Sample records for performance validation test

  1. Embedded performance validity testing in neuropsychological assessment: Potential clinical tools.

    PubMed

    Rickards, Tyler A; Cranston, Christopher C; Touradji, Pegah; Bechtold, Kathleen T

    2018-01-01

    The article aims to suggest clinically-useful tools in neuropsychological assessment for efficient use of embedded measures of performance validity. To accomplish this, we integrated available validity-related and statistical research from the literature, consensus statements, and survey-based data from practicing neuropsychologists. We provide recommendations for use of 1) Cutoffs for embedded performance validity tests including Reliable Digit Span, California Verbal Learning Test (Second Edition) Forced Choice Recognition, Rey-Osterrieth Complex Figure Test Combination Score, Wisconsin Card Sorting Test Failure to Maintain Set, and the Finger Tapping Test; 2) Selecting number of performance validity measures to administer in an assessment; and 3) Hypothetical clinical decision-making models for use of performance validity testing in a neuropsychological assessment collectively considering behavior, patient reporting, and data indicating invalid or noncredible performance. Performance validity testing helps inform the clinician about an individual's general approach to tasks: response to failure, task engagement and persistence, compliance with task demands. Data-driven clinical suggestions provide a resource to clinicians and to instigate conversation within the field to make more uniform, testable decisions to further the discussion, and guide future research in this area.

  2. Performance Validity Testing in Neuropsychology: Scientific Basis and Clinical Application-A Brief Review.

    PubMed

    Greher, Michael R; Wodushek, Thomas R

    2017-03-01

    Performance validity testing refers to neuropsychologists' methodology for determining whether neuropsychological test performances completed in the course of an evaluation are valid (ie, the results of true neurocognitive function) or invalid (ie, overly impacted by the patient's effort/engagement in testing). This determination relies upon the use of either standalone tests designed for this sole purpose, or specific scores/indicators embedded within traditional neuropsychological measures that have demonstrated this utility. In response to a greater appreciation for the critical role that performance validity issues play in neuropsychological testing and the need to measure this variable to the best of our ability, the scientific base for performance validity testing has expanded greatly over the last 20 to 30 years. As such, the majority of current day neuropsychologists in the United States use a variety of measures for the purpose of performance validity testing as part of everyday forensic and clinical practice and address this issue directly in their evaluations. The following is the first article of a 2-part series that will address the evolution of performance validity testing in the field of neuropsychology, both in terms of the science as well as the clinical application of this measurement technique. The second article of this series will review performance validity tests in terms of methods for development of these measures, and maximizing of diagnostic accuracy.

  3. Performance Evaluation of a Data Validation System

    NASA Technical Reports Server (NTRS)

    Wong, Edmond (Technical Monitor); Sowers, T. Shane; Santi, L. Michael; Bickford, Randall L.

    2005-01-01

    Online data validation is a performance-enhancing component of modern control and health management systems. It is essential that performance of the data validation system be verified prior to its use in a control and health management system. A new Data Qualification and Validation (DQV) Test-bed application was developed to provide a systematic test environment for this performance verification. The DQV Test-bed was used to evaluate a model-based data validation package known as the Data Quality Validation Studio (DQVS). DQVS was employed as the primary data validation component of a rocket engine health management (EHM) system developed under NASA's NGLT (Next Generation Launch Technology) program. In this paper, the DQVS and DQV Test-bed software applications are described, and the DQV Test-bed verification procedure for this EHM system application is presented. Test-bed results are summarized and implications for EHM system performance improvements are discussed.

  4. Performance and Symptom Validity Testing as a Function of Medical Board Evaluation in U.S. Military Service Members with a History of Mild Traumatic Brain Injury.

    PubMed

    Armistead-Jehle, Patrick; Cole, Wesley R; Stegman, Robert L

    2018-02-01

    The study was designed to replicate and extend pervious findings demonstrating the high rates of invalid neuropsychological testing in military service members (SMs) with a history of mild traumatic brain injury (mTBI) assessed in the context of a medical evaluation board (MEB). Two hundred thirty-one active duty SMs (61 of which were undergoing an MEB) underwent neuropsychological assessment. Performance validity (Word Memory Test) and symptom validity (MMPI-2-RF) test data were compared across those evaluated within disability (MEB) and clinical contexts. As with previous studies, there were significantly more individuals in an MEB context that failed performance (MEB = 57%, non-MEB = 31%) and symptom validity testing (MEB = 57%, non-MEB = 22%) and performance validity testing had a notable affect on cognitive test scores. Performance and symptom validity test failure rates did not vary as a function of the reason for disability evaluation when divided into behavioral versus physical health conditions. These data are consistent with past studies, and extends those studies by including symptom validity testing and investigating the effect of reason for MEB. This and previous studies demonstrate that more than 50% of SMs seen in the context of an MEB will fail performance validity tests and over-report on symptom validity measures. These results emphasize the importance of using both performance and symptom validity testing when evaluating SMs with a history of mTBI, especially if they are being seen for disability evaluations, in order to ensure the accuracy of cognitive and psychological test data. Published by Oxford University Press 2017. This work is written by (a) US Government employee(s) and is in the public domain in the US.

  5. Functional performance testing of the hip in athletes: a systematic review for reliability and validity.

    PubMed

    Kivlan, Benjamin R; Martin, Robroy L

    2012-08-01

    The purpose of this study was to systematically review the literature for functional performance tests with evidence of reliability and validity that could be used for a young, athletic population with hip dysfunction. A search of PubMed and SPORTDiscus databases were performed to identify movement, balance, hop/jump, or agility functional performance tests from the current peer-reviewed literature used to assess function of the hip in young, athletic subjects. The single-leg stance, deep squat, single-leg squat, and star excursion balance tests (SEBT) demonstrated evidence of validity and normative data for score interpretation. The single-leg stance test and SEBT have evidence of validity with association to hip abductor function. The deep squat test demonstrated evidence as a functional performance test for evaluating femoroacetabular impingement. Hop/Jump tests and agility tests have no reported evidence of reliability or validity in a population of subjects with hip pathology. Use of functional performance tests in the assessment of hip dysfunction has not been well established in the current literature. Diminished squat depth and provocation of pain during the single-leg balance test have been associated with patients diagnosed with FAI and gluteal tendinopathy, respectively. The SEBT and single-leg squat tests provided evidence of convergent validity through an analysis of kinematics and muscle function in normal subjects. Reliability of functional performance tests have not been established on patients with hip dysfunction. Further study is needed to establish reliability and validity of functional performance tests that can be used in a young, athletic population with hip dysfunction. 2b (Systematic Review of Literature).

  6. Performance validity testing in neuropsychology: a clinical guide, critical review, and update on a rapidly evolving literature.

    PubMed

    Lippa, Sara M

    2018-04-01

    Over the past two decades, there has been much research on measures of response bias and myriad measures have been validated in a variety of clinical and research samples. This critical review aims to guide clinicians through the use of performance validity tests (PVTs) from test selection and administration through test interpretation and feedback. Recommended cutoffs and relevant test operating characteristics are presented. Other important issues to consider during test selection, administration, interpretation, and feedback are discussed including order effects, coaching, impact on test data, and methods to combine measures and improve predictive power. When interpreting performance validity measures, neuropsychologists must use particular caution in cases of dementia, low intelligence, English as a second language/minority cultures, or low education. PVTs provide valuable information regarding response bias and, under the right circumstances, can provide excellent evidence of response bias. Only after consideration of the entire clinical picture, including validity test performance, can concrete determinations regarding the validity of test data be made.

  7. Effort, symptom validity testing, performance validity testing and traumatic brain injury.

    PubMed

    Bigler, Erin D

    2014-01-01

    To understand the neurocognitive effects of brain injury, valid neuropsychological test findings are paramount. This review examines the research on what has been referred to a symptom validity testing (SVT). Above a designated cut-score signifies a 'passing' SVT performance which is likely the best indicator of valid neuropsychological test findings. Likewise, substantially below cut-point performance that nears chance or is at chance signifies invalid test performance. Significantly below chance is the sine qua non neuropsychological indicator for malingering. However, the interpretative problems with SVT performance below the cut-point yet far above chance are substantial, as pointed out in this review. This intermediate, border-zone performance on SVT measures is where substantial interpretative challenges exist. Case studies are used to highlight the many areas where additional research is needed. Historical perspectives are reviewed along with the neurobiology of effort. Reasons why performance validity testing (PVT) may be better than the SVT term are reviewed. Advances in neuroimaging techniques may be key in better understanding the meaning of border zone SVT failure. The review demonstrates the problems with rigidity in interpretation with established cut-scores. A better understanding of how certain types of neurological, neuropsychiatric and/or even test conditions may affect SVT performance is needed.

  8. 15 CFR 995.27 - Format validation software testing.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... 15 Commerce and Foreign Trade 3 2013-01-01 2013-01-01 false Format validation software testing... of NOAA ENC Products § 995.27 Format validation software testing. Tests shall be performed verifying, as far as reasonable and practicable, that CEVAD's data testing software performs the checks, as...

  9. 15 CFR 995.27 - Format validation software testing.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... 15 Commerce and Foreign Trade 3 2014-01-01 2014-01-01 false Format validation software testing... of NOAA ENC Products § 995.27 Format validation software testing. Tests shall be performed verifying, as far as reasonable and practicable, that CEVAD's data testing software performs the checks, as...

  10. 15 CFR 995.27 - Format validation software testing.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... 15 Commerce and Foreign Trade 3 2012-01-01 2012-01-01 false Format validation software testing... of NOAA ENC Products § 995.27 Format validation software testing. Tests shall be performed verifying, as far as reasonable and practicable, that CEVAD's data testing software performs the checks, as...

  11. 15 CFR 995.27 - Format validation software testing.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... 15 Commerce and Foreign Trade 3 2011-01-01 2011-01-01 false Format validation software testing... of NOAA ENC Products § 995.27 Format validation software testing. Tests shall be performed verifying, as far as reasonable and practicable, that CEVAD's data testing software performs the checks, as...

  12. FUNCTIONAL PERFORMANCE TESTING OF THE HIP IN ATHLETES: A SYSTEMATIC REVIEW FOR RELIABILITY AND VALIDITY

    PubMed Central

    Martin, RobRoy L.

    2012-01-01

    Purpose/Background: The purpose of this study was to systematically review the literature for functional performance tests with evidence of reliability and validity that could be used for a young, athletic population with hip dysfunction. Methods: A search of PubMed and SPORTDiscus databases were performed to identify movement, balance, hop/jump, or agility functional performance tests from the current peer-reviewed literature used to assess function of the hip in young, athletic subjects. Results: The single-leg stance, deep squat, single-leg squat, and star excursion balance tests (SEBT) demonstrated evidence of validity and normative data for score interpretation. The single-leg stance test and SEBT have evidence of validity with association to hip abductor function. The deep squat test demonstrated evidence as a functional performance test for evaluating femoroacetabular impingement. Hop/Jump tests and agility tests have no reported evidence of reliability or validity in a population of subjects with hip pathology. Conclusions: Use of functional performance tests in the assessment of hip dysfunction has not been well established in the current literature. Diminished squat depth and provocation of pain during the single-leg balance test have been associated with patients diagnosed with FAI and gluteal tendinopathy, respectively. The SEBT and single-leg squat tests provided evidence of convergent validity through an analysis of kinematics and muscle function in normal subjects. Reliability of functional performance tests have not been established on patients with hip dysfunction. Further study is needed to establish reliability and validity of functional performance tests that can be used in a young, athletic population with hip dysfunction. Level of Evidence: 2b (Systematic Review of Literature) PMID:22893860

  13. Student mathematical imagination instruments: construction, cultural adaptation and validity

    NASA Astrophysics Data System (ADS)

    Dwijayanti, I.; Budayasa, I. K.; Siswono, T. Y. E.

    2018-03-01

    Imagination has an important role as the center of sensorimotor activity of the students. The purpose of this research is to construct the instrument of students’ mathematical imagination in understanding concept of algebraic expression. The researcher performs validity using questionnaire and test technique and data analysis using descriptive method. Stages performed include: 1) the construction of the embodiment of the imagination; 2) determine the learning style questionnaire; 3) construct instruments; 4) translate to Indonesian as well as adaptation of learning style questionnaire content to student culture; 5) perform content validation. The results stated that the constructed instrument is valid by content validation and empirical validation so that it can be used with revisions. Content validation involves Indonesian linguists, english linguists and mathematics material experts. Empirical validation is done through a legibility test (10 students) and shows that in general the language used can be understood. In addition, a questionnaire test (86 students) was analyzed using a biserial point correlation technique resulting in 16 valid items with a reliability test using KR 20 with medium reability criteria. While the test instrument test (32 students) to find all items are valid and reliability test using KR 21 with reability is 0,62.

  14. Agility performance in high-level junior basketball players: the predictive value of anthropometrics and power qualities.

    PubMed

    Sisic, Nedim; Jelicic, Mario; Pehar, Miran; Spasic, Miodrag; Sekulic, Damir

    2016-01-01

    In basketball, anthropometric status is an important factor when identifying and selecting talents, while agility is one of the most vital motor performances. The aim of this investigation was to evaluate the influence of anthropometric variables and power capacities on different preplanned agility performances. The participants were 92 high-level, junior-age basketball players (16-17 years of age; 187.6±8.72 cm in body height, 78.40±12.26 kg in body mass), randomly divided into a validation and cross-validation subsample. The predictors set consisted of 16 anthropometric variables, three tests of power-capacities (Sargent-jump, broad-jump and medicine-ball-throw) as predictors. The criteria were three tests of agility: a T-Shape-Test; a Zig-Zag-Test, and a test of running with a 180-degree turn (T180). Forward stepwise multiple regressions were calculated for validation subsamples and then cross-validated. Cross validation included correlations between observed and predicted scores, dependent samples t-test between predicted and observed scores; and Bland Altman graphics. Analysis of the variance identified centres being advanced in most of the anthropometric indices, and medicine-ball-throw (all at P<0.05); with no significant between-position-differences for other studied motor performances. Multiple regression models originally calculated for the validation subsample were then cross-validated, and confirmed for Zig-zag-Test (R of 0.71 and 0.72 for the validation and cross-validation subsample, respectively). Anthropometrics were not strongly related to agility performance, but leg length is found to be negatively associated with performance in basketball-specific agility. Power capacities are confirmed to be an important factor in agility. The results highlighted the importance of sport-specific tests when studying pre-planned agility performance in basketball. The improvement in power capacities will probably result in an improvement in agility in basketball athletes, while anthropometric indices should be used in order to identify those athletes who can achieve superior agility performance.

  15. Development, construct validity and test-retest reliability of a field-based wheelchair mobility performance test for wheelchair basketball.

    PubMed

    de Witte, Annemarie M H; Hoozemans, Marco J M; Berger, Monique A M; van der Slikke, Rienk M A; van der Woude, Lucas H V; Veeger, Dirkjan H E J

    2018-01-01

    The aim of this study was to develop and describe a wheelchair mobility performance test in wheelchair basketball and to assess its construct validity and reliability. To mimic mobility performance of wheelchair basketball matches in a standardised manner, a test was designed based on observation of wheelchair basketball matches and expert judgement. Forty-six players performed the test to determine its validity and 23 players performed the test twice for reliability. Independent-samples t-tests were used to assess whether the times needed to complete the test were different for classifications, playing standards and sex. Intraclass correlation coefficients (ICC) were calculated to quantify reliability of performance times. Males performed better than females (P < 0.001, effect size [ES] = -1.26) and international men performed better than national men (P < 0.001, ES = -1.62). Performance time of low (≤2.5) and high (≥3.0) classification players was borderline not significant with a moderate ES (P = 0.06, ES = 0.58). The reliability was excellent for overall performance time (ICC = 0.95). These results show that the test can be used as a standardised mobility performance test to validly and reliably assess the capacity in mobility performance of elite wheelchair basketball athletes. Furthermore, the described methodology of development is recommended for use in other sports to develop sport-specific tests.

  16. Characterizing the GOES-R (GOES-16) Geostationary Lightning Mapper (GLM) On-Orbit Performance

    NASA Technical Reports Server (NTRS)

    Rudlosky, Scott D.; Goodman, Steven J.; Koshak, William J.; Blakeslee, Richard J.; Buechler, Dennis E.; Mach, Douglas M.; Bateman, Monte

    2017-01-01

    Two overlapping efforts help to characterize the GLM performance, the Post Launch Test (PLT) phase to validate the predicted pre-launch instrument performance and the Post Launch Product Test (PLPT) phase to validate the lightning detection product used in forecast and warning decision-making. This paper documents the calibration and validation plans and activities for the first 6 months of GLM on-orbit testing and validation commencing with first light on 4 January 2017. The PLT phase addresses image quality, on-orbit calibration, RTEP threshold tuning, image navigation, noise filtering, and solar intrusion assessment, resulting in a GLM calibration parameter file. The PLPT includes four main activities, the Reference Data Comparisons (RDC), Algorithm Testing (AT), Instrument Navigation and Registration Testing (INRT), and Long Term Baseline Testing (LTBT). Field campaigns are also designed to contribute valuable insights into the GLM performance capabilities. The PLPT tests each contribute to the beta, provisional, and fully validated GLM data.

  17. Validation of alternative methods for toxicity testing.

    PubMed Central

    Bruner, L H; Carr, G J; Curren, R D; Chamberlain, M

    1998-01-01

    Before nonanimal toxicity tests may be officially accepted by regulatory agencies, it is generally agreed that the validity of the new methods must be demonstrated in an independent, scientifically sound validation program. Validation has been defined as the demonstration of the reliability and relevance of a test method for a particular purpose. This paper provides a brief review of the development of the theoretical aspects of the validation process and updates current thinking about objectively testing the performance of an alternative method in a validation study. Validation of alternative methods for eye irritation testing is a specific example illustrating important concepts. Although discussion focuses on the validation of alternative methods intended to replace current in vivo toxicity tests, the procedures can be used to assess the performance of alternative methods intended for other uses. Images Figure 1 PMID:9599695

  18. The Reliability and Validity of Protocols for the Assessment of Endurance Sports Performance: An Updated Review

    ERIC Educational Resources Information Center

    Stevens, Christopher John; Dascombe, Ben James

    2015-01-01

    Sports performance testing is one of the most common and important measures used in sport science. Performance testing protocols must have high reliability to ensure any changes are not due to measurement error or inter-individual differences. High validity is also important to ensure test performance reflects true performance. Time-trial…

  19. Further examination of embedded performance validity indicators for the Conners' Continuous Performance Test and Brief Test of Attention in a large outpatient clinical sample.

    PubMed

    Sharland, Michael J; Waring, Stephen C; Johnson, Brian P; Taran, Allise M; Rusin, Travis A; Pattock, Andrew M; Palcher, Jeanette A

    2018-01-01

    Assessing test performance validity is a standard clinical practice and although studies have examined the utility of cognitive/memory measures, few have examined attention measures as indicators of performance validity beyond the Reliable Digit Span. The current study further investigates the classification probability of embedded Performance Validity Tests (PVTs) within the Brief Test of Attention (BTA) and the Conners' Continuous Performance Test (CPT-II), in a large clinical sample. This was a retrospective study of 615 patients consecutively referred for comprehensive outpatient neuropsychological evaluation. Non-credible performance was defined two ways: failure on one or more PVTs and failure on two or more PVTs. Classification probability of the BTA and CPT-II into non-credible groups was assessed. Sensitivity, specificity, positive predictive value, and negative predictive value were derived to identify clinically relevant cut-off scores. When using failure on two or more PVTs as the indicator for non-credible responding compared to failure on one or more PVTs, highest classification probability, or area under the curve (AUC), was achieved by the BTA (AUC = .87 vs. .79). CPT-II Omission, Commission, and Total Errors exhibited higher classification probability as well. Overall, these findings corroborate previous findings, extending them to a large clinical sample. BTA and CPT-II are useful embedded performance validity indicators within a clinical battery but should not be used in isolation without other performance validity indicators.

  20. Development and Validation of Targeted Next-Generation Sequencing Panels for Detection of Germline Variants in Inherited Diseases.

    PubMed

    Santani, Avni; Murrell, Jill; Funke, Birgit; Yu, Zhenming; Hegde, Madhuri; Mao, Rong; Ferreira-Gonzalez, Andrea; Voelkerding, Karl V; Weck, Karen E

    2017-06-01

    - The number of targeted next-generation sequencing (NGS) panels for genetic diseases offered by clinical laboratories is rapidly increasing. Before an NGS-based test is implemented in a clinical laboratory, appropriate validation studies are needed to determine the performance characteristics of the test. - To provide examples of assay design and validation of targeted NGS gene panels for the detection of germline variants associated with inherited disorders. - The approaches used by 2 clinical laboratories for the development and validation of targeted NGS gene panels are described. Important design and validation considerations are examined. - Clinical laboratories must validate performance specifications of each test prior to implementation. Test design specifications and validation data are provided, outlining important steps in validation of targeted NGS panels by clinical diagnostic laboratories.

  1. Performance validation of the ANSER control laws for the F-18 HARV

    NASA Technical Reports Server (NTRS)

    Messina, Michael D.

    1995-01-01

    The ANSER control laws were implemented in Ada by NASA Dryden for flight test on the High Alpha Research Vehicle (HARV). The Ada implementation was tested in the hardware-in-the-loop (HIL) simulation, and results were compared to those obtained with the NASA Langley batch Fortran implementation of the control laws which are considered the 'truth model.' This report documents the performance validation test results between these implementations. This report contains the ANSER performance validation test plan, HIL versus batch time-history comparisons, simulation scripts used to generate checkcases, and detailed analysis of discrepancies discovered during testing.

  2. Performance validation of the ANSER Control Laws for the F-18 HARV

    NASA Technical Reports Server (NTRS)

    Messina, Michael D.

    1995-01-01

    The ANSER control laws were implemented in Ada by NASA Dryden for flight test on the High Alpha Research Vehicle (HARV). The Ada implementation was tested in the hardware-in-the-loop (HIL) simulation, and results were compared to those obtained with the NASA Langley batch Fortran implementation of the control laws which are considered the 'truth model'. This report documents the performance validation test results between these implementations. This report contains the ANSER performance validation test plan, HIL versus batch time-history comparisons, simulation scripts used to generate checkcases, and detailed analysis of discrepancies discovered during testing.

  3. An Efficient Data Partitioning to Improve Classification Performance While Keeping Parameters Interpretable.

    PubMed

    Korjus, Kristjan; Hebart, Martin N; Vicente, Raul

    2016-01-01

    Supervised machine learning methods typically require splitting data into multiple chunks for training, validating, and finally testing classifiers. For finding the best parameters of a classifier, training and validation are usually carried out with cross-validation. This is followed by application of the classifier with optimized parameters to a separate test set for estimating the classifier's generalization performance. With limited data, this separation of test data creates a difficult trade-off between having more statistical power in estimating generalization performance versus choosing better parameters and fitting a better model. We propose a novel approach that we term "Cross-validation and cross-testing" improving this trade-off by re-using test data without biasing classifier performance. The novel approach is validated using simulated data and electrophysiological recordings in humans and rodents. The results demonstrate that the approach has a higher probability of discovering significant results than the standard approach of cross-validation and testing, while maintaining the nominal alpha level. In contrast to nested cross-validation, which is maximally efficient in re-using data, the proposed approach additionally maintains the interpretability of individual parameters. Taken together, we suggest an addition to currently used machine learning approaches which may be particularly useful in cases where model weights do not require interpretation, but parameters do.

  4. Meta-Analysis of Integrity Tests: A Critical Examination of Validity Generalization and Moderator Variables.

    DTIC Science & Technology

    1992-06-01

    predicting both job performance and counterproductive behaviors on the job such as theft, disciplinary problems, and absenteeism . Validities were found to...DECLASSIFICATION/DOWNGRADING SCHEDULE 4 PERFORMING ORGANIZATION REPORT NUMBER(S) 92-1 6a NAME OF PERFORMING ORGANIZATION Universi+y of Iowa...be generalizable. The estimated mean operational predictive validity of integrity tests for supervisory ratings of job performance is .41. For the

  5. Use of the color trails test as an embedded measure of performance validity.

    PubMed

    Henry, George K; Algina, James

    2013-01-01

    One hundred personal injury litigants and disability claimants referred for a forensic neuropsychological evaluation were administered both portions of the Color Trails Test (CTT) as part of a more comprehensive battery of standardized tests. Subjects who failed two or more free-standing tests of cognitive performance validity formed the Failed Performance Validity (FPV) group, while subjects who passed all free-standing performance validity measures were assigned to the Passed Performance Validity (PPV) group. A cutscore of ≥45 seconds to complete Color Trails 1 (CT1) was associated with a classification accuracy of 78%, good sensitivity (66%) and high specificity (90%), while a cutscore of ≥84 seconds to complete Color Trails 2 (CT2) was associated with a classification accuracy of 82%, good sensitivity (74%) and high specificity (90%). A CT1 cutscore of ≥58 seconds, and a CT2 cutscore ≥100 seconds was associated with 100% positive predictive power at base rates from 20 to 50%.

  6. Development and validation of a new questionnaire for the assessment of subjective physical performance in adult patients with haemophilia--the HEP-Test-Q.

    PubMed

    von Mackensen, S; Czepa, D; Herbsleb, M; Hilberg, T

    2010-01-01

    Specific research studies for the investigation of physical performance in haemophilic patients are rare. However, these instruments become increasingly more important to evaluate therapeutic treatments. Within the frame of the Haemophilia & Exercise Project (HEP), a new questionnaire, namely HEP-Test-Q, has been developed for the assessment of subjective physical performance in haemophilic adults. In this article, the development and validation of the HEP-Test-Q is described. The development consisted of different phases including item collection, pilot testing and field testing. The preliminary version was pilot-tested in 24 German HEP-participants. Following evaluation and preliminary psychometric analysis, the HEP-Test-Q was revised. The final version consists of 25 items pertaining to the domains 'mobility', 'strength & coordination', 'endurance' and 'body perception', which was administered to 43 German haemophilic patients (43.8 +/- 11.2 years). Psychometric analysis included reliability and validity testing. Convergent validity was tested correlating the HEP-Test-Q with SF-36, Haem-A-QoL, HAL and the Orthopaedic Joint Score. Discriminant validity tested different clinical subgroups. Patients accepted the questionnaire and found it easy to fill in. Psychometric testing revealed good values for reliability in terms of internal consistency (Cronbach's alpha = 0.96) and test-retest reliability (r = 0.90) as well as for convergent validity correlating highly with Haem-A-QoL, HAL and SF-36. Discriminant validity testing showed significant differences for age, hepatitis A and hepatitis B and the number of target joints. HEP-Test-Q is a short and well-accepted questionnaire, assessing subjective physical performance of haemophiliacs, which might be combined with objective assessments to reveal aspects, which cannot be measured objectively, such as body perception.

  7. Alphabus Mechanical Validation Plan and Test Campaign

    NASA Astrophysics Data System (ADS)

    Calvisi, G.; Bonnet, D.; Belliol, P.; Lodereau, P.; Redoundo, R.

    2012-07-01

    A joint team of the two leading European satellite companies (Astrium and Thales Alenia Space) worked with the support of ESA and CNES to define a product line able to efficiently address the upper segment of communications satellites : Alphabus Starting in 2009 and up to 2011 the mechanical validation of the Alphabus platform has been obtained thanks to static tests performed on dedicated static model and to environmental test performed on the first satellite based on Alphabus: Alphasat I-XL. The mechanical validation of the Alphabus platform presented an excellent opportunity to improve the validation and qualification process, with respect to static, sine vibrations, acoustic and L/V shock environment, minimizing recurrent cost of manufacturing, integration and testing. A main driver on mechanical testing is that mechanical acceptance testing at satellite level will be performed with empty tanks due to technical constraints (limitation of existing vibration devices) and programmatic advantages (test risk reduction, test schedule minimization). In this paper the impacts that such testing logic have on validation plan are briefly recalled and its actual application for Alphasat PFM mechanical test campaign is detailed.

  8. An Efficient Data Partitioning to Improve Classification Performance While Keeping Parameters Interpretable

    PubMed Central

    Korjus, Kristjan; Hebart, Martin N.; Vicente, Raul

    2016-01-01

    Supervised machine learning methods typically require splitting data into multiple chunks for training, validating, and finally testing classifiers. For finding the best parameters of a classifier, training and validation are usually carried out with cross-validation. This is followed by application of the classifier with optimized parameters to a separate test set for estimating the classifier’s generalization performance. With limited data, this separation of test data creates a difficult trade-off between having more statistical power in estimating generalization performance versus choosing better parameters and fitting a better model. We propose a novel approach that we term “Cross-validation and cross-testing” improving this trade-off by re-using test data without biasing classifier performance. The novel approach is validated using simulated data and electrophysiological recordings in humans and rodents. The results demonstrate that the approach has a higher probability of discovering significant results than the standard approach of cross-validation and testing, while maintaining the nominal alpha level. In contrast to nested cross-validation, which is maximally efficient in re-using data, the proposed approach additionally maintains the interpretability of individual parameters. Taken together, we suggest an addition to currently used machine learning approaches which may be particularly useful in cases where model weights do not require interpretation, but parameters do. PMID:27564393

  9. System performance testing of the DSN radio science system, Mark 3-78

    NASA Technical Reports Server (NTRS)

    Berman, A. L.; Mehta, J. S.

    1978-01-01

    System performance tests are required to evaluate system performance following initial system implementation and subsequent modification, and to validate system performance prior to actual operational usage. Non-real-time end-to-end Radio Science system performance tests are described that are based on the comparison of open-loop radio science data to equivalent closed-loop radio metric data, as well as an abbreviated Radio Science real-time system performance test that validates critical Radio Science System elements at the Deep Space Station prior to actual operational usage.

  10. 15 CFR 995.27 - Format validation software testing.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 15 Commerce and Foreign Trade 3 2010-01-01 2010-01-01 false Format validation software testing... CERTIFICATION REQUIREMENTS FOR NOAA HYDROGRAPHIC PRODUCTS AND SERVICES CERTIFICATION REQUIREMENTS FOR... of NOAA ENC Products § 995.27 Format validation software testing. Tests shall be performed verifying...

  11. Reliability, validity and description of timed performance of the Jebsen-Taylor Test in patients with muscular dystrophies.

    PubMed

    Artilheiro, Mariana Cunha; Fávero, Francis Meire; Caromano, Fátima Aparecida; Oliveira, Acary de Souza Bulle; Carvas, Nelson; Voos, Mariana Callil; Sá, Cristina Dos Santos Cardoso de

    2017-12-08

    The Jebsen-Taylor Test evaluates upper limb function by measuring timed performance on everyday activities. The test is used to assess and monitor the progression of patients with Parkinson disease, cerebral palsy, stroke and brain injury. To analyze the reliability, internal consistency and validity of the Jebsen-Taylor Test in people with Muscular Dystrophy and to describe and classify upper limb timed performance of people with Muscular Dystrophy. Fifty patients with Muscular Dystrophy were assessed. Non-dominant and dominant upper limb performances on the Jebsen-Taylor Test were filmed. Two raters evaluated timed performance for inter-rater reliability analysis. Test-retest reliability was investigated by using intraclass correlation coefficients. Internal consistency was assessed using the Cronbach alpha. Construct validity was conducted by comparing the Jebsen-Taylor Test with the Performance of Upper Limb. The internal consistency of Jebsen-Taylor Test was good (Cronbach's α=0.98). A very high inter-rater reliability (0.903-0.999), except for writing with an Intraclass correlation coefficient of 0.772-1.000. Strong correlations between the Jebsen-Taylor Test and the Performance of Upper Limb Module were found (rho=-0.712). The Jebsen-Taylor Test is a reliable and valid measure of timed performance for people with Muscular Dystrophy. Copyright © 2017 Associação Brasileira de Pesquisa e Pós-Graduação em Fisioterapia. Publicado por Elsevier Editora Ltda. All rights reserved.

  12. Five-Kilometers Time Trial: Preliminary Validation of a Short Test for Cycling Performance Evaluation.

    PubMed

    Dantas, Jose Luiz; Pereira, Gleber; Nakamura, Fabio Yuzo

    2015-09-01

    The five-kilometer time trial (TT5km) has been used to assess aerobic endurance performance without further investigation of its validity. This study aimed to perform a preliminary validation of the TT5km to rank well-trained cyclists based on aerobic endurance fitness and assess changes of the aerobic endurance performance. After the incremental test, 20 cyclists (age = 31.3 ± 7.9 years; body mass index = 22.7 ± 1.5 kg/m(2); maximal aerobic power = 360.5 ± 49.5 W) performed the TT5km twice, collecting performance (time to complete, absolute and relative power output, average speed) and physiological responses (heart rate and electromyography activity). The validation criteria were pacing strategy, absolute and relative reliability, validity, and sensitivity. Sensitivity index was obtained from the ratio between the smallest worthwhile change and typical error. The TT5km showed high absolute (coefficient of variation < 3%) and relative (intraclass coefficient correlation > 0.95) reliability of performance variables, whereas it presented low reliability of physiological responses. The TT5km performance variables were highly correlated with the aerobic endurance indices obtained from incremental test (r > 0.70). These variables showed adequate sensitivity index (> 1). TT5km is a valid test to rank the aerobic endurance fitness of well-trained cyclists and to differentiate changes on aerobic endurance performance. Coaches can detect performance changes through either absolute (± 17.7 W) or relative power output (± 0.3 W.kg(-1)), the time to complete the test (± 13.4 s) and the average speed (± 1.0 km.h(-1)). Furthermore, TT5km performance can also be used to rank the athletes according to their aerobic endurance fitness.

  13. Validation of the Narrowing Beam Walking Test in Lower Limb Prosthesis Users.

    PubMed

    Sawers, Andrew; Hafner, Brian

    2018-04-11

    To evaluate the content, construct, and discriminant validity of the Narrowing Beam Walking Test (NBWT), a performance-based balance test for lower limb prosthesis users. Cross-sectional study. Research laboratory and prosthetics clinic. Unilateral transtibial and transfemoral prosthesis users (N=40). Not applicable. Content validity was examined by quantifying the percentage of participants receiving maximum or minimum scores (ie, ceiling and floor effects). Convergent construct validity was examined using correlations between participants' NBWT scores and scores or times on existing clinical balance tests regularly administered to lower limb prosthesis users. Known-groups construct validity was examined by comparing NBWT scores between groups of participants with different fall histories, amputation levels, amputation etiologies, and functional levels. Discriminant validity was evaluated by analyzing the area under each test's receiver operating characteristic (ROC) curve. No minimum or maximum scores were recorded on the NBWT. NBWT scores demonstrated strong correlations (ρ=.70‒.85) with scores/times on performance-based balance tests (timed Up and Go test, Four Square Step Test, and Berg Balance Scale) and a moderate correlation (ρ=.49) with the self-report Activities-specific Balance Confidence scale. NBWT performance was significantly lower among participants with a history of falls (P=.003), transfemoral amputation (P=.011), and a lower mobility level (P<.001). The NBWT also had the largest area under the ROC curve (.81) and was the only test to exhibit an area that was statistically significantly >.50 (ie, chance). The results provide strong evidence of content, construct, and discriminant validity for the NBWT as a performance-based test of balance ability. The evidence supports its use to assess balance impairments and fall risk in unilateral transtibial and transfemoral prosthesis users. Copyright © 2018 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.

  14. Diagnostic validity of physical examination tests for common knee disorders: An overview of systematic reviews and meta-analysis.

    PubMed

    Décary, Simon; Ouellet, Philippe; Vendittoli, Pascal-André; Roy, Jean-Sébastien; Desmeules, François

    2017-01-01

    More evidence on diagnostic validity of physical examination tests for knee disorders is needed to lower frequently used and costly imaging tests. To conduct a systematic review of systematic reviews (SR) and meta-analyses (MA) evaluating the diagnostic validity of physical examination tests for knee disorders. A structured literature search was conducted in five databases until January 2016. Methodological quality was assessed using the AMSTAR. Seventeen reviews were included with mean AMSTAR score of 5.5 ± 2.3. Based on six SR, only the Lachman test for ACL injuries is diagnostically valid when individually performed (Likelihood ratio (LR+):10.2, LR-:0.2). Based on two SR, the Ottawa Knee Rule is a valid screening tool for knee fractures (LR-:0.05). Based on one SR, the EULAR criteria had a post-test probability of 99% for the diagnosis of knee osteoarthritis. Based on two SR, a complete physical examination performed by a trained health provider was found to be diagnostically valid for ACL, PCL and meniscal injuries as well as for cartilage lesions. When individually performed, common physical tests are rarely able to rule in or rule out a specific knee disorder, except the Lachman for ACL injuries. There is low-quality evidence concerning the validity of combining history elements and physical tests. Copyright © 2016 Elsevier Ltd. All rights reserved.

  15. Establishing the reliability and concurrent validity of physical performance tests using virtual reality equipment for community-dwelling healthy elders.

    PubMed

    Griswold, David; Rockwell, Kyle; Killa, Carri; Maurer, Michael; Landgraff, Nancy; Learman, Ken

    2015-01-01

    The aim of this study was to determine the reliability and concurrent validity of commonly used physical performance tests using the OmniVR Virtual Rehabilitation System for healthy community-dwelling elders. Participants (N = 40) were recruited by the authors and were screened for eligibility. The initial method of measurement was randomized to either virtual reality (VR) or clinically based measures (CM). Physical performance tests included the five times sit to stand, Timed Up and Go (TUG), Forward Functional Reach (FFR) and 30-s stand test. A random number generator determined the testing order. The test-re-test reliability for the VR and CM was determined. Furthermore, concurrent validity was determined using a Pearson product moment correlation (Pearson r). The VR demonstrated excellent reliability for 5 × STS intraclass correlation coefficient (ICC) = 0.931(3,1), FFR ICC = 0.846(3,1) and the TUG ICC = 0.944(3,1). The concurrent validity data for the VR and CM (ICC 3, k) were moderate for FFR ICC = 0.682, excellent 5 × STS ICC = 0.889 and excellent for the TUG ICC = 0.878. The concurrent validity of the 30-s stand test was good ICC = 0.735(3,1). This study supports the use of VR equipment for measuring physical performance tests in the clinic for healthy community-dwelling elders. Virtual reality equipment is not only used to treat balance impairments but it is also used to measure and determine physical impairments through the use of physical performance tests. Virtual reality equipment is a reliable and valid tool for collecting physical performance data for the 5 × STS, FFR, TUG and 30-s stand test for healthy community-dwelling elders.

  16. Validity of the Optometry Admission Test in Predicting Performance in Schools and Colleges of Optometry.

    ERIC Educational Resources Information Center

    Kramer, Gene A.; Johnston, JoElle

    1997-01-01

    A study examined the relationship between Optometry Admission Test scores and pre-optometry or undergraduate grade point average (GPA) with first and second year performance in optometry schools. The test's predictive validity was limited but significant, and comparable to those reported for other admission tests. In addition, the scores…

  17. Noncredible cognitive performance at clinical evaluation of adult ADHD: An embedded validity indicator in a visuospatial working memory test.

    PubMed

    Fuermaier, Anselm B M; Tucha, Oliver; Koerts, Janneke; Lange, Klaus W; Weisbrod, Matthias; Aschenbrenner, Steffen; Tucha, Lara

    2017-12-01

    The assessment of performance validity is an essential part of the neuropsychological evaluation of adults with attention-deficit/hyperactivity disorder (ADHD). Most available tools, however, are inaccurate regarding the identification of noncredible performance. This study describes the development of a visuospatial working memory test, including a validity indicator for noncredible cognitive performance of adults with ADHD. Visuospatial working memory of adults with ADHD (n = 48) was first compared to the test performance of healthy individuals (n = 48). Furthermore, a simulation design was performed including 252 individuals who were randomly assigned to either a control group (n = 48) or to 1 of 3 simulation groups who were requested to feign ADHD (n = 204). Additional samples of 27 adults with ADHD and 69 instructed simulators were included to cross-validate findings from the first samples. Adults with ADHD showed impaired visuospatial working memory performance of medium size as compared to healthy individuals. Simulation groups committed significantly more errors and had shorter response times as compared to patients with ADHD. Moreover, binary logistic regression analysis was carried out to derive a validity index that optimally differentiates between true and feigned ADHD. ROC analysis demonstrated high classification rates of the validity index, as shown in excellent specificity (95.8%) and adequate sensitivity (60.3%). The visuospatial working memory test as presented in this study therefore appears sensitive in indicating cognitive impairment of adults with ADHD. Furthermore, the embedded validity index revealed promising results concerning the detection of noncredible cognitive performance of adults with ADHD. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  18. Further Validation of the Conner's Adult Attention Deficit/Hyperactivity Rating Scale Infrequency Index (CII) for Detection of Non-Credible Report of Attention Deficit/Hyperactivity Disorder Symptoms.

    PubMed

    Cook, Carolyn M; Bolinger, Elizabeth; Suhr, Julie

    2016-06-01

    Attention deficit/hyperactivity disorder (ADHD) can be easily presented in a non-credible manner, through non-credible report of ADHD symptoms and/or by non-credible performance on neuropsychological tests. While most studies have focused on detection of non-credible performance using performance validity tests, there are few studies examining the ability to detect non-credible report of ADHD symptoms. We provide further validation data for a recently developed measure of non-credible ADHD symptom report, the Conner's Adult ADHD Rating Scales (CAARS) Infrequency Index (CII). Using archival data from 86 adults referred for concerns about ADHD, we examined the accuracy of the CII in detecting extreme scores on the CAARS and invalid reporting on validity indices of the Minnesota Multiphasic Personality Inventory-2 Restructured Format (MMPI-2-RF). We also examined the accuracy of the CII in detecting non-credible performance on standalone and embedded performance validity tests. The CII was 52% sensitive to extreme scores on CAARS DSM symptom subscales (with 97% specificity) and 20%-36% sensitive to invalid responding on MMPI-2-RF validity scales (with near 90% specificity), providing further evidence for the interpretation of the CII as an indicator of non-credible ADHD symptom report. However, the CII detected only 18% of individuals who failed a standalone performance validity test (Word Memory Test), with 87.8% specificity, and was not accurate in detecting non-credible performance using embedded digit span cutoffs. Future studies should continue to examine how best to assess for non-credible symptom report in ADHD referrals. © The Author 2016. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  19. Development of self and peer performance assessment on iodometric titration experiment

    NASA Astrophysics Data System (ADS)

    Nahadi; Siswaningsih, W.; Kusumaningtyas, H.

    2018-05-01

    This study aims to describe the process in developing of reliable and valid assessment to measure students’ performance on iodometric titration and the effect of the self and peer assessment on students’ performance. The self and peer-instrument provides valuable feedback for the student performance improvement. The developed assessment contains rubric and task for facilitating self and peer assessment. The participants are 24 students at the second-grade student in certain vocational high school in Bandung. The participants divided into two groups. The first 12 students involved in the validity test of the developed assessment, while the remain 12 students participated for the reliability test. The content validity was evaluated based on the judgment experts. Test result of content validity based on judgment expert show that the developed performance assessment instrument categorized as valid on each task with the realibity classified as very good. Analysis of the impact of the self and peer assessment implementation showed that the peer instrument supported the self assessment.

  20. Validation of the breast evaluation questionnaire for breast hypertrophy and breast reduction.

    PubMed

    Lewin, Richard; Elander, Anna; Lundberg, Jonas; Hansson, Emma; Thorarinsson, Andri; Claudelin, Malin; Bladh, Helena; Lidén, Mattias

    2018-06-13

    There is a lack of published, validated questionnaires for evaluating psychosocial morbidity in patients with breast hypertrophy undergoing breast reduction surgery. To validate the breast evaluation questionnaire (BEQ), originally developed for the assessment of breast augmentation patients, for the assessment of psychosocial morbidity in patients with breast hypertrophy undergoing breast reduction surgery. Validation study Subjects: Women with macromastia Methods: The validation of the BEQ, adapted to breast reduction, was performed in several steps. Content validity, reliability, construct validity and responsiveness were assessed. The original version was adjusted according to the results for content validity and resulted in item reduction and a modified BEQ (mBEQ) that was then assessed for reliability, construct validity and responsiveness. Internal and external validation was performed for the modified BEQ. Convergent validity was tested against Breast-Q (reduction) and discriminate validity was tested against the SF-36. Known-groups validation revealed significant differences between the normal population and patients undergoing breast reduction surgery. The BEQ showed good reliability by test-re-test analysis and high responsiveness. The modified BEQ may be reliable, valid and responsive instrument for assessing women who undergo breast reduction.

  1. Isokinetic knee strength qualities as predictors of jumping performance in high-level volleyball athletes: multiple regression approach.

    PubMed

    Sattler, Tine; Sekulic, Damir; Spasic, Miodrag; Osmankac, Nedzad; Vicente João, Paulo; Dervisevic, Edvin; Hadzic, Vedran

    2016-01-01

    Previous investigations noted potential importance of isokinetic strength in rapid muscular performances, such as jumping. This study aimed to identify the influence of isokinetic-knee-strength on specific jumping performance in volleyball. The secondary aim of the study was to evaluate reliability and validity of the two volleyball-specific jumping tests. The sample comprised 67 female (21.96±3.79 years; 68.26±8.52 kg; 174.43±6.85 cm) and 99 male (23.62±5.27 years; 84.83±10.37 kg; 189.01±7.21 cm) high- volleyball players who competed in 1st and 2nd National Division. Subjects were randomly divided into validation (N.=55 and 33 for males and females, respectively) and cross-validation subsamples (N.=54 and 34 for males and females, respectively). Set of predictors included isokinetic tests, to evaluate the eccentric and concentric strength capacities of the knee extensors, and flexors for dominant and non-dominant leg. The main outcome measure for the isokinetic testing was peak torque (PT) which was later normalized for body mass and expressed as PT/Kg. Block-jump and spike-jump performances were measured over three trials, and observed as criteria. Forward stepwise multiple regressions were calculated for validation subsamples and then cross-validated. Cross validation included correlations between and t-test differences between observed and predicted scores; and Bland Altman graphics. Jumping tests were found to be reliable (spike jump: ICC of 0.79 and 0.86; block-jump: ICC of 0.86 and 0.90; for males and females, respectively), and their validity was confirmed by significant t-test differences between 1st vs. 2nd division players. Isokinetic variables were found to be significant predictors of jumping performance in females, but not among males. In females, the isokinetic-knee measures were shown to be stronger and more valid predictors of the block-jump (42% and 64% of the explained variance for validation and cross-validation subsample, respectively) than that of the spike-jump (39% and 34% of the explained variance for validation and cross-validation subsample, respectively). Differences between prediction models calculated for males and females are mostly explained by gender-specific biomechanics of jumping. Study defined importance of knee-isokinetic-strength in volleyball jumping performance in female athletes. Further studies should evaluate association between ankle-isokinetic-strength and volleyball-specific jumping performances. Results reinforce the need for the cross-validation of the prediction-models in sport and exercise sciences.

  2. Red flags in the clinical interview may forecast invalid neuropsychological testing.

    PubMed

    Keesler, Michael E; McClung, Kirstie; Meredith-Duliba, Tawny; Williams, Kelli; Swirsky-Sacchetti, Thomas

    2017-04-01

    Evaluating assessment validity is expected in neuropsychological evaluation, particularly in cases with identified secondary gain, where malingering or somatization may be present. Assessed with standalone measures and embedded indices, all within the testing portion of the examination, research on validity of self-report in the clinical interview is limited. Based on experience with litigation-involved examinees recovering from mild traumatic brain injury (mTBI), it was hypothesized that inconsistently reported date of injury (DOI) and/or loss of consciousness (LOC) might predict invalid performance on neurocognitive testing. This archival study examined cases of litigation-involved mTBI patients seen at an outpatient neuropsychological practice in Philadelphia, PA. Coded data included demographic variables, performance validity measures, and consistency between self-report and medicolegal records. A significant relationship was found between the consistency of examinees' self-report with records and their scores on performance validity testing, X 2 (1, N = 84) = 24.18, p < .01, Φ = .49. Post hoc testing revealed significant between-group differences in three of four comparisons, with medium to large effect sizes. A final post hoc analysis found significance between the number of performance validity tests (PVTs) failed and the extent to which an examinee incorrectly reported DOI r(83) = .49, p < .01. Using inconsistently reported LOC and/or DOI to predict an examinee's performance as invalid had a 75% sensitivity and a 75% specificity. Examinees whose reported DOI or LOC differs from records may be more likely to fail one or more PVTs, suggesting possible symptom exaggeration and/or under performance on cognitive testing.s.

  3. Validation and clinical utility of the executive function performance test in persons with traumatic brain injury.

    PubMed

    Baum, C M; Wolf, T J; Wong, A W K; Chen, C H; Walker, K; Young, A C; Carlozzi, N E; Tulsky, D S; Heaton, R K; Heinemann, A W

    2017-07-01

    This study examined the relationships between the Executive Function Performance Test (EFPT), the NIH Toolbox Cognitive Function tests, and neuropsychological executive function measures in 182 persons with traumatic brain injury (TBI) and 46 controls to evaluate construct, discriminant, and predictive validity. Construct validity: There were moderate correlations between the EFPT and the NIH Toolbox Crystallized (r = -.479), Fluid Tests (r = -.420), and Total Composite Scores (r = -.496). Discriminant validity: Significant differences were found in the EFPT total and sequence scores across control, complicated mild/moderate, and severe TBI groups. We found differences in the organisation score between control and severe, and between mild and severe TBI groups. Both TBI groups had significantly lower scores in safety and judgement than controls. Compared to the controls, the severe TBI group demonstrated significantly lower performance on all instrumental activities of daily living (IADL) tasks. Compared to the mild TBI group, the controls performed better on the medication task, the severe TBI group performed worse in the cooking and telephone tasks. Predictive validity: The EFPT predicted the self-perception of independence measured by the TBI-QOL (beta = -0.49, p < .001) for the severe TBI group. Overall, these data support the validity of the EFPT for use in individuals with TBI.

  4. Tests for the Assessment of Sport-Specific Performance in Olympic Combat Sports: A Systematic Review With Practical Recommendations.

    PubMed

    Chaabene, Helmi; Negra, Yassine; Bouguezzi, Raja; Capranica, Laura; Franchini, Emerson; Prieske, Olaf; Hbacha, Hamdi; Granacher, Urs

    2018-01-01

    The regular monitoring of physical fitness and sport-specific performance is important in elite sports to increase the likelihood of success in competition. This study aimed to systematically review and to critically appraise the methodological quality, validation data, and feasibility of the sport-specific performance assessment in Olympic combat sports like amateur boxing, fencing, judo, karate, taekwondo, and wrestling. A systematic search was conducted in the electronic databases PubMed, Google-Scholar, and Science-Direct up to October 2017. Studies in combat sports were included that reported validation data (e.g., reliability, validity, sensitivity) of sport-specific tests. Overall, 39 studies were eligible for inclusion in this review. The majority of studies (74%) contained sample sizes <30 subjects. Nearly, 1/3 of the reviewed studies lacked a sufficient description (e.g., anthropometrics, age, expertise level) of the included participants. Seventy-two percent of studies did not sufficiently report inclusion/exclusion criteria of their participants. In 62% of the included studies, the description and/or inclusion of a familiarization session (s) was either incomplete or not existent. Sixty-percent of studies did not report any details about the stability of testing conditions. Approximately half of the studies examined reliability measures of the included sport-specific tests (intraclass correlation coefficient [ICC] = 0.43-1.00). Content validity was addressed in all included studies, criterion validity (only the concurrent aspect of it) in approximately half of the studies with correlation coefficients ranging from r = -0.41 to 0.90. Construct validity was reported in 31% of the included studies and predictive validity in only one. Test sensitivity was addressed in 13% of the included studies. The majority of studies (64%) ignored and/or provided incomplete information on test feasibility and methodological limitations of the sport-specific test. In 28% of the included studies, insufficient information or a complete lack of information was provided in the respective field of the test application. Several methodological gaps exist in studies that used sport-specific performance tests in Olympic combat sports. Additional research should adopt more rigorous validation procedures in the application and description of sport-specific performance tests in Olympic combat sports.

  5. Ada (Tradename) Compiler Validation Summary Report. Harris Corporation. HARRIS Ada Compiler, Version 1.0. Harris H1200 and H800.

    DTIC Science & Technology

    This Validations Summary Report (VSR) summarizes the results and conclusions of validation testing performed on the HARRIS Ada Compiler, Version 1.0...at compile time, at link time, or during execution. On-site testing was performed 28 APR 1986 through 30 APR 1986 at Harris Corporation, Ft. Lauderdale

  6. Validity of clinical color vision tests for air traffic control specialists.

    DOT National Transportation Integrated Search

    1992-10-01

    An experiment on the relationship between aeromedical color vision screening test performance and performance on color-dependent tasks of Air Traffic Control Specialists was replicated to expand the data base supporting the job-related validity of th...

  7. Performance Validity Testing in Neuropsychology: Methods for Measurement Development and Maximizing Diagnostic Accuracy.

    PubMed

    Wodushek, Thomas R; Greher, Michael R

    2017-05-01

    In the first column in this 2-part series, Performance Validity Testing in Neuropsychology: Scientific Basis and Clinical Application-A Brief Review, the authors introduced performance validity tests (PVTs) and their function, provided a justification for why they are necessary, traced their ongoing endorsement by neuropsychological organizations, and described how they are used and interpreted by ever increasing numbers of clinical neuropsychologists. To enhance readers' understanding of these measures, this second column briefly describes common detection strategies used in PVTs as well as the typical methods used to validate new PVTs and determine cut scores for valid/invalid determinations. We provide a discussion of the latest research demonstrating how neuropsychologists can combine multiple PVTs in a single battery to improve sensitivity/specificity to invalid responding. Finally, we discuss future directions for the research and application of PVTs.

  8. Validating use of a critical thinking test for the dental admission test.

    PubMed

    Tsai, Tsung-Hsun

    2014-04-01

    The purpose of this study was to validate the use of a test to assess dental school applicants' critical thinking abilities. The intent was to include this test on the Dental Admission Test (DAT) if it was shown to enhance the DAT's validity. Correlation and regression analyses of undergraduate and dental school performance with scores on each of the tests on the DAT battery and the California Critical Thinking Skills Test (CCTST) were performed. Data were collected from 439 third- and fourth-year dental students who consented to participate and were enrolled at one of the ten accredited dental schools included in the study. These ten dental schools were from most regions of the United States. This study concluded that including the CCTST on the DAT did not significantly enhance the DAT's validity.

  9. Vacuum decay container closure integrity leak test method development and validation for a lyophilized product-package system.

    PubMed

    Patel, Jayshree; Mulhall, Brian; Wolf, Heinz; Klohr, Steven; Guazzo, Dana Morton

    2011-01-01

    A leak test performed according to ASTM F2338-09 Standard Test Method for Nondestructive Detection of Leaks in Packages by Vacuum Decay Method was developed and validated for container-closure integrity verification of a lyophilized product in a parenteral vial package system. This nondestructive leak test method is intended for use in manufacturing as an in-process package integrity check, and for testing product stored on stability in lieu of sterility tests. Method development and optimization challenge studies incorporated artificially defective packages representing a range of glass vial wall and sealing surface defects, as well as various elastomeric stopper defects. Method validation required 3 days of random-order replicate testing of a test sample population of negative-control, no-defect packages and positive-control, with-defect packages. Positive-control packages were prepared using vials each with a single hole laser-drilled through the glass vial wall. Hole creation and hole size certification was performed by Lenox Laser. Validation study results successfully demonstrated the vacuum decay leak test method's ability to accurately and reliably detect those packages with laser-drilled holes greater than or equal to approximately 5 μm in nominal diameter. All development and validation studies were performed at Whitehouse Analytical Laboratories in Whitehouse, NJ, under the direction of consultant Dana Guazzo of RxPax, LLC, using a VeriPac 455 Micro Leak Test System by Packaging Technologies & Inspection (Tuckahoe, NY). Bristol Myers Squibb (New Brunswick, NJ) fully subsidized all work. A leak test performed according to ASTM F2338-09 Standard Test Method for Nondestructive Detection of Leaks in Packages by Vacuum Decay Method was developed and validated to detect defects in stoppered vial packages containing lyophilized product for injection. This nondestructive leak test method is intended for use in manufacturing as an in-process package integrity check, and for testing product stored on stability in lieu of sterility tests. Test method validation study results proved the method capable of detecting holes laser-drilled through the glass vial wall greater than or equal to 5 μm in nominal diameter. Total test time is less than 1 min per package. All method development and validation studies were performed at Whitehouse Analytical Laboratories in Whitehouse, NJ, under the direction of consultant Dana Guazzo of RxPax, LLC, using a VeriPac 455 Micro Leak Test System by Packaging Technologies & Inspection (Tuckahoe, NY). Bristol Myers Squibb (New Brunswick, NJ) fully subsidized all work.

  10. The reliability and validity of the Complex Task Performance Assessment: A performance-based assessment of executive function.

    PubMed

    Wolf, Timothy J; Dahl, Abigail; Auen, Colleen; Doherty, Meghan

    2017-07-01

    The objective of this study was to evaluate the inter-rater reliability, test-retest reliability, concurrent validity, and discriminant validity of the Complex Task Performance Assessment (CTPA): an ecologically valid performance-based assessment of executive function. Community control participants (n = 20) and individuals with mild stroke (n = 14) participated in this study. All participants completed the CTPA and a battery of cognitive assessments at initial testing. The control participants completed the CTPA at two different times one week apart. The intra-class correlation coefficient (ICC) for inter-rater reliability for the total score on the CTPA was .991. The ICCs for all of the sub-scores of the CTPA were also high (.889-.977). The CTPA total score was significantly correlated to Condition 4 of the DKEFS Color-Word Interference Test (p = -.425), and the Wechsler Test of Adult Reading (p  = -.493). Finally, there were significant differences between control subjects and individuals with mild stroke on the total score of the CTPA (p = .007) and all sub-scores except interpretation failures and total items incorrect. These results are also consistent with other current executive function performance-based assessments and indicate that the CTPA is a reliable and valid performance-based measure of executive function.

  11. Evaluating the accuracy of the Wechsler Memory Scale-Fourth Edition (WMS-IV) logical memory embedded validity index for detecting invalid test performance.

    PubMed

    Soble, Jason R; Bain, Kathleen M; Bailey, K Chase; Kirton, Joshua W; Marceaux, Janice C; Critchfield, Edan A; McCoy, Karin J M; O'Rourke, Justin J F

    2018-01-08

    Embedded performance validity tests (PVTs) allow for continuous assessment of invalid performance throughout neuropsychological test batteries. This study evaluated the utility of the Wechsler Memory Scale-Fourth Edition (WMS-IV) Logical Memory (LM) Recognition score as an embedded PVT using the Advanced Clinical Solutions (ACS) for WAIS-IV/WMS-IV Effort System. This mixed clinical sample was comprised of 97 total participants, 71 of whom were classified as valid and 26 as invalid based on three well-validated, freestanding criterion PVTs. Overall, the LM embedded PVT demonstrated poor concordance with the criterion PVTs and unacceptable psychometric properties using ACS validity base rates (42% sensitivity/79% specificity). Moreover, 15-39% of participants obtained an invalid ACS base rate despite having a normatively-intact age-corrected LM Recognition total score. Receiving operating characteristic curve analysis revealed a Recognition total score cutoff of < 61% correct improved specificity (92%) while sensitivity remained weak (31%). Thus, results indicated the LM Recognition embedded PVT is not appropriate for use from an evidence-based perspective, and that clinicians may be faced with reconciling how a normatively intact cognitive performance on the Recognition subtest could simultaneously reflect invalid performance validity.

  12. PLCO Ovarian Phase III Validation Study — EDRN Public Portal

    Cancer.gov

    Our preliminary data indicate that the performance of CA 125 as a screening test for ovarian cancer can be improved upon by additional biomarkers. With completion of one additional validation step, we will be ready to test the performance of a consensus marker panel in a phase III validation study. Given the original aims of the PLCO trial, we believe that the PLCO represents an ideal longitudinal cohort offering specimens for phase III validation of ovarian cancer biomarkers.

  13. Validity and Reliability of the 8-Item Work Limitations Questionnaire.

    PubMed

    Walker, Timothy J; Tullar, Jessica M; Diamond, Pamela M; Kohl, Harold W; Amick, Benjamin C

    2017-12-01

    Purpose To evaluate factorial validity, scale reliability, test-retest reliability, convergent validity, and discriminant validity of the 8-item Work Limitations Questionnaire (WLQ) among employees from a public university system. Methods A secondary analysis using de-identified data from employees who completed an annual Health Assessment between the years 2009-2015 tested research aims. Confirmatory factor analysis (CFA) (n = 10,165) tested the latent structure of the 8-item WLQ. Scale reliability was determined using a CFA-based approach while test-retest reliability was determined using the intraclass correlation coefficient. Convergent/discriminant validity was tested by evaluating relations between the 8-item WLQ with health/performance variables for convergent validity (health-related work performance, number of chronic conditions, and general health) and demographic variables for discriminant validity (gender and institution type). Results A 1-factor model with three correlated residuals demonstrated excellent model fit (CFI = 0.99, TLI = 0.99, RMSEA = 0.03, and SRMR = 0.01). The scale reliability was acceptable (0.69, 95% CI 0.68-0.70) and the test-retest reliability was very good (ICC = 0.78). Low-to-moderate associations were observed between the 8-item WLQ and the health/performance variables while weak associations were observed between the demographic variables. Conclusions The 8-item WLQ demonstrated sufficient reliability and validity among employees from a public university system. Results suggest the 8-item WLQ is a usable alternative for studies when the more comprehensive 25-item WLQ is not available.

  14. Validity and Reliability of Baseline Testing in a Standardized Environment.

    PubMed

    Higgins, Kathryn L; Caze, Todd; Maerlender, Arthur

    2017-08-11

    The Immediate Postconcussion Assessment and Cognitive Testing (ImPACT) is a computerized neuropsychological test battery commonly used to determine cognitive recovery from concussion based on comparing post-injury scores to baseline scores. This model is based on the premise that ImPACT baseline test scores are a valid and reliable measure of optimal cognitive function at baseline. Growing evidence suggests that this premise may not be accurate and a large contributor to invalid and unreliable baseline test scores may be the protocol and environment in which baseline tests are administered. This study examined the effects of a standardized environment and administration protocol on the reliability and performance validity of athletes' baseline test scores on ImPACT by comparing scores obtained in two different group-testing settings. Three hundred-sixty one Division 1 cohort-matched collegiate athletes' baseline data were assessed using a variety of indicators of potential performance invalidity; internal reliability was also examined. Thirty-one to thirty-nine percent of the baseline cases had at least one indicator of low performance validity, but there were no significant differences in validity indicators based on environment in which the testing was conducted. Internal consistency reliability scores were in the acceptable to good range, with no significant differences between administration conditions. These results suggest that athletes may be reliably performing at levels lower than their best effort would produce. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  15. Phase 1 Validation Testing and Simulation for the WEC-Sim Open Source Code

    NASA Astrophysics Data System (ADS)

    Ruehl, K.; Michelen, C.; Gunawan, B.; Bosma, B.; Simmons, A.; Lomonaco, P.

    2015-12-01

    WEC-Sim is an open source code to model wave energy converters performance in operational waves, developed by Sandia and NREL and funded by the US DOE. The code is a time-domain modeling tool developed in MATLAB/SIMULINK using the multibody dynamics solver SimMechanics, and solves the WEC's governing equations of motion using the Cummins time-domain impulse response formulation in 6 degrees of freedom. The WEC-Sim code has undergone verification through code-to-code comparisons; however validation of the code has been limited to publicly available experimental data sets. While these data sets provide preliminary code validation, the experimental tests were not explicitly designed for code validation, and as a result are limited in their ability to validate the full functionality of the WEC-Sim code. Therefore, dedicated physical model tests for WEC-Sim validation have been performed. This presentation provides an overview of the WEC-Sim validation experimental wave tank tests performed at the Oregon State University's Directional Wave Basin at Hinsdale Wave Research Laboratory. Phase 1 of experimental testing was focused on device characterization and completed in Fall 2015. Phase 2 is focused on WEC performance and scheduled for Winter 2015/2016. These experimental tests were designed explicitly to validate the performance of WEC-Sim code, and its new feature additions. Upon completion, the WEC-Sim validation data set will be made publicly available to the wave energy community. For the physical model test, a controllable model of a floating wave energy converter has been designed and constructed. The instrumentation includes state-of-the-art devices to measure pressure fields, motions in 6 DOF, multi-axial load cells, torque transducers, position transducers, and encoders. The model also incorporates a fully programmable Power-Take-Off system which can be used to generate or absorb wave energy. Numerical simulations of the experiments using WEC-Sim will be presented. These simulations highlight the code features included in the latest release of WEC-Sim (v1.2), including: wave directionality, nonlinear hydrostatics and hydrodynamics, user-defined wave elevation time-series, state space radiation, and WEC-Sim compatibility with BEMIO (open source AQWA/WAMI/NEMOH coefficient parser).

  16. Validity, Reliability, and Sensitivity of a Volleyball Intermittent Endurance Test.

    PubMed

    Rodríguez-Marroyo, Jose A; Medina-Carrillo, Javier; García-López, Juan; Morante, Juan C; Villa, José G; Foster, Carl

    2017-03-01

    To analyze the concurrent and construct validity of a volleyball intermittent endurance test (VIET). The VIET's test-retest reliability and sensitivity to assess seasonal changes was also studied. During the preseason, 71 volleyball players of different competitive levels took part in this study. All performed the VIET and a graded treadmill test with gas-exchange measurement (GXT). Thirty-one of the players performed an additional VIET to analyze the test-retest reliability. To test the VIET's sensitivity, 28 players repeated the VIET and GXT at the end of their season. Significant (P < .001) relationships between VIET distance and maximal oxygen uptake (r = .74) and GXT maximal speed (r = .78) were observed. There were no significant differences between the VIET performance test and retest (1542.1 ± 338.1 vs 1567.1 ± 358.2 m). Significant (P < .001) relationships and intraclass correlation coefficient (ICC) were found (r = .95, ICC = .96) for VIET performance. VIET performance increased significantly (P < .001) with player performance level and was sensitive to fitness changes across the season (1458.8 ± 343.5 vs 1581.1 ± 334.0 m, P < .01). The VIET may be considered a valid, reliable, and sensitive test to assess the aerobic endurance in volleyball players.

  17. Predicting Performance in Higher Education Using Proximal Predictors.

    PubMed

    Niessen, A Susan M; Meijer, Rob R; Tendeiro, Jorge N

    2016-01-01

    We studied the validity of two methods for predicting academic performance and student-program fit that were proximal to important study criteria. Applicants to an undergraduate psychology program participated in a selection procedure containing a trial-studying test based on a work sample approach, and specific skills tests in English and math. Test scores were used to predict academic achievement and progress after the first year, achievement in specific course types, enrollment, and dropout after the first year. All tests showed positive significant correlations with the criteria. The trial-studying test was consistently the best predictor in the admission procedure. We found no significant differences between the predictive validity of the trial-studying test and prior educational performance, and substantial shared explained variance between the two predictors. Only applicants with lower trial-studying scores were significantly less likely to enroll in the program. In conclusion, the trial-studying test yielded predictive validities similar to that of prior educational performance and possibly enabled self-selection. In admissions aimed at student-program fit, or in admissions in which past educational performance is difficult to use, a trial-studying test is a good instrument to predict academic performance.

  18. Validation of sterilizing grade filtration.

    PubMed

    Jornitz, M W; Meltzer, T H

    2003-01-01

    Validation consideration of sterilizing grade filters, namely 0.2 micron, changed when FDA voiced concerns about the validity of Bacterial Challenge tests performed in the past. Such validation exercises are nowadays considered to be filter qualification. Filter validation requires more thorough analysis, especially Bacterial Challenge testing with the actual drug product under process conditions. To do so, viability testing is a necessity to determine the Bacterial Challenge test methodology. Additionally to these two compulsory tests, other evaluations like extractable, adsorption and chemical compatibility tests should be considered. PDA Technical Report # 26, Sterilizing Filtration of Liquids, describes all parameters and aspects required for the comprehensive validation of filters. The report is a most helpful tool for validation of liquid filters used in the biopharmaceutical industry. It sets the cornerstones of validation requirements and other filtration considerations.

  19. Ride qualities criteria validation/pilot performance study: Flight test results

    NASA Technical Reports Server (NTRS)

    Nardi, L. U.; Kawana, H. Y.; Greek, D. C.

    1979-01-01

    Pilot performance during a terrain following flight was studied for ride quality criteria validation. Data from manual and automatic terrain following operations conducted during low level penetrations were analyzed to determine the effect of ride qualities on crew performance. The conditions analyzed included varying levels of turbulence, terrain roughness, and mission duration with a ride smoothing system on and off. Limited validation of the B-1 ride quality criteria and some of the first order interactions between ride qualities and pilot/vehicle performance are highlighted. An earlier B-1 flight simulation program correlated well with the flight test results.

  20. Validity of Highlighting on Text Comprehension

    NASA Astrophysics Data System (ADS)

    So, Joey C. Y.; Chan, Alan H. S.

    2009-10-01

    In this study, 38 university students were tested with a Chinese reading task on an LED display under different task conditions for determining the effects of the highlighting and its validity on comprehension performance on light-emitting diodes (LED) display for Chinese reading. Four levels of validity (0%, 33%, 67% and 100%) and a control condition with no highlighting were tested. Each subject was required to perform the five experimental conditions in which different passages were read and comprehended. The results showed that the condition with 100% validity of highlighting was found to have better comprehension performance than other validity levels and conditions with no highlighting. The comprehension score of the condition without highlighting effect was comparatively lower than those highlighting conditions with distracters, though not significant.

  1. Evaluation of the methodological quality of studies of the performance of diagnostic tests for bovine tuberculosis using QUADAS.

    PubMed

    Downs, Sara H; More, Simon J; Goodchild, Anthony V; Whelan, Adam O; Abernethy, Darrell A; Broughan, Jennifer M; Cameron, Angus; Cook, Alasdair J; Ricardo de la Rua-Domenech, R; Greiner, Matthias; Gunn, Jane; Nuñez-Garcia, Javier; Rhodes, Shelley; Rolfe, Simon; Sharp, Michael; Upton, Paul; Watson, Eamon; Welsh, Michael; Woolliams, John A; Clifton-Hadley, Richard S; Parry, Jessica E

    2018-05-01

    There has been little assessment of the methodological quality of studies measuring the performance (sensitivity and/or specificity) of diagnostic tests for animal diseases. In a systematic review, 190 studies of tests for bovine tuberculosis (bTB) in cattle (published 1934-2009) were assessed by at least one of 18 reviewers using the QUADAS (Quality Assessment of Diagnostic Accuracy Studies) checklist adapted for animal disease tests. VETQUADAS (VQ) included items measuring clarity in reporting (n = 3), internal validity (n = 9) and external validity (n = 2). A similar pattern for compliance was observed in studies of different diagnostic test types. Compliance significantly improved with year of publication for all items measuring clarity in reporting and external validity but only improved in four of the nine items measuring internal validity (p < 0.05). 107 references, of which 83 had performance data eligible for inclusion in a meta-analysis were reviewed by two reviewers. In these references, agreement between reviewers' responses was 71% for compliance, 32% for unsure and 29% for non-compliance. Mean compliance with reporting items was 2, 5.2 for internal validity and 1.5 for external validity. The index test result was described in sufficient detail in 80.1% of studies and was interpreted without knowledge of the reference standard test result in only 33.1%. Loss to follow-up was adequately explained in only 31.1% of studies. The prevalence of deficiencies observed may be due to inadequate reporting but may also reflect lack of attention to methodological issues that could bias the results of diagnostic test performance estimates. QUADAS was a useful tool for assessing and comparing the quality of studies measuring the performance of diagnostic tests but might be improved further by including explicit assessment of population sampling strategy. Crown Copyright © 2017. Published by Elsevier B.V. All rights reserved.

  2. Tests for the Assessment of Sport-Specific Performance in Olympic Combat Sports: A Systematic Review With Practical Recommendations

    PubMed Central

    Chaabene, Helmi; Negra, Yassine; Bouguezzi, Raja; Capranica, Laura; Franchini, Emerson; Prieske, Olaf; Hbacha, Hamdi; Granacher, Urs

    2018-01-01

    The regular monitoring of physical fitness and sport-specific performance is important in elite sports to increase the likelihood of success in competition. This study aimed to systematically review and to critically appraise the methodological quality, validation data, and feasibility of the sport-specific performance assessment in Olympic combat sports like amateur boxing, fencing, judo, karate, taekwondo, and wrestling. A systematic search was conducted in the electronic databases PubMed, Google-Scholar, and Science-Direct up to October 2017. Studies in combat sports were included that reported validation data (e.g., reliability, validity, sensitivity) of sport-specific tests. Overall, 39 studies were eligible for inclusion in this review. The majority of studies (74%) contained sample sizes <30 subjects. Nearly, 1/3 of the reviewed studies lacked a sufficient description (e.g., anthropometrics, age, expertise level) of the included participants. Seventy-two percent of studies did not sufficiently report inclusion/exclusion criteria of their participants. In 62% of the included studies, the description and/or inclusion of a familiarization session (s) was either incomplete or not existent. Sixty-percent of studies did not report any details about the stability of testing conditions. Approximately half of the studies examined reliability measures of the included sport-specific tests (intraclass correlation coefficient [ICC] = 0.43–1.00). Content validity was addressed in all included studies, criterion validity (only the concurrent aspect of it) in approximately half of the studies with correlation coefficients ranging from r = −0.41 to 0.90. Construct validity was reported in 31% of the included studies and predictive validity in only one. Test sensitivity was addressed in 13% of the included studies. The majority of studies (64%) ignored and/or provided incomplete information on test feasibility and methodological limitations of the sport-specific test. In 28% of the included studies, insufficient information or a complete lack of information was provided in the respective field of the test application. Several methodological gaps exist in studies that used sport-specific performance tests in Olympic combat sports. Additional research should adopt more rigorous validation procedures in the application and description of sport-specific performance tests in Olympic combat sports. PMID:29692739

  3. Word Memory Test Performance Across Cognitive Domains, Psychiatric Presentations, and Mild Traumatic Brain Injury.

    PubMed

    Rowland, Jared A; Miskey, Holly M; Brearly, Timothy W; Martindale, Sarah L; Shura, Robert D

    2017-05-01

    The current study addressed two aims: (i) determine how Word Memory Test (WMT) performance relates to test performance across numerous cognitive domains and (ii) evaluate how current psychiatric disorders or mild traumatic brain injury (mTBI) history affects performance on the WMT after excluding participants with poor symptom validity. Participants were 235 Iraq and Afghanistan-era veterans (Mage = 35.5) who completed a comprehensive neuropsychological battery. Participants were divided into two groups based on WMT performance (Pass = 193, Fail = 42). Tests were grouped into cognitive domains and an average z-score was calculated for each domain. Significant differences were found between those who passed and those who failed the WMT on the memory, attention, executive function, and motor output domain z-scores. WMT failure was associated with a larger performance decrement in the memory domain than the sensation or visuospatial-construction domains. Participants with a current psychiatric diagnosis or mTBI history were significantly more likely to fail the WMT, even after removing participants with poor symptom validity. Results suggest that the WMT is most appropriate for assessing validity in the domains of attention, executive function, motor output and memory, with little relationship to performance in domains of sensation or visuospatial-construction. Comprehensive cognitive batteries would benefit from inclusion of additional performance validity tests in these domains. Additionally, symptom validity did not explain higher rates of WMT failure in individuals with a current psychiatric diagnosis or mTBI history. Further research is needed to better understand how these conditions may affect WMT performance. Published by Oxford University Press 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.

  4. Electrolysis Performance Improvement and Validation Experiment

    NASA Technical Reports Server (NTRS)

    Schubert, Franz H.

    1992-01-01

    Viewgraphs on electrolysis performance improvement and validation experiment are presented. Topics covered include: water electrolysis: an ever increasing need/role for space missions; static feed electrolysis (SFE) technology: a concept developed for space applications; experiment objectives: why test in microgravity environment; and experiment description: approach, hardware description, test sequence and schedule.

  5. Validity and test-retest reliability of an at-work production loss instrument.

    PubMed

    Aboagye, E; Jensen, I; Bergström, G; Hagberg, J; Axén, I; Lohela-Karlsson, M

    2016-07-01

    Besides causing ill health, a poor work environment may contribute to production loss. Production loss assessment instruments emphasize health-related consequences but there is no instrument to measure reduced work performance related to the work environment. To examine convergent validity and test-retest reliability of health-related production loss (HRPL) and work environment-related production loss (WRPL) against a valid comparable instrument, the Health and Work Performance Questionnaire (HPQ). Cross-sectional study of employees, not on sick leave, who were asked to self-rate their work performance and production losses. Using the Pearson correlation and Bland and Altman's Test of Agreement, convergent validity was examined. Subgroup analyses were performed for employees recording problem-specific reduced work performance. Consistency of pairs of HRPL and WRPL for samples responding to both assessments was expressed using Intraclass Correlation Coefficient (ICC) and tests of repeatability. A total of 88 employees participated and 44 responded to both assessments. Test of agreement between measurements estimates a mean difference of 0.34 for HRPL and -0.03 for WRPL compared with work performance. This indicates that the production loss questions are valid and moderately associated with work performance for the total sample and subgroups. ICC for paired HRPL assessments was 0.90 and 0.91 for WRPL, i.e. the test-retest reliability was good and suggests stability in the instrument. HRPL and WRPL can be used to measure production loss due to health-related and work environment-related problems. These results may have implications for advancing methods of assessing production loss, which represents an important cost to employers. © The Author 2016. Published by Oxford University Press on behalf of the Society of Occupational Medicine. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  6. Development of an Agility Test for Badminton Players and Assessment of Its Validity and Test-Retest Reliability.

    PubMed

    Loureiro, Luiz de França Bahia; de Freitas, Paulo Barbosa

    2016-04-01

    Badminton requires open and fast actions toward the shuttlecock, but there is no specific agility test for badminton players with specific movements. To develop an agility test that simultaneously assesses perception and motor capacity and examine the test's concurrent and construct validity and its test-retest reliability. The Badcamp agility test consists of running as fast as possible to 6 targets placed on the corners and middle points of a rectangular area (5.6 × 4.2 m) from the start position located in the center of it, following visual stimuli presented in a luminous panel. The authors recruited 43 badminton players (17-32 y old) to evaluate concurrent (with shuttle-run agility test--SRAT) and construct validity and test-retest reliability. Results revealed that Badcamp presents concurrent and construct validity, as its performance is strongly related to SRAT (ρ = 0.83, P < .001), with performance of experts being better than nonexpert players (P < .01). In addition, Badcamp is reliable, as no difference (P = .07) and a high intraclass correlation (ICC = .93) were found in the performance of the players on 2 different occasions. The findings indicate that Badcamp is an effective, valid, and reliable tool to measure agility, allowing coaches and athletic trainers to evaluate players' athletic condition and training effectiveness and possibly detect talented individuals in this sport.

  7. Implementation and application of an interactive user-friendly validation software for RADIANCE

    NASA Astrophysics Data System (ADS)

    Sundaram, Anand; Boonn, William W.; Kim, Woojin; Cook, Tessa S.

    2012-02-01

    RADIANCE extracts CT dose parameters from dose sheets using optical character recognition and stores the data in a relational database. To facilitate validation of RADIANCE's performance, a simple user interface was initially implemented and about 300 records were evaluated. Here, we extend this interface to achieve a wider variety of functions and perform a larger-scale validation. The validator uses some data from the RADIANCE database to prepopulate quality-testing fields, such as correspondence between calculated and reported total dose-length product. The interface also displays relevant parameters from the DICOM headers. A total of 5,098 dose sheets were used to test the performance accuracy of RADIANCE in dose data extraction. Several search criteria were implemented. All records were searchable by accession number, study date, or dose parameters beyond chosen thresholds. Validated records were searchable according to additional criteria from validation inputs. An error rate of 0.303% was demonstrated in the validation. Dose monitoring is increasingly important and RADIANCE provides an open-source solution with a high level of accuracy. The RADIANCE validator has been updated to enable users to test the integrity of their installation and verify that their dose monitoring is accurate and effective.

  8. Reliability and validity of the test of incremental respiratory endurance measures of inspiratory muscle performance in COPD.

    PubMed

    Formiga, Magno F; Roach, Kathryn E; Vital, Isabel; Urdaneta, Gisel; Balestrini, Kira; Calderon-Candelario, Rafael A; Campos, Michael A; Cahalin, Lawrence P

    2018-01-01

    The Test of Incremental Respiratory Endurance (TIRE) provides a comprehensive assessment of inspiratory muscle performance by measuring maximal inspiratory pressure (MIP) over time. The integration of MIP over inspiratory duration (ID) provides the sustained maximal inspiratory pressure (SMIP). Evidence on the reliability and validity of these measurements in COPD is not currently available. Therefore, we assessed the reliability, responsiveness and construct validity of the TIRE measures of inspiratory muscle performance in subjects with COPD. Test-retest reliability, known-groups and convergent validity assessments were implemented simultaneously in 81 male subjects with mild to very severe COPD. TIRE measures were obtained using the portable PrO2 device, following standard guidelines. All TIRE measures were found to be highly reliable, with SMIP demonstrating the strongest test-retest reliability with a nearly perfect intraclass correlation coefficient (ICC) of 0.99, while MIP and ID clustered closely together behind SMIP with ICC values of about 0.97. Our findings also demonstrated known-groups validity of all TIRE measures, with SMIP and ID yielding larger effect sizes when compared to MIP in distinguishing between subjects of different COPD status. Finally, our analyses confirmed convergent validity for both SMIP and ID, but not MIP. The TIRE measures of MIP, SMIP and ID have excellent test-retest reliability and demonstrated known-groups validity in subjects with COPD. SMIP and ID also demonstrated evidence of moderate convergent validity and appear to be more stable measures in this patient population than the traditional MIP.

  9. Coverage of the Test of Memory Malingering, Victoria Symptom Validity Test, and Word Memory Test on the Internet: is test security threatened?

    PubMed

    Bauer, Lyndsey; McCaffrey, Robert J

    2006-01-01

    In forensic neuropsychological settings, maintaining test security has become critically important, especially in regard to symptom validity tests (SVTs). Coaching, which can entail providing patients or litigants with information about the cognitive sequelae of head injury, or teaching them test-taking strategies to avoid detection of symptom dissimulation has been examined experimentally in many research studies. Emerging evidence supports that coaching strategies affect psychological and neuropsychological test performance to differing degrees depending on the coaching paradigm and the tests administered. The present study sought to examine Internet coverage of SVTs because it is potentially another source of coaching, or information that is readily available. Google searches were performed on the Test of Memory Malingering, the Victoria Symptom Validity Test, and the Word Memory Test. Results indicated that there is a variable amount of information available about each test that could threaten test security and validity should inappropriately interested parties find it. Steps that could be taken to improve this situation and limitations to this exploration are discussed.

  10. Does IQ Really Predict Job Performance?

    PubMed Central

    Richardson, Ken; Norgate, Sarah H.

    2015-01-01

    IQ has played a prominent part in developmental and adult psychology for decades. In the absence of a clear theoretical model of internal cognitive functions, however, construct validity for IQ tests has always been difficult to establish. Test validity, therefore, has always been indirect, by correlating individual differences in test scores with what are assumed to be other criteria of intelligence. Job performance has, for several reasons, been one such criterion. Correlations of around 0.5 have been regularly cited as evidence of test validity, and as justification for the use of the tests in developmental studies, in educational and occupational selection and in research programs on sources of individual differences. Here, those correlations are examined together with the quality of the original data and the many corrections needed to arrive at them. It is concluded that considerable caution needs to be exercised in citing such correlations for test validation purposes. PMID:26405429

  11. Validity, Reliability, and Performance Determinants of a New Job-Specific Anaerobic Work Capacity Test for the Norwegian Navy Special Operations Command.

    PubMed

    Angeltveit, Andreas; Paulsen, Gøran; Solberg, Paul A; Raastad, Truls

    2016-02-01

    Operators in Special Operation Forces (SOF) have a particularly demanding profession where physical and psychological capacities can be challenged to the extremes. The diversity of physical capacities needed depend on the mission. Consequently, tests used to monitor SOF operators' physical fitness should cover a broad range of physical capacities. Whereas tests for strength and aerobic endurance are established, there is no test for specific anaerobic work capacity described in the literature. The purpose of this study was therefore to evaluate the reliability, validity, and to identify performance determinants of a new test developed for testing specific anaerobic work capacity in SOF operators. Nineteen active young students were included in the concurrent validity part of the study. The students performed the evacuation (EVAC) test 3 times and the results were compared for reliability and with performance in the Wingate cycle test, 300-m sprint, and a maximal accumulated oxygen deficit (MAOD) test. In part II of the study, 21 Norwegian Navy Special Operations Command operators conducted the EVAC test, anthropometric measurements, a dual x-ray absorptiometry scan, leg press, isokinetic knee extensions, maximal oxygen uptake test, and countermovement jump (CMJ) test. The EVAC test showed good reliability after 1 familiarization trial (intraclass correlation = 0.89; coefficient of variance = 3.7%). The EVAC test correlated well with the Wingate test (r = -0.68), 300-m sprint time (r = 0.51), and 300-m mean power (W) (r = -0.67). No significant correlation was found with the MAOD test. In part II of the study, height, body mass, lean body mass, isokinetic knee extension torque, maximal oxygen uptake, and maximal power in a CMJ was significantly correlated with performance in the EVAC test. The EVAC test is a reliable and valid test for anaerobic work capacity for SOF operators, and muscle mass, leg strength, and leg power seem to be the most important determinants of performance.

  12. Validity and Reliability of a Medicine Ball Explosive Power Test.

    ERIC Educational Resources Information Center

    Stockbrugger, Barry A.; Haennel, Robert G.

    2001-01-01

    Evaluated the validity and reliability of a medicine ball throw test to evaluate explosive power. Data on competitive sand volleyball players who performed a medicine ball throw and a standard countermovement jump indicated that the medicine ball throw test was a valid and reliable way to assess explosive power for an analogous total-body movement…

  13. Validation of Helicopter Gear Condition Indicators Using Seeded Fault Tests

    NASA Technical Reports Server (NTRS)

    Dempsey, Paula; Brandon, E. Bruce

    2013-01-01

    A "seeded fault test" in support of a rotorcraft condition based maintenance program (CBM), is an experiment in which a component is tested with a known fault while health monitoring data is collected. These tests are performed at operating conditions comparable to operating conditions the component would be exposed to while installed on the aircraft. Performance of seeded fault tests is one method used to provide evidence that a Health Usage Monitoring System (HUMS) can replace current maintenance practices required for aircraft airworthiness. Actual in-service experience of the HUMS detecting a component fault is another validation method. This paper will discuss a hybrid validation approach that combines in service-data with seeded fault tests. For this approach, existing in-service HUMS flight data from a naturally occurring component fault will be used to define a component seeded fault test. An example, using spiral bevel gears as the targeted component, will be presented. Since the U.S. Army has begun to develop standards for using seeded fault tests for HUMS validation, the hybrid approach will be mapped to the steps defined within their Aeronautical Design Standard Handbook for CBM. This paper will step through their defined processes, and identify additional steps that may be required when using component test rig fault tests to demonstrate helicopter CI performance. The discussion within this paper will provide the reader with a better appreciation for the challenges faced when defining a seeded fault test for HUMS validation.

  14. Convergent validity and sex differences in healthy elderly adults for performance on 3D virtual reality navigation learning and 2D hidden maze tasks.

    PubMed

    Tippett, William J; Lee, Jang-Han; Mraz, Richard; Zakzanis, Konstantine K; Snyder, Peter J; Black, Sandra E; Graham, Simon J

    2009-04-01

    This study assessed the convergent validity of a virtual environment (VE) navigation learning task, the Groton Maze Learning Test (GMLT), and selected traditional neuropsychological tests performed in a group of healthy elderly adults (n = 24). The cohort was divided equally between males and females to explore performance variability due to sex differences, which were subsequently characterized and reported as part of the analysis. To facilitate performance comparisons, specific "efficiency" scores were created for both the VE navigation task and the GMLT. Men reached peak performance more rapidly than women during VE navigation and on the GMLT and significantly outperformed women on the first learning trial in the VE. Results suggest reasonable convergent validity across the VE task, GMLT, and selected neuropsychological tests for assessment of spatial memory.

  15. Validating an artificial intelligence human proximity operations system with test cases

    NASA Astrophysics Data System (ADS)

    Huber, Justin; Straub, Jeremy

    2013-05-01

    An artificial intelligence-controlled robot (AICR) operating in close proximity to humans poses risk to these humans. Validating the performance of an AICR is an ill posed problem, due to the complexity introduced by the erratic (noncomputer) actors. In order to prove the AICR's usefulness, test cases must be generated to simulate the actions of these actors. This paper discusses AICR's performance validation in the context of a common human activity, moving through a crowded corridor, using test cases created by an AI use case producer. This test is a two-dimensional simplification relevant to autonomous UAV navigation in the national airspace.

  16. Victoria Symptom Validity Test performance in children and adolescents with neurological disorders.

    PubMed

    Brooks, Brian L

    2012-12-01

    It is becoming increasingly more important to study, use, and promote the utility of measures that are designed to detect non-compliance with testing (i.e., poor effort, symptom non-validity, response bias) as part of neuropsychological assessments with children and adolescents. Several measures have evidence for use in pediatrics, but there is a paucity of published support for the Victoria Symptom Validity Test (VSVT) in this population. The purpose of this study was to examine the performance on the VSVT in a sample of pediatric patients with known neurological disorders. The sample consisted of 100 consecutively referred children and adolescents between the ages of 6 and 19 years (mean = 14.0, SD = 3.1) with various neurological diagnoses. On the VSVT total items, 95% of the sample had performance in the "valid" range, with 5% being deemed "questionable" and 0% deemed "invalid". On easy items, 97% were "valid", 2% were "questionable", and 1% was "invalid." For difficult items, 84% were "valid," 16% were "questionable," and 0% was "invalid." For those patients given two effort measures (i.e., VSVT and Test of Memory Malingering; n = 65), none was identified as having poor test-taking compliance on both measures. VSVT scores were significantly correlated with age, intelligence, processing speed, and functional ratings of daily abilities (attention, executive functioning, and adaptive functioning), but not objective performance on the measure of sustained attention, verbal memory, or visual memory. The VSVT has potential to be used in neuropsychological assessments with pediatric patients.

  17. Performance testing for superpave and structural validation.

    DOT National Transportation Integrated Search

    2012-11-01

    The primary objective of this full-scale accelerated pavement testing was to evaluate the performance of unmodified : and polymer modified asphalt binders and to recommend improved specification tests over existing SUperior : PERforming Asphalt PAVEm...

  18. Validation of hot-poured crack sealant performance-based guidelines.

    DOT National Transportation Integrated Search

    2017-06-01

    This report summarizes a comprehensive research effort to validate thresholds for performance-based guidelines and : grading system for hot-poured asphalt crack sealants. A series of performance tests were established in earlier research and : includ...

  19. Addressing criticisms of existing predictive bias research: cognitive ability test scores still overpredict African Americans' job performance.

    PubMed

    Berry, Christopher M; Zhao, Peng

    2015-01-01

    Predictive bias studies have generally suggested that cognitive ability test scores overpredict job performance of African Americans, meaning these tests are not predictively biased against African Americans. However, at least 2 issues call into question existing over-/underprediction evidence: (a) a bias identified by Aguinis, Culpepper, and Pierce (2010) in the intercept test typically used to assess over-/underprediction and (b) a focus on the level of observed validity instead of operational validity. The present study developed and utilized a method of assessing over-/underprediction that draws on the math of subgroup regression intercept differences, does not rely on the biased intercept test, allows for analysis at the level of operational validity, and can use meta-analytic estimates as input values. Therefore, existing meta-analytic estimates of key parameters, corrected for relevant statistical artifacts, were used to determine whether African American job performance remains overpredicted at the level of operational validity. African American job performance was typically overpredicted by cognitive ability tests across levels of job complexity and across conditions wherein African American and White regression slopes did and did not differ. Because the present study does not rely on the biased intercept test and because appropriate statistical artifact corrections were carried out, the present study's results are not affected by the 2 issues mentioned above. The present study represents strong evidence that cognitive ability tests generally overpredict job performance of African Americans. (c) 2015 APA, all rights reserved.

  20. Dynamic testing in schizophrenia: does training change the construct validity of a test?

    PubMed

    Wiedl, Karl H; Schöttke, Henning; Green, Michael F; Nuechterlein, Keith H

    2004-01-01

    Dynamic testing typically involves specific interventions for a test to assess the extent to which test performance can be modified, beyond level of baseline (static) performance. This study used a dynamic version of the Wisconsin Card Sorting Test (WCST) that is based on cognitive remediation techniques within a test-training-test procedure. From results of previous studies with schizophrenia patients, we concluded that the dynamic and static versions of the WCST should have different construct validity. This hypothesis was tested by examining the patterns of correlations with measures of executive functioning, secondary verbal memory, and verbal intelligence. Results demonstrated a specific construct validity of WCST dynamic (i.e., posttest) scores as an index of problem solving (Tower of Hanoi) and secondary verbal memory and learning (Auditory Verbal Learning Test), whereas the impact of general verbal capacity and selective attention (Verbal IQ, Stroop Test) was reduced. It is concluded that the construct validity of the test changes with dynamic administration and that this difference helps to explain why the dynamic version of the WCST predicts functional outcome better than the static version.

  1. Physical performance tests after stroke: reliability and validity.

    PubMed

    Maeda, A; Yuasa, T; Nakamura, K; Higuchi, S; Motohashi, Y

    2000-01-01

    To evaluate the reliability and validity of the modified physical performance tests for stroke survivors who live in a community. The subjects included 40 stroke survivors and 40 apparently healthy independent elderly persons. The physical performance tests for the stroke survivors comprised two physical capacity evaluation tasks that represented physical abilities necessary to perform the main activities of daily living, e.g., standing-up ability (time needed to stand up from bed rest) and walking ability (time needed to walk 10 m). Regarding the reliability of tests, significant correlations were confirmed between test and retest of physical performance tests with both short and long intervals in individuals after stroke. Regarding the validity of tests, the authors studied the significant correlations between the maximum isometric strength of the quardriceps muscle and the time needed to walk 10 m, centimeters reached while sitting and reaching, and the time needed to stand up from bed rest. The authors confirmed that there were significant correlations between the instrumental activity of daily living and the time needed to stand up from bed rest, along with the time needed to walk 10 m for the stroke survivors. These physical performance tests are useful guides for evaluating a level of activity of daily living and physical frailty of stroke survivors living in a community.

  2. Content Validity Index and Intra- and Inter-Rater Reliability of a New Muscle Strength/Endurance Test Battery for Swedish Soldiers

    PubMed Central

    Larsson, Helena; Tegern, Matthias; Monnier, Andreas; Skoglund, Jörgen; Helander, Charlotte; Persson, Emelie; Malm, Christer; Broman, Lisbet; Aasa, Ulrika

    2015-01-01

    The objective of this study was to examine the content validity of commonly used muscle performance tests in military personnel and to investigate the reliability of a proposed test battery. For the content validity investigation, thirty selected tests were those described in the literature and/or commonly used in the Nordic and North Atlantic Treaty Organization (NATO) countries. Nine selected experts rated, on a four-point Likert scale, the relevance of these tests in relation to five different work tasks: lifting, carrying equipment on the body or in the hands, climbing, and digging. Thereafter, a content validity index (CVI) was calculated for each work task. The result showed excellent CVI (≥0.78) for sixteen tests, which comprised of one or more of the military work tasks. Three of the tests; the functional lower-limb loading test (the Ranger test), dead-lift with kettlebells, and back extension, showed excellent content validity for four of the work tasks. For the development of a new muscle strength/endurance test battery, these three tests were further supplemented with two other tests, namely, the chins and side-bridge test. The inter-rater reliability was high (intraclass correlation coefficient, ICC2,1 0.99) for all five tests. The intra-rater reliability was good to high (ICC3,1 0.82–0.96) with an acceptable standard error of mean (SEM), except for the side-bridge test (SEM%>15). Thus, the final suggested test battery for a valid and reliable evaluation of soldiers’ muscle performance comprised the following four tests; the Ranger test, dead-lift with kettlebells, chins, and back extension test. The criterion-related validity of the test battery should be further evaluated for soldiers exposed to varying physical workload. PMID:26177030

  3. A Human Proximity Operations System test case validation approach

    NASA Astrophysics Data System (ADS)

    Huber, Justin; Straub, Jeremy

    A Human Proximity Operations System (HPOS) poses numerous risks in a real world environment. These risks range from mundane tasks such as avoiding walls and fixed obstacles to the critical need to keep people and processes safe in the context of the HPOS's situation-specific decision making. Validating the performance of an HPOS, which must operate in a real-world environment, is an ill posed problem due to the complexity that is introduced by erratic (non-computer) actors. In order to prove the HPOS's usefulness, test cases must be generated to simulate possible actions of these actors, so the HPOS can be shown to be able perform safely in environments where it will be operated. The HPOS must demonstrate its ability to be as safe as a human, across a wide range of foreseeable circumstances. This paper evaluates the use of test cases to validate HPOS performance and utility. It considers an HPOS's safe performance in the context of a common human activity, moving through a crowded corridor, and extrapolates (based on this) to the suitability of using test cases for AI validation in other areas of prospective application.

  4. Specificity rates for non-clinical, bilingual, Mexican Americans on three popular performance validity measures.

    PubMed

    Gasquoine, Philip G; Weimer, Amy A; Amador, Arnoldo

    2017-04-01

    To measure specificity as failure rates for non-clinical, bilingual, Mexican Americans on three popular performance validity measures: (a) the language format Reliable Digit Span; (b) visual-perceptual format Test of Memory Malingering; and (c) visual-perceptual format Dot Counting, using optimal/suboptimal effort cut scores developed for monolingual, English-speakers. Participants were 61 consecutive referrals, aged between 18 and 65 years, with <16 years of education who were subjectively bilingual (confirmed via formal assessment) and chose the language of assessment, Spanish or English, for the performance validity tests. Failure rates were 38% for Reliable Digit Span, 3% for the Test of Memory Malingering, and 7% for Dot Counting. For Reliable Digit Span, the failure rates for Spanish (46%) and English (31%) languages of administration did not differ significantly. Optimal/suboptimal effort cut scores derived for monolingual English-speakers can be used with Spanish/English bilinguals when using the visual-perceptual format Test of Memory Malingering and Dot Counting. The high failure rate for Reliable Digit Span suggests it should not be used as a performance validity measure with Spanish/English bilinguals, irrespective of the language of test administration, Spanish or English.

  5. C-TOC (Cognitive Testing on Computer): investigating the usability and validity of a novel self-administered cognitive assessment tool in aging and early dementia.

    PubMed

    Jacova, Claudia; McGrenere, Joanna; Lee, Hyunsoo S; Wang, William W; Le Huray, Sarah; Corenblith, Emily F; Brehmer, Matthew; Tang, Charlotte; Hayden, Sherri; Beattie, B Lynn; Hsiung, Ging-Yuek R

    2015-01-01

    Cognitive Testing on Computer (C-TOC) is a novel computer-based test battery developed to improve both usability and validity in the computerized assessment of cognitive function in older adults. C-TOC's usability was evaluated concurrently with its iterative development to version 4 in subjects with and without cognitive impairment, and health professional advisors representing different ethnocultural groups. C-TOC version 4 was then validated against neuropsychological tests (NPTs), and by comparing performance scores of subjects with normal cognition, Cognitive Impairment Not Dementia (CIND) and Alzheimer disease. C-TOC's language tests were validated in subjects with aphasic disorders. The most important usability issue that emerged from consultations with 27 older adults and with 8 cultural advisors was the test-takers' understanding of the task, particularly executive function tasks. User interface features did not pose significant problems. C-TOC version 4 tests correlated with comparator NPT (r=0.4 to 0.7). C-TOC test scores were normal (n=16)>CIND (n=16)>Alzheimer disease (n=6). All normal/CIND NPT performance differences were detected on C-TOC. Low computer knowledge adversely affected test performance, particularly in CIND. C-TOC detected impairments in aphasic disorders (n=11). In general, C-TOC had good validity in detecting cognitive impairment. Ensuring test-takers' understanding of the tasks, and considering their computer knowledge appear important steps towards C-TOC's implementation.

  6. Minimizing false positive error with multiple performance validity tests: response to Bilder, Sugar, and Hellemann (2014 this issue).

    PubMed

    Larrabee, Glenn J

    2014-01-01

    Bilder, Sugar, and Hellemann (2014 this issue) contend that empirical support is lacking for use of multiple performance validity tests (PVTs) in evaluation of the individual case, differing from the conclusions of Davis and Millis (2014), and Larrabee (2014), who found no substantial increase in false positive rates using a criterion of failure of ≥ 2 PVTs and/or Symptom Validity Tests (SVTs) out of multiple tests administered. Reconsideration of data presented in Larrabee (2014) supports a criterion of ≥ 2 out of up to 7 PVTs/SVTs, as keeping false positive rates close to and in most cases below 10% in cases with bona fide neurologic, psychiatric, and developmental disorders. Strategies to minimize risk of false positive error are discussed, including (1) adjusting individual PVT cutoffs or criterion for number of PVTs failed, for examinees who have clinical histories placing them at risk for false positive identification (e.g., severe TBI, schizophrenia), (2) using the history of the individual case to rule out conditions known to result in false positive errors, (3) using normal performance in domains mimicked by PVTs to show that sufficient native ability exists for valid performance on the PVT(s) that have been failed, and (4) recognizing that as the number of PVTs/SVTs failed increases, the likelihood of valid clinical presentation decreases, with a corresponding increase in the likelihood of invalid test performance and symptom report.

  7. [Comparison of the Wechsler Memory Scale-III and the Spain-Complutense Verbal Learning Test in acquired brain injury: construct validity and ecological validity].

    PubMed

    Luna-Lario, P; Pena, J; Ojeda, N

    2017-04-16

    To perform an in-depth examination of the construct validity and the ecological validity of the Wechsler Memory Scale-III (WMS-III) and the Spain-Complutense Verbal Learning Test (TAVEC). The sample consists of 106 adults with acquired brain injury who were treated in the Area of Neuropsychology and Neuropsychiatry of the Complejo Hospitalario de Navarra and displayed memory deficit as the main sequela, measured by means of specific memory tests. The construct validity is determined by examining the tasks required in each test over the basic theoretical models, comparing the performance according to the parameters offered by the tests, contrasting the severity indices of each test and analysing their convergence. The external validity is explored through the correlation between the tests and by using regression models. According to the results obtained, both the WMS-III and the TAVEC have construct validity. The TAVEC is more sensitive and captures not only the deficits in mnemonic consolidation, but also in the executive functions involved in memory. The working memory index of the WMS-III is useful for predicting the return to work at two years after the acquired brain injury, but none of the instruments anticipates the disability and dependence at least six months after the injury. We reflect upon the construct validity of the tests and their insufficient capacity to predict functionality when the sequelae become chronic.

  8. A high power ion thruster for deep space missions

    NASA Astrophysics Data System (ADS)

    Polk, James E.; Goebel, Dan M.; Snyder, John S.; Schneider, Analyn C.; Johnson, Lee K.; Sengupta, Anita

    2012-07-01

    The Nuclear Electric Xenon Ion System ion thruster was developed for potential outer planet robotic missions using nuclear electric propulsion (NEP). This engine was designed to operate at power levels ranging from 13 to 28 kW at specific impulses of 6000-8500 s and for burn times of up to 10 years. State-of-the-art performance and life assessment tools were used to design the thruster, which featured 57-cm-diameter carbon-carbon composite grids operating at voltages of 3.5-6.5 kV. Preliminary validation of the thruster performance was accomplished with a laboratory model thruster, while in parallel, a flight-like development model (DM) thruster was completed and two DM thrusters fabricated. The first thruster completed full performance testing and a 2000-h wear test. The second successfully completed vibration tests at the full protoflight levels defined for this NEP program and then passed performance validation testing. The thruster design, performance, and the experimental validation of the design tools are discussed in this paper.

  9. A high power ion thruster for deep space missions.

    PubMed

    Polk, James E; Goebel, Dan M; Snyder, John S; Schneider, Analyn C; Johnson, Lee K; Sengupta, Anita

    2012-07-01

    The Nuclear Electric Xenon Ion System ion thruster was developed for potential outer planet robotic missions using nuclear electric propulsion (NEP). This engine was designed to operate at power levels ranging from 13 to 28 kW at specific impulses of 6000-8500 s and for burn times of up to 10 years. State-of-the-art performance and life assessment tools were used to design the thruster, which featured 57-cm-diameter carbon-carbon composite grids operating at voltages of 3.5-6.5 kV. Preliminary validation of the thruster performance was accomplished with a laboratory model thruster, while in parallel, a flight-like development model (DM) thruster was completed and two DM thrusters fabricated. The first thruster completed full performance testing and a 2000-h wear test. The second successfully completed vibration tests at the full protoflight levels defined for this NEP program and then passed performance validation testing. The thruster design, performance, and the experimental validation of the design tools are discussed in this paper.

  10. Prevalence of Invalid Performance on Baseline Testing for Sport-Related Concussion by Age and Validity Indicator.

    PubMed

    Abeare, Christopher A; Messa, Isabelle; Zuccato, Brandon G; Merker, Bradley; Erdodi, Laszlo

    2018-03-12

    Estimated base rates of invalid performance on baseline testing (base rates of failure) for the management of sport-related concussion range from 6.1% to 40.0%, depending on the validity indicator used. The instability of this key measure represents a challenge in the clinical interpretation of test results that could undermine the utility of baseline testing. To determine the prevalence of invalid performance on baseline testing and to assess whether the prevalence varies as a function of age and validity indicator. This retrospective, cross-sectional study included data collected between January 1, 2012, and December 31, 2016, from a clinical referral center in the Midwestern United States. Participants included 7897 consecutively tested, equivalently proportioned male and female athletes aged 10 to 21 years, who completed baseline neurocognitive testing for the purpose of concussion management. Baseline assessment was conducted with the Immediate Postconcussion Assessment and Cognitive Testing (ImPACT), a computerized neurocognitive test designed for assessment of concussion. Base rates of failure on published ImPACT validity indicators were compared within and across age groups. Hypotheses were developed after data collection but prior to analyses. Of the 7897 study participants, 4086 (51.7%) were male, mean (SD) age was 14.71 (1.78) years, 7820 (99.0%) were primarily English speaking, and the mean (SD) educational level was 8.79 (1.68) years. The base rate of failure ranged from 6.4% to 47.6% across individual indicators. Most of the sample (55.7%) failed at least 1 of 4 validity indicators. The base rate of failure varied considerably across age groups (117 of 140 [83.6%] for those aged 10 years to 14 of 48 [29.2%] for those aged 21 years), representing a risk ratio of 2.86 (95% CI, 2.60-3.16; P < .001). The results for base rate of failure were surprisingly high overall and varied widely depending on the specific validity indicator and the age of the examinee. The strong age association, with 3 of 4 participants aged 10 to 12 years failing validity indicators, suggests that the clinical interpretation and utility of baseline testing in this age group is questionable. These findings underscore the need for close scrutiny of performance validity indicators on baseline testing across age groups.

  11. Test validity and performance validity: considerations in providing a framework for development of an ability-focused neuropsychological test battery.

    PubMed

    Larrabee, Glenn J

    2014-11-01

    Literature on test validity and performance validity is reviewed to propose a framework for specification of an ability-focused battery (AFB). Factor analysis supports six domains of ability: first, verbal symbolic; secondly, visuoperceptual and visuospatial judgment and problem solving; thirdly, sensorimotor skills; fourthly, attention/working memory; fifthly, processing speed; finally, learning and memory (which can be divided into verbal and visual subdomains). The AFB should include at least three measures for each of the six domains, selected based on various criteria for validity including sensitivity to presence of disorder, sensitivity to severity of disorder, correlation with important activities of daily living, and containing embedded/derived measures of performance validity. Criterion groups should include moderate and severe traumatic brain injury, and Alzheimer's disease. Validation groups should also include patients with left and right hemisphere stroke, to determine measures sensitive to lateralized cognitive impairment and so that the moderating effects of auditory comprehension impairment and neglect can be analyzed on AFB measures. © The Author 2014. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  12. The Stroop test as a measure of performance validity in adults clinically referred for neuropsychological assessment.

    PubMed

    Erdodi, Laszlo A; Sagar, Sanya; Seke, Kristian; Zuccato, Brandon G; Schwartz, Eben S; Roth, Robert M

    2018-06-01

    This study was designed to develop performance validity indicators embedded within the Delis-Kaplan Executive Function Systems (D-KEFS) version of the Stroop task. Archival data from a mixed clinical sample of 132 patients (50% male; M Age = 43.4; M Education = 14.1) clinically referred for neuropsychological assessment were analyzed. Criterion measures included the Warrington Recognition Memory Test-Words and 2 composites based on several independent validity indicators. An age-corrected scaled score ≤6 on any of the 4 trials reliably differentiated psychometrically defined credible and noncredible response sets with high specificity (.87-.94) and variable sensitivity (.34-.71). An inverted Stroop effect was less sensitive (.14-.29), but comparably specific (.85-90) to invalid performance. Aggregating the newly developed D-KEFS Stroop validity indicators further improved classification accuracy. Failing the validity cutoffs was unrelated to self-reported depression or anxiety. However, it was associated with elevated somatic symptom report. In addition to processing speed and executive function, the D-KEFS version of the Stroop task can function as a measure of performance validity. A multivariate approach to performance validity assessment is generally superior to univariate models. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  13. Validity threats: overcoming interference with proposed interpretations of assessment data.

    PubMed

    Downing, Steven M; Haladyna, Thomas M

    2004-03-01

    Factors that interfere with the ability to interpret assessment scores or ratings in the proposed manner threaten validity. To be interpreted in a meaningful manner, all assessments in medical education require sound, scientific evidence of validity. The purpose of this essay is to discuss 2 major threats to validity: construct under-representation (CU) and construct-irrelevant variance (CIV). Examples of each type of threat for written, performance and clinical performance examinations are provided. The CU threat to validity refers to undersampling the content domain. Using too few items, cases or clinical performance observations to adequately generalise to the domain represents CU. Variables that systematically (rather than randomly) interfere with the ability to meaningfully interpret scores or ratings represent CIV. Issues such as flawed test items written at inappropriate reading levels or statistically biased questions represent CIV in written tests. For performance examinations, such as standardised patient examinations, flawed cases or cases that are too difficult for student ability contribute CIV to the assessment. For clinical performance data, systematic rater error, such as halo or central tendency error, represents CIV. The term face validity is rejected as representative of any type of legitimate validity evidence, although the fact that the appearance of the assessment may be an important characteristic other than validity is acknowledged. There are multiple threats to validity in all types of assessment in medical education. Methods to eliminate or control validity threats are suggested.

  14. Validity and reliability of a video questionnaire to assess physical function in older adults.

    PubMed

    Balachandran, Anoop; N Verduin, Chelsea; Potiaumpai, Melanie; Ni, Meng; Signorile, Joseph F

    2016-08-01

    Self-report questionnaires are widely used to assess physical function in older adults. However, they often lack a clear frame of reference and hence interpreting and rating task difficulty levels can be problematic for the responder. Consequently, the usefulness of traditional self-report questionnaires for assessing higher-level functioning is limited. Video-based questionnaires can overcome some of these limitations by offering a clear and objective visual reference for the performance level against which the subject is to compare his or her perceived capacity. Hence the purpose of the study was to develop and validate a novel, video-based questionnaire to assess physical function in older adults independently living in the community. A total of 61 community-living adults, 60years or older, were recruited. To examine validity, 35 of the subjects completed the video questionnaire, two types of physical performance tests: a test of instrumental activity of daily living (IADL) included in the Short Physical Functional Performance battery (PFP-10), and a composite of 3 performance tests (30s chair stand, single-leg balance and usual gait speed). To ascertain reliability, two-week test-retest reliability was assessed in the remaining 26 subjects who did not participate in validity testing. The video questionnaire showed a moderate correlation with the IADLs (Spearman rho=0.64, p<0.001; 95% CI (0.4, 0.8)), and a lower correlation with the composite score of physical performance tests (Spearman rho=0.49, p<0.01; 95% CI (0.18, 0.7)). The test-retest assessment yielded an intra-class correlation (ICC) of 0.87 (p<0.001; 95% CI (0.70, 0.94)) and a Cronbach's alpha of 0.89 demonstrating good reliability and internal consistency. Our results show that the video questionnaire developed to evaluate physical function in community-living older adults is a valid and reliable assessment tool; however, further validation is needed for definitive conclusions. Copyright © 2016 Elsevier Inc. All rights reserved.

  15. Assessment of human epidermal model LabCyte EPI-MODEL for in vitro skin irritation testing according to European Centre for the Validation of Alternative Methods (ECVAM)-validated protocol.

    PubMed

    Katoh, Masakazu; Hamajima, Fumiyasu; Ogasawara, Takahiro; Hata, Ken-Ichiro

    2009-06-01

    A validation study of an in vitro skin irritation testing method using a reconstructed human skin model has been conducted by the European Centre for the Validation of Alternative Methods (ECVAM), and a protocol using EpiSkin (SkinEthic, France) has been approved. The structural and performance criteria of skin models for testing are defined in the ECVAM Performance Standards announced along with the approval. We have performed several evaluations of the new reconstructed human epidermal model LabCyte EPI-MODEL, and confirmed that it is applicable to skin irritation testing as defined in the ECVAM Performance Standards. We selected 19 materials (nine irritants and ten non-irritants) available in Japan as test chemicals among the 20 reference chemicals described in the ECVAM Performance Standard. A test chemical was applied to the surface of the LabCyte EPI-MODEL for 15 min, after which it was completely removed and the model then post-incubated for 42 hr. Cell v iability was measured by MTT assay and skin irritancy of the test chemical evaluated. In addition, interleukin-1 alpha (IL-1alpha) concentration in the culture supernatant after post-incubation was measured to provide a complementary evaluation of skin irritation. Evaluation of the 19 test chemicals resulted in 79% accuracy, 78% sensitivity and 80% specificity, confirming that the in vitro skin irritancy of the LabCyte EPI-MODEL correlates highly with in vivo skin irritation. These results suggest that LabCyte EPI-MODEL is applicable to the skin irritation testing protocol set out in the ECVAM Performance Standards.

  16. Predictive validity of pre-admission assessments on medical student performance.

    PubMed

    Dabaliz, Al-Awwab; Kaadan, Samy; Dabbagh, M Marwan; Barakat, Abdulaziz; Shareef, Mohammad Abrar; Al-Tannir, Mohamad; Obeidat, Akef; Mohamed, Ayman

    2017-11-24

    To examine the predictive validity of pre-admission variables on students' performance in a medical school in Saudi Arabia. In this retrospective study, we collected admission and college performance data for 737 students in preclinical and clinical years. Data included high school scores and other standardized test scores, such as those of the National Achievement Test and the General Aptitude Test. Additionally, we included the scores of the Test of English as a Foreign Language (TOEFL) and the International English Language Testing System (IELTS) exams. Those datasets were then compared with college performance indicators, namely the cumulative Grade Point Average (cGPA) and progress test, using multivariate linear regression analysis. In preclinical years, both the National Achievement Test (p=0.04, B=0.08) and TOEFL (p=0.017, B=0.01) scores were positive predictors of cGPA, whereas the General Aptitude Test (p=0.048, B=-0.05) negatively predicted cGPA. Moreover, none of the pre-admission variables were predictive of progress test performance in the same group. On the other hand, none of the pre-admission variables were predictive of cGPA in clinical years. Overall, cGPA strongly predict-ed students' progress test performance (p<0.001 and B=19.02). Only the National Achievement Test and TOEFL significantly predicted performance in preclinical years. However, these variables do not predict progress test performance, meaning that they do not predict the functional knowledge reflected in the progress test. We report various strengths and deficiencies in the current medical college admission criteria, and call for employing more sensitive and valid ones that predict student performance and functional knowledge, especially in the clinical years.

  17. Predictive validity of pre-admission assessments on medical student performance

    PubMed Central

    Dabaliz, Al-Awwab; Kaadan, Samy; Dabbagh, M. Marwan; Barakat, Abdulaziz; Shareef, Mohammad Abrar; Al-Tannir, Mohamad; Obeidat, Akef

    2017-01-01

    Objectives To examine the predictive validity of pre-admission variables on students’ performance in a medical school in Saudi Arabia.  Methods In this retrospective study, we collected admission and college performance data for 737 students in preclinical and clinical years. Data included high school scores and other standardized test scores, such as those of the National Achievement Test and the General Aptitude Test. Additionally, we included the scores of the Test of English as a Foreign Language (TOEFL) and the International English Language Testing System (IELTS) exams. Those datasets were then compared with college performance indicators, namely the cumulative Grade Point Average (cGPA) and progress test, using multivariate linear regression analysis. Results In preclinical years, both the National Achievement Test (p=0.04, B=0.08) and TOEFL (p=0.017, B=0.01) scores were positive predictors of cGPA, whereas the General Aptitude Test (p=0.048, B=-0.05) negatively predicted cGPA. Moreover, none of the pre-admission variables were predictive of progress test performance in the same group. On the other hand, none of the pre-admission variables were predictive of cGPA in clinical years. Overall, cGPA strongly predict-ed students’ progress test performance (p<0.001 and B=19.02). Conclusions Only the National Achievement Test and TOEFL significantly predicted performance in preclinical years. However, these variables do not predict progress test performance, meaning that they do not predict the functional knowledge reflected in the progress test. We report various strengths and deficiencies in the current medical college admission criteria, and call for employing more sensitive and valid ones that predict student performance and functional knowledge, especially in the clinical years. PMID:29176032

  18. The Validity and Incremental Validity of Knowledge Tests, Low-Fidelity Simulations, and High-Fidelity Simulations for Predicting Job Performance in Advanced-Level High-Stakes Selection

    ERIC Educational Resources Information Center

    Lievens, Filip; Patterson, Fiona

    2011-01-01

    In high-stakes selection among candidates with considerable domain-specific knowledge and experience, investigations of whether high-fidelity simulations (assessment centers; ACs) have incremental validity over low-fidelity simulations (situational judgment tests; SJTs) are lacking. Therefore, this article integrates research on the validity of…

  19. Translation, cultural adaptation and validation of the Diabetes Attitudes Scale - third version into Brazilian Portuguese 1

    PubMed Central

    Vieira, Gisele de Lacerda Chaves; Pagano, Adriana Silvino; Reis, Ilka Afonso; Rodrigues, Júlia Santos Nunes; Torres, Heloísa de Carvalho

    2018-01-01

    ABSTRACT Objective: to perform the translation, adaptation and validation of the Diabetes Attitudes Scale - third version instrument into Brazilian Portuguese. Methods: methodological study carried out in six stages: initial translation, synthesis of the initial translation, back-translation, evaluation of the translated version by the Committee of Judges (27 Linguists and 29 health professionals), pre-test and validation. The pre-test and validation (test-retest) steps included 22 and 120 health professionals, respectively. The Content Validity Index, the analyses of internal consistency and reproducibility were performed using the R statistical program. Results: in the content validation, the instrument presented good acceptance among the Judges with a mean Content Validity Index of 0.94. The scale presented acceptable internal consistency (Cronbach’s alpha = 0.60), while the correlation of the total score at the test and retest moments was considered high (Polychoric Correlation Coefficient = 0.86). The Intra-class Correlation Coefficient, for the total score, presented a value of 0.65. Conclusion: the Brazilian version of the instrument (Escala de Atitudes dos Profissionais em relação ao Diabetes Mellitus) was considered valid and reliable for application by health professionals in Brazil. PMID:29319739

  20. The development and testing of a skin tear risk assessment tool.

    PubMed

    Newall, Nelly; Lewin, Gill F; Bulsara, Max K; Carville, Keryln J; Leslie, Gavin D; Roberts, Pam A

    2017-02-01

    The aim of the present study is to develop a reliable and valid skin tear risk assessment tool. The six characteristics identified in a previous case control study as constituting the best risk model for skin tear development were used to construct a risk assessment tool. The ability of the tool to predict skin tear development was then tested in a prospective study. Between August 2012 and September 2013, 1466 tertiary hospital patients were assessed at admission and followed up for 10 days to see if they developed a skin tear. The predictive validity of the tool was assessed using receiver operating characteristic (ROC) analysis. When the tool was found not to have performed as well as hoped, secondary analyses were performed to determine whether a potentially better performing risk model could be identified. The tool was found to have high sensitivity but low specificity and therefore have inadequate predictive validity. Secondary analysis of the combined data from this and the previous case control study identified an alternative better performing risk model. The tool developed and tested in this study was found to have inadequate predictive validity. The predictive validity of an alternative, more parsimonious model now needs to be tested. © 2015 Medicalhelplines.com Inc and John Wiley & Sons Ltd.

  1. What tests should you use to assess small intestinal bacterial overgrowth in systemic sclerosis?

    PubMed

    Braun-Moscovici, Yolanda; Braun, Marius; Khanna, Dinesh; Balbir-Gurman, Alexandra; Furst, Daniel E

    2015-01-01

    Small intestinal bacterial overgrowth (SIBO) plays a major role in the pathogenesis of malabsorption in SSc patients and is a source of great morbidity and even mortality, in those patients. This manuscript reviews which tests are valid and should be used in SSc when evaluating SIBO. We performed systematic literature searches in PubMed, Embase and the Cochrane library from 1966 up to November 2014 for English language, published articles examining bacterial overgrowth in SSc (e.g. malabsorption tests, breath tests, xylose test, etc). Articles obtained from these searches were reviewed for additional references. The validity of the tests was evaluated according to the OMERACT principles of truth, discrimination and feasibility. From a total of 65 titles, 22 articles were reviewed and 20 were ultimately extracted to examine the validity of tests for GI morphology, bacterial overgrowth and malabsorption in SSc. Only 1 test (hydrogen and methane breath tests) is fully validated. Four tests are partially validated, including jejunal cultures, xylose, lactulose tests, and 72 hours fecal fat test. Only 1 of a total of 5 GI tests of bacterial overgrowth (see above) is fully validated in SSc. For clinical trials, fully validated tests are preferred, although some investigators use partially validated tests (4 tests). Further validation of GI tests in SSc is needed.

  2. Revalidation of the NASA Ames 11-by 11-Foot Transonic Wind Tunnel with a Commercial Airplane Model

    NASA Technical Reports Server (NTRS)

    Kmak, Frank J.; Hudgins, M.; Hergert, D.; George, Michael W. (Technical Monitor)

    2001-01-01

    The 11-By 11-Foot Transonic leg of the Unitary Plan Wind Tunnel (UPWT) was modernized to improve tunnel performance, capability, productivity, and reliability. Wind tunnel tests to demonstrate the readiness of the tunnel for a return to production operations included an Integrated Systems Test (IST), calibration tests, and airplane validation tests. One of the two validation tests was a 0.037-scale Boeing 777 model that was previously tested in the 11-By 11-Foot tunnel in 1991. The objective of the validation tests was to compare pre-modernization and post-modernization results from the same airplane model in order to substantiate the operational readiness of the facility. Evaluation of within-test, test-to-test, and tunnel-to-tunnel data repeatability were made to study the effects of the tunnel modifications. Tunnel productivity was also evaluated to determine the readiness of the facility for production operations. The operation of the facility, including model installation, tunnel operations, and the performance of tunnel systems, was observed and facility deficiency findings generated. The data repeatability studies and tunnel-to-tunnel comparisons demonstrated outstanding data repeatability and a high overall level of data quality. Despite some operational and facility problems, the validation test was successful in demonstrating the readiness of the facility to perform production airplane wind tunnel%, tests.

  3. Testing expert systems

    NASA Technical Reports Server (NTRS)

    Chang, C. L.; Stachowitz, R. A.

    1988-01-01

    Software quality is of primary concern in all large-scale expert system development efforts. Building appropriate validation and test tools for ensuring software reliability of expert systems is therefore required. The Expert Systems Validation Associate (EVA) is a validation system under development at the Lockheed Artificial Intelligence Center. EVA provides a wide range of validation and test tools to check correctness, consistency, and completeness of an expert system. Testing a major function of EVA. It means executing an expert system with test cases with the intent of finding errors. In this paper, we describe many different types of testing such as function-based testing, structure-based testing, and data-based testing. We describe how appropriate test cases may be selected in order to perform good and thorough testing of an expert system.

  4. The prone bridge test: Performance, validity, and reliability among older and younger adults.

    PubMed

    Bohannon, Richard W; Steffl, Michal; Glenney, Susan S; Green, Michelle; Cashwell, Leah; Prajerova, Kveta; Bunn, Jennifer

    2018-04-01

    The prone bridge maneuver, or plank, has been viewed as a potential alternative to curl-ups for assessing trunk muscle performance. The purpose of this study was to assess prone bridge test performance, validity, and reliability among younger and older adults. Sixty younger (20-35 years old) and 60 older (60-79 years old) participants completed this study. Groups were evenly divided by sex. Participants completed surveys regarding physical activity and abdominal exercise participation. Height, weight, body mass index (BMI), and waist circumference were measured. On two occasions, 5-9 days apart, participants held a prone bridge until volitional exhaustion or until repeated technique failure. Validity was examined using data from the first session: convergent validity by calculating correlations between survey responses, anthropometrics, and prone bridge time, known groups validity by using an ANOVA comparing bridge times of younger and older adults and of men and women. Test-retest reliability was examined by using a paired t-test to compare prone bridge times for Session1 and Session 2. Furthermore, an intraclass correlation coefficient (ICC) was used to characterize relative reliability and minimal detectable change (MDC 95% ) was used to describe absolute reliability. The mean prone bridge time was 145.3 ± 71.5 s, and was positively correlated with physical activity participation (p ≤ 0.001) and negatively correlated with BMI and waist circumference (p ≤ 0.003). Younger participants had significantly longer plank times than older participants (p = 0.003). The ICC between testing sessions was 0.915. The prone bridge test is a valid and reliable measure for evaluating abdominal performance in both younger and older adults. Copyright © 2017 Elsevier Ltd. All rights reserved.

  5. 6DOF Testing of the SLS Inertial Navigation Unit

    NASA Technical Reports Server (NTRS)

    Geohagan, Kevin W.; Bernard, William P.; Oliver, T. Emerson; Strickland, Dennis J.; Leggett, Jared O.

    2018-01-01

    The Navigation System on the NASA Space Launch System (SLS) Block 1 vehicle performs initial alignment of the Inertial Navigation System (INS) navigation frame through gyrocompass alignment (GCA). In lieu of direct testing of GCA accuracy in support of requirement verification, the SLS Navigation Team proposed and conducted an engineering test to, among other things, validate the GCA performance and overall behavior of the SLS INS model through comparison with test data. This paper will detail dynamic hardware testing of the SLS INS, conducted by the SLS Navigation Team at Marshall Space Flight Center's 6DOF Table Facility, in support of GCA performance characterization and INS model validation. A 6-DOF motion platform was used to produce 6DOF pad twist and sway dynamics while a simulated SLS flight computer communicated with the INS. Tests conducted include an evaluation of GCA algorithm robustness to increasingly dynamic pad environments, an examination of GCA algorithm stability and accuracy over long durations, and a long-duration static test to gather enough data for Allan Variance analysis. Test setup, execution, and data analysis will be discussed, including analysis performed in support of SLS INS model validation.

  6. Symptom validity testing in memory clinics: Hippocampal-memory associations and relevance for diagnosing mild cognitive impairment.

    PubMed

    Rienstra, Anne; Groot, Paul F C; Spaan, Pauline E J; Majoie, Charles B L M; Nederveen, Aart J; Walstra, Gerard J M; de Jonghe, Jos F M; van Gool, Willem A; Olabarriaga, Silvia D; Korkhov, Vladimir V; Schmand, Ben

    2013-01-01

    Patients with mild cognitive impairment (MCI) do not always convert to dementia. In such cases, abnormal neuropsychological test results may not validly reflect cognitive symptoms due to brain disease, and the usual brain-behavior relationships may be absent. This study examined symptom validity in a memory clinic sample and its effect on the associations between hippocampal volume and memory performance. Eleven of 170 consecutive patients (6.5%; 13% of patients younger than 65 years) referred to memory clinics showed noncredible performance on symptom validity tests (SVTs, viz. Word Memory Test and Test of Memory Malingering). They were compared to a demographically matched group (n = 57) selected from the remaining patients. Hippocampal volume, measured by an automated volumetric method (Freesurfer), was correlated with scores on six verbal memory tests. The median correlation was r = .49 in the matched group. However, the relation was absent (median r = -.11) in patients who failed SVTs. Memory clinic samples may include patients who show noncredible performance, which invalidates their MCI diagnosis. This underscores the importance of applying SVTs in evaluating patients with cognitive complaints that may signify a predementia stage, especially when these patients are relatively young.

  7. Validity of FAA-approved color vision tests for class II and class III aeromedical screening.

    DOT National Transportation Integrated Search

    1993-09-01

    All clinical color vision tests currently used in the medical examination of pilots were studied regarding validity for prediction of performance on practical tests of ability to discriminate the aviation signal colors, red, green, and white given un...

  8. Evaluating the dynamic response of in-flight thrust calculation techniques during throttle transients

    NASA Technical Reports Server (NTRS)

    Ray, Ronald J.

    1994-01-01

    New flight test maneuvers and analysis techniques for evaluating the dynamic response of in-flight thrust models during throttle transients have been developed and validated. The approach is based on the aircraft and engine performance relationship between thrust and drag. Two flight test maneuvers, a throttle step and a throttle frequency sweep, were developed and used in the study. Graphical analysis techniques, including a frequency domain analysis method, were also developed and evaluated. They provide quantitative and qualitative results. Four thrust calculation methods were used to demonstrate and validate the test technique. Flight test applications on two high-performance aircraft confirmed the test methods as valid and accurate. These maneuvers and analysis techniques were easy to implement and use. Flight test results indicate the analysis techniques can identify the combined effects of model error and instrumentation response limitations on the calculated thrust value. The methods developed in this report provide an accurate approach for evaluating, validating, or comparing thrust calculation methods for dynamic flight applications.

  9. The reliability and validity of fatigue measures during multiple-sprint work: an issue revisited.

    PubMed

    Glaister, Mark; Howatson, Glyn; Pattison, John R; McInnes, Gill

    2008-09-01

    The ability to repeatedly produce a high-power output or sprint speed is a key fitness component of most field and court sports. The aim of this study was to evaluate the validity and reliability of eight different approaches to quantify this parameter in tests of multiple-sprint performance. Ten physically active men completed two trials of each of two multiple-sprint running protocols with contrasting recovery periods. Protocol 1 consisted of 12 x 30-m sprints repeated every 35 seconds; protocol 2 consisted of 12 x 30-m sprints repeated every 65 seconds. All testing was performed in an indoor sports facility, and sprint times were recorded using twin-beam photocells. All but one of the formulae showed good construct validity, as evidenced by similar within-protocol fatigue scores. However, the assumptions on which many of the formulae were based, combined with poor or inconsistent test-retest reliability (coefficient of variation range: 0.8-145.7%; intraclass correlation coefficient range: 0.09-0.75), suggested many problems regarding logical validity. In line with previous research, the results support the percentage decrement calculation as the most valid and reliable method of quantifying fatigue in tests of multiple-sprint performance.

  10. Changing abilities vs. changing tasks: Examining validity degradation with test scores and college performance criteria both assessed longitudinally.

    PubMed

    Dahlke, Jeffrey A; Kostal, Jack W; Sackett, Paul R; Kuncel, Nathan R

    2018-05-03

    We explore potential explanations for validity degradation using a unique predictive validation data set containing up to four consecutive years of high school students' cognitive test scores and four complete years of those students' college grades. This data set permits analyses that disentangle the effects of predictor-score age and timing of criterion measurements on validity degradation. We investigate the extent to which validity degradation is explained by criterion dynamism versus the limited shelf-life of ability scores. We also explore whether validity degradation is attributable to fluctuations in criterion variability over time and/or GPA contamination from individual differences in course-taking patterns. Analyses of multiyear predictor data suggest that changes to the determinants of performance over time have much stronger effects on validity degradation than does the shelf-life of cognitive test scores. The age of predictor scores had only a modest relationship with criterion-related validity when the criterion measurement occasion was held constant. Practical implications and recommendations for future research are discussed. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  11. An Evaluation of Test Speededness in an Assessment for Third-Grade Gifted Students

    ERIC Educational Resources Information Center

    Hailey, Emily; Callahan, Carolyn M.; Azano, Amy; Moon, Tonya R.

    2012-01-01

    Reliability and validity are integral concepts in assessment design. Test speededness, the influence of time constraints on test taker performance, is often an overlooked threat to reliability and validity, especially in classroom-based testing. The purpose of this study is to evaluate the degree of test speededness of classroom-based assessments…

  12. TESTING BALANCE AND FALL RISK IN PERSONS WITH PARKINSON DISEASE, AN ARGUMENT FOR ECOLOGICALLY VALID TESTING

    PubMed Central

    Foreman, K. Bo; Addison, Odessa; Kim, Han S.; Dibble, Leland E.

    2010-01-01

    Introduction Despite clear deficits in postural control, most clinical examination tools lack accuracy in identifying persons with Parkinson disease (PD) who have fallen or are at risk for falls. We assert that this is in part due to the lack of ecological validity of the testing. Methods To test this assertion, we examined the responsiveness and predictive validity of the Functional Gait Assessment (FGA), the Pull test, and the Timed up and Go (TUG) during clinically defined ON and OFF medication states. To address responsiveness, ON/OFF medication performance was compared. To address predictive validity, areas under the curve (AUC) of receiver operating characteristic (ROC) curves were compared. Comparisons were made using separate non-parametric tests. Results Thirty-six persons (24 male, 12 female) with PD (22 fallers, 14 non-fallers) participated. Only the FGA was able to detect differences between fallers and non-fallers for both ON/OFF medication testing. The predictive validity of the FGA and the TUG for fall identification was higher during OFF medication compared to ON medication testing. The predictive validity of the FGA was higher than the TUG and the Pull test during ON and OFF medication testing. Discussion In order to most accurately identify fallers, clinicians should test persons with PD in ecologically relevant conditions and tasks. In this study, interpretation of the OFF medication performance and use of the FGA provided more accurate prediction of those who would fall. PMID:21215674

  13. Ultrasonic inspection of a glued laminated timber fabricated with defects

    Treesearch

    Robert Emerson; David Pollock; David McLean; Kenneth Fridley; Robert Ross; Roy Pellerin

    2001-01-01

    The Federal Highway Administration (FHWA) set up a validation test to compare the effectiveness of various nondestructive inspection techniques for detecting artificial defects in glulam members. The validation test consisted of a glulam beam fabricated with artificial defects known to FHWA personnel but not originally known to the scientists performing the validation...

  14. Domestic violence on children: development and validation of an instrument to evaluate knowledge of health professionals 1

    PubMed Central

    Oliveira, Lanuza Borges; Soares, Fernanda Amaral; Silveira, Marise Fagundes; de Pinho, Lucinéia; Caldeira, Antônio Prates; Leite, Maísa Tavares de Souza

    2016-01-01

    ABSTRACT Objective: to develop and validate an instrument to evaluate the knowledge of health professionals about domestic violence on children. Method: this was a study conducted with 194 physicians, nurses and dentists. A literature review was performed for preparation of the items and identification of the dimensions. Apparent and content validation was performed using analysis of three experts and 27 professors of the pediatric health discipline. For construct validation, Cronbach's alpha was used, and the Kappa test was applied to verify reproducibility. The criterion validation was conducted using the Student's t-test. Results: the final instrument included 56 items; the Cronbach alpha was 0.734, the Kappa test showed a correlation greater than 0.6 for most items, and the Student t-test showed a statistically significant value to the level of 5% for the two selected variables: years of education and using the Family Health Strategy. Conclusion: the instrument is valid and can be used as a promising tool to develop or direct actions in public health and evaluate knowledge about domestic violence on children. PMID:27556878

  15. vvtools v. 1.0

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Drake, Richard R.

    Vvtools is a suite of testing tools, with a focus on reproducible verification and validation. They are written in pure Python, and contain a test harness and an automated process management tool. Users of vvtools can develop suites of verification and validation tests and run them on small to large high performance computing resources in an automated and reproducible way. The test harness enables complex processes to be performed in each test and even supports a one-level parent/child dependency between tests. It includes a built in capability to manage workloads requiring multiple processors and platforms that use batch queueing systems.

  16. Readability Level of Standardized Test Items and Student Performance: The Forgotten Validity Variable

    ERIC Educational Resources Information Center

    Hewitt, Margaret A.; Homan, Susan P.

    2004-01-01

    Test validity issues considered by test developers and school districts rarely include individual item readability levels. In this study, items from a major standardized test were examined for individual item readability level and item difficulty. The Homan-Hewitt Readability Formula was applied to items across three grade levels. Results of…

  17. An exploratory study into the effect of time-restricted internet access on face-validity, construct validity and reliability of postgraduate knowledge progress testing

    PubMed Central

    2013-01-01

    Background Yearly formative knowledge testing (also known as progress testing) was shown to have a limited construct-validity and reliability in postgraduate medical education. One way to improve construct-validity and reliability is to improve the authenticity of a test. As easily accessible internet has become inseparably linked to daily clinical practice, we hypothesized that allowing internet access for a limited amount of time during the progress test would improve the perception of authenticity (face-validity) of the test, which would in turn improve the construct-validity and reliability of postgraduate progress testing. Methods Postgraduate trainees taking the yearly knowledge progress test were asked to participate in a study where they could access the internet for 30 minutes at the end of a traditional pen and paper test. Before and after the test they were asked to complete a short questionnaire regarding the face-validity of the test. Results Mean test scores increased significantly for all training years. Trainees indicated that the face-validity of the test improved with internet access and that they would like to continue to have internet access during future testing. Internet access did not improve the construct-validity or reliability of the test. Conclusion Improving the face-validity of postgraduate progress testing, by adding the possibility to search the internet for a limited amount of time, positively influences test performance and face-validity. However, it did not change the reliability or the construct-validity of the test. PMID:24195696

  18. Predicting Job Performance for the Visually Impaired: Validity of the Fine Finger Dexterity Work Task.

    ERIC Educational Resources Information Center

    Giesen, J. Martin; And Others

    The study was designed to determine the reliability and criterion validity of a psychomotor performance test (the Fine Finger Dexterity Work Task Unit) with 40 partially or totally blind adults. Reliability was established by using the test-retest method. A supervisory rating was developed and the reliability established by using the split-half…

  19. Six Years of Comprehensive, Clinical, Performance-Based Assessment Using Standardized Patients at the Southern Illinois University School of Medicine.

    ERIC Educational Resources Information Center

    Vu, Nu Viet; And Others

    1992-01-01

    The use of a performance-based assessment of senior medical students' clinical skills utilizing standardized patients was evaluated, with 6,804 student-patient encounters involving 405 students over 6 years. Results provide evidence for test security, content validity, construct validity, reliability, and test ability to discriminate a wide range…

  20. Predicting psychopharmacological drug effects on actual driving performance (SDLP) from psychometric tests measuring driving-related skills.

    PubMed

    Verster, Joris C; Roth, Thomas

    2012-03-01

    There are various methods to examine driving ability. Comparisons between these methods and their relationship with actual on-road driving is often not determined. The objective of this study was to determine whether laboratory tests measuring driving-related skills could adequately predict on-the-road driving performance during normal traffic. Ninety-six healthy volunteers performed a standardized on-the-road driving test. Subjects were instructed to drive with a constant speed and steady lateral position within the right traffic lane. Standard deviation of lateral position (SDLP), i.e., the weaving of the car, was determined. The subjects also performed a psychometric test battery including the DSST, Sternberg memory scanning test, a tracking test, and a divided attention test. Difference scores from placebo for parameters of the psychometric tests and SDLP were computed and correlated with each other. A stepwise linear regression analysis determined the predictive validity of the laboratory test battery to SDLP. Stepwise regression analyses revealed that the combination of five parameters, hard tracking, tracking and reaction time of the divided attention test, and reaction time and percentage of errors of the Sternberg memory scanning test, together had a predictive validity of 33.4%. The psychometric tests in this test battery showed insufficient predictive validity to replace the on-the-road driving test during normal traffic.

  1. Simulation verification techniques study: Simulation performance validation techniques document. [for the space shuttle system

    NASA Technical Reports Server (NTRS)

    Duncan, L. M.; Reddell, J. P.; Schoonmaker, P. B.

    1975-01-01

    Techniques and support software for the efficient performance of simulation validation are discussed. Overall validation software structure, the performance of validation at various levels of simulation integration, guidelines for check case formulation, methods for real time acquisition and formatting of data from an all up operational simulator, and methods and criteria for comparison and evaluation of simulation data are included. Vehicle subsystems modules, module integration, special test requirements, and reference data formats are also described.

  2. Reliability and Validity of the Standing Heel-Rise Test

    ERIC Educational Resources Information Center

    Yocum, Allison; McCoy, Sarah Westcott; Bjornson, Kristie F.; Mullens, Pamela; Burton, Gay Naganuma

    2010-01-01

    A standardized protocol for a pediatric heel-rise test was developed and reliability and validity are reported. Fifty-seven children developing typically (CDT) and 34 children with plantar flexion weakness performed three tests: unilateral heel rise, vertical jump, and force measurement using handheld dynamometry. Intraclass correlation…

  3. Applied Chaos Level Test for Validation of Signal Conditions Underlying Optimal Performance of Voice Classification Methods.

    PubMed

    Liu, Boquan; Polce, Evan; Sprott, Julien C; Jiang, Jack J

    2018-05-17

    The purpose of this study is to introduce a chaos level test to evaluate linear and nonlinear voice type classification method performances under varying signal chaos conditions without subjective impression. Voice signals were constructed with differing degrees of noise to model signal chaos. Within each noise power, 100 Monte Carlo experiments were applied to analyze the output of jitter, shimmer, correlation dimension, and spectrum convergence ratio. The computational output of the 4 classifiers was then plotted against signal chaos level to investigate the performance of these acoustic analysis methods under varying degrees of signal chaos. A diffusive behavior detection-based chaos level test was used to investigate the performances of different voice classification methods. Voice signals were constructed by varying the signal-to-noise ratio to establish differing signal chaos conditions. Chaos level increased sigmoidally with increasing noise power. Jitter and shimmer performed optimally when the chaos level was less than or equal to 0.01, whereas correlation dimension was capable of analyzing signals with chaos levels of less than or equal to 0.0179. Spectrum convergence ratio demonstrated proficiency in analyzing voice signals with all chaos levels investigated in this study. The results of this study corroborate the performance relationships observed in previous studies and, therefore, demonstrate the validity of the validation test method. The presented chaos level validation test could be broadly utilized to evaluate acoustic analysis methods and establish the most appropriate methodology for objective voice analysis in clinical practice.

  4. Initial validation of a web-based self-administered neuropsychological test battery for older adults and seniors.

    PubMed

    Hansen, Tor Ivar; Haferstrom, Elise Christina D; Brunner, Jan F; Lehn, Hanne; Håberg, Asta Kristine

    2015-01-01

    Computerized neuropsychological tests are effective in assessing different cognitive domains, but are often limited by the need of proprietary hardware and technical staff. Web-based tests can be more accessible and flexible. We aimed to investigate validity, effects of computer familiarity, education, and age, and the feasibility of a new web-based self-administered neuropsychological test battery (Memoro) in older adults and seniors. A total of 62 (37 female) participants (mean age 60.7 years) completed the Memoro web-based neuropsychological test battery and a traditional battery composed of similar tests intended to measure the same cognitive constructs. Participants were assessed on computer familiarity and how they experienced the two batteries. To properly test the factor structure of Memoro, an additional factor analysis in 218 individuals from the HUNT population was performed. Comparing Memoro to traditional tests, we observed good concurrent validity (r = .49-.63). The performance on the traditional and Memoro test battery was consistent, but differences in raw scores were observed with higher scores on verbal memory and lower in spatial memory in Memoro. Factor analysis indicated two factors: verbal and spatial memory. There were no correlations between test performance and computer familiarity after adjustment for age or age and education. Subjects reported that they preferred web-based testing as it allowed them to set their own pace, and they did not feel scrutinized by an administrator. Memoro showed good concurrent validity compared to neuropsychological tests measuring similar cognitive constructs. Based on the current results, Memoro appears to be a tool that can be used to assess cognitive function in older and senior adults. Further work is necessary to ascertain its validity and reliability.

  5. Initial validation of a web-based self-administered neuropsychological test battery for older adults and seniors

    PubMed Central

    Hansen, Tor Ivar; Haferstrom, Elise Christina D.; Brunner, Jan F.; Lehn, Hanne; Håberg, Asta Kristine

    2015-01-01

    Introduction: Computerized neuropsychological tests are effective in assessing different cognitive domains, but are often limited by the need of proprietary hardware and technical staff. Web-based tests can be more accessible and flexible. We aimed to investigate validity, effects of computer familiarity, education, and age, and the feasibility of a new web-based self-administered neuropsychological test battery (Memoro) in older adults and seniors. Method: A total of 62 (37 female) participants (mean age 60.7 years) completed the Memoro web-based neuropsychological test battery and a traditional battery composed of similar tests intended to measure the same cognitive constructs. Participants were assessed on computer familiarity and how they experienced the two batteries. To properly test the factor structure of Memoro, an additional factor analysis in 218 individuals from the HUNT population was performed. Results: Comparing Memoro to traditional tests, we observed good concurrent validity (r = .49–.63). The performance on the traditional and Memoro test battery was consistent, but differences in raw scores were observed with higher scores on verbal memory and lower in spatial memory in Memoro. Factor analysis indicated two factors: verbal and spatial memory. There were no correlations between test performance and computer familiarity after adjustment for age or age and education. Subjects reported that they preferred web-based testing as it allowed them to set their own pace, and they did not feel scrutinized by an administrator. Conclusions: Memoro showed good concurrent validity compared to neuropsychological tests measuring similar cognitive constructs. Based on the current results, Memoro appears to be a tool that can be used to assess cognitive function in older and senior adults. Further work is necessary to ascertain its validity and reliability. PMID:26009791

  6. Validation of an Alzheimer’s disease assessment battery in Asian participants with mild to moderate Alzheimer’s disease

    PubMed Central

    Shen, Joan HQ; Shen, Qi; Yu, Holly; Lai, Jin-Shei; Beaumont, Jennifer L; Zhang, Zhenxin; Wang, Huali; Kim, Seong Yoon; Chen, Christopher; Kwok, Timothy; Wang, Shuu-Jiun; Lee, Dong Young; Harrison, John; Cummings, Jeffrey

    2014-01-01

    There is a lack of validated tools for assessing Alzheimer’s disease (AD) across Asia. This study evaluates the psychometric properties of the Alzheimer’s Disease Assessment Scale-Cognitive Subscale (ADAS-Cog), Disability Assessment for Dementia (DAD), and Neuropsychological Test Battery (NTB) in Asian participants. Participants with mild to moderate AD (n=251) and healthy controls (n=51) from Mainland China, Taiwan, Singapore, Hong Kong, and South Korea completed selected instruments at several time points. Test-retest reliability was better than 0.70 for all tests. AD participants performed significantly more poorly than controls on every score. Within the AD group, greater disease severity corresponded to significantly poorer performance. The AD group test performance worsened over time and there was a trend for worse performance in AD compared to healthy controls over time. The ADAS-Cog, DAD, and NTB are reliable, valid, and responsive measures in this population and could be used for clinical trials across Asian countries/regions. PMID:25628967

  7. Exploring rationality in schizophrenia.

    PubMed

    Revsbech, Rasmus; Mortensen, Erik Lykke; Owen, Gareth; Nordgaard, Julie; Jansson, Lennart; Sæbye, Ditte; Flensborg-Madsen, Trine; Parnas, Josef

    2015-06-01

    Empirical studies of rationality (syllogisms) in patients with schizophrenia have obtained different results. One study found that patients reason more logically if the syllogism is presented through an unusual content. To explore syllogism-based rationality in schizophrenia. Thirty-eight first-admitted patients with schizophrenia and 38 healthy controls solved 29 syllogisms that varied in presentation content (ordinary v. unusual) and validity (valid v. invalid). Statistical tests were made of unadjusted and adjusted group differences in models adjusting for intelligence and neuropsychological test performance. Controls outperformed patients on all syllogism types, but the difference between the two groups was only significant for valid syllogisms presented with unusual content. However, when adjusting for intelligence and neuropsychological test performance, all group differences became non-significant. When taking intelligence and neuropsychological performance into account, patients with schizophrenia and controls perform similarly on syllogism tests of rationality. None. © The Royal College of Psychiatrists 2015. This is an open access article distributed under the terms of the Creative Commons Non-Commercial, No Derivatives (CC BY-NC-ND) licence.

  8. Analytic Validation of Immunohistochemical Assays: A Comparison of Laboratory Practices Before and After Introduction of an Evidence-Based Guideline.

    PubMed

    Fitzgibbons, Patrick L; Goldsmith, Jeffrey D; Souers, Rhona J; Fatheree, Lisa A; Volmar, Keith E; Stuart, Lauren N; Nowak, Jan A; Astles, J Rex; Nakhleh, Raouf E

    2017-09-01

    - Laboratories must demonstrate analytic validity before any test can be used clinically, but studies have shown inconsistent practices in immunohistochemical assay validation. - To assess changes in immunohistochemistry analytic validation practices after publication of an evidence-based laboratory practice guideline. - A survey on current immunohistochemistry assay validation practices and on the awareness and adoption of a recently published guideline was sent to subscribers enrolled in one of 3 relevant College of American Pathologists proficiency testing programs and to additional nonsubscribing laboratories that perform immunohistochemical testing. The results were compared with an earlier survey of validation practices. - Analysis was based on responses from 1085 laboratories that perform immunohistochemical staining. Of 1057 responses, 65.4% (691) were aware of the guideline recommendations before this survey was sent and 79.9% (550 of 688) of those have already adopted some or all of the recommendations. Compared with the 2010 survey, a significant number of laboratories now have written validation procedures for both predictive and nonpredictive marker assays and specifications for the minimum numbers of cases needed for validation. There was also significant improvement in compliance with validation requirements, with 99% (100 of 102) having validated their most recently introduced predictive marker assay, compared with 74.9% (326 of 435) in 2010. The difficulty in finding validation cases for rare antigens and resource limitations were cited as the biggest challenges in implementing the guideline. - Dissemination of the 2014 evidence-based guideline validation practices had a positive impact on laboratory performance; some or all of the recommendations have been adopted by nearly 80% of respondents.

  9. Reliability and validity of generalizable skills instruments for students who are deaf, blind, or visually impaired.

    PubMed

    Loeding, B L; Greenan, J P

    1998-12-01

    The study examined the validity and reliability of four assessments, with three instruments per domain. Domains included generalizable mathematics, communication, interpersonal relations, and reasoning skills. Participants were deaf, legally blind, or visually impaired students enrolled in vocational classes at residential secondary schools. The researchers estimated the internal consistency reliability, test-retest reliability, and construct validity correlations of three subinstruments: student self-ratings, teacher ratings, and performance assessments. The data suggest that these instruments are highly internally consistent measures of generalizable vocational skills. Four performance assessments have high-to-moderate test-retest reliability estimates, and were generally considered to possess acceptable validity and reliability.

  10. Fundamentals of endoscopic surgery: creation and validation of the hands-on test.

    PubMed

    Vassiliou, Melina C; Dunkin, Brian J; Fried, Gerald M; Mellinger, John D; Trus, Thadeus; Kaneva, Pepa; Lyons, Calvin; Korndorffer, James R; Ujiki, Michael; Velanovich, Vic; Kochman, Michael L; Tsuda, Shawn; Martinez, Jose; Scott, Daniel J; Korus, Gary; Park, Adrian; Marks, Jeffrey M

    2014-03-01

    The Fundamentals of Endoscopic Surgery™ (FES) program consists of online materials and didactic and skills-based tests. All components were designed to measure the skills and knowledge required to perform safe flexible endoscopy. The purpose of this multicenter study was to evaluate the reliability and validity of the hands-on component of the FES examination, and to establish the pass score. Expert endoscopists identified the critical skill set required for flexible endoscopy. They were then modeled in a virtual reality simulator (GI Mentor™ II, Simbionix™ Ltd., Airport City, Israel) to create five tasks and metrics. Scores were designed to measure both speed and precision. Validity evidence was assessed by correlating performance with self-reported endoscopic experience (surgeons and gastroenterologists [GIs]). Internal consistency of each test task was assessed using Cronbach's alpha. Test-retest reliability was determined by having the same participant perform the test a second time and comparing their scores. Passing scores were determined by a contrasting groups methodology and use of receiver operating characteristic curves. A total of 160 participants (17 % GIs) performed the simulator test. Scores on the five tasks showed good internal consistency reliability and all had significant correlations with endoscopic experience. Total FES scores correlated 0.73, with participants' level of endoscopic experience providing evidence of their validity, and their internal consistency reliability (Cronbach's alpha) was 0.82. Test-retest reliability was assessed in 11 participants, and the intraclass correlation was 0.85. The passing score was determined and is estimated to have a sensitivity (true positive rate) of 0.81 and a 1-specificity (false positive rate) of 0.21. The FES hands-on skills test examines the basic procedural components required to perform safe flexible endoscopy. It meets rigorous standards of reliability and validity required for high-stakes examinations, and, together with the knowledge component, may help contribute to the definition and determination of competence in endoscopy.

  11. Reliability and validity of the Assessment of Daily Activity Performance (ADAP) in community-dwelling older women.

    PubMed

    de Vreede, Paul L; Samson, Monique M; van Meeteren, Nico L; Duursma, Sijmen A; Verhaar, Harald J

    2006-08-01

    The Assessment of Daily Activity Performance (ADAP) test was developed, and modeled after the Continuous-scale Physical Functional Performance (CS-PFP) test, to provide a quantitative assessment of older adults' physical functional performance. The aim of this study was to determine the intra-examiner reliability and construct validity of the ADAP in a community-living older population, and to identify the importance of tester experience. Forty-three community-dwelling, older women (mean age 75 yr +/-4.3) were randomized to the test-retest reliability study (n=19) or validation study (n=24). The intra-examiner reliability of an experienced (tester 1) and an inexperienced tester (tester 2) was assessed by comparing test and retest scores of 19 participants. Construct validity was assessed by comparing the ADAP scores of 24 participants with self-perceived function by the SF-36 Health Survey, muscle function tests, and the Timed Up and Go test (TUG). Tester 1 had good consistency and reliability scores (mean difference between test and retest scores (DIF), -1.05+/-1.99; 95% confidence interval (CI), -2.58 to 0.48; Cronbach's alpha (alpha) range, 0.83 to 0.98; intraclass correlation (ICC) range, 0.75 to 0.96; Limits of Agreement (LoA), -2.58 to 4.95). Tester 2 had lower reliability scores (DIF, -2.45+/-4.36; 95% CI, -5.56 to 0.67; alpha range, 0.53 to 0.94; ICC range, 0.36 to 0.90; LoA, -6.09 to 10.99), with a systematic difference between test and retest scores for the ADAP domain lower-body strength (-3.81; 95% CI, -6.09 to -1.54), ADAP correlated with SF-36 Physical Functioning scale (r=0.67), TUG test (r=-0.91) and with isometric knee extensor strength (r=0.80). The ADAP test is a reliable and valid instrument. Our results suggest that testers should practise using the test, to improve reliability, before applying it to clinical settings.

  12. Embedded performance validity tests within the Hopkins Verbal Learning Test - Revised and the Brief Visuospatial Memory Test - Revised.

    PubMed

    Sawyer, R John; Testa, S Marc; Dux, Moira

    2017-01-01

    Various research studies and neuropsychology practice organizations have reiterated the importance of developing embedded performance validity tests (PVTs) to detect potentially invalid neurocognitive test data. This study investigated whether measures within the Hopkins Verbal Learning Test - Revised (HVLT-R) and the Brief Visuospatial Memory Test - Revised (BVMT-R) could accurately classify individuals who fail two or more PVTs during routine clinical assessment. The present sample of 109 United States military veterans (Mean age = 52.4, SD = 13.3), all consisted of clinically referred patients and received a battery of neuropsychological tests. Based on performance validity findings, veterans were assigned to valid (n = 86) or invalid (n = 23) groups. Of the 109 patients in the overall sample, 77 were administered the HLVT-R and 75 were administered the BVMT-R, which were examined for classification accuracy. The HVLT-R Recognition Discrimination Index and the BVMT-R Retention Percentage showed good to adequate discrimination with an area under the curve of .78 and .70, respectively. The HVLT-R Recognition Discrimination Index showed sensitivity of .53 with specificity of .93. The BVMT-R Retention Percentage demonstrated sensitivity of .31 with specificity of .92. When used in conjunction with other PVTs, these new embedded PVTs may be effective in the detection of invalid test data, although they are not intended for use in patients with dementia.

  13. A Testing Platform for Validation of Overhead Conductor Aging Models and Understanding Thermal Limits

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Irminger, Philip; Starke, Michael R; Dimitrovski, Aleksandar D

    2014-01-01

    Power system equipment manufacturers and researchers continue to experiment with novel overhead electric conductor designs that support better conductor performance and address congestion issues. To address the technology gap in testing these novel designs, Oak Ridge National Laboratory constructed the Powerline Conductor Accelerated Testing (PCAT) facility to evaluate the performance of novel overhead conductors in an accelerated fashion in a field environment. Additionally, PCAT has the capability to test advanced sensors and measurement methods for accessing overhead conductor performance and condition. Equipped with extensive measurement and monitoring devices, PCAT provides a platform to improve/validate conductor computer models and assess themore » performance of novel conductors. The PCAT facility and its testing capabilities are described in this paper.« less

  14. The validity of ACT-PEP test scores for predicting academic performance of registered nurses in BSN programs.

    PubMed

    Yang, J C; Noble, J

    1990-01-01

    This study investigated the validity of three American College Testing-Proficiency Examination Program (ACT-PEP) tests (Maternal and Child Nursing, Psychiatric/Mental Health Nursing, Adult Nursing) for predicting the academic performance of registered nurses (RNs) enrolled in bachelor's degree BSN programs nationwide. This study also examined RN students' performance on the ACT-PEP tests by their demographic characteristics: student's age, sex, race, student status (full- or part-time), and employment status (full- or part-time). The total sample for the three tests comprised 2,600 students from eight institutions nationwide. The median correlation coefficients between the three ACT-PEP tests and the semester grade point averages ranged from .36 to .56. Median correlation coefficients increased over time, supporting the stability of ACT-PEP test scores for predicting academic performance over time. The relative importance of selected independent variables for predicting academic performance was also examined; the most important variable for predicting academic performance was typically the ACT-PEP test score. Across the institutions, student demographic characteristics did not contribute significantly to explaining academic performance, over and above ACT-PEP scores.

  15. The Validity of Value-Added Estimates from Low-Stakes Testing Contexts: The Impact of Change in Test-Taking Motivation and Test Consequences

    ERIC Educational Resources Information Center

    Finney, Sara J.; Sundre, Donna L.; Swain, Matthew S.; Williams, Laura M.

    2016-01-01

    Accountability mandates often prompt assessment of student learning gains (e.g., value-added estimates) via achievement tests. The validity of these estimates have been questioned when performance on tests is low stakes for students. To assess the effects of motivation on value-added estimates, we assigned students to one of three test consequence…

  16. Validity and reliability of the Short Physical Performance Battery (SPPB): a pilot study on mobility in the Colombian Andes.

    PubMed

    Gómez, José Fernando; Curcio, Carmen-Lucía; Alvarado, Beatriz; Zunzunegui, María Victoria; Guralnik, Jack

    2013-07-01

    To assess the validity (convergent and construct) and reliability of the Short Physical Performance Battery (SPPB) among non-disabled adults between 65 to 74 years of age residing in the Andes Mountains of Colombia. Design Validation study; 150 subjects aged 65 to 74 years recruited from elderly associations (day-centers) in Manizales, Colombia. The SPPB tests of balance, including time to walk 4 meters and time required to stand from a chair 5 times were administered to all participants. Reliability was analyzed with a 7-day interval between assessments and use of repeated ANOVA testing. Construct validity was assessed using factor analysis and by testing the relationship between SPPB and depressive symptoms, cognitive function, and self rated health (SRH), while the concurrent validity was measured through relationships with mobility limitations and disability in Activities of Daily Living (ADL). ANOVA tests were used to establish these associations. Test-retest reliability of the SPPB was high: 0.87 (CI95%: 0.77-0.96). A one factor solution was found with three SPPB tests. SPPB was related to self-rated health, limitations in walking and climbing steps and to indicators of disability, as well as to cognitive function and depression. There was a graded decrease in the mean SPPB score with increasing disability and poor health. The Spanish version of SPPB is reliable and valid to assess physical performance among older adults from our region. Future studies should establish their clinical applications and explore usage in population studies.

  17. Validity and reliability of the Short Physical Performance Battery (SPPB)

    PubMed Central

    Curcio, Carmen-Lucía; Alvarado, Beatriz; Zunzunegui, María Victoria; Guralnik, Jack

    2013-01-01

    Objectives: To assess the validity (convergent and construct) and reliability of the Short Physical Performance Battery (SPPB) among non-disabled adults between 65 to 74 years of age residing in the Andes Mountains of Colombia. Methods: Design Validation study; Participants: 150 subjects aged 65 to 74 years recruited from elderly associations (day-centers) in Manizales, Colombia. Measurements: The SPPB tests of balance, including time to walk 4 meters and time required to stand from a chair 5 times were administered to all participants. Reliability was analyzed with a 7-day interval between assessments and use of repeated ANOVA testing. Construct validity was assessed using factor analysis and by testing the relationship between SPPB and depressive symptoms, cognitive function, and self rated health (SRH), while the concurrent validity was measured through relationships with mobility limitations and disability in Activities of Daily Living (ADL). ANOVA tests were used to establish these associations. Results: Test-retest reliability of the SPPB was high: 0.87 (CI95%: 0.77-0.96). A one factor solution was found with three SPPB tests. SPPB was related to self-rated health, limitations in walking and climbing steps and to indicators of disability, as well as to cognitive function and depression. There was a graded decrease in the mean SPPB score with increasing disability and poor health. Conclusion: The Spanish version of SPPB is reliable and valid to assess physical performance among older adults from our region. Future studies should establish their clinical applications and explore usage in population studies. PMID:24892614

  18. Development, Testing, and Validation of a Model-Based Tool to Predict Operator Responses in Unexpected Workload Transitions

    NASA Technical Reports Server (NTRS)

    Sebok, Angelia; Wickens, Christopher; Sargent, Robert

    2015-01-01

    One human factors challenge is predicting operator performance in novel situations. Approaches such as drawing on relevant previous experience, and developing computational models to predict operator performance in complex situations, offer potential methods to address this challenge. A few concerns with modeling operator performance are that models need to realistic, and they need to be tested empirically and validated. In addition, many existing human performance modeling tools are complex and require that an analyst gain significant experience to be able to develop models for meaningful data collection. This paper describes an effort to address these challenges by developing an easy to use model-based tool, using models that were developed from a review of existing human performance literature and targeted experimental studies, and performing an empirical validation of key model predictions.

  19. Reliability and validity of two isometric squat tests.

    PubMed

    Blazevich, Anthony J; Gill, Nicholas; Newton, Robert U

    2002-05-01

    The purpose of the present study was first to examine the reliability of isometric squat (IS) and isometric forward hack squat (IFHS) tests to determine if repeated measures on the same subjects yielded reliable results. The second purpose was to examine the relation between isometric and dynamic measures of strength to assess validity. Fourteen male subjects performed maximal IS and IFHS tests on 2 occasions and 1 repetition maximum (1-RM) free-weight squat and forward hack squat (FHS) tests on 1 occasion. The 2 tests were found to be highly reliable (intraclass correlation coefficient [ICC](IS) = 0.97 and ICC(IFHS) = 1.00). There was a strong relation between average IS and 1-RM squat performance, and between IFHS and 1-RM FHS performance (r(squat) = 0.77, r(FHS) = 0.76; p < 0.01), but a weak relation between squat and FHS test performances (r < 0.55). There was also no difference between observed 1-RM values and those predicted by our regression equations. Errors in predicting 1-RM performance were in the order of 8.5% (standard error of the estimate [SEE] = 13.8 kg) and 7.3% (SEE = 19.4 kg) for IS and IFHS respectively. Correlations between isometric and 1-RM tests were not of sufficient size to indicate high validity of the isometric tests. Together the results suggest that IS and IFHS tests could detect small differences in multijoint isometric strength between subjects, or performance changes over time, and that the scores in the isometric tests are well related to 1-RM performance. However, there was a small error when predicting 1-RM performance from isometric performance, and these tests have not been shown to discriminate between small changes in dynamic strength. The weak relation between squat and FHS test performance can be attributed to differences in the movement patterns of the tests

  20. Physical examination tests of the shoulder: a systematic review and meta-analysis of diagnostic test performance.

    PubMed

    Gismervik, Sigmund Ø; Drogset, Jon O; Granviken, Fredrik; Rø, Magne; Leivseth, Gunnar

    2017-01-25

    Physical examination tests of the shoulder (PETS) are clinical examination maneuvers designed to aid the assessment of shoulder complaints. Despite more than 180 PETS described in the literature, evidence of their validity and usefulness in diagnosing the shoulder is questioned. This meta-analysis aims to use diagnostic odds ratio (DOR) to evaluate how much PETS shift overall probability and to rank the test performance of single PETS in order to aid the clinician's choice of which tests to use. This study adheres to the principles outlined in the Cochrane guidelines and the PRISMA statement. A fixed effect model was used to assess the overall diagnostic validity of PETS by pooling DOR for different PETS with similar biomechanical rationale when possible. Single PETS were assessed and ranked by DOR. Clinical performance was assessed by sensitivity, specificity, accuracy and likelihood ratio. Six thousand nine-hundred abstracts and 202 full-text articles were assessed for eligibility; 20 articles were eligible and data from 11 articles could be included in the meta-analysis. All PETS for SLAP (superior labral anterior posterior) lesions pooled gave a DOR of 1.38 [1.13, 1.69]. The Supraspinatus test for any full thickness rotator cuff tear obtained the highest DOR of 9.24 (sensitivity was 0.74, specificity 0.77). Compression-Rotation test obtained the highest DOR (6.36) among single PETS for SLAP lesions (sensitivity 0.43, specificity 0.89) and Hawkins test obtained the highest DOR (2.86) for impingement syndrome (sensitivity 0.58, specificity 0.67). No single PETS showed superior clinical test performance. The clinical performance of single PETS is limited. However, when the different PETS for SLAP lesions were pooled, we found a statistical significant change in post-test probability indicating an overall statistical validity. We suggest that clinicians choose their PETS among those with the highest pooled DOR and to assess validity to their own specific clinical settings, review the inclusion criteria of the included primary studies. We further propose that future studies on the validity of PETS use randomized research designs rather than the accuracy design relying less on well-established gold standard reference tests and efficient treatment options.

  1. Reliability and criterion-related validity testing (construct) of the Endotracheal Suction Assessment Tool (ESAT©).

    PubMed

    Davies, Kylie; Bulsara, Max K; Ramelet, Anne-Sylvie; Monterosso, Leanne

    2018-05-01

    To establish criterion-related construct validity and test-retest reliability for the Endotracheal Suction Assessment Tool© (ESAT©). Endotracheal tube suction performed in children can significantly affect clinical stability. Previously identified clinical indicators for endotracheal tube suction were used as criteria when designing the ESAT©. Content validity was reported previously. The final stages of psychometric testing are presented. Observational testing was used to measure construct validity and determine whether the ESAT© could guide "inexperienced" paediatric intensive care nurses' decision-making regarding endotracheal tube suction. Test-retest reliability of the ESAT© was performed at two time points. The researchers and paediatric intensive care nurse "experts" developed 10 hypothetical clinical scenarios with predetermined endotracheal tube suction outcomes. "Experienced" (n = 12) and "inexperienced" (n = 14) paediatric intensive care nurses were presented with the scenarios and the ESAT© guiding decision-making about whether to perform endotracheal tube suction for each scenario. Outcomes were compared with those predetermined by the "experts" (n = 9). Test-retest reliability of the ESAT© was measured at two consecutive time points (4 weeks apart) with "experienced" and "inexperienced" paediatric intensive care nurses using the same scenarios and tool to guide decision-making. No differences were observed between endotracheal tube suction decisions made by "experts" (n = 9), "inexperienced" (n = 14) and "experienced" (n = 12) nurses confirming the tool's construct validity. No differences were observed between groups for endotracheal tube suction decisions at T1 and T2. Criterion-related construct validity and test-retest reliability of the ESAT© were demonstrated. Further testing is recommended to confirm reliability in the clinical setting with the "inexperienced" nurse to guide decision-making related to endotracheal tube suction. The ESAT© is the first validated tool to systematically guide endotracheal nursing practice for the "inexperienced" nurse. © 2018 John Wiley & Sons Ltd.

  2. Reproducibility, Reliability, and Validity of Fuchsin-Based Beads for the Evaluation of Masticatory Performance.

    PubMed

    Sánchez-Ayala, Alfonso; Farias-Neto, Arcelino; Vilanova, Larissa Soares Reis; Costa, Marina Abrantes; Paiva, Ana Clara Soares; Carreiro, Adriana da Fonte Porto; Mestriner-Junior, Wilson

    2016-08-01

    Rehabilitation of masticatory function is inherent to prosthodontics; however, despite the various techniques for evaluating oral comminution, the methodological suitability of these has not been completely studied. The aim of this study was to determine the reproducibility, reliability, and validity of a test food based on fuchsin beads for masticatory function assessment. Masticatory performance was evaluated in 20 dentate subjects (mean age, 23.3 years) using two kinds of test foods and methods: fuchsin beads and ultraviolet-visible spectrophotometry, and silicone cubes and multiple sieving as gold standard. Three examiners conducted five masticatory performance trials with each test food. Reproducibility of the results from both test foods was separately assessed using the intraclass correlation coefficient (ICC). Reliability and validity of fuchsin bead data were measured by comparing the average mean of absolute differences and the measurement means, respectively, regarding silicone cube data using the paired Student's t-test (α = 0.05). Intraexaminer and interexaminer ICC for the fuchsin bead values were 0.65 and 0.76 (p < 0.001), respectively; those for the silicone cubes values were 0.93 and 0.91 (p < 0.001), respectively. Reliability revealed intraexaminer (p < 0.001) and interexaminer (p < 0.05) differences between the average means of absolute differences of each test foods. Validity also showed differences between the measurement means of each test food (p < 0.001). Intra- and interexaminer reproducibility of the test food based on fuchsin beads for evaluation of masticatory performance were good and excellent, respectively; however, the reliability and validity were low, because fuchsin beads do not measure the grinding capacity of masticatory function as silicone cubes do; instead, this test food describes the crushing potential of teeth. Thus, the two kinds of test foods evaluate different properties of masticatory capacity, confirming fushsin beads as a useful tool for this purpose. © 2015 by the American College of Prosthodontists.

  3. Effect of response format on cognitive reflection: Validating a two- and four-option multiple choice question version of the Cognitive Reflection Test.

    PubMed

    Sirota, Miroslav; Juanchich, Marie

    2018-03-27

    The Cognitive Reflection Test, measuring intuition inhibition and cognitive reflection, has become extremely popular because it reliably predicts reasoning performance, decision-making, and beliefs. Across studies, the response format of CRT items sometimes differs, based on the assumed construct equivalence of tests with open-ended versus multiple-choice items (the equivalence hypothesis). Evidence and theoretical reasons, however, suggest that the cognitive processes measured by these response formats and their associated performances might differ (the nonequivalence hypothesis). We tested the two hypotheses experimentally by assessing the performance in tests with different response formats and by comparing their predictive and construct validity. In a between-subjects experiment (n = 452), participants answered stem-equivalent CRT items in an open-ended, a two-option, or a four-option response format and then completed tasks on belief bias, denominator neglect, and paranormal beliefs (benchmark indicators of predictive validity), as well as on actively open-minded thinking and numeracy (benchmark indicators of construct validity). We found no significant differences between the three response formats in the numbers of correct responses, the numbers of intuitive responses (with the exception of the two-option version, which had a higher number than the other tests), and the correlational patterns of the indicators of predictive and construct validity. All three test versions were similarly reliable, but the multiple-choice formats were completed more quickly. We speculate that the specific nature of the CRT items helps build construct equivalence among the different response formats. We recommend using the validated multiple-choice version of the CRT presented here, particularly the four-option CRT, for practical and methodological reasons. Supplementary materials and data are available at https://osf.io/mzhyc/ .

  4. Performance Tested Method multiple laboratory validation study of ELISA-based assays for the detection of peanuts in food.

    PubMed

    Park, Douglas L; Coates, Scott; Brewer, Vickery A; Garber, Eric A E; Abouzied, Mohamed; Johnson, Kurt; Ritter, Bruce; McKenzie, Deborah

    2005-01-01

    Performance Tested Method multiple laboratory validations for the detection of peanut protein in 4 different food matrixes were conducted under the auspices of the AOAC Research Institute. In this blind study, 3 commercially available ELISA test kits were validated: Neogen Veratox for Peanut, R-Biopharm RIDASCREEN FAST Peanut, and Tepnel BioKits for Peanut Assay. The food matrixes used were breakfast cereal, cookies, ice cream, and milk chocolate spiked at 0 and 5 ppm peanut. Analyses of the samples were conducted by laboratories representing industry and international and U.S governmental agencies. All 3 commercial test kits successfully identified spiked and peanut-free samples. The validation study required 60 analyses on test samples at the target level 5 microg peanut/g food and 60 analyses at a peanut-free level, which was designed to ensure that the lower 95% confidence limit for the sensitivity and specificity would not be <90%. The probability that a test sample contains an allergen given a prevalence rate of 5% and a positive test result using a single test kit analysis with 95% sensitivity and 95% specificity, which was demonstrated for these test kits, would be 50%. When 2 test kits are run simultaneously on all samples, the probability becomes 95%. It is therefore recommended that all field samples be analyzed with at least 2 of the validated kits.

  5. Two-colour chewing gum mixing ability test for evaluating masticatory performance in children with mixed dentition: validity and reliability study.

    PubMed

    Kaya, M S; Güçlü, B; Schimmel, M; Akyüz, S

    2017-11-01

    The unappealing taste of the chewing material and the time-consuming repetitive task in masticatory performance tests using artificial foodstuff may discourage children from performing natural chewing movements. Therefore, the aim was to determine the validity and reliability of a two-colour chewing gum mixing ability test for masticatory performance (MP) assessment in mixed dentition children. Masticatory performance was tested in two groups: systemically healthy fully dentate young adults and children in mixed dentition. Median particle size was assessed using a comminution test, and a two-colour chewing gum mixing ability test was applied for MP analysis. Validity was tested with Pearson correlation, and reliability was tested with intra-class correlation coefficient, Pearson correlation and Bland-Altman plots. Both comminution and two-colour chewing gum mixing ability tests revealed statistically significant MP differences between children (n = 25) and adults (n = 27, both P < 0·01). Pearson correlation between comminution and two-colour chewing gum mixing ability tests was positive and significant (r = 0·418, P = 0·002). Correlations for interobserver reliability and test-retest values were significant (r = 0·990, P = 0·0001 and r = 0·995, P = 0·0001). Although both methods could discriminate MP differences, the comminution test detected these differences generally in a wider range compared to two-colour chewing gum mixing ability test. However, considering the high reliability of the results, the two-colour chewing gum mixing ability test can be used to assess masticatory performance in children, especially at non-clinical settings. © 2017 John Wiley & Sons Ltd.

  6. Safety validation test equipment operation

    NASA Astrophysics Data System (ADS)

    Kurosaki, Tadaaki; Watanabe, Takashi

    1992-08-01

    An overview of the activities conducted on safety validation test equipment operation for materials used for NASA manned missions is presented. Safety validation tests, such as flammability, odor, offgassing, and so forth were conducted in accordance with NASA-NHB-8060.1C using test subjects common with those used by NASA, and the equipment used were qualified for their functions and performances in accordance with NASDA-CR-99124 'Safety Validation Test Qualification Procedures.' Test procedure systems were established by preparing 'Common Procedures for Safety Validation Test' as well as test procedures for flammability, offgassing, and odor tests. The test operation organization chaired by the General Manager of the Parts and Material Laboratory of NASDA (National Space Development Agency of Japan) was established, and the test leaders and operators in the organization were qualified in accordance with the specified procedures. One-hundred-one tests had been conducted so far by the Parts and Material Laboratory according to the request submitted by the manufacturers through the Space Station Group and the Safety and Product Assurance for Manned Systems Office.

  7. Clinical Functional Capacity Testing in Patients With Facioscapulohumeral Muscular Dystrophy: Construct Validity and Interrater Reliability of Antigravity Tests.

    PubMed

    Rijken, Noortje H; van Engelen, Baziel G; Weerdesteyn, Vivian; Geurts, Alexander C

    2015-12-01

    To evaluate the construct validity and interrater reliability of 4 simple antigravity tests in a small group of patients with facioscapulohumeral muscular dystrophy (FSHD). Case-control study. University medical center. Patients with various severity levels of FSHD (n=9) and healthy control subjects (n=10) were included (N=19). Not applicable. A 4-point ordinal scale was designed to grade performance on the following 4 antigravity tests: sit to stance, stance to sit, step up, and step down. In addition, the 6-minute walk test, 10-m walking test, Berg Balance Scale, and timed Up and Go test were administered as conventional tests. Construct validity was determined by linear regression analysis using the Clinical Severity Score (CSS) as the dependent variable. Interrater agreement was tested using a κ analysis. Patients with FSHD performed worse on all 4 antigravity tests compared with the controls. Stronger correlations were found within than between test categories (antigravity vs conventional). The antigravity tests revealed the highest explained variance with regard to the CSS (R(2)=.86, P=.014). Interrater agreement was generally good. The results of this exploratory study support the construct validity and interrater reliability of the proposed antigravity tests for the assessment of functional capacity in patients with FSHD taking into account the use of compensatory strategies. Future research should further validate these results in a larger sample of patients with FSHD. Copyright © 2015 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.

  8. Performance Testing of a Trace Contaminant Control Subassembly for the International Space Station

    NASA Technical Reports Server (NTRS)

    Perry, J. L.; Curtis, R. E.; Alexandre, K. L.; Ruggiero, L. L.; Shtessel, N.

    1998-01-01

    As part of the International Space Station (ISS) Trace Contaminant Control Subassembly (TCCS) development, a performance test has been conducted to provide reference data for flight verification analyses. This test, which used the U.S. Habitation Module (U.S. Hab) TCCS as the test article, was designed to add to the existing database on TCCS performance. Included in this database are results obtained during ISS development testing; testing of functionally similar TCCS prototype units; and bench scale testing of activated charcoal, oxidation catalyst, and granular lithium hydroxide (LiOH). The present database has served as the basis for the development and validation of a computerized TCCS process simulation model. This model serves as the primary means for verifying the ISS TCCS performance. In order to mitigate risk associated with this verification approach, the U.S. Hab TCCS performance test provides an additional set of data which serve to anchor both the process model and previously-obtained development test data to flight hardware performance. The following discussion provides relevant background followed by a summary of the test hardware, objectives, requirements, and facilities. Facility and test article performance during the test is summarized, test results are presented, and the TCCS's performance relative to past test experience is discussed. Performance predictions made with the TCCS process model are compared with the U.S. Hab TCCS test results to demonstrate its validation.

  9. Comprehension of Written Grammar Test: Reliability and Known-Groups Validity Study With Hearing and Deaf and Hard-of-Hearing Students.

    PubMed

    Cannon, Joanna E; Hubley, Anita M; Millhoff, Courtney; Mazlouman, Shahla

    2016-01-01

    The aim of the current study was to gather validation evidence for the Comprehension of Written Grammar (CWG; Easterbrooks, 2010) receptive test of 26 grammatical structures of English print for use with children who are deaf and hard of hearing (DHH). Reliability and validity data were collected for 98 participants (49 DHH and 49 hearing) in Grades 2-6. The objectives were to: (a) examine 4-week test-retest reliability data; and (b) provide evidence of known-groups validity by examining expected differences between the groups on the CWG vocabulary pretest and main test, as well as selected structures. Results indicated excellent test-retest reliability estimates for CWG test scores. DHH participants performed statistically significantly lower on the CWG vocabulary pretest and main test than the hearing participants. Significantly lower performance by DHH participants on most expected grammatical structures (e.g., basic sentence patterns, auxiliary "be" singular/plural forms, tense, comparatives, and complementation) also provided known groups evidence. Overall, the findings of this study showed strong evidence of the reliability of scores and known group-based validity of inferences made from the CWG. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  10. Testing Math or Testing Language? The Construct Validity of the KeyMath-Revised for Children With Intellectual Disability and Language Difficulties.

    PubMed

    Rhodes, Katherine T; Branum-Martin, Lee; Morris, Robin D; Romski, MaryAnn; Sevcik, Rose A

    2015-11-01

    Although it is often assumed that mathematics ability alone predicts mathematics test performance, linguistic demands may also predict achievement. This study examined the role of language in mathematics assessment performance for children with intellectual disability (ID) at less severe levels, on the KeyMath-Revised Inventory (KM-R) with a sample of 264 children, in grades 2-5. Using confirmatory factor analysis, the hypothesis that the KM-R would demonstrate discriminant validity with measures of language abilities in a two-factor model was compared to two plausible alternative models. Results indicated that KM-R did not have discriminant validity with measures of children's language abilities and was a multidimensional test of both mathematics and language abilities for this population of test users. Implications are considered for test development, interpretation, and intervention.

  11. Effects of Coaching on the Validity of the SAT: A Simulation Study.

    ERIC Educational Resources Information Center

    Baydar, Nazli

    The effects of student coaching in preparation for the College Board Scholastic Aptitude Test (SAT) on the predictive validity of this test for freshman year performance were studied using data on 1985 freshman year students from four colleges. After the validity of the SAT was estimated for each school, a given proportion of students was picked,…

  12. Flight Test 4 Preliminary Results: NASA Ames SSI

    NASA Technical Reports Server (NTRS)

    Isaacson, Doug; Gong, Chester; Reardon, Scott; Santiago, Confesor

    2016-01-01

    Realization of the expected proliferation of Unmanned Aircraft System (UAS) operations in the National Airspace System (NAS) depends on the development and validation of performance standards for UAS Detect and Avoid (DAA) Systems. The RTCA Special Committee 228 is charged with leading the development of draft Minimum Operational Performance Standards (MOPS) for UAS DAA Systems. NASA, as a participating member of RTCA SC-228 is committed to supporting the development and validation of draft requirements as well as the safety substantiation and end-to-end assessment of DAA system performance. The Unmanned Aircraft System (UAS) Integration into the National Airspace System (NAS) Project conducted flight test program, referred to as Flight Test 4, at Armstrong Flight Research Center from April -June 2016. Part of the test flights were dedicated to the NASA Ames-developed Detect and Avoid (DAA) System referred to as JADEM (Java Architecture for DAA Extensibility and Modeling). The encounter scenarios, which involved NASA's Ikhana UAS and a manned intruder aircraft, were designed to collect data on DAA system performance in real-world conditions and uncertainties with four different surveillance sensor systems. Flight test 4 has four objectives: (1) validate DAA requirements in stressing cases that drive MOPS requirements, including: high-speed cooperative intruder, low-speed non-cooperative intruder, high vertical closure rate encounter, and Mode CS-only intruder (i.e. without ADS-B), (2) validate TCASDAA alerting and guidance interoperability concept in the presence of realistic sensor, tracking and navigational errors and in multiple-intruder encounters against both cooperative and non-cooperative intruders, (3) validate Well Clear Recovery guidance in the presence of realistic sensor, tracking and navigational errors, and (4) validate DAA alerting and guidance requirements in the presence of realistic sensor, tracking and navigational errors. The results will be presented at RTCA Special Committee 228 in support of final verification and validation of the DAA MOPS.

  13. Developing the Persian version of the homophone meaning generation test

    PubMed Central

    Ebrahimipour, Mona; Motamed, Mohammad Reza; Ashayeri, Hassan; Modarresi, Yahya; Kamali, Mohammad

    2016-01-01

    Background: Finding the right word is a necessity in communication, and its evaluation has always been a challenging clinical issue, suggesting the need for valid and reliable measurements. The Homophone Meaning Generation Test (HMGT) can measure the ability to switch between verbal concepts, which is required in word retrieval. The purpose of this study was to adapt and validate the Persian version of the HMGT. Methods: The first phase involved the adaptation of the HMGT to the Persian language. The second phase concerned the psychometric testing. The word-finding performance was assessed in 90 Persian-speaking healthy individuals (20-50 year old; 45 males and 45 females) through three naming tasks: Semantic Fluency, Phonemic Fluency, and Homophone Meaning Generation Test. The participants had no history of neurological or psychiatric diseases, alcohol abuse, severe depression, or history of speech, language, or learning problems. Results: The internal consistency coefficient was larger than 0.8 for all the items with a total Cronbach’s alpha of 0.80. Interrater and intrarater reliability were also excellent. The validity of all items was above 0.77, and the content validity index (0.99) was appropriate. The Persian HMGT had strong convergent validity with semantic and phonemic switching and adequate divergent validity with semantic and phonemic clustering. Conclusion: The Persian version of the Homophone Meaning Generation Test is an appropriate, valid, and reliable test to evaluate the ability to switch between verbal concepts in the assessment of word-finding performance. PMID:27390705

  14. Developing the Persian version of the homophone meaning generation test.

    PubMed

    Ebrahimipour, Mona; Motamed, Mohammad Reza; Ashayeri, Hassan; Modarresi, Yahya; Kamali, Mohammad

    2016-01-01

    Finding the right word is a necessity in communication, and its evaluation has always been a challenging clinical issue, suggesting the need for valid and reliable measurements. The Homophone Meaning Generation Test (HMGT) can measure the ability to switch between verbal concepts, which is required in word retrieval. The purpose of this study was to adapt and validate the Persian version of the HMGT. The first phase involved the adaptation of the HMGT to the Persian language. The second phase concerned the psychometric testing. The word-finding performance was assessed in 90 Persian-speaking healthy individuals (20-50 year old; 45 males and 45 females) through three naming tasks: Semantic Fluency, Phonemic Fluency, and Homophone Meaning Generation Test. The participants had no history of neurological or psychiatric diseases, alcohol abuse, severe depression, or history of speech, language, or learning problems. The internal consistency coefficient was larger than 0.8 for all the items with a total Cronbach's alpha of 0.80. Interrater and intrarater reliability were also excellent. The validity of all items was above 0.77, and the content validity index (0.99) was appropriate. The Persian HMGT had strong convergent validity with semantic and phonemic switching and adequate divergent validity with semantic and phonemic clustering. The Persian version of the Homophone Meaning Generation Test is an appropriate, valid, and reliable test to evaluate the ability to switch between verbal concepts in the assessment of word-finding performance.

  15. Preliminary Report on a National Cross-Validation of the Computerized Adaptive Screening Test (CAST).

    ERIC Educational Resources Information Center

    Knapp, Deirdre J.; Pliske, Rebecca M.

    A study was conducted to validate the Army's Computerized Adaptive Screening Test (CAST), using data from 2,240 applicants from 60 army recruiting stations across the nation. CAST is a computer-assisted adaptive test used to predict performance on the Armed Forces Qualification Test (AFQT). AFQT scores are computed by adding four subtest scores of…

  16. 78 FR 28733 - Medical Devices; General Hospital and Personal Use Monitoring Devices; Classification of the...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-05-16

    ... Toxicology Testing. Labeling (dose limits). Electromagnetic incompatibility........ Electromagnetic... analysis and nonclinical testing must validate electromagnetic compatibility performance, wireless... electromagnetic compatibility performance, wireless performance, and electrical safety; and (4) Labeling must...

  17. An Evaluation of Computerized Tests as Predictors of Job Performance: II. Differential Validity for Global and Job Element Criteria. Final Report.

    ERIC Educational Resources Information Center

    Cory, Charles H.

    This report presents data concerning the validity of a set of experimental computerized and paper-and-pencil tests for measures of on-job performance on global and job elements. It reports on the usefulness of 30 experimental and operational variables for predicting marks on 42 job elements and on a global criterion for Electrician's Mate,…

  18. Measuring verbal and non-verbal communication in aphasia: reliability, validity, and sensitivity to change of the Scenario Test.

    PubMed

    van der Meulen, Ineke; van de Sandt-Koenderman, W Mieke E; Duivenvoorden, Hugo J; Ribbers, Gerard M

    2010-01-01

    This study explores the psychometric qualities of the Scenario Test, a new test to assess daily-life communication in severe aphasia. The test is innovative in that it: (1) examines the effectiveness of verbal and non-verbal communication; and (2) assesses patients' communication in an interactive setting, with a supportive communication partner. To determine the reliability, validity, and sensitivity to change of the Scenario Test and discuss its clinical value. The Scenario Test was administered to 122 persons with aphasia after stroke and to 25 non-aphasic controls. Analyses were performed for the entire group of persons with aphasia, as well as for a subgroup of persons unable to communicate verbally (n = 43). Reliability (internal consistency, test-retest reliability, inter-judge, and intra-judge reliability) and validity (internal validity, convergent validity, known-groups validity) and sensitivity to change were examined using standard psychometric methods. The Scenario Test showed high levels of reliability. Internal consistency (Cronbach's alpha = 0.96; item-rest correlations = 0.58-0.82) and test-retest reliability (ICC = 0.98) were high. Agreement between judges in total scores was good, as indicated by the high inter- and intra-judge reliability (ICC = 0.86-1.00). Agreement in scores on the individual items was also good (square-weighted kappa values 0.61-0.92). The test demonstrated good levels of validity. A principal component analysis for categorical data identified two dimensions, interpreted as general communication and communicative creativity. Correlations with three other instruments measuring communication in aphasia, that is, Spontaneous Speech interview from the Aachen Aphasia Test (AAT), Amsterdam-Nijmegen Everyday Language Test (ANELT), and Communicative Effectiveness Index (CETI), were moderate to strong (0.50-0.85) suggesting good convergent validity. Group differences were observed between persons with aphasia and non-aphasic controls, as well as between persons with aphasia unable to use speech to convey information and those able to communicate verbally; this indicates good known-groups validity. The test was sensitive to changes in performance, measured over a period of 6 months. The data support the reliability and validity of the Scenario Test as an instrument for examining daily-life communication in aphasia. The test focuses on multimodal communication; its psychometric qualities enable future studies on the effect of Alternative and Augmentative Communication (AAC) training in aphasia.

  19. Advanced Concept Studies for Supersonic Commercial Transports Entering Service in the 2018 to 2020 Period

    NASA Technical Reports Server (NTRS)

    Morgenstern, John; Norstrud, Nicole; Sokhey, Jack; Martens, Steve; Alonso, Juan J.

    2013-01-01

    Lockheed Martin Aeronautics Company (LM), working in conjunction with General Electric Global Research (GE GR), Rolls-Royce Liberty Works (RRLW), and Stanford University, herein presents results from the "N+2 Supersonic Validations" contract s initial 22 month phase, addressing the NASA solicitation "Advanced Concept Studies for Supersonic Commercial Transports Entering Service in the 2018 to 2020 Period." This report version adds documentation of an additional three month low boom test task. The key technical objective of this effort was to validate integrated airframe and propulsion technologies and design methodologies. These capabilities aspired to produce a viable supersonic vehicle design with environmental and performance characteristics. Supersonic testing of both airframe and propulsion technologies (including LM3: 97-023 low boom testing and April-June nozzle acoustic testing) verified LM s supersonic low-boom design methodologies and both GE and RRLW's nozzle technologies for future implementation. The N+2 program is aligned with NASA s Supersonic Project and is focused on providing system-level solutions capable of overcoming the environmental and performance/efficiency barriers to practical supersonic flight. NASA proposed "Initial Environmental Targets and Performance Goals for Future Supersonic Civil Aircraft". The LM N+2 studies are built upon LM s prior N+3 100 passenger design studies. The LM N+2 program addresses low boom design and methodology validations with wind tunnel testing, performance and efficiency goals with system level analysis, and low noise validations with two nozzle (GE and RRLW) acoustic tests.

  20. Reliability and validity of functional performance tests in dancers with hip dysfunction.

    PubMed

    Kivlan, Benjamin R; Carcia, Christopher R; Clemente, F Richard; Phelps, Amy L; Martin, Robroy L

    2013-08-01

    Quasi-experimental, repeated measures. Functional performance tests that identify hip joint impairments and assess the effect of intervention have not been adequately described for dancers. The purpose of this study was to examine the reliability and validity of hop and balance tests among a group of dancers with musculoskeletal pain in the hip region. NINETEEN FEMALE DANCERS (AGE: 18.90±1.11 years; height: 164.85±6.95 cm; weight: 60.37±8.29 kg) with unilateral hip pain were assessed utilizing the cross-over reach, medial triple hop, lateral triple hop, and cross-over hop tests on two occasions, 2 days apart. Test-retest reliability and comparisons between the involved and uninvolved side for each respective test were determined. Intra-class correlation coefficients for the functional performance tests ranged from 0.89-0.96. The cross-over reach test had a SEM of 2.79 cm and a MDC of 7.73 cm. The medial and lateral triple hop tests had SEM values of 7.51 cm and 8.17 cm, and MDC values of 20.81 cm and 22.62 cm, respectively. The SEM was 0.15 seconds and the MDC was 0.42 seconds for the cross-over hop test. Performance on the medial triple hop test was significantly less on the involved side (370.21±38.26 cm) compared to the uninvolved side (388.05±41.49 cm); t(18) = -4.33, p<0.01. The side-to-side comparisons of the cross-over reach test (involved mean=61.68±10.9 cm; uninvolved mean=61.69±8.63 cm); t(18) = -0.004, p=0.99, lateral triple hop test (involved mean=306.92±35.79 cm; uninvolved mean=310.68±24.49 cm); t(18) = -0.55, p=0.59, and cross-over hop test (involved mean=2.49±0.34 seconds; uninvolved mean= 2.61±0.42 seconds; t(18) = -1.84, p=0.08) were not statistically different between sides. The functional performance tests used in this study can be reliably performed on dancers with unilateral hip pain. The medial triple hop test was the only functional performance test with evidence of validity in side-to-side comparisons. These results suggest that the medial triple hop test may be a reliable and valid functional performance test to assess impairments related to hip pain among dancers. 3b. Non-consecutive cohort study.

  1. RELIABILITY AND VALIDITY OF FUNCTIONAL PERFORMANCE TESTS IN DANCERS WITH HIP DYSFUNCTION

    PubMed Central

    Carcia, Christopher R.; Clemente, F. Richard; Phelps, Amy L.; Martin, RobRoy L.

    2013-01-01

    Study Design: Quasi-experimental, repeated measures. Purpose/Background: Functional performance tests that identify hip joint impairments and assess the effect of intervention have not been adequately described for dancers. The purpose of this study was to examine the reliability and validity of hop and balance tests among a group of dancers with musculoskeletal pain in the hip region. Methods: Nineteen female dancers (age: 18.90±1.11 years; height: 164.85±6.95 cm; weight: 60.37±8.29 kg) with unilateral hip pain were assessed utilizing the cross-over reach, medial triple hop, lateral triple hop, and cross-over hop tests on two occasions, 2 days apart. Test-retest reliability and comparisons between the involved and uninvolved side for each respective test were determined. Results: Intra-class correlation coefficients for the functional performance tests ranged from 0.89-0.96. The cross-over reach test had a SEM of 2.79 cm and a MDC of 7.73 cm. The medial and lateral triple hop tests had SEM values of 7.51 cm and 8.17 cm, and MDC values of 20.81 cm and 22.62 cm, respectively. The SEM was 0.15 seconds and the MDC was 0.42 seconds for the cross-over hop test. Performance on the medial triple hop test was significantly less on the involved side (370.21±38.26 cm) compared to the uninvolved side (388.05±41.49 cm); t(18) = −4.33, p<0.01. The side-to-side comparisons of the cross-over reach test (involved mean=61.68±10.9 cm; uninvolved mean=61.69±8.63 cm); t(18) = −0.004, p=0.99, lateral triple hop test (involved mean=306.92±35.79 cm; uninvolved mean=310.68±24.49 cm); t(18) = −0.55, p=0.59, and cross-over hop test (involved mean=2.49±0.34 seconds; uninvolved mean= 2.61±0.42 seconds; t(18) = −1.84, p=0.08) were not statistically different between sides. Conclusion: The functional performance tests used in this study can be reliably performed on dancers with unilateral hip pain. The medial triple hop test was the only functional performance test with evidence of validity in side-to-side comparisons. These results suggest that the medial triple hop test may be a reliable and valid functional performance test to assess impairments related to hip pain among dancers. Level of Evidence: 3b. Non-consecutive cohort study PMID:24175123

  2. Performance of a Cartridge-Based Assay for Detection of Clinically Significant Human Papillomavirus (HPV) Infection: Lessons from VALGENT (Validation of HPV Genotyping Tests)

    PubMed Central

    Geraets, Daan; Cuzick, Jack; Cadman, Louise; Moore, Catherine; Vanden Broeck, Davy; Padalko, Elisaveta; Quint, Wim; Arbyn, Marc

    2016-01-01

    The Validation of Human Papillomavirus (HPV) Genotyping Tests (VALGENT) studies offer an opportunity to clinically validate HPV assays for use in primary screening for cervical cancer and also provide a framework for the comparison of analytical and type-specific performance. Through VALGENT, we assessed the performance of the cartridge-based Xpert HPV assay (Xpert HPV), which detects 14 high-risk (HR) types and resolves HPV16 and HPV18/45. Samples from women attending the United Kingdom cervical screening program enriched with cytologically abnormal samples were collated. All had been previously tested by a clinically validated standard comparator test (SCT), the GP5+/6+ enzyme immunoassay (EIA). The clinical sensitivity and specificity of the Xpert HPV for the detection of cervical intraepithelial neoplasia grade 2 or higher (CIN2+) and CIN3+ relative to those of the SCT were assessed as were the inter- and intralaboratory reproducibilities according to international criteria for test validation. Type concordance for HPV16 and HPV18/45 between the Xpert HPV and the SCT was also analyzed. The Xpert HPV detected 94% of CIN2+ and 98% of CIN3+ lesions among all screened women and 90% of CIN2+ and 96% of CIN3+ lesions in women 30 years and older. The specificity for CIN1 or less (≤CIN1) was 83% (95% confidence interval [CI], 80 to 85%) in all women and 88% (95% CI, 86 to 91%) in women 30 years and older. Inter- and intralaboratory agreements for the Xpert HPV were 98% and 97%, respectively. The kappa agreements for HPV16 and HPV18/45 between the clinically validated reference test (GP5+/6+ LMNX) and the Xpert HPV were 0.92 and 0.91, respectively. The clinical performance and reproducibility of the Xpert HPV are comparable to those of well-established HPV assays and fulfill the criteria for use in primary cervical cancer screening. PMID:27385707

  3. Reliability and Validity of the Inline Skating Skill Test

    PubMed Central

    Radman, Ivan; Ruzic, Lana; Padovan, Viktoria; Cigrovski, Vjekoslav; Podnar, Hrvoje

    2016-01-01

    This study aimed to examine the reliability and validity of the inline skating skill test. Based on previous skating experience forty-two skaters (26 female and 16 male) were randomized into two groups (competitive level vs. recreational level). They performed the test four times, with a recovery time of 45 minutes between sessions. Prior to testing, the participants rated their skating skill using a scale from 1 to 10. The protocol included performance time measurement through a course, combining different skating techniques. Trivial changes in performance time between the repeated sessions were determined in both competitive females/males and recreational females/males (-1.7% [95% CI: -5.8–2.6%] – 2.2% [95% CI: 0.0–4.5%]). In all four subgroups, the skill test had a low mean within-individual variation (1.6% [95% CI: 1.2–2.4%] – 2.7% [95% CI: 2.1–4.0%]) and high mean inter-session correlation (ICC = 0.97 [95% CI: 0.92–0.99] – 0.99 [95% CI: 0.98–1.00]). The comparison of detected typical errors and smallest worthwhile changes (calculated as standard deviations × 0.2) revealed that the skill test was able to track changes in skaters’ performances. Competitive-level skaters needed shorter time (24.4–26.4%, all p < 0.01) to complete the test in comparison to recreational-level skaters. Moreover, moderate correlation (ρ = 0.80–0.82; all p < 0.01) was observed between the participant’s self-rating and achieved performance times. In conclusion, the proposed test is a reliable and valid method to evaluate inline skating skills in amateur competitive and recreational level skaters. Further studies are needed to evaluate the reproducibility of this skill test in different populations including elite inline skaters. Key points Study evaluated the reliability and construct validity of a newly developed inline skating skill test. Evaluated test is a first protocol designed to assess specific inline skating skill. Two groups of amateur skaters with different skating proficiency repeated the skill test in four separate occasions. The results suggest that evaluated test is reliable and valid to evaluate inline skating skill in amateur skaters. PMID:27803616

  4. Performance assessment instrument to assess the senior high students' psychomotor for the salt hydrolysis material

    NASA Astrophysics Data System (ADS)

    Nahadi, Firman, Harry; Yulina, Erlis

    2016-02-01

    The purposes of this study were to develop a performance assessment instrument for assessing the competence of psychomotor high school students on salt hydrolysis concepts. The design used in this study was the Research & Development which consists of three phases: development, testing and application of instruments. Subjects in this study were high school students in class XI science, which amounts to 93 students. In the development phase, seven validators validated 17 tasks instrument. In the test phase, we divided 19 students into three-part different times to conduct performance test in salt hydrolysis lab work and observed by six raters. The first, the second, and the third groups recpectively consist of five, six, and eight students. In the application phase, two raters observed the performance of 74 students in the salt hydrolysis lab work in several times. The results showed that 16 of 17 tasks of performance assessment instrument developed can be stated to be valid with CVR value of 1,00 and 0,714. While, the rest was not valid with CVR value was 0.429, below the critical value (0.622). In the test phase, reliability value of instrument obtained were 0,951 for the five-student group, 0,806 for the six-student group and 0,743 for the eight-student group. From the interviews, teachers strongly agree with the performance instrument developed. They stated that the instrument was feasible to use for maximum number of students were six in a single observation.

  5. Reaction time as an indicator of insufficient effort: Development and validation of an embedded performance validity parameter.

    PubMed

    Stevens, Andreas; Bahlo, Simone; Licha, Christina; Liske, Benjamin; Vossler-Thies, Elisabeth

    2016-11-30

    Subnormal performance in attention tasks may result from various sources including lack of effort. In this report, the derivation and validation of a performance validity parameter for reaction time is described, using a set of malingering-indices ("Slick-criteria"), and 3 independent samples of participants (total n =893). The Slick-criteria yield an estimate of the probability of malingering based on the presence of an external incentive, evidence from neuropsychological testing, from self-report and clinical data. In study (1) a validity parameter is derived using reaction time data of a sample, composed of inpatients with recent severe brain lesions not involved in litigation and of litigants with and without brain lesion. In study (2) the validity parameter is tested in an independent sample of litigants. In study (3) the parameter is applied to an independent sample comprising cooperative and non-cooperative testees. Logistic regression analysis led to a derived validity parameter based on median reaction time and standard deviation. It performed satisfactorily in studies (2) and (3) (study 2 sensitivity=0.94, specificity=1.00; study 3 sensitivity=0.79, specificity=0.87). The findings suggest that median reaction time and standard deviation may be used as indicators of negative response bias. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  6. Development and validation of a high-fidelity phonomicrosurgical trainer.

    PubMed

    Klein, Adam M; Gross, Jennifer

    2017-04-01

    To validate the use of a high-fidelity phonomicrosurgical trainer. A high-fidelity phonomicrosurgical trainer, based on a previously validated model by Contag et al., 1 was designed with multilayered vocal folds that more closely mimic the consistency of true vocal folds, containing intracordal lesions to practice phonomicrosurgical removal. A training module was developed to simulate the true phonomicrosurgical experience. A validation study with novice and expert surgeons was conducted. Novices and experts were instructed to remove the lesion from the synthetic vocal folds, and novices were given four training trials. Performances were measured by the amount of time spent and tissue injury (microflap, superficial, deep) to the vocal fold. An independent Student t test and Fisher exact tests were used to compare subjects. A matched-paired t test and Wilcoxon signed rank tests were used to compare novice performance on the first and fourth trials and assess for improvement. Experts completed the excision with less total errors than novices (P = .004) and made less injury to the microflap (P = .05) and superficial tissue (P = .003). Novices improved their performance with training, making less total errors (P = .002) and superficial tissue injuries (P = .02) and spending less time for removal (P = .002) after several practice trials. This high-fidelity phonomicrosurgical trainer has been validated for novice surgeons. It can distinguish between experts and novices; and after training, it helped to improve novice performance. N/A. Laryngoscope, 127:888-893, 2017. © 2016 The American Laryngological, Rhinological and Otological Society, Inc.

  7. Procedures for Constructing and Using Criterion-Referenced Performance Tests.

    ERIC Educational Resources Information Center

    Campbell, Clifton P.; Allender, Bill R.

    1988-01-01

    Criterion-referenced performance tests (CRPT) provide a realistic method for objectively measuring task proficiency against predetermined attainment standards. This article explains the procedures of constructing, validating, and scoring CRPTs and includes a checklist for a welding test. (JOW)

  8. NDARC - NASA Design and Analysis of Rotorcraft Validation and Demonstration

    NASA Technical Reports Server (NTRS)

    Johnson, Wayne

    2010-01-01

    Validation and demonstration results from the development of the conceptual design tool NDARC (NASA Design and Analysis of Rotorcraft) are presented. The principal tasks of NDARC are to design a rotorcraft to satisfy specified design conditions and missions, and then analyze the performance of the aircraft for a set of off-design missions and point operating conditions. The aircraft chosen as NDARC development test cases are the UH-60A single main-rotor and tail-rotor helicopter, the CH-47D tandem helicopter, the XH-59A coaxial lift-offset helicopter, and the XV-15 tiltrotor. These aircraft were selected because flight performance data, a weight statement, detailed geometry information, and a correlated comprehensive analysis model are available for each. Validation consists of developing the NDARC models for these aircraft by using geometry and weight information, airframe wind tunnel test data, engine decks, rotor performance tests, and comprehensive analysis results; and then comparing the NDARC results for aircraft and component performance with flight test data. Based on the calibrated models, the capability of the code to size rotorcraft is explored.

  9. Exploring rationality in schizophrenia

    PubMed Central

    Mortensen, Erik Lykke; Owen, Gareth; Nordgaard, Julie; Jansson, Lennart; Sæbye, Ditte; Flensborg-Madsen, Trine; Parnas, Josef

    2015-01-01

    Background Empirical studies of rationality (syllogisms) in patients with schizophrenia have obtained different results. One study found that patients reason more logically if the syllogism is presented through an unusual content. Aims To explore syllogism-based rationality in schizophrenia. Method Thirty-eight first-admitted patients with schizophrenia and 38 healthy controls solved 29 syllogisms that varied in presentation content (ordinary v. unusual) and validity (valid v. invalid). Statistical tests were made of unadjusted and adjusted group differences in models adjusting for intelligence and neuropsychological test performance. Results Controls outperformed patients on all syllogism types, but the difference between the two groups was only significant for valid syllogisms presented with unusual content. However, when adjusting for intelligence and neuropsychological test performance, all group differences became non-significant. Conclusions When taking intelligence and neuropsychological performance into account, patients with schizophrenia and controls perform similarly on syllogism tests of rationality. Declaration of interest None. Copyright and usage © The Royal College of Psychiatrists 2015. This is an open access article distributed under the terms of the Creative Commons Non-Commercial, No Derivatives (CC BY-NC-ND) licence. PMID:27703730

  10. Prospective validation of pathologic complete response models in rectal cancer: Transferability and reproducibility.

    PubMed

    van Soest, Johan; Meldolesi, Elisa; van Stiphout, Ruud; Gatta, Roberto; Damiani, Andrea; Valentini, Vincenzo; Lambin, Philippe; Dekker, Andre

    2017-09-01

    Multiple models have been developed to predict pathologic complete response (pCR) in locally advanced rectal cancer patients. Unfortunately, validation of these models normally omit the implications of cohort differences on prediction model performance. In this work, we will perform a prospective validation of three pCR models, including information whether this validation will target transferability or reproducibility (cohort differences) of the given models. We applied a novel methodology, the cohort differences model, to predict whether a patient belongs to the training or to the validation cohort. If the cohort differences model performs well, it would suggest a large difference in cohort characteristics meaning we would validate the transferability of the model rather than reproducibility. We tested our method in a prospective validation of three existing models for pCR prediction in 154 patients. Our results showed a large difference between training and validation cohort for one of the three tested models [Area under the Receiver Operating Curve (AUC) cohort differences model: 0.85], signaling the validation leans towards transferability. Two out of three models had a lower AUC for validation (0.66 and 0.58), one model showed a higher AUC in the validation cohort (0.70). We have successfully applied a new methodology in the validation of three prediction models, which allows us to indicate if a validation targeted transferability (large differences between training/validation cohort) or reproducibility (small cohort differences). © 2017 American Association of Physicists in Medicine.

  11. Validation of a short-term memory test for the recognition of people and faces.

    PubMed

    Leyk, D; Sievert, A; Heiss, A; Gorges, W; Ridder, D; Alexander, T; Wunderlich, M; Ruther, T

    2008-08-01

    Memorising and processing faces is a short-term memory dependent task of utmost importance in the security domain, in which constant and high performance is a must. Especially in access or passport control-related tasks, the timely identification of performance decrements is essential, margins of error are narrow and inadequate performance may have grave consequences. However, conventional short-term memory tests frequently use abstract settings with little relevance to working situations. They may thus be unable to capture task-specific decrements. The aim of the study was to devise and validate a new test, better reflecting job specifics and employing appropriate stimuli. After 1.5 s (short) or 4.5 s (long) presentation, a set of seven portraits of faces had to be memorised for comparison with two control stimuli. Stimulus appearance followed 2 s (first item) and 8 s (second item) after set presentation. Twenty eight subjects (12 male, 16 female) were tested at seven different times of day, 3 h apart. Recognition rates were above 60% even for the least favourable condition. Recognition was significantly better in the 'long' condition (+10%) and for the first item (+18%). Recognition time showed significant differences (10%) between items. Minor effects of learning were found for response latencies only. Based on occupationally relevant metrics, the test displayed internal and external validity, consistency and suitability for further use in test/retest scenarios. In public security, especially where access to restricted areas is monitored, margins of error are narrow and operator performance must remain high and level. Appropriate schedules for personnel, based on valid test results, are required. However, task-specific data and performance tests, permitting the description of task specific decrements, are not available. Commonly used tests may be unsuitable due to undue abstraction and insufficient reference to real-world conditions. Thus, tests are required that account for task-specific conditions and neurophysiological characteristics.

  12. Construct Validity of Fresh Frozen Human Cadaver as a Training Model in Minimal Access Surgery

    PubMed Central

    Macafee, David; Pranesh, Nagarajan; Horgan, Alan F.

    2012-01-01

    Background: The construct validity of fresh human cadaver as a training tool has not been established previously. The aims of this study were to investigate the construct validity of fresh frozen human cadaver as a method of training in minimal access surgery and determine if novices can be rapidly trained using this model to a safe level of performance. Methods: Junior surgical trainees, novices (<3 laparoscopic procedure performed) in laparoscopic surgery, performed 10 repetitions of a set of structured laparoscopic tasks on fresh frozen cadavers. Expert laparoscopists (>100 laparoscopic procedures) performed 3 repetitions of identical tasks. Performances were scored using a validated, objective Global Operative Assessment of Laparoscopic Skills scale. Scores for 3 consecutive repetitions were compared between experts and novices to determine construct validity. Furthermore, to determine if the novices reached a safe level, a trimmed mean of the experts score was used to define a benchmark. Mann-Whitney U test was used for construct validity analysis and 1-sample t test to compare performances of the novice group with the benchmark safe score. Results: Ten novices and 2 experts were recruited. Four out of 5 tasks (nondominant to dominant hand transfer; simulated appendicectomy; intracorporeal and extracorporeal knot tying) showed construct validity. Novices’ scores became comparable to benchmark scores between the eighth and tenth repetition. Conclusion: Minimal access surgical training using fresh frozen human cadavers appears to have construct validity. The laparoscopic skills of novices can be accelerated through to a safe level within 8 to 10 repetitions. PMID:23318058

  13. Work zone performance measures pilot test.

    DOT National Transportation Integrated Search

    2011-04-01

    Currently, a well-defined and validated set of metrics to use in monitoring work zone performance do not : exist. This pilot test was conducted to assist state DOTs in identifying what work zone performance : measures can and should be targeted, what...

  14. Relationships between the handball-specific complex test, non-specific field tests and the match performance score in elite professional handball players.

    PubMed

    Hermassi, Souhail; Chelly, Mohamed-Souhaiel; Wollny, Rainer; Hoffmeyer, Birgit; Fieseler, Georg; Schulze, Stephan; Irlenbusch, Lars; Delank, Karl-Stefan; Shephard, Roy J; Bartels, Thomas; Schwesig, René

    2018-06-01

    This study assessed the validity of the handball-specific complex test (HBCT) and two non-specific field tests in professional elite handball athletes, using the match performance score (MPS) as the gold standard of performance. Thirteen elite male handball players (age: 27.4±4.8 years; premier German league) performed the HBCT, the Yo-Yo Intermittent Recovery (YYIR) test and a repeated shuttle sprint ability (RSA) test at the beginning of pre-season training. The RSA results were evaluated in terms of best time, total time, and fatigue decrement. Heart rates (HR) were assessed at selected times throughout all tests; the recovery HR was measured immediately post-test and 10 minutes later. The match performance score was based on various handball specific parameters (e.g., field goals, assists, steals, blocks, and technical mistakes) as seen during all matches of the immediately subsequent season (2015/2016). The parameters of run 1, run 2, and HR recovery at minutes 6 and 10 of the RSA test all showed a variance of more than 10% (range: 11-15%). However, the variance of scores for the YYIR test was much smaller (range: 1-7%). The resting HR (r2=0.18), HR recovery at minute 10 (r2=0.10), lactate concentration at rest (r2=0.17), recovery of heart rate from 0 to 10 minutes (r2=0.15), and velocity of second throw at first trial (r2=0.37) were the most valid HBCT parameters. Much effort is necessary to assess MPS and to develop valid tests. Speed and the rate of functional recovery seem the best predictors of competitive performance for elite handball players.

  15. Statistical validation of normal tissue complication probability models.

    PubMed

    Xu, Cheng-Jian; van der Schaaf, Arjen; Van't Veld, Aart A; Langendijk, Johannes A; Schilstra, Cornelis

    2012-09-01

    To investigate the applicability and value of double cross-validation and permutation tests as established statistical approaches in the validation of normal tissue complication probability (NTCP) models. A penalized regression method, LASSO (least absolute shrinkage and selection operator), was used to build NTCP models for xerostomia after radiation therapy treatment of head-and-neck cancer. Model assessment was based on the likelihood function and the area under the receiver operating characteristic curve. Repeated double cross-validation showed the uncertainty and instability of the NTCP models and indicated that the statistical significance of model performance can be obtained by permutation testing. Repeated double cross-validation and permutation tests are recommended to validate NTCP models before clinical use. Copyright © 2012 Elsevier Inc. All rights reserved.

  16. Cross-cultural adaptation and validation of the sino-nasal outcome test (SNOT-22) for Spanish-speaking patients.

    PubMed

    de los Santos, Gonzalo; Reyes, Pablo; del Castillo, Raúl; Fragola, Claudio; Royuela, Ana

    2015-11-01

    Our objective was to perform translation, cross-cultural adaptation and validation of the sino-nasal outcome test 22 (SNOT-22) to Spanish language. SNOT-22 was translated, back translated, and a pretest trial was performed. The study included 119 individuals divided into 60 cases, who met diagnostic criteria for chronic rhinosinusitis according to the European Position Paper on Rhinosinusitis 2012; and 59 controls, who reported no sino-nasal disease. Internal consistency was evaluated with Cronbach's alpha test, reproducibility with Kappa coefficient, reliability with intraclass correlation coefficient (ICC), validity with Mann-Whitney U test and responsiveness with Wilcoxon test. In cases, Cronbach's alpha was 0.91 both before and after treatment, as for controls, it was 0.90 at their first test assessment and 0.88 at 3 weeks. Kappa coefficient was calculated for each item, with an average score of 0.69. ICC was also performed for each item, with a score of 0.87 in the overall score and an average among all items of 0.71. Median score for cases was 47, and 2 for controls, finding the difference to be highly significant (Mann-Whitney U test, p < 0.001). Clinical changes were observed among treated patients, with a median score of 47 and 13.5 before and after treatment, respectively (Wilcoxon test, p < 0.001). The effect size resulted in 0.14 in treated patients whose status at 3 weeks was unvarying; 1.03 in those who were better and 1.89 for much better group. All controls were unvarying with an effect size of 0.05. The Spanish version of the SNOT-22 has the internal consistency, reliability, reproducibility, validity and responsiveness necessary to be a valid instrument to be used in clinical practice.

  17. Two-Speed Gearbox Dynamic Simulation Predictions and Test Validation

    NASA Technical Reports Server (NTRS)

    Lewicki, David G.; DeSmidt, Hans; Smith, Edward C.; Bauman, Steven W.

    2010-01-01

    Dynamic simulations and experimental validation tests were performed on a two-stage, two-speed gearbox as part of the drive system research activities of the NASA Fundamental Aeronautics Subsonics Rotary Wing Project. The gearbox was driven by two electromagnetic motors and had two electromagnetic, multi-disk clutches to control output speed. A dynamic model of the system was created which included a direct current electric motor with proportional-integral-derivative (PID) speed control, a two-speed gearbox with dual electromagnetically actuated clutches, and an eddy current dynamometer. A six degree-of-freedom model of the gearbox accounted for the system torsional dynamics and included gear, clutch, shaft, and load inertias as well as shaft flexibilities and a dry clutch stick-slip friction model. Experimental validation tests were performed on the gearbox in the NASA Glenn gear noise test facility. Gearbox output speed and torque as well as drive motor speed and current were compared to those from the analytical predictions. The experiments correlate very well with the predictions, thus validating the dynamic simulation methodologies.

  18. The analytical validation of the Oncotype DX Recurrence Score assay

    PubMed Central

    Baehner, Frederick L

    2016-01-01

    In vitro diagnostic multivariate index assays are highly complex molecular assays that can provide clinically actionable information regarding the underlying tumour biology and facilitate personalised treatment. These assays are only useful in clinical practice if all of the following are established: analytical validation (i.e., how accurately/reliably the assay measures the molecular characteristics), clinical validation (i.e., how consistently/accurately the test detects/predicts the outcomes of interest), and clinical utility (i.e., how likely the test is to significantly improve patient outcomes). In considering the use of these assays, clinicians often focus primarily on the clinical validity/utility; however, the analytical validity of an assay (e.g., its accuracy, reproducibility, and standardisation) should also be evaluated and carefully considered. This review focuses on the rigorous analytical validation and performance of the Oncotype DX® Breast Cancer Assay, which is performed at the Central Clinical Reference Laboratory of Genomic Health, Inc. The assay process includes tumour tissue enrichment (if needed), RNA extraction, gene expression quantitation (using a gene panel consisting of 16 cancer genes plus 5 reference genes and quantitative real-time RT-PCR), and an automated computer algorithm to produce a Recurrence Score® result (scale: 0–100). This review presents evidence showing that the Recurrence Score result reported for each patient falls within a tight clinically relevant confidence interval. Specifically, the review discusses how the development of the assay was designed to optimise assay performance, presents data supporting its analytical validity, and describes the quality control and assurance programmes that ensure optimal test performance over time. PMID:27729940

  19. The analytical validation of the Oncotype DX Recurrence Score assay.

    PubMed

    Baehner, Frederick L

    2016-01-01

    In vitro diagnostic multivariate index assays are highly complex molecular assays that can provide clinically actionable information regarding the underlying tumour biology and facilitate personalised treatment. These assays are only useful in clinical practice if all of the following are established: analytical validation (i.e., how accurately/reliably the assay measures the molecular characteristics), clinical validation (i.e., how consistently/accurately the test detects/predicts the outcomes of interest), and clinical utility (i.e., how likely the test is to significantly improve patient outcomes). In considering the use of these assays, clinicians often focus primarily on the clinical validity/utility; however, the analytical validity of an assay (e.g., its accuracy, reproducibility, and standardisation) should also be evaluated and carefully considered. This review focuses on the rigorous analytical validation and performance of the Oncotype DX ® Breast Cancer Assay, which is performed at the Central Clinical Reference Laboratory of Genomic Health, Inc. The assay process includes tumour tissue enrichment (if needed), RNA extraction, gene expression quantitation (using a gene panel consisting of 16 cancer genes plus 5 reference genes and quantitative real-time RT-PCR), and an automated computer algorithm to produce a Recurrence Score ® result (scale: 0-100). This review presents evidence showing that the Recurrence Score result reported for each patient falls within a tight clinically relevant confidence interval. Specifically, the review discusses how the development of the assay was designed to optimise assay performance, presents data supporting its analytical validity, and describes the quality control and assurance programmes that ensure optimal test performance over time.

  20. Large format array controller (aLFA-C): tests and characterisation at ESA

    NASA Astrophysics Data System (ADS)

    Lemmel, Frédéric; ter Haar, Jörg; van der Biezen, John; Duvet, Ludovic; Nelms, Nick; Blommaert, Sander; Butler, Bart; van der Luijt, Cornelis; Heijnen, Jerko; Smit, Hans; Visser, Ivo

    2016-08-01

    For future near infrared astronomy missions, ESA is developing a complete detection and conversion chain (photon to SpaceWire chain system): Large Format Array (aLFA-N) based on MCT type detectors. aLFA-C (Astronomy Large Format Array Controller): a versatile cryogenic detector controller. An aLFA-C prototype was developed by Caeleste (Belgium) under ESA contract (400106260400). To validate independently the performances of the aLFA-C prototype and consolidate the definition of the follow-on activity, a dedicated test bench has been designed and developed in ESTEC/ESA within the Payload Technology Validation group. This paper presents the test setup and the performance validation of the first prototype of this controller at room and cryogenic temperature. Test setup and software needed to test the HAWAII-2RG and aLFA-N detectors with the aLFA-C prototype at cryogenic temperature will be also presented.

  1. Updating the Trainability Tests Literature on Black-White Subgroup Differences and Reconsidering Criterion-Related Validity

    ERIC Educational Resources Information Center

    Roth, Philip L.; Buster, Maury A.; Bobko, Philip

    2011-01-01

    A number of applied psychologists have suggested that trainability test Black-White ethnic group differences are low or relatively low (e.g., Siegel & Bergman, 1975), though data are scarce. Likewise, there are relatively few estimates of criterion-related validity for trainability tests predicting job performance (cf. Robertson & Downs,…

  2. Slip resistance of winter footwear on snow and ice measured using maximum achievable incline.

    PubMed

    Hsu, Jennifer; Shaw, Robert; Novak, Alison; Li, Yue; Ormerod, Marcus; Newton, Rita; Dutta, Tilak; Fernie, Geoff

    2016-05-01

    Protective footwear is necessary for preventing injurious slips and falls in winter conditions. Valid methods for assessing footwear slip resistance on winter surfaces are needed in order to evaluate footwear and outsole designs. The purpose of this study was to utilise a method of testing winter footwear that was ecologically valid in terms of involving actual human testers walking on realistic winter surfaces to produce objective measures of slip resistance. During the experiment, eight participants tested six styles of footwear on wet ice, on dry ice, and on dry ice after walking over soft snow. Slip resistance was measured by determining the maximum incline angles participants were able to walk up and down in each footwear-surface combination. The results indicated that testing on a variety of surfaces is necessary for establishing winter footwear performance and that standard mechanical bench tests for footwear slip resistance do not adequately reflect actual performance. Practitioner Summary: Existing standardised methods for measuring footwear slip resistance lack validation on winter surfaces. By determining the maximum inclines participants could walk up and down slopes of wet ice, dry ice, and ice with snow, in a range of footwear, an ecologically valid test for measuring winter footwear performance was established.

  3. Slip resistance of winter footwear on snow and ice measured using maximum achievable incline

    PubMed Central

    Hsu, Jennifer; Shaw, Robert; Novak, Alison; Li, Yue; Ormerod, Marcus; Newton, Rita; Dutta, Tilak; Fernie, Geoff

    2016-01-01

    Abstract Protective footwear is necessary for preventing injurious slips and falls in winter conditions. Valid methods for assessing footwear slip resistance on winter surfaces are needed in order to evaluate footwear and outsole designs. The purpose of this study was to utilise a method of testing winter footwear that was ecologically valid in terms of involving actual human testers walking on realistic winter surfaces to produce objective measures of slip resistance. During the experiment, eight participants tested six styles of footwear on wet ice, on dry ice, and on dry ice after walking over soft snow. Slip resistance was measured by determining the maximum incline angles participants were able to walk up and down in each footwear–surface combination. The results indicated that testing on a variety of surfaces is necessary for establishing winter footwear performance and that standard mechanical bench tests for footwear slip resistance do not adequately reflect actual performance. Practitioner Summary: Existing standardised methods for measuring footwear slip resistance lack validation on winter surfaces. By determining the maximum inclines participants could walk up and down slopes of wet ice, dry ice, and ice with snow, in a range of footwear, an ecologically valid test for measuring winter footwear performance was established. PMID:26555738

  4. Reliability and validity of the revised Gibson Test of Cognitive Skills, a computer-based test battery for assessing cognition across the lifespan.

    PubMed

    Moore, Amy Lawson; Miller, Terissa M

    2018-01-01

    The purpose of the current study is to evaluate the validity and reliability of the revised Gibson Test of Cognitive Skills, a computer-based battery of tests measuring short-term memory, long-term memory, processing speed, logic and reasoning, visual processing, as well as auditory processing and word attack skills. This study included 2,737 participants aged 5-85 years. A series of studies was conducted to examine the validity and reliability using the test performance of the entire norming group and several subgroups. The evaluation of the technical properties of the test battery included content validation by subject matter experts, item analysis and coefficient alpha, test-retest reliability, split-half reliability, and analysis of concurrent validity with the Woodcock Johnson III Tests of Cognitive Abilities and Tests of Achievement. Results indicated strong sources of evidence of validity and reliability for the test, including internal consistency reliability coefficients ranging from 0.87 to 0.98, test-retest reliability coefficients ranging from 0.69 to 0.91, split-half reliability coefficients ranging from 0.87 to 0.91, and concurrent validity coefficients ranging from 0.53 to 0.93. The Gibson Test of Cognitive Skills-2 is a reliable and valid tool for assessing cognition in the general population across the lifespan.

  5. Development and validation of a web-based questionnaire for surveying the health and working conditions of high-performance marine craft populations

    PubMed Central

    de Alwis, Manudul Pahansen; Lo Martire, Riccardo; Äng, Björn O; Garme, Karl

    2016-01-01

    Background High-performance marine craft crews are susceptible to various adverse health conditions caused by multiple interactive factors. However, there are limited epidemiological data available for assessment of working conditions at sea. Although questionnaire surveys are widely used for identifying exposures, outcomes and associated risks with high accuracy levels, until now, no validated epidemiological tool exists for surveying occupational health and performance in these populations. Aim To develop and validate a web-based questionnaire for epidemiological assessment of occupational and individual risk exposure pertinent to the musculoskeletal health conditions and performance in high-performance marine craft populations. Method A questionnaire for investigating the association between work-related exposure, performance and health was initially developed by a consensus panel under four subdomains, viz. demography, lifestyle, work exposure and health and systematically validated by expert raters for content relevance and simplicity in three consecutive stages, each iteratively followed by a consensus panel revision. The item content validity index (I-CVI) was determined as the proportion of experts giving a rating of 3 or 4. The scale content validity index (S-CVI/Ave) was computed by averaging the I-CVIs for the assessment of the questionnaire as a tool. Finally, the questionnaire was pilot tested. Results The S-CVI/Ave increased from 0.89 to 0.96 for relevance and from 0.76 to 0.94 for simplicity, resulting in 36 items in the final questionnaire. The pilot test confirmed the feasibility of the questionnaire. Conclusions The present study shows that the web-based questionnaire fulfils previously published validity acceptance criteria and is therefore considered valid and feasible for the empirical surveying of epidemiological aspects among high-performance marine craft crews and similar populations. PMID:27324717

  6. Danish VISA-A questionnaire with validation and reliability testing for Danish-speaking Achilles tendinopathy patients.

    PubMed

    Iversen, J V; Bartels, E M; Jørgensen, J E; Nielsen, T G; Ginnerup, C; Lind, M C; Langberg, H

    2016-12-01

    The VISA-A questionnaire has proven to be a valid and reliable tool for assessing severity of Achilles tendinopathy (AT). The aim was to translate and cross-culturally adapt the VISA-A questionnaire for a Danish-speaking AT population, and subsequently perform validity and reliability tests. Translation and following cross-cultural adaptation was performed as translation, synthesis, reverse translation, expert review, and pretesting. The final Danish version (VISA-A-DK) was tested for reliability on healthy controls (n = 75) and patients (n = 36). Tests for internal consistency, validity, and structure were performed on 71 patients. VISA-A-DK showed good reliability for patients (r = 0.80 ICC = 0.79) and healthy individuals (r = 0.98 ICC = 0.97). Internal consistency was 0.73 (Cronbach's alpha). The mean VISA-A-DK score in AT patients was 51 [47-55]. This was significantly lower than healthy controls with a score of 93 (90-95). Criterion validity was considered good when comparing the scores of the Danish version with the original version in both healthy individuals and patients. VISA-A-DK is a valid and reliable instrument and has shown compatible to the original version in assessment of AT patients. VISA-A-DK is a useful tool in the assessment of AT, both in research and in a clinical setting. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  7. Development of Internet-Based Tasks for the Executive Function Performance Test.

    PubMed

    Rand, Debbie; Lee Ben-Haim, Keren; Malka, Rachel; Portnoy, Sigal

    The Executive Function Performance Test (EFPT) is a reliable and valid performance-based tool to assess executive functions (EFs). This study's objective was to develop and verify two Internet-based tasks for the EFPT. A cross-sectional study assessed the alternate-form reliability of the Internet-based bill-paying and telephone-use tasks in healthy adults and people with subacute stroke (Study 1). It also sought to establish the tasks' criterion reliability for assessing EF deficits by correlating performance with that on the Trail Making Test in five groups: healthy young adults, healthy older adults, people with subacute stroke, people with chronic stroke, and young adults with attention deficit hyperactivity disorder (Study 2). The alternative-form reliability and initial construct validity for the Internet-based bill-paying task were verified. Criterion validity was established for both tasks. The Internet-based tasks are comparable to the original EFPT tasks and can be used for assessment of EF deficits. Copyright © 2018 by the American Occupational Therapy Association, Inc.

  8. UAS in the NAS Flight Test Series 4 Overview

    NASA Technical Reports Server (NTRS)

    Murphy, Jim

    2016-01-01

    Flight Test Series 4 (FT4) provides the researchers with an opportunity to expand on the data collected during the first flight tests. Following Flight Test Series 3, additional scripted encounters with different aircraft performance and sensors will be conducted. FT4 is presently planned for Spring of 2016 to ensure collection of data to support the validation of the final RTCA Phase 1 DAA (Detect and Avoid) Minimum Operational Performance Standards (MOPS). There are three research objectives associated with this goal: Evaluate the performance of the DAA system against cooperative and non-cooperative aircraft encounters Evaluate UAS (Unmanned Aircraft Systems) pilot performance in response to DAA maneuver guidance and alerting with live intruder encounters Evaluate TCAS/DAA (Traffic Alert and Collision Avoidance System/Detect and Avoid) interoperability. This flight test series will focus on only the Scripted Encounters configuration, supporting the collection of data to validate the interoperability of DAA and collision avoidance algorithms.

  9. Hyper-X Engine Design and Ground Test Program

    NASA Technical Reports Server (NTRS)

    Voland, R. T.; Rock, K. E.; Huebner, L. D.; Witte, D. W.; Fischer, K. E.; McClinton, C. R.

    1998-01-01

    The Hyper-X Program, NASA's focused hypersonic technology program jointly run by NASA Langley and Dryden, is designed to move hypersonic, air-breathing vehicle technology from the laboratory environment to the flight environment, the last stage preceding prototype development. The Hyper-X research vehicle will provide the first ever opportunity to obtain data on an airframe integrated supersonic combustion ramjet propulsion system in flight, providing the first flight validation of wind tunnel, numerical and analytical methods used for design of these vehicles. A substantial portion of the integrated vehicle/engine flowpath development, engine systems verification and validation and flight test risk reduction efforts are experimentally based, including vehicle aeropropulsive force and moment database generation for flight control law development, and integrated vehicle/engine performance validation. The Mach 7 engine flowpath development tests have been completed, and effort is now shifting to engine controls, systems and performance verification and validation tests, as well as, additional flight test risk reduction tests. The engine wind tunnel tests required for these efforts range from tests of partial width engines in both small and large scramjet test facilities, to tests of the full flight engine on a vehicle simulator and tests of a complete flight vehicle in the Langley 8-Ft. High Temperature Tunnel. These tests will begin in the summer of 1998 and continue through 1999. The first flight test is planned for early 2000.

  10. Meta-Analysis of Integrity Tests: A Critical Examination of Validity Generalization and Moderator Variables

    DTIC Science & Technology

    1992-06-01

    AVA LABLLTY OF PEPOR’ 2b DECLASSfFiCATION DOWNGRADING SCHEDULE UnI imiited 4 PERFORMING ORGANZAT ON REPORT NUMBER(S) 5 MON’TORzNG ORGA% ZA C% RPEOR...8217 " S 92- 1 6a NAME OF PERFORMING ORGANIZATION 6b OFFPCE SYMBOL 7a NAME OF V0’O0R ’C OCGAz) ZA- %I University of Iowa (Ifappicable) Defense Personnel...data points. Results indicate that integrity test validities are positive and in many cases substantial for predicting both job performance and

  11. The 2.3 kW Ion Thruster Wear Test

    NASA Technical Reports Server (NTRS)

    Parkes, James; Rawlin, Vincent K.; Sovey, James S.; Kussmaul, Michael J.; Patterson, Michael J.

    1995-01-01

    A 30-cm diameter xenon ion thruster is under development at NASA to provide an ion propulsion option for auxiliary and primary propulsion on missions of national interest. Specific efforts include thruster design optimizations, component life testing and validation, and performance characterizations. Under this program, the ion thruster will be brought to engineering model development status. This paper describes the results of a 2.3-kW 2000-hour wear test performed to identify life limiting phenomena, measure the performance and characterize the operation of the thruster, and obtain wear, erosion, and surface contamination data. These data are being using as a data base for proceeding with additional life validation tests, and to provide input to flight thruster design requirements.

  12. The importance of assessing for validity of symptom report and performance in attention deficit/hyperactivity disorder (ADHD): Introduction to the special section on noncredible presentation in ADHD.

    PubMed

    Suhr, Julie A; Berry, David T R

    2017-12-01

    Invalid self-report and invalid performance occur with high base rates in attention deficit/hyperactivity disorder (ADHD; Harrison, 2006; Musso & Gouvier, 2014). Although much research has focused on the development and validation of symptom validity tests (SVTs) and performance validity tests (PVTs) for psychiatric and neurological presentations, less attention has been given to the use of SVTs and PVTs in ADHD evaluation. This introduction to the special section describes a series of studies examining the use of SVTs and PVTs in adult ADHD evaluation. We present the series of studies in the context of prior research on noncredible presentation and call for future research using improved research methods and with a focus on assessment issues specific to ADHD evaluation. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  13. Validity: Applying Current Concepts and Standards to Gynecologic Surgery Performance Assessments

    ERIC Educational Resources Information Center

    LeClaire, Edgar L.; Nihira, Mikio A.; Hardré, Patricia L.

    2015-01-01

    Validity is critical for meaningful assessment of surgical competency. According to the Standards for Educational and Psychological Testing, validation involves the integration of data from well-defined classifications of evidence. In the authoritative framework, data from all classifications support construct validity claims. The two aims of this…

  14. An initial investigation into the validity of a computer-based auditory processing assessment (Feather Squadron).

    PubMed

    Barker, Matthew D; Purdy, Suzanne C

    2016-01-01

    This research investigates a novel method for identifying and measuring school-aged children with poor auditory processing through a tablet computer. Feasibility and test-retest reliability are investigated by examining the percentage of Group 1 participants able to complete the tasks and developmental effects on performance. Concurrent validity was investigated against traditional tests of auditory processing using Group 2. There were 847 students aged 5 to 13 years in group 1, and 46 aged 5 to 14 years in group 2. Some tasks could not be completed by the youngest participants. Significant correlations were found between results of most auditory processing areas assessed by the Feather Squadron test and traditional auditory processing tests. Test-retest comparisons indicated good reliability for most of the Feather Squadron assessments and some of the traditional tests. The results indicate the Feather Squadron assessment is a time-efficient, feasible, concurrently valid, and reliable approach for measuring auditory processing in school-aged children. Clinically, this may be a useful option for audiologists when performing auditory processing assessments as it is a relatively fast, engaging, and easy way to assess auditory processing abilities. Research is needed to investigate further the construct validity of this new assessment by examining the association between performance on Feather Squadron and objective evoked potential, lesion studies, and/or functional imaging measures of auditory function.

  15. Construct validity of the individual work performance questionnaire.

    PubMed

    Koopmans, Linda; Bernaards, Claire M; Hildebrandt, Vincent H; de Vet, Henrica C W; van der Beek, Allard J

    2014-03-01

    To examine the construct validity of the Individual Work Performance Questionnaire (IWPQ). A total of 1424 Dutch workers from three occupational sectors (blue, pink, and white collar) participated in the study. First, IWPQ scores were correlated with related constructs (convergent validity). Second, differences between known groups were tested (discriminative validity). First, IWPQ scores correlated weakly to moderately with absolute and relative presenteeism, and work engagement. Second, significant differences in IWPQ scores were observed for workers differing in job satisfaction, and workers differing in health. Overall, the results indicate acceptable construct validity of the IWPQ. Researchers are provided with a reliable and valid instrument to measure individual work performance comprehensively and generically, among workers from different occupational sectors, with and without health problems.

  16. Catch-up validation study of an in vitro skin irritation test method based on an open source reconstructed epidermis (phase II).

    PubMed

    Groeber, F; Schober, L; Schmid, F F; Traube, A; Kolbus-Hernandez, S; Daton, K; Hoffmann, S; Petersohn, D; Schäfer-Korting, M; Walles, H; Mewes, K R

    2016-10-01

    To replace the Draize skin irritation assay (OECD guideline 404) several test methods based on reconstructed human epidermis (RHE) have been developed and were adopted in the OECD test guideline 439. However, all validated test methods in the guideline are linked to RHE provided by only three companies. Thus, the availability of these test models is dependent on the commercial interest of the producer. To overcome this limitation and thus to increase the accessibility of in vitro skin irritation testing, an open source reconstructed epidermis (OS-REp) was introduced. To demonstrate the capacity of the OS-REp in regulatory risk assessment, a catch-up validation study was performed. The participating laboratories used in-house generated OS-REp to assess the set of 20 reference substances according to the performance standards amending the OECD test guideline 439. Testing was performed under blinded conditions. The within-laboratory reproducibility of 87% and the inter-laboratory reproducibility of 85% prove a high reliability of irritancy testing using the OS-REp protocol. In addition, the prediction capacity was with an accuracy of 80% comparable to previous published RHE based test protocols. Taken together the results indicate that the OS-REp test method can be used as a standalone alternative skin irritation test replacing the OECD test guideline 404. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

  17. Input Range Testing for the General Mission Analysis Tool (GMAT)

    NASA Technical Reports Server (NTRS)

    Hughes, Steven P.

    2007-01-01

    This document contains a test plan for testing input values to the General Mission Analysis Tool (GMAT). The plan includes four primary types of information, which rigorously define all tests that should be performed to validate that GMAT will accept allowable inputs and deny disallowed inputs. The first is a complete list of all allowed object fields in GMAT. The second type of information, is test input to be attempted for each field. The third type of information is allowable input values for all objects fields in GMAT. The final piece of information is how GMAT should respond to both valid and invalid information. It is VERY important to note that the tests below must be performed for both the Graphical User Interface and the script!! The examples are illustrated using a scripting perspective, because it is simpler to write up. However, the test must be performed for both interfaces to GMAT.

  18. Flight-Test Validation and Flying Qualities Evaluation of a Rotorcraft UAV Flight Control System

    NASA Technical Reports Server (NTRS)

    Mettler, Bernard; Tuschler, Mark B.; Kanade, Takeo

    2000-01-01

    This paper presents a process of design and flight-test validation and flying qualities evaluation of a flight control system for a rotorcraft-based unmanned aerial vehicle (RUAV). The keystone of this process is an accurate flight-dynamic model of the aircraft, derived by using system identification modeling. The model captures the most relevant dynamic features of our unmanned rotorcraft, and explicitly accounts for the presence of a stabilizer bar. Using the identified model we were able to determine the performance margins of our original control system and identify limiting factors. The performance limitations were addressed and the attitude control system was 0ptimize.d for different three performance levels: slow, medium, fast. The optimized control laws will be implemented in our RUAV. We will first determine the validity of our control design approach by flight test validating our optimized controllers. Subsequently, we will fly a series of maneuvers with the three optimized controllers to determine the level of flying qualities that can be attained. The outcome enable us to draw important conclusions on the flying qualities requirements for small-scale RUAVs.

  19. Timed activity performance in persons with upper limb amputation: A preliminary study.

    PubMed

    Resnik, Linda; Borgia, Mathew; Acluche, Frantzy

    55 subjects with upper limb amputation were administered the T-MAP twice within one week. To develop a timed measure of activity performance for persons with upper limb amputation (T-MAP); examine the measure's internal consistency, test-retest reliability and validity; and compare scores by prosthesis use. Measures of activity performance for persons with upper limb amputation are needed The time required to perform daily activities is a meaningful metric that implication for participation in life roles. Internal consistency and test-retest reliability were evaluated. Construct validity was examined by comparing scores by amputation level. Exploratory analyses compared sub-group scores, and examined correlations with other measures. Scale alpha was 0.77, ICC was 0.93. Timed scores differed by amputation level. Subjects using a prosthesis took longer to perform all tasks. T-MAP was not correlated with other measures of dexterity or activity, but was correlated with pain for non-prosthesis users. The timed scale had adequate internal consistency and excellent test-retest reliability. Analyses support reliability and construct validity of the T-MAP. 2c "outcomes" research. Published by Elsevier Inc.

  20. CSI Flight Computer System and experimental test results

    NASA Technical Reports Server (NTRS)

    Sparks, Dean W., Jr.; Peri, F., Jr.; Schuler, P.

    1993-01-01

    This paper describes the CSI Computer System (CCS) and the experimental tests performed to validate its functionality. This system is comprised of two major components: the space flight qualified Excitation and Damping Subsystem (EDS) which performs controls calculations; and the Remote Interface Unit (RIU) which is used for data acquisition, transmission, and filtering. The flight-like RIU is the interface between the EDS and the sensors and actuators positioned on the particular structure under control. The EDS and RIU communicate over the MIL-STD-1553B, a space flight qualified bus. To test the CCS under realistic conditions, it was connected to the Phase-0 CSI Evolutionary Model (CEM) at NASA Langley Research Center. The following schematic shows how the CCS is connected to the CEM. Various tests were performed which validated the ability of the system to perform control/structures experiments.

  1. Investigating the Validity of an Integrated Listening-Speaking Task: A Discourse-Based Analysis of Test Takers' Oral Performances

    ERIC Educational Resources Information Center

    Frost, Kellie; Elder, Catherine; Wigglesworth, Gillian

    2012-01-01

    Performance on integrated tasks requires candidates to engage skills and strategies beyond language proficiency alone, in ways that can be difficult to define and measure for testing purposes. While it has been widely recognized that stimulus materials impact test performance, our understanding of the way in which test takers make use of these…

  2. The predictive validity of a situational judgement test, a clinical problem solving test and the core medical training selection methods for performance in specialty training .

    PubMed

    Patterson, Fiona; Lopes, Safiatu; Harding, Stephen; Vaux, Emma; Berkin, Liz; Black, David

    2017-02-01

    The aim of this study was to follow up a sample of physicians who began core medical training (CMT) in 2009. This paper examines the long-term validity of CMT and GP selection methods in predicting performance in the Membership of Royal College of Physicians (MRCP(UK)) examinations. We performed a longitudinal study, examining the extent to which the GP and CMT selection methods (T1) predict performance in the MRCP(UK) examinations (T2). A total of 2,569 applicants from 2008-09 who completed CMT and GP selection methods were included in the study. Looking at MRCP(UK) part 1, part 2 written and PACES scores, both CMT and GP selection methods show evidence of predictive validity for the outcome variables, and hierarchical regressions show the GP methods add significant value to the CMT selection process. CMT selection methods predict performance in important outcomes and have good evidence of validity; the GP methods may have an additional role alongside the CMT selection methods. © Royal College of Physicians 2017. All rights reserved.

  3. Diagnostic Validity of Wechsler Substest Scatter

    ERIC Educational Resources Information Center

    Watkins, Marley W.

    2005-01-01

    Cognitive subtest scatter has often been considered to be diagnostically significant. The current study tested the diagnostic validity of four separate operationalizations of WISC-III subtest scatter: (a) range of verbal, performance, and full-scale subtests; (b) variance of verbal, performance, and full-scale subtests; (c) number of subtests…

  4. Performance on a virtual reality angled laparoscope task correlates with spatial ability of trainees.

    PubMed

    Rosenthal, Rachel; Hamel, Christian; Oertli, Daniel; Demartines, Nicolas; Gantert, Walter A

    2010-08-01

    The aim of the present study was to investigate whether trainees' performance on a virtual reality angled laparoscope navigation task correlates with scores obtained on a validated conventional test of spatial ability. 56 participants of a surgery workshop performed an angled laparoscope navigation task on the Xitact LS 500 virtual reality Simulator. Performance parameters were correlated with the score of a validated paper-and-pencil test of spatial ability. Performance at the conventional spatial ability test significantly correlated with performance at the virtual reality task for overall task score (p < 0.001), task completion time (p < 0.001) and economy of movement (p = 0.035), not for endoscope travel speed (p = 0.947). In conclusion, trainees' performance in a standardized virtual reality camera navigation task correlates with their innate spatial ability. This VR session holds potential to serve as an assessment tool for trainees.

  5. Validation of EncephalApp, Smartphone-Based Stroop Test, for the Diagnosis of Covert Hepatic Encephalopathy.

    PubMed

    Bajaj, Jasmohan S; Heuman, Douglas M; Sterling, Richard K; Sanyal, Arun J; Siddiqui, Muhammad; Matherly, Scott; Luketic, Velimir; Stravitz, R Todd; Fuchs, Michael; Thacker, Leroy R; Gilles, HoChong; White, Melanie B; Unser, Ariel; Hovermale, James; Gavis, Edith; Noble, Nicole A; Wade, James B

    2015-10-01

    Detection of covert hepatic encephalopathy (CHE) is difficult, but point-of-care testing could increase rates of diagnosis. We aimed to validate the ability of the smartphone app EncephalApp, a streamlined version of Stroop App, to detect CHE. We evaluated face validity, test-retest reliability, and external validity. Patients with cirrhosis (n = 167; 38% with overt HE [OHE]; mean age, 55 years; mean Model for End-Stage Liver Disease score, 12) and controls (n = 114) were each given a paper and pencil cognitive battery (standard) along with EncephalApp. EncephalApp has Off and On states; results measured were OffTime, OnTime, OffTime+OnTime, and number of runs required to complete 5 off and on runs. Thirty-six patients with cirrhosis underwent driving simulation tests, and EncephalApp results were correlated with results. Test-retest reliability was analyzed in a subgroup of patients. The test was performed before and after transjugular intrahepatic portosystemic shunt placement, and before and after correction for hyponatremia, to determine external validity. All patients with cirrhosis performed worse on paper and pencil and EncephalApp tests than controls. Patients with cirrhosis and OHE performed worse than those without OHE. Age-dependent EncephalApp cutoffs (younger or older than 45 years) were set. An OffTime+OnTime value of >190 seconds identified all patients with CHE with an area under the receiver operator characteristic value of 0.91; the area under the receiver operator characteristic value was 0.88 for diagnosis of CHE in those without OHE. EncephalApp times correlated with crashes and illegal turns in driving simulation tests. Test-retest reliability was high (intraclass coefficient, 0.83) among 30 patients retested 1-3 months apart. OffTime+OnTime increased significantly (206 vs 255 seconds, P = .007) among 10 patients retested 33 ± 7 days after transjugular intrahepatic portosystemic shunt placement. OffTime+OnTime decreased significantly (242 vs 225 seconds, P = .03) in 7 patients tested before and after correction for hyponatremia (126 ± 3 to 132 ± 4 meq/L, P = .01) 10 ± 5 days apart. A smartphone app called EncephalApp has good face validity, test-retest reliability, and external validity for the diagnosis of CHE. Copyright © 2015 AGA Institute. Published by Elsevier Inc. All rights reserved.

  6. SAS molecular tests Escherichia coli O157 detection kit. Performance tested method 031203.

    PubMed

    Bapanpally, Chandra; Montier, Laura; Khan, Shah; Kasra, Akif; Brunelle, Sharon L

    2014-01-01

    The SAS Molecular tests Escherichia coli O157 Detection method, a loop-mediated isothermal amplification method, performed as well as or better than the U.S. Department of Agriculture, Food Safety Inspection Service Microbiology Laboratory Guidebook and the U.S. Food and Drug Administration Bacteriological Analytical Manual reference methods for ground beef, beef trim, bagged mixed lettuce, and fresh spinach. Ground beef (30% fat, 25 g test portion) was validated for 7-8 h enrichment, leafy greens were validated in a 6-7 h enrichment, and ground beef (30% fat, 375 g composite test portion) and beef trim (375 g composite test portion) were validated in a 16-20 h enrichment. The method performance for meat and leafy green matrixes was also shown to be acceptable under conditions of co-enrichment with Salmonella. Thus, after a short co-enrichment step, ground beef, beef trim, lettuce, and spinach can be tested for both Salmonella and E. coli O157. The SAS Molecular tests Salmonella Detection Kit was validated using the same test portions as for the SAS Molecular tests E. coli O157 Detection Kit and those results are presented in a separate report. Inclusivity and exclusivity testing revealed no false negatives and no false positives among the 50 E. coli 0157 strains, including H7 and non-motile strains, and 30 non-E. coli O157 strains examined. Finally, the method was shown to be robust when variations to DNA extract hold time and DNA volume were varied. The method comparison and robustness data suggest a full 7 h enrichment time should be used for 25 g ground beef test portions.

  7. Threats to Validity When Using Open-Ended Items in International Achievement Studies: Coding Responses to the PISA 2012 Problem-Solving Test in Finland

    ERIC Educational Resources Information Center

    Arffman, Inga

    2016-01-01

    Open-ended (OE) items are widely used to gather data on student performance in international achievement studies. However, several factors may threaten validity when using such items. This study examined Finnish coders' opinions about threats to validity when coding responses to OE items in the PISA 2012 problem-solving test. A total of 6…

  8. Validation of the Information/Communications Technology Literacy Test

    DTIC Science & Technology

    2016-10-01

    nested set. Table 11 presents the results of incremental validity analyses for job knowledge/performance criteria by MOS. Figure 7 presents much...Systems Operator-Analyst (25B) and Nodal Network Systems Operator-Maintainer (25N) MOS. This report documents technical procedures and results of the...research effort. Results suggest that the ICTL test has potential as a valid and highly efficient predictor of valued outcomes in Signal school MOS. Not

  9. The Geant4 physics validation repository

    NASA Astrophysics Data System (ADS)

    Wenzel, H.; Yarba, J.; Dotti, A.

    2015-12-01

    The Geant4 collaboration regularly performs validation and regression tests. The results are stored in a central repository and can be easily accessed via a web application. In this article we describe the Geant4 physics validation repository which consists of a relational database storing experimental data and Geant4 test results, a java API and a web application. The functionality of these components and the technology choices we made are also described.

  10. Integrated Testing Approaches for the NASA Ares I Crew Launch Vehicle

    NASA Technical Reports Server (NTRS)

    Taylor, James L.; Cockrell, Charles E.; Tuma, Margaret L.; Askins, Bruce R.; Bland, Jeff D.; Davis, Stephan R.; Patterson, Alan F.; Taylor, Terry L.; Robinson, Kimberly L.

    2008-01-01

    The Ares I crew launch vehicle is being developed by the U.S. National Aeronautics and Space Administration (NASA) to provide crew and cargo access to the International Space Station (ISS) and, together with the Ares V cargo launch vehicle, serves as a critical component of NASA's future human exploration of the Moon. During the preliminary design phase, NASA defined and began implementing plans for integrated ground and flight testing necessary to achieve the first human launch of Ares I. The individual Ares I flight hardware elements - including the first stage five segment booster (FSB), upper stage, and J-2X upper stage engine - will undergo extensive development, qualification, and certification testing prior to flight. Key integrated system tests include the upper stage Main Propulsion Test Article (MPTA), acceptance tests of the integrated upper stage and upper stage engine assembly, a full-scale integrated vehicle ground vibration test (IVGVT), aerodynamic testing to characterize vehicle performance, and integrated testing of the avionics and software components. The Ares I-X development flight test will provide flight data to validate engineering models for aerodynamic performance, stage separation, structural dynamic performance, and control system functionality. The Ares I-Y flight test will validate ascent performance of the first stage, stage separation functionality, validate the ability of the upper stage to manage cryogenic propellants to achieve upper stage engine start conditions, and a high-altitude demonstration of the launch abort system (LAS) following stage separation. The Orion 1 flight test will be conducted as a full, un-crewed, operational flight test through the entire ascent flight profile prior to the first crewed launch.

  11. System-Integrated Finite Element Analysis of a Full-Scale Helicopter Crash Test with Deployable Energy Absorbers

    NASA Technical Reports Server (NTRS)

    Annett, Martin S.; Polanco, Michael A.

    2010-01-01

    A full-scale crash test of an MD-500 helicopter was conducted in December 2009 at NASA Langley's Landing and Impact Research facility (LandIR). The MD-500 helicopter was fitted with a composite honeycomb Deployable Energy Absorber (DEA) and tested under vertical and horizontal impact velocities of 26-ft/sec and 40-ft/sec, respectively. The objectives of the test were to evaluate the performance of the DEA concept under realistic crash conditions and to generate test data for validation of a system integrated finite element model. In preparation for the full-scale crash test, a series of sub-scale and MD-500 mass simulator tests was conducted to evaluate the impact performances of various components, including a new crush tube and the DEA blocks. Parameters defined within the system integrated finite element model were determined from these tests. The objective of this paper is to summarize the finite element models developed and analyses performed, beginning with pre-test predictions and continuing through post-test validation.

  12. LS-DYNA Analysis of a Full-Scale Helicopter Crash Test

    NASA Technical Reports Server (NTRS)

    Annett, Martin S.

    2010-01-01

    A full-scale crash test of an MD-500 helicopter was conducted in December 2009 at NASA Langley's Landing and Impact Research facility (LandIR). The MD-500 helicopter was fitted with a composite honeycomb Deployable Energy Absorber (DEA) and tested under vertical and horizontal impact velocities of 26 ft/sec and 40 ft/sec, respectively. The objectives of the test were to evaluate the performance of the DEA concept under realistic crash conditions and to generate test data for validation of a system integrated LS-DYNA finite element model. In preparation for the full-scale crash test, a series of sub-scale and MD-500 mass simulator tests was conducted to evaluate the impact performances of various components, including a new crush tube and the DEA blocks. Parameters defined within the system integrated finite element model were determined from these tests. The objective of this paper is to summarize the finite element models developed and analyses performed, beginning with pre-test and continuing through post test validation.

  13. The Arthroscopic Surgical Skill Evaluation Tool (ASSET).

    PubMed

    Koehler, Ryan J; Amsdell, Simon; Arendt, Elizabeth A; Bisson, Leslie J; Braman, Jonathan P; Bramen, Jonathan P; Butler, Aaron; Cosgarea, Andrew J; Harner, Christopher D; Garrett, William E; Olson, Tyson; Warme, Winston J; Nicandri, Gregg T

    2013-06-01

    Surgeries employing arthroscopic techniques are among the most commonly performed in orthopaedic clinical practice; however, valid and reliable methods of assessing the arthroscopic skill of orthopaedic surgeons are lacking. The Arthroscopic Surgery Skill Evaluation Tool (ASSET) will demonstrate content validity, concurrent criterion-oriented validity, and reliability when used to assess the technical ability of surgeons performing diagnostic knee arthroscopic surgery on cadaveric specimens. Cross-sectional study; Level of evidence, 3. Content validity was determined by a group of 7 experts using the Delphi method. Intra-articular performance of a right and left diagnostic knee arthroscopic procedure was recorded for 28 residents and 2 sports medicine fellowship-trained attending surgeons. Surgeon performance was assessed by 2 blinded raters using the ASSET. Concurrent criterion-oriented validity, interrater reliability, and test-retest reliability were evaluated. Content validity: The content development group identified 8 arthroscopic skill domains to evaluate using the ASSET. Concurrent criterion-oriented validity: Significant differences in the total ASSET score (P < .05) between novice, intermediate, and advanced experience groups were identified. Interrater reliability: The ASSET scores assigned by each rater were strongly correlated (r = 0.91, P < .01), and the intraclass correlation coefficient between raters for the total ASSET score was 0.90. Test-retest reliability: There was a significant correlation between ASSET scores for both procedures attempted by each surgeon (r = 0.79, P < .01). The ASSET appears to be a useful, valid, and reliable method for assessing surgeon performance of diagnostic knee arthroscopic surgery in cadaveric specimens. Studies are ongoing to determine its generalizability to other procedures as well as to the live operating room and other simulated environments.

  14. Relationships between the Kaufman Brief Intelligence Test and the Wechsler Adult Intelligence Scale-Third Edition.

    PubMed

    Walters, Steven O; Weaver, Kenneth A

    2003-06-01

    The Kaufman Brief Intelligence Test detects learning problems of young students and is a screen for whether a more comprehensive test of intelligence is needed. A study to assess whether this test was valid as an adult intelligence test was conducted with 20 undergraduate psychology majors. The correlations between the Kaufman Brief Intelligence Test's Composite, Vocabulary, and Matrices test scores and their corresponding Wechsler Adult Intelligence Scale-Third Edition test scores, the Full Scale (r=.88), Verbal (r=.77), and Performance scores (r=.87), indicated very strong relationships. In addition, no significant differences were obtained between the Composite, Vocabulary, and Matrices means of the Kaufman Brief Intelligence Test and the Full Scale, Verbal, and Performance means of the WAIS-III. The Kaufman Brief Intelligence Test appears to be a valid test of intelligence for adults.

  15. Validity of a basketball-specific complex test in female professional players.

    PubMed

    Schwesig, René; Hermassi, Souhail; Lauenroth, Andreas; Laudner, Kevin; Koke, Alexander; Bartels, Thomas; Delank, Stefan; Schulze, Stephan

    2018-06-01

    The purpose of this study was to assess the validity of a new basketball-specific complex test (BBCT) based on the ascertained match performance.Fourteen female professional basketball players (ages: 23.4 ± 1.8 years) performed the BBCT and a treadmill test (TT) at the beginning of pre-season training. Lactate, heart rate (HR), time, shooting precision and number of errors were measured during the four test sequences of the BBCT (short distance sprinting with direction changes, with and without a ball; fast break; lay-up parcours; sprint endurance test). In addition, lactate threshold (LT) and HR were assessed at selected times throughout the TT and the BBCT and over 6 (TT) or 10 (BBCT) minutes after the tests. The match performance score (mps) was calculated on specific parameters (e. g. points) collected during all matches during the subsequent season (22 matches). The mps served as the "gold standard" within the validation process for the BBCT and the TT.TT parameters demonstrated an explained variance (EV) between 0 % (HR recovery) and 11 % (running speed at 6 mmol/l LT). The EV from the BBCT was higher and ranged from 0 % (HR recovery 6 minutes after end of exercise) to 28 % (sprint endurance test after 8 of 10 sprints). Ten out of 21 BBCT parameters (48 %) and 2 out of 5 TT parameters (40 %) demonstrated an EV higher than 10 %. Average EV for all parameters was 12 % (BBCT) and 6 % (TT), respectively. The BBCT had a higher validity than the TT for predicting match performance. These findings suggest that coaches and scientists should consider using the BBCT testing protocol to estimate the match performance abilities of elite female players. © Georg Thieme Verlag KG Stuttgart · New York.

  16. Evaluation of the Clinical Performance of the HPV-Risk Assay Using the VALGENT-3 Panel.

    PubMed

    Polman, N J; Oštrbenk, A; Xu, L; Snijders, P J F; Meijer, C J L M; Poljak, M; Heideman, D A M; Arbyn, M

    2017-12-01

    Human papillomavirus (HPV) testing is increasingly being incorporated into cervical cancer screening. The Validation of HPV Genotyping Tests (VALGENT) is a framework designed to evaluate the clinical performance of various HPV tests relative to that of the validated and accepted comparator test in a formalized and uniform manner. The aim of this study was to evaluate the clinical performance of the HPV-Risk assay with samples from the VALGENT-3 panel and to compare its performance to that of the clinically validated Hybrid Capture 2 assay (HC2). The VALGENT-3 panel comprises 1,300 consecutive samples from women participating in routine cervical cancer screening and is enriched with 300 samples from women with abnormal cytology. DNA was extracted from original ThinPrep PreservCyt medium aliquots, and HPV testing was performed using the HPV-Risk assay by investigators blind to the clinical data. HPV prevalence was analyzed, and the clinical performance of the HPV-Risk assay for the detection of cervical intraepithelial neoplasia grade 3 or worse (CIN3+) and CIN2 or worse (CIN2+) relative to the performance of HC2 was assessed. The sensitivity of the HPV-Risk assay for the detection of CIN3+ was similar to that of HC2 (relative sensitivity, 1.00; 95% confidence interval [CI], 0.95 to 1.05; P = 1.000), but the specificity of the HPV-Risk assay was significantly higher than that of HC2 (relative specificity, 1.02; 95% CI, 1.01 to 1.04; P < 0.001). For the detection of CIN2+, similar results were obtained, with the relative sensitivity being 0.98 (95% CI, 0.93 to 1.02; P = 0.257) and the relative specificity being 1.02 (95% CI, 1.01 to 1.03; P < 0.001). The performance of the HPV-Risk assay for the detection of CIN3+ and CIN2+ was noninferior to that of HC2, with all P values being ≤0.006. In conclusion, the HPV-Risk assay demonstrated noninferiority to the clinically validated HC2 by the use of samples from the VALGENT-3 panel for test validation and comparison. Copyright © 2017 Polman et al.

  17. Incremental Validity of the New MCAT.

    ERIC Educational Resources Information Center

    Friedman, Charles P.; Bakewell, William E., Jr.

    1980-01-01

    The ability of the new Medical College Admission Test (MCAT) to predict performance of first-year medical students at the University of North Carolina was studied. Its incremental validity, determined by computing the additional variance in performance explainable by the MCAT after the effects of other admissions variables were taken into account,…

  18. Validity and Responsiveness of the Two-Minute Walk Test for Measuring Functional Recovery After Total Knee Arthroplasty.

    PubMed

    Unnanuntana, Aasis; Ruangsomboon, Pakpoom; Keesukpunt, Worawut

    2018-06-01

    The 2-minute walk test (2mwt) is a performance-based test that evaluates functional recovery after total knee arthroplasty (TKA). This study evaluated its validity compared with the modified Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC), Oxford Knee Score (OKS), modified Knee Score, Numerical Pain Rating Scale, and Timed Up and Go test, and its responsiveness in assessing functional recovery in TKA patients. This prospective cohort study included 162 patients undergoing primary TKA between 2013 and 2015. We used patient-reported outcome measures (modified WOMAC, OKS, modified Knee Score, Numerical Pain Rating Scale) and performance-based tests (2mwt and Timed Up and Go test) at baseline and 3, 6, and 12 months postoperatively. The construct validity of 2mwt was determined between the 2mwt distances walked and other outcome measurements. To assess responsiveness, effect size and standardized response mean were analyzed. Minimal clinically important difference of 2mwt at 12 months after TKA was also calculated. All outcome measurements improved significantly from baseline to 3, 6, and 12 months postoperatively. Bivariate analysis revealed mild to moderate associations between the 2mwt and modified WOMAC function subscales, and moderate to strong associations with OKS. Mild to moderate correlations were found for pain and stiffness between 2mwt and other outcome measurements. The effect size and standardized response mean at 12 months were large, with a minimal clinically important difference of 12.7 m. 2mwt is a validated performance-based test with responsiveness properties. Being simple and easy to perform, it can be used routinely in clinical practice to evaluate functional recovery after TKA. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.

  19. Spring performance tester for miniature extension springs

    DOEpatents

    Salzbrenner, Bradley; Boyce, Brad

    2017-05-16

    A spring performance tester and method of testing a spring are disclosed that has improved accuracy and precision over prior art spring testers. The tester can perform static and cyclic testing. The spring tester can provide validation for product acceptance as well as test for cyclic degradation of springs, such as the change in the spring rate and fatigue failure.

  20. Validation of multiprocessor systems

    NASA Technical Reports Server (NTRS)

    Siewiorek, D. P.; Segall, Z.; Kong, T.

    1982-01-01

    Experiments that can be used to validate fault free performance of multiprocessor systems in aerospace systems integrating flight controls and avionics are discussed. Engineering prototypes for two fault tolerant multiprocessors are tested.

  1. Land Ice Verification and Validation Kit

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    2015-07-15

    To address a pressing need to better understand the behavior and complex interaction of ice sheets within the global Earth system, significant development of continental-scale, dynamical ice-sheet models is underway. The associated verification and validation process of these models is being coordinated through a new, robust, python-based extensible software package, the Land Ice Verification and Validation toolkit (LIVV). This release provides robust and automated verification and a performance evaluation on LCF platforms. The performance V&V involves a comprehensive comparison of model performance relative to expected behavior on a given computing platform. LIVV operates on a set of benchmark and testmore » data, and provides comparisons for a suite of community prioritized tests, including configuration and parameter variations, bit-4-bit evaluation, and plots of tests where differences occur.« less

  2. The Edinburgh Postnatal Depression Scale (EPDS): translation and validation study of the Iranian version.

    PubMed

    Montazeri, Ali; Torkan, Behnaz; Omidvari, Sepideh

    2007-04-04

    The Edinburgh Postnatal Depression Scale (EPDS) is a widely used instrument to measure postnatal depression. This study aimed to translate and to test the reliability and validity of the EPDS in Iran. The English language version of the EPDS was translated into Persian (Iranian language) and was used in this study. The questionnaire was administered to a consecutive sample of 100 women with normal (n = 50) and caesarean section (n = 50) deliveries at two points in time: 6 to 8 weeks and 12 to 14 weeks after delivery. Statistical analysis was performed to test the reliability and validity of the EPDS. Overall 22% of women at time 1 and 18% at time 2 reported experiencing postpartum depression. In general, the Iranian version of the EPDS was found to be acceptable to almost all women. Cronbach's alpha coefficient (to test reliability) was found to be 0.77 at time 1 and 0.86 at time 2. In addition, test-rest reliability was performed and the intraclass correlation coefficient was found to be 0.80. Validity as performed using known groups comparison showed satisfactory results. The questionnaire discriminated well between sub-groups of women differing in mode of delivery in the expected direction. The factor analysis indicated a three-factor structure that jointly accounted for 58% of the variance. This preliminary validation study of the Iranian version of the EPDS proved that it is an acceptable, reliable and valid measure of postnatal depression. It seems that the EPDS not only measures postpartum depression but also may be measuring something more.

  3. Validation conform ISO-15189 of assays in the field of autoimmunity: Joint efforts in The Netherlands.

    PubMed

    Mulder, Leontine; van der Molen, Renate; Koelman, Carin; van Leeuwen, Ester; Roos, Anja; Damoiseaux, Jan

    2018-05-01

    ISO 15189:2012 requires validation of methods used in the medical laboratory, and lists a series of performance parameters for consideration to include. Although these performance parameters are feasible for clinical chemistry analytes, application in the validation of autoimmunity tests is a challenge. Lack of gold standards or reference methods in combination with the scarcity of well-defined diagnostic samples of patients with rare diseases make validation of new assays difficult. The present manuscript describes the initiative of Dutch medical immunology laboratory specialists to combine efforts and perform multi-center validation studies of new assays in the field of autoimmunity. Validation data and reports are made available to interested Dutch laboratory specialists. Copyright © 2018 Elsevier B.V. All rights reserved.

  4. Development of diagnostic test instruments to reveal level student conception in kinematic and dynamics

    NASA Astrophysics Data System (ADS)

    Handhika, J.; Cari, C.; Suparmi, A.; Sunarno, W.; Purwandari, P.

    2018-03-01

    The purpose of this research was to develop a diagnostic test instrument to reveal students' conceptions in kinematics and dynamics. The diagnostic test was developed based on the content indicator the concept of (1) displacement and distance, (2) instantaneous and average velocity, (3) zero and constant acceleration, (4) gravitational acceleration (5) Newton's first Law, (6) and Newton's third Law. The diagnostic test development model includes: Diagnostic test requirement analysis, formulating test-making objectives, developing tests, checking the validity of the content and the performance of reliability, and application of tests. The Content Validation Index (CVI) results in the category are highly relevant, with a value of 0.85. Three questions get negative Content Validation Ratio CVR) (-0.6), after revised distractors and clarify visual presentation; the CVR become 1 (highly relevant). This test was applied, obtained 16 valid test items, with Cronbach Alpha value of 0.80. It can conclude that diagnostic test can be used to reveal the level of students conception in kinematics and dynamics.

  5. Evaluating Maintenance Performance: The Development of Graphic Symbolic Substitutes for Criterion Referenced Job Task Performance Tests for Electronic Maintenance. Final Report.

    ERIC Educational Resources Information Center

    Shriver, Edgar L.; Foley, John P., Jr.

    A battery of criterion referenced Job Task Performance Tests (JTPT) was developed because paper and pencil tests of job knowledge and electronic theory had very poor criterion-related or empirical validity with respect to the ability of electronic maintenance men to perform their job. Although the original JTPT required the use of actual…

  6. The development of performance-based practical assessment model at civil engineering workshop in state polytechnic

    NASA Astrophysics Data System (ADS)

    Kristinayanti, W. S.; Mas Pertiwi, I. G. A. I.; Evin Yudhi, S.; Lokantara, W. D.

    2018-01-01

    Assessment is an important element in education that shall oversees students’ competence not only in terms of cognitive aspect, but alsothe students’ psychomotorin a comprehensive way. Civil Engineering Department at Bali State Polytechnic,as a vocational education institution, emphasizes on not only the theoretical foundation of the study, but also the application throughpracticum in workshop-based learning. We are aware of a need for performance-based assessment for these students, which would be essential for the student’s all-round performance in their studies.We try to develop a performance-based practicum assessment model that is needed to assess student’s ability in workshop-based learning. This research was conducted in three stages, 1) learning needs analysis, 2) instruments development, and 3) testing of instruments. The study uses rubrics set-up to test students’ competence in the workshop and test the validity. We obtained 34-point valid statement out of 35, and resulted in value of Cronbach’s alpha equal to 0.977. In expert test we obtained a value of CVI = 0.75 which means that the drafted assessment is empirically valid within thetrial group.

  7. Testing to the Top: Everything But the Kitchen Sink?

    ERIC Educational Resources Information Center

    Dietel, Ron

    2011-01-01

    Two tests intended to measure student achievement of the Common Core State Standards will face intense scrutiny, but the test makers say they will include performance assessments and other items that are not multiple-choice questions. Incorporating performance items on this tests will bring up issues over scoring, costs, and validity.

  8. Performance Validation of Version 152.0 ANSER Control Laws for the F-18 HARV

    NASA Technical Reports Server (NTRS)

    Messina, Michael D.

    1996-01-01

    The Actuated Nose Strakes for Enhanced Rolling (ANSER) Control Laws were modified as a result of Phase 3 F/A-18 High Alpha Research Vehicle (HARV) flight testing. The control law modifications for the next software release were designated version 152.0. The Ada implementation was tested in the Hardware-In-the-Loop (HIL) simulation and results were compared to those obtained with the NASA Langley batch Fortran implementation of the control laws which are considered the 'truth model.' This report documents the performance validation test results between these implementations for ANSER control law version 152.0.

  9. The Geant4 physics validation repository

    DOE PAGES

    Wenzel, H.; Yarba, J.; Dotti, A.

    2015-12-23

    The Geant4 collaboration regularly performs validation and regression tests. The results are stored in a central repository and can be easily accessed via a web application. In this article we describe the Geant4 physics validation repository which consists of a relational database storing experimental data and Geant4 test results, a java API and a web application. Lastly, the functionality of these components and the technology choices we made are also described

  10. Development of an instrument based on the protection motivation theory to measure factors influencing women's intention to first pap test practice.

    PubMed

    Hassani, Lale; Dehdari, Tahereh; Hajizadeh, Ebrahim; Shojaeizadeh, Davoud; Abedini, Mehrandokht; Nedjat, Saharnaz

    2014-01-01

    Given that there are many Iranian women who have never had a Pap smear, this study was designed to develop and validate a measurement tool based on the Protection Motivation Theory to assess factors influencing the Iranian women's intention to perform first Pap testing. In this psychometric research, to determine the Content Validity Index (CVI) and the Content Validity Ratio (CVR), a panel of experts (n=10) reviewed scale items. Reliability was estimated through the Intraclass Correlation Coefficient (n=30) and internal consistency (n=240). Also, factor analysis (exploratory and conformity) was performed on the data of the sample women who had never had a Pap smear test (n=240). A 26-item questionnaire was developed. The CVI and CVR scores of the scale were 0.89 and 0.90, respectively. Exploratory factor analysis loaded a 26-item with seven factors questionnaire (perceived vulnerability and severity, fear, response costs, response efficacy, self-efficacy, and protection motivation (or intention)) that jointly accounted for 72.76% of the observed variance. Confirmatory factor analysis indicated a good fit for the data. Internal consistency (range 0.70-0.93) and test-retest reliability (range 0.72-0.96) of sub-scales were acceptable. This study showed that the designed instrument was a valid and reliable tool for measuring the factors influencing the women's intention to perform their first Pap testing.

  11. Reliability and criterion-related validity of a new repeated agility test

    PubMed Central

    Makni, E; Jemni, M; Elloumi, M; Chamari, K; Nabli, MA; Padulo, J; Moalla, W

    2016-01-01

    The study aimed to assess the reliability and the criterion-related validity of a new repeated sprint T-test (RSTT) that includes intense multidirectional intermittent efforts. The RSTT consisted of 7 maximal repeated executions of the agility T-test with 25 s of passive recovery rest in between. Forty-five team sports players performed two RSTTs separated by 3 days to assess the reliability of best time (BT) and total time (TT) of the RSTT. The intra-class correlation coefficient analysis revealed a high relative reliability between test and retest for BT and TT (>0.90). The standard error of measurement (<0.50) showed that the RSTT has a good absolute reliability. The minimal detectable change values for BT and TT related to the RSTT were 0.09 s and 0.58 s, respectively. To check the criterion-related validity of the RSTT, players performed a repeated linear sprint (RLS) and a repeated sprint with changes of direction (RSCD). Significant correlations between the BT and TT of the RLS, RSCD and RSTT were observed (p<0.001). The RSTT is, therefore, a reliable and valid measure of the intermittent repeated sprint agility performance. As this ability is required in all team sports, it is suggested that team sports coaches, fitness coaches and sports scientists consider this test in their training follow-up. PMID:27274109

  12. Do candidate reactions relate to job performance or affect criterion-related validity? A multistudy investigation of relations among reactions, selection test scores, and job performance.

    PubMed

    McCarthy, Julie M; Van Iddekinge, Chad H; Lievens, Filip; Kung, Mei-Chuan; Sinar, Evan F; Campion, Michael A

    2013-09-01

    Considerable evidence suggests that how candidates react to selection procedures can affect their test performance and their attitudes toward the hiring organization (e.g., recommending the firm to others). However, very few studies of candidate reactions have examined one of the outcomes organizations care most about: job performance. We attempt to address this gap by developing and testing a conceptual framework that delineates whether and how candidate reactions might influence job performance. We accomplish this objective using data from 4 studies (total N = 6,480), 6 selection procedures (personality tests, job knowledge tests, cognitive ability tests, work samples, situational judgment tests, and a selection inventory), 5 key candidate reactions (anxiety, motivation, belief in tests, self-efficacy, and procedural justice), 2 contexts (industry and education), 3 continents (North America, South America, and Europe), 2 study designs (predictive and concurrent), and 4 occupational areas (medical, sales, customer service, and technological). Consistent with previous research, candidate reactions were related to test scores, and test scores were related to job performance. Further, there was some evidence that reactions affected performance indirectly through their influence on test scores. Finally, in no cases did candidate reactions affect the prediction of job performance by increasing or decreasing the criterion-related validity of test scores. Implications of these findings and avenues for future research are discussed. PsycINFO Database Record (c) 2013 APA, all rights reserved

  13. Blood-alcohol proficiency test program

    DOT National Transportation Integrated Search

    1975-01-01

    A preliminary survey has been performed to ascertain the validity of the blood alcohol analysis performed by a number of laboratories on a voluntary basis. Values of accuracy and precision of the tests are presented. /Abstract from report summary pag...

  14. ESGE-ESGENA technical specification for process validation and routine testing of endoscope reprocessing in washer-disinfectors according to EN ISO 15883, parts 1, 4, and ISO/TS 15883-5.

    PubMed

    Beilenhoff, Ulrike; Biering, Holger; Blum, Reinhard; Brljak, Jadranka; Cimbro, Monica; Dumonceau, Jean-Marc; Hassan, Cesare; Jung, Michael; Neumann, Christiane; Pietsch, Michael; Pineau, Lionel; Ponchon, Thierry; Rejchrt, Stanislav; Rey, Jean-François; Schmidt, Verona; Tillett, Jayne; van Hooft, Jeanin

    2017-12-01

    1 Prerequisites. The clinical service provider should obtain confirmation from the endoscope washer-disinfector (EWD) manufacturer that all endoscopes intended to be used can be reprocessed in the EWD. 2 Installation qualification. This can be performed by different parties but national guidelines should define who has the responsibilities, taking into account legal requirements. 3 Operational qualification. This should include parametric tests to verify that the EWD is working according to its specifications. 4 Performance qualification. Testing of cleaning performance, microbiological testing of routinely used endoscopes, and the quality of the final rinse water should be considered in all local guidelines. The extent of these tests depends on local requirements. According to the results of type testing performed during EWD development, other parameters can be tested if local regulatory authorities accept this. Chemical residues on endoscope surfaces should be searched for, if acceptable test methods are available. 5 Routine inspections. National guidelines should consider both technical and performance criteria. Individual risk analyses performed in the validation and requalification processes are helpful for defining appropriate test frequencies for routine inspections. © Georg Thieme Verlag KG Stuttgart · New York.

  15. Psychometric properties of the motor diagnostics in the German football talent identification and development programme.

    PubMed

    HÖner, Oliver; Votteler, Andreas; Schmid, Markus; Schultz, Florian; Roth, Klaus

    2015-01-01

    The utilisation of motor performance tests for talent identification in youth sports is discussed intensively in talent research. This article examines the reliability, differential stability and validity of the motor diagnostics conducted nationwide by the German football talent identification and development programme and provides reference values for a standardised interpretation of the diagnostics results. Highly selected players (the top 4% of their age groups, U12-U15) took part in the diagnostics at 17 measurement points between spring 2004 and spring 2012 (N = 68,158). The heterogeneous test battery measured speed abilities and football-specific technical skills (sprint, agility, dribbling, ball control, shooting, juggling). For all measurement points, the overall score and the speed tests showed high internal consistency, high test-retest reliability and satisfying differential stability. The diagnostics demonstrated satisfying factorial-related validity with plausible and stable loadings on the two empirical factors "speed" and "technical skills". The score, and the technical skills dribbling and juggling, differentiated the most among players of different performance levels and thus showed the highest criterion-related validity. Satisfactory psychometric properties for the diagnostics are an important prerequisite for a scientifically sound rating of players' actual motor performance and for the future examination of the prognostic validity for success in adulthood.

  16. Academic performance, career potential, creativity, and job performance: can one construct predict them all?

    PubMed

    Kuncel, Nathan R; Hezlett, Sarah A; Ones, Deniz S

    2004-01-01

    This meta-analysis addresses the question of whether 1 general cognitive ability measure developed for predicting academic performance is valid for predicting performance in both educational and work domains. The validity of the Miller Analogies Test (MAT; W. S. Miller, 1960) for predicting 18 academic and work-related criteria was examined. MAT correlations with other cognitive tests (e.g., Raven's Matrices [J. C. Raven, 1965]; Graduate Record Examinations) also were meta-analyzed. The results indicate that the abilities measured by the MAT are shared with other cognitive ability instruments and that these abilities are generalizably valid predictors of academic and vocational criteria, as well as evaluations of career potential and creativity. These findings contradict the notion that intelligence at work is wholly different from intelligence at school, extending the voluminous literature that supports the broad importance of general cognitive ability (g).

  17. A COMPARISON OF THE EMPIRICAL VALIDITY OF SIX TESTS OF ABILITY WITH EDUCABLE MENTAL RETARDATES.

    ERIC Educational Resources Information Center

    MUELLER, MAX W.

    AN INVESTIGATION OF THE VALIDITY OF INTELLIGENCE AND OTHER TESTS USED IN THE DIAGNOSIS OF RETARDED CHILDREN WAS PERFORMED. EXPERIMENTAL SAMPLES CONSISTED OF 101 CHILDREN SELECTED FROM SPECIAL CLASSES FOR EDUCABLE MENTALLY RETARDED (EMR) WHOSE AGES RANGED FROM 6.9 TO 10 YEARS AND WHOSE IQ SCORES RANGED FROM 50 TO 80. THE TESTS EVALUATED WERE (1)…

  18. Impact of External Cue Validity on Driving Performance in Parkinson's Disease

    PubMed Central

    Scally, Karen; Charlton, Judith L.; Iansek, Robert; Bradshaw, John L.; Moss, Simon; Georgiou-Karistianis, Nellie

    2011-01-01

    This study sought to investigate the impact of external cue validity on simulated driving performance in 19 Parkinson's disease (PD) patients and 19 healthy age-matched controls. Braking points and distance between deceleration point and braking point were analysed for red traffic signals preceded either by Valid Cues (correctly predicting signal), Invalid Cues (incorrectly predicting signal), and No Cues. Results showed that PD drivers braked significantly later and travelled significantly further between deceleration and braking points compared with controls for Invalid and No-Cue conditions. No significant group differences were observed for driving performance in response to Valid Cues. The benefit of Valid Cues relative to Invalid Cues and No Cues was significantly greater for PD drivers compared with controls. Trail Making Test (B-A) scores correlated with driving performance for PDs only. These results highlight the importance of external cues and higher cognitive functioning for driving performance in mild to moderate PD. PMID:21789275

  19. A clinical test of stepping and change of direction to identify multiple falling older adults.

    PubMed

    Dite, Wayne; Temple, Viviene A

    2002-11-01

    To establish the reliability and validity of a new clinical test of dynamic standing balance, the Four Square Step Test (FSST), to evaluate its sensitivity, specificity, and predictive value in identifying subjects who fall, and to compare it with 3 established balance and mobility tests. A 3-group comparison performed by using 3 validated tests and 1 new test. A rehabilitation center and university medical school in Australia. Eighty-one community-dwelling adults over the age of 65 years. Subjects were age- and gender-matched to form 3 groups: multiple fallers, nonmultiple fallers, and healthy comparisons. Not applicable. Time to complete the FSST and Timed Up and Go test and the number of steps to complete the Step Test and Functional Reach Test distance. High reliability was found for interrater (n=30, intraclass correlation coefficient [ICC]=.99) and retest reliability (n=20, ICC=.98). Evidence for validity was found through correlation with other existing balance tests. Validity was supported, with the FSST showing significantly better performance scores (P<.01) for each of the healthier and less impaired groups. The FSST also revealed a sensitivity of 85%, a specificity of 88% to 100%, and a positive predictive value of 86%. As a clinical test, the FSST is reliable, valid, easy to score, quick to administer, requires little space, and needs no special equipment. It is unique in that it involves stepping over low objects (2.5cm) and movement in 4 directions. The FSST had higher combined sensitivity and specificity for identifying differences between groups in the selected sample population of older adults than the 3 tests with which it was compared. Copyright 2002 by the American Congress of Rehabilitation Medicine and the American Academy of Physical Medicine and Rehabilitation

  20. The Reliability and Validity of a Performance Task for Evaluating Science Process Skills.

    ERIC Educational Resources Information Center

    Adams, Cheryll M.; Callahan, Carolyn M.

    1995-01-01

    The Diet Cola Test was designed as a process assessment of science aptitude in intermediate grade students. Investigations of the instrument's reliability and validity indicated that data did not support use of the instrument for identifying individual students' aptitude. However, results suggested the test's appropriateness for evaluating…

  1. Transfer of skills on LapSim virtual reality laparoscopic simulator into the operating room in urology.

    PubMed

    Alwaal, Amjad; Al-Qaoud, Talal M; Haddad, Richard L; Alzahrani, Tarek M; Delisle, Josee; Anidjar, Maurice

    2015-01-01

    Assessing the predictive validity of the LapSim simulator within a urology residency program. Twelve urology residents at McGill University were enrolled in the study between June 2008 and December 2011. The residents had weekly training on the LapSim that consisted of 3 tasks (cutting, clip-applying, and lifting and grasping). They underwent monthly assessment of their LapSim performance using total time, tissue damage and path length among other parameters as surrogates for their economy of movement and respect for tissue. The last residents' LapSim performance was compared with their first performance of radical nephrectomy on anesthetized porcine models in their 4(th) year of training. Two independent urologic surgeons rated the resident performance on the porcine models, and kappa test with standardized weight function was used to assess for inter-observer bias. Nonparametric spearman correlation test was used to compare each rater's cumulative score with the cumulative score obtained on the porcine models in order to test the predictive validity of the LapSim simulator. The kappa results demonstrated acceptable agreement between the two observers among all domains of the rating scale of performance except for confidence of movement and efficiency. In addition, poor predictive validity of the LapSim simulator was demonstrated. Predictive validity was not demonstrated for the LapSim simulator in the context of a urology residency training program.

  2. Initial construct validity evidence of a virtual human application for competency assessment in breaking bad news to a cancer patient.

    PubMed

    Guetterman, Timothy C; Kron, Frederick W; Campbell, Toby C; Scerbo, Mark W; Zelenski, Amy B; Cleary, James F; Fetters, Michael D

    2017-01-01

    Despite interest in using virtual humans (VHs) for assessing health care communication, evidence of validity is limited. We evaluated the validity of a VH application, MPathic-VR, for assessing performance-based competence in breaking bad news (BBN) to a VH patient. We used a two-group quasi-experimental design, with residents participating in a 3-hour seminar on BBN. Group A (n=15) completed the VH simulation before and after the seminar, and Group B (n=12) completed the VH simulation only after the BBN seminar to avoid the possibility that testing alone affected performance. Pre- and postseminar differences for Group A were analyzed with a paired t -test, and comparisons between Groups A and B were analyzed with an independent t -test. Compared to the preseminar result, Group A's postseminar scores improved significantly, indicating that the VH program was sensitive to differences in assessing performance-based competence in BBN. Postseminar scores of Group A and Group B were not significantly different, indicating that both groups performed similarly on the VH program. Improved pre-post scores demonstrate acquisition of skills in BBN to a VH patient. Pretest sensitization did not appear to influence posttest assessment. These results provide initial construct validity evidence that the VH program is effective for assessing BBN performance-based communication competence.

  3. Initial construct validity evidence of a virtual human application for competency assessment in breaking bad news to a cancer patient

    PubMed Central

    Guetterman, Timothy C; Kron, Frederick W; Campbell, Toby C; Scerbo, Mark W; Zelenski, Amy B; Cleary, James F; Fetters, Michael D

    2017-01-01

    Background Despite interest in using virtual humans (VHs) for assessing health care communication, evidence of validity is limited. We evaluated the validity of a VH application, MPathic-VR, for assessing performance-based competence in breaking bad news (BBN) to a VH patient. Methods We used a two-group quasi-experimental design, with residents participating in a 3-hour seminar on BBN. Group A (n=15) completed the VH simulation before and after the seminar, and Group B (n=12) completed the VH simulation only after the BBN seminar to avoid the possibility that testing alone affected performance. Pre- and postseminar differences for Group A were analyzed with a paired t-test, and comparisons between Groups A and B were analyzed with an independent t-test. Results Compared to the preseminar result, Group A’s postseminar scores improved significantly, indicating that the VH program was sensitive to differences in assessing performance-based competence in BBN. Postseminar scores of Group A and Group B were not significantly different, indicating that both groups performed similarly on the VH program. Conclusion Improved pre–post scores demonstrate acquisition of skills in BBN to a VH patient. Pretest sensitization did not appear to influence posttest assessment. These results provide initial construct validity evidence that the VH program is effective for assessing BBN performance-based communication competence. PMID:28794664

  4. Evaluation of the reliability and validity for X16 balance testing scale for the elderly.

    PubMed

    Ju, Jingjuan; Jiang, Yu; Zhou, Peng; Li, Lin; Ye, Xiaolei; Wu, Hongmei; Shen, Bin; Zhang, Jialei; He, Xiaoding; Niu, Chunjin; Xia, Qinghua

    2018-05-10

    Balance performance is considered as an indicator of functional status in the elderly, a large scale population screening and evaluation in the community context followed by proper interventions would be of great significance at public health level. However, there has been no suitable balance testing scale available for large scale studies in the unique community context of urban China. A balance scale named X16 balance testing scale was developed, which was composed of 3 domains and 16 items. A total of 1985 functionally independent and active community-dwelling elderly adults' balance abilities were tested using the X16 scale. The internal consistency, split-half reliability, content validity, construct validity, discriminant validity of X16 balance testing scale were evaluated. Factor analysis was performed to identify alternative factor structure. The Eigenvalues of factors 1, 2, and 3 were 8.53, 1.79, and 1.21, respectively, and their cumulative contribution to the total variance reached 72.0%. These 3 factors mainly represented domains static balance, postural stability, and dynamic balance. The Cronbach alpha coefficient for the scale was 0.933. The Spearman correlation coefficients between items and its corresponding domains were ranged from 0.538 to 0.964. The correlation coefficients between each item and its corresponding domain were higher than the coefficients between this item and other domains. With the increase of age, the scores of balance performance, domains static balance, postural stability, and dynamic balance in the elderly declined gradually (P < 0.001). With the increase of age, the proportion of the elderly with intact balance performance decreased gradually (P < 0.001). The reliability and validity of the X16 balance testing scale is both adequate and acceptable. Due to its simple and quick use features, it is practical to be used repeatedly and routinely especially in community setting and on large scale screening.

  5. System-Level Experimental Validations for Supersonic Commercial Transport Aircraft Entering Service in the 2018-2020 Time Period

    NASA Technical Reports Server (NTRS)

    Magee, Todd E.; Wilcox, Peter A.; Fugal, Spencer R.; Acheson, Kurt E.; Adamson, Eric E.; Bidwell, Alicia L.; Shaw, Stephen G.

    2013-01-01

    This report describes the work conducted by The Boeing Company under American Recovery and Reinvestment Act (ARRA) and NASA funding to experimentally validate the conceptual design of a supersonic airliner feasible for entry into service in the 2018 to 2020 timeframe (NASA N+2 generation). The report discusses the design, analysis and development of a low-boom concept that meets aggressive sonic boom and performance goals for a cruise Mach number of 1.8. The design is achieved through integrated multidisciplinary optimization tools. The report also describes the detailed design and fabrication of both sonic boom and performance wind tunnel models of the low-boom concept. Additionally, a description of the detailed validation wind tunnel testing that was performed with the wind tunnel models is provided along with validation comparisons with pretest Computational Fluid Dynamics (CFD). Finally, the report describes the evaluation of existing NASA sonic boom pressure rail measurement instrumentation and a detailed description of new sonic boom measurement instrumentation that was constructed for the validation wind tunnel testing.

  6. The predictive validity of the BioMedical Admissions Test for pre-clinical examination performance.

    PubMed

    Emery, Joanne L; Bell, John F

    2009-06-01

    Some medical courses in the UK have many more applicants than places and almost all applicants have the highest possible previous and predicted examination grades. The BioMedical Admissions Test (BMAT) was designed to assist in the student selection process specifically for a number of 'traditional' medical courses with clear pre-clinical and clinical phases and a strong focus on science teaching in the early years. It is intended to supplement the information provided by examination results, interviews and personal statements. This paper reports on the predictive validity of the BMAT and its predecessor, the Medical and Veterinary Admissions Test. Results from the earliest 4 years of the test (2000-2003) were matched to the pre-clinical examination results of those accepted onto the medical course at the University of Cambridge. Correlation and logistic regression analyses were performed for each cohort. Section 2 of the test ('Scientific Knowledge') correlated more strongly with examination marks than did Section 1 ('Aptitude and Skills'). It also had a stronger relationship with the probability of achieving the highest examination class. The BMAT and its predecessor demonstrate predictive validity for the pre-clinical years of the medical course at the University of Cambridge. The test identifies important differences in skills and knowledge between candidates, not shown by their previous attainment, which predict their examination performance. It is thus a valid source of additional admissions information for medical courses with a strong scientific emphasis when previous attainment is very high.

  7. Cluster analysis of novel isometric strength measures produces a valid and evidence-based classification structure for wheelchair track racing.

    PubMed

    Connick, Mark J; Beckman, Emma; Vanlandewijck, Yves; Malone, Laurie A; Blomqvist, Sven; Tweedy, Sean M

    2017-11-25

    The Para athletics wheelchair-racing classification system employs best practice to ensure that classes comprise athletes whose impairments cause a comparable degree of activity limitation. However, decision-making is largely subjective and scientific evidence which reduces this subjectivity is required. To evaluate whether isometric strength tests were valid for the purposes of classifying wheelchair racers and whether cluster analysis of the strength measures produced a valid classification structure. Thirty-two international level, male wheelchair racers from classes T51-54 completed six isometric strength tests evaluating elbow extensors, shoulder flexors, trunk flexors and forearm pronators and two wheelchair performance tests-Top-Speed (0-15 m) and Top-Speed (absolute). Strength tests significantly correlated with wheelchair performance were included in a cluster analysis and the validity of the resulting clusters was assessed. All six strength tests correlated with performance (r=0.54-0.88). Cluster analysis yielded four clusters with reasonable overall structure (mean silhouette coefficient=0.58) and large intercluster strength differences. Six athletes (19%) were allocated to clusters that did not align with their current class. While the mean wheelchair racing performance of the resulting clusters was unequivocally hierarchical, the mean performance of current classes was not, with no difference between current classes T53 and T54. Cluster analysis of isometric strength tests produced classes comprising athletes who experienced a similar degree of activity limitation. The strength tests reported can provide the basis for a new, more transparent, less subjective wheelchair racing classification system, pending replication of these findings in a larger, representative sample. This paper also provides guidance for development of evidence-based systems in other Para sports. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  8. Testing the Construct Validity of a Virtual Reality Hip Arthroscopy Simulator.

    PubMed

    Khanduja, Vikas; Lawrence, John E; Audenaert, Emmanuel

    2017-03-01

    To test the construct validity of the hip diagnostics module of a virtual reality hip arthroscopy simulator. Nineteen orthopaedic surgeons performed a simulated arthroscopic examination of a healthy hip joint using a 70° arthroscope in the supine position. Surgeons were categorized as either expert (those who had performed 250 hip arthroscopies or more) or novice (those who had performed fewer than this). Twenty-one specific targets were visualized within the central and peripheral compartments; 9 via the anterior portal, 9 via the anterolateral portal, and 3 via the posterolateral portal. This was immediately followed by a task testing basic probe examination of the joint in which a series of 8 targets were probed via the anterolateral portal. During the tasks, the surgeon's performance was evaluated by the simulator using a set of predefined metrics including task duration, number of soft tissue and bone collisions, and distance travelled by instruments. No repeat attempts at the tasks were permitted. Construct validity was then evaluated by comparing novice and expert group performance metrics over the 2 tasks using the Mann-Whitney test, with a P value of less than .05 considered significant. On the visualization task, the expert group outperformed the novice group on time taken (P = .0003), number of collisions with soft tissue (P = .001), number of collisions with bone (P = .002), and distance travelled by the arthroscope (P = .02). On the probe examination, the 2 groups differed only in the time taken to complete the task (P = .025) with no significant difference in other metrics. Increased experience in hip arthroscopy was reflected by significantly better performance on the virtual reality simulator across 2 tasks, supporting its construct validity. This study validates a virtual reality hip arthroscopy simulator and supports its potential for developing basic arthroscopic skills. Level III. Copyright © 2016 Arthroscopy Association of North America. All rights reserved.

  9. Nucleic acid tests for the detection of alpha human papillomaviruses.

    PubMed

    Poljak, Mario; Cuzick, Jack; Kocjan, Boštjan J; Iftner, Thomas; Dillner, Joakim; Arbyn, Marc

    2012-11-20

    Testing for high-risk types of alpha human papillomaviruses (HPV) is an invaluable part of clinical guidelines for cervical carcinoma screening, management and treatment. In this comprehensive inventory of commercial tests for detection of alpha-HPV, we identified at least 125 distinct HPV tests and at least 84 variants of the original tests. However, only a small subset of HPV tests has documented clinical performance for any of the standard HPV testing indications. For more than 75% of HPV tests currently on the market, no single publication in peer-reviewed literature can be identified. HPV tests that have not been validated and lack proof of reliability, reproducibility and accuracy should not be used in clinical management. Once incorporated in the lab, it is essential that the whole procedure of HPV testing is subject to continuous and rigorous quality assurance to avoid sub-optimal, potentially harmful practices. Manufacturers of HPV tests are urged to put more effort into evaluating their current and future products analytically, using international standards, and for clinical applications, using clinically validated endpoints. To assist with analytical validation, the World Health Organization is developing international standards for HPV types other than HPV16 and HPV18 and is planning development of external quality control panels specifically designed to be used for performance evaluation of current and future HPV tests. There is a need for more competitively priced HPV tests, especially for resource-poor countries, and uniform test validation criteria based on international standards should enable issuing more competitive and fair tender notices for purchasing. Automation systems allowing large-scale testing, as well as further increases in clinical performance, are the main needs in the further improvement of HPV tests. This article forms part of a special supplement entitled "Comprehensive Control of HPV Infections and Related Diseases" Vaccine Volume 30, Supplement 5, 2012. Copyright © 2012 Elsevier Ltd. All rights reserved.

  10. Validity of the MCAT in Predicting Performance in the First Two Years of Medical School.

    ERIC Educational Resources Information Center

    Jones, Robert F.; Thomae-Forgues, Maria

    1984-01-01

    The first systematic summary of predictive validity research on the new Medical College Admission Test (MCAT) is presented. The results show that MCAT scores have significant predictive validity with respect to first- and second-year medical school course grades. Further directions for MCAT validity research are described. (Author/MLW)

  11. System-Level Experimental Validations for Supersonic Commercial Transport Aircraft Entering Service in the 2018-2020 Time Period

    NASA Technical Reports Server (NTRS)

    Magee, Todd E.; Fugal, Spencer R.; Fink, Lawrence E.; Adamson, Eric E.; Shaw, Stephen G.

    2015-01-01

    This report describes the work conducted under NASA funding for the Boeing N+2 Supersonic Experimental Validation project to experimentally validate the conceptual design of a supersonic airliner feasible for entry into service in the 2018 -to 2020 timeframe (NASA N+2 generation). The primary goal of the project was to develop a low-boom configuration optimized for minimum sonic boom signature (65 to 70 PLdB). This was a very aggressive goal that could be achieved only through integrated multidisciplinary optimization tools validated in relevant ground and, later, flight environments. The project was split into two phases. Phase I of the project covered the detailed aerodynamic design of a low boom airliner as well as the wind tunnel tests to validate that design (ref. 1). This report covers Phase II of the project, which continued the design methodology development of Phase I with a focus on the propulsion integration aspects as well as the testing involved to validate those designs. One of the major airplane configuration features of the Boeing N+2 low boom design was the overwing nacelle. The location of the nacelle allowed for a minimal effect on the boom signature, however, it added a level of difficulty to designing an inlet with acceptable performance in the overwing flow field. Using the Phase I work as the starting point, the goals of the Phase 2 project were to design and verify inlet performance while maintaining a low-boom signature. The Phase II project was successful in meeting all contract objectives. New modular nacelles were built for the larger Performance Model along with a propulsion rig with an electrically-actuated mass flow plug. Two new mounting struts were built for the smaller Boom Model, along with new nacelles. Propulsion integration testing was performed using an instrumented fan face and a mass flow plug, while boom signatures were measured using a wall-mounted pressure rail. A side study of testing in different wind tunnels was completed as a precursor to the selection of the facilities used for validation testing. As facility schedules allowed, the propulsion testing was done at the NASA Glenn Research Center (GRC) 8 x 6-Foot wind tunnel, while boom and force testing was done at the NASA Ames Research Center (ARC) 9 x 7-Foot wind tunnel. During boom testing, a live balance was used for gathering force data. This report is broken down into nine sections. The first technical section (Section 2) covers the general scope of the Phase II activities, goals, a description of the design and testing efforts, and the project plan and schedule. Section 3 covers the details of the propulsion system concepts and design evolution. A series of short tests to evaluate the suitability of different wind tunnels for boom, propulsion, and force testing was also performed under the Phase 2 effort, with the results covered in Section 4. The propulsion integration testing is covered in Section 5 and the boom and force testing in Section 6. CFD comparisons and analyses are included in Section 7. Section 8 includes the conclusions and lessons learned.

  12. Investigating Score Dependability in English/Chinese Interpreter Certification Performance Testing: A Generalizability Theory Approach

    ERIC Educational Resources Information Center

    Han, Chao

    2016-01-01

    As a property of test scores, reliability/dependability constitutes an important psychometric consideration, and it underpins the validity of measurement results. A review of interpreter certification performance tests (ICPTs) reveals that (a) although reliability/dependability checking has been recognized as an important concern, its theoretical…

  13. Relationship of Temporal Lobe Volumes to Neuropsychological Test Performance in Healthy Children

    ERIC Educational Resources Information Center

    Wells, Carolyn T.; Mahone, E. Mark; Matson, Melissa A.; Kates, Wendy R.; Hay, Trisha; Horska, Alena

    2008-01-01

    Ecological validity of neuropsychological assessment includes the ability of tests to predict real-world functioning and/or covary with brain structures. Studies have examined the relationship between adaptive skills and test performance, with less focus on the association between regional brain volumes and neurobehavioral function in healthy…

  14. Early Childhood Practitioner Judgments of the Social Validity of Performance Checklists and Parent Practice Guides

    ERIC Educational Resources Information Center

    Dunst, Carl J.

    2017-01-01

    Findings from three field tests evaluations of early childhood intervention practitioner performance checklists and three parent practice guides are reported. Forty-two practitioners from three early childhood intervention programs reviewed the checklists and practice guides and made (1) social validity judgments of both products, (2) judgments of…

  15. Word Memory Test Predicts Recovery in Claimants With Work-Related Head Injury.

    PubMed

    Colangelo, Annette; Abada, Abigail; Haws, Calvin; Park, Joanne; Niemeläinen, Riikka; Gross, Douglas P

    2016-05-01

    To investigate the predictive validity of the Word Memory Test (WMT), a verbal memory neuropsychological test developed as a performance validity measure to assess memory, effort, and performance consistency. Cohort study with 1-year follow-up. Workers' compensation rehabilitation facility. Participants included workers' compensation claimants with work-related head injury (N=188; mean age, 44y; 161 men [85.6%]). Not applicable. Outcome measures for determining predictive validity included days to suspension of wage replacement benefits during the 1-year follow-up and work status at discharge in claimants undergoing rehabilitation. Analysis included multivariable Cox and logistic regression. Better WMT performance was significantly but weakly correlated with younger age (r=-.30), documented brain abnormality (r=.28), and loss of consciousness at the time of injury (r=.25). Claimants with documented brain abnormalities on diagnostic imaging scans performed better (∼9%) on the WMT than those without brain abnormalities. The WMT predicted days receiving benefits (adjusted hazard ratio, 1.13; 95% confidence interval, 1.04-1.24) and work status outcome at program discharge (adjusted odds ratio, 1.62; 95% confidence interval, 1.13-2.34). Our results provide evidence for the predictive validity of the WMT in workers' compensation claimants. Younger claimants and those with more severe brain injuries performed better on the WMT. It may be that financial incentives or other factors related to the compensation claim affected the performance. Copyright © 2016 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.

  16. Predictive Validity of the Air Force Officer Qualifying Test for USAF Air Battle Manager Training Performance

    DTIC Science & Technology

    2008-09-01

    performance criteria including passing/failing training, training grades, class rank (Carretta & Ree, 2003; Olea & Ree, 1994), and several non...are consistent with prior validations of the AFOQT versus academic performance criteria in pilot (Carretta & Ree, 1995; Olea & Ree, 1994; Ree...Carretta, & Teachout, 1995)) and navigator ( Olea & Ree, 1994) training. Subsequent analyses took three different approaches to examine the

  17. A simple test of choice stepping reaction time for assessing fall risk in people with multiple sclerosis.

    PubMed

    Tijsma, Mylou; Vister, Eva; Hoang, Phu; Lord, Stephen R

    2017-03-01

    Purpose To determine (a) the discriminant validity for established fall risk factors and (b) the predictive validity for falls of a simple test of choice stepping reaction time (CSRT) in people with multiple sclerosis (MS). Method People with MS (n = 210, 21-74y) performed the CSRT, sensorimotor, balance and neuropsychological tests in a single session. They were then followed up for falls using monthly fall diaries for 6 months. Results The CSRT test had excellent discriminant validity with respect to established fall risk factors. Frequent fallers (≥3 falls) performed significantly worse in the CSRT test than non-frequent fallers (0-2 falls). With the odds of suffering frequent falls increasing 69% with each SD increase in CSRT (OR = 1.69, 95% CI: 1.27-2.26, p = <0.001). In regression analysis, CSRT was best explained by sway, time to complete the 9-Hole Peg test, knee extension strength of the weaker leg, proprioception and the time to complete the Trails B test (multiple R 2   =   0.449, p < 0.001). Conclusions A simple low tech CSRT test has excellent discriminative and predictive validity in relation to falls in people with MS. This test may prove useful in documenting longitudinal changes in fall risk in relation to MS disease progression and effects of interventions. Implications for rehabilitation Good choice stepping reaction time (CSRT) is required for maintaining balance. A simple low-tech CSRT test has excellent discriminative and predictive validity in relation to falls in people with MS. This test may prove useful documenting longitudinal changes in fall risk in relation to MS disease progression and effects of interventions.

  18. Reliability and Validity of the Inline Skating Skill Test.

    PubMed

    Radman, Ivan; Ruzic, Lana; Padovan, Viktoria; Cigrovski, Vjekoslav; Podnar, Hrvoje

    2016-09-01

    This study aimed to examine the reliability and validity of the inline skating skill test. Based on previous skating experience forty-two skaters (26 female and 16 male) were randomized into two groups (competitive level vs. recreational level). They performed the test four times, with a recovery time of 45 minutes between sessions. Prior to testing, the participants rated their skating skill using a scale from 1 to 10. The protocol included performance time measurement through a course, combining different skating techniques. Trivial changes in performance time between the repeated sessions were determined in both competitive females/males and recreational females/males (-1.7% [95% CI: -5.8-2.6%] - 2.2% [95% CI: 0.0-4.5%]). In all four subgroups, the skill test had a low mean within-individual variation (1.6% [95% CI: 1.2-2.4%] - 2.7% [95% CI: 2.1-4.0%]) and high mean inter-session correlation (ICC = 0.97 [95% CI: 0.92-0.99] - 0.99 [95% CI: 0.98-1.00]). The comparison of detected typical errors and smallest worthwhile changes (calculated as standard deviations × 0.2) revealed that the skill test was able to track changes in skaters' performances. Competitive-level skaters needed shorter time (24.4-26.4%, all p < 0.01) to complete the test in comparison to recreational-level skaters. Moreover, moderate correlation (ρ = 0.80-0.82; all p < 0.01) was observed between the participant's self-rating and achieved performance times. In conclusion, the proposed test is a reliable and valid method to evaluate inline skating skills in amateur competitive and recreational level skaters. Further studies are needed to evaluate the reproducibility of this skill test in different populations including elite inline skaters.

  19. Yo-Yo IR2 testing of elite and sub-elite soccer players: performance, heart rate response and correlations to other interval tests.

    PubMed

    Ingebrigtsen, Jørgen; Bendiksen, Mads; Randers, Morten Bredsgaard; Castagna, Carlo; Krustrup, Peter; Holtermann, Andreas

    2012-01-01

    We examined performance, heart rate response and construct validity of the Yo-Yo IR2 test by testing 111 elite and 92 sub-elite soccer players from Norway and Denmark. VO₂max, Yo-Yo IR1 and repeated sprint tests (RSA) (n = 51) and match-analyses (n = 39) were also performed. Yo-Yo IR2 and Yo-Yo IR1 performance was 41 and 25% better (P < 0.01) for elite than sub-elite players, respectively, and heart rate after 2 and 4 min of the Yo-Yo IR2 test was 20 and 15 bpm (9 and 6% HRmax), respectively, lower (P < 0.01) for elite players. RSA performance and VO₂max was not different between competitive levels (P > 0.05). For top-teams, Yo-Yo IR2 performance (28%) and sprinting distance (25%) during match were greater (P < 0.05) than for bottom-teams. For elite and sub-elite players, Yo-Yo IR2 performance was correlated (P < 0.05) with Yo-Yo IR1 performance (r = 0.74 and 0.76) and mean RSA time (r = -0.74 and -0.34). We conclude that the Yo-Yo IR2 test has a high discriminant and concurrent validity, as it discriminates between players of different within- and between-league competitive levels and is correlated to other frequently used intermittent elite soccer tests.

  20. Field assessment of balance in 10 to 14 year old children, reproducibility and validity of the Nintendo Wii board.

    PubMed

    Larsen, Lisbeth Runge; Jørgensen, Martin Grønbech; Junge, Tina; Juul-Kristensen, Birgit; Wedderkopp, Niels

    2014-06-10

    Because body proportions in childhood are different to those in adulthood, children have a relatively higher centre of mass location. This biomechanical difference and the fact that children's movements have not yet fully matured result in different sway performances in children and adults. When assessing static balance, it is essential to use objective, sensitive tools, and these types of measurement have previously been performed in laboratory settings. However, the emergence of technologies like the Nintendo Wii Board (NWB) might allow balance assessment in field settings. As the NWB has only been validated and tested for reproducibility in adults, the purpose of this study was to examine reproducibility and validity of the NWB in a field setting, in a population of children. Fifty-four 10-14 year-olds from the CHAMPS-Study DK performed four different balance tests: bilateral stance with eyes open (1), unilateral stance on dominant (2) and non-dominant leg (3) with eyes open, and bilateral stance with eyes closed (4). Three rounds of the four tests were completed with the NWB and with a force platform (AMTI). To assess reproducibility, an intra-day test-retest design was applied with a two-hour break between sessions. Bland-Altman plots supplemented by Minimum Detectable Change (MDC) and concordance correlation coefficient (CCC) demonstrated satisfactory reproducibility for the NWB and the AMTI (MDC: 26.3-28.2%, CCC: 0.76-0.86) using Centre Of Pressure path Length as measurement parameter. Bland-Altman plots demonstrated satisfactory concurrent validity between the NWB and the AMTI, supplemented by satisfactory CCC in all tests (CCC: 0.74-0.87). The ranges of the limits of agreement in the validity study were comparable to the limits of agreement of the reproducibility study. Both NWB and AMTI have satisfactory reproducibility for testing static balance in a population of children. Concurrent validity of NWB compared with AMTI was satisfactory. Furthermore, the results from the concurrent validity study were comparable to the reproducibility results of the NWB and the AMTI. Thus, NWB has the potential to replace the AMTI in field settings in studies including children. Future studies are needed to examine intra-subject variability and to test the predictive validity of NWB.

  1. Field assessment of balance in 10 to 14 year old children, reproducibility and validity of the Nintendo Wii board

    PubMed Central

    2014-01-01

    Background Because body proportions in childhood are different to those in adulthood, children have a relatively higher centre of mass location. This biomechanical difference and the fact that children’s movements have not yet fully matured result in different sway performances in children and adults. When assessing static balance, it is essential to use objective, sensitive tools, and these types of measurement have previously been performed in laboratory settings. However, the emergence of technologies like the Nintendo Wii Board (NWB) might allow balance assessment in field settings. As the NWB has only been validated and tested for reproducibility in adults, the purpose of this study was to examine reproducibility and validity of the NWB in a field setting, in a population of children. Methods Fifty-four 10–14 year-olds from the CHAMPS-Study DK performed four different balance tests: bilateral stance with eyes open (1), unilateral stance on dominant (2) and non-dominant leg (3) with eyes open, and bilateral stance with eyes closed (4). Three rounds of the four tests were completed with the NWB and with a force platform (AMTI). To assess reproducibility, an intra-day test-retest design was applied with a two-hour break between sessions. Results Bland-Altman plots supplemented by Minimum Detectable Change (MDC) and concordance correlation coefficient (CCC) demonstrated satisfactory reproducibility for the NWB and the AMTI (MDC: 26.3-28.2%, CCC: 0.76-0.86) using Centre Of Pressure path Length as measurement parameter. Bland-Altman plots demonstrated satisfactory concurrent validity between the NWB and the AMTI, supplemented by satisfactory CCC in all tests (CCC: 0.74-0.87). The ranges of the limits of agreement in the validity study were comparable to the limits of agreement of the reproducibility study. Conclusion Both NWB and AMTI have satisfactory reproducibility for testing static balance in a population of children. Concurrent validity of NWB compared with AMTI was satisfactory. Furthermore, the results from the concurrent validity study were comparable to the reproducibility results of the NWB and the AMTI. Thus, NWB has the potential to replace the AMTI in field settings in studies including children. Future studies are needed to examine intra-subject variability and to test the predictive validity of NWB. PMID:24913461

  2. Role of test motivation in intelligence testing.

    PubMed

    Duckworth, Angela Lee; Quinn, Patrick D; Lynam, Donald R; Loeber, Rolf; Stouthamer-Loeber, Magda

    2011-05-10

    Intelligence tests are widely assumed to measure maximal intellectual performance, and predictive associations between intelligence quotient (IQ) scores and later-life outcomes are typically interpreted as unbiased estimates of the effect of intellectual ability on academic, professional, and social life outcomes. The current investigation critically examines these assumptions and finds evidence against both. First, we examined whether motivation is less than maximal on intelligence tests administered in the context of low-stakes research situations. Specifically, we completed a meta-analysis of random-assignment experiments testing the effects of material incentives on intelligence-test performance on a collective 2,008 participants. Incentives increased IQ scores by an average of 0.64 SD, with larger effects for individuals with lower baseline IQ scores. Second, we tested whether individual differences in motivation during IQ testing can spuriously inflate the predictive validity of intelligence for life outcomes. Trained observers rated test motivation among 251 adolescent boys completing intelligence tests using a 15-min "thin-slice" video sample. IQ score predicted life outcomes, including academic performance in adolescence and criminal convictions, employment, and years of education in early adulthood. After adjusting for the influence of test motivation, however, the predictive validity of intelligence for life outcomes was significantly diminished, particularly for nonacademic outcomes. Collectively, our findings suggest that, under low-stakes research conditions, some individuals try harder than others, and, in this context, test motivation can act as a third-variable confound that inflates estimates of the predictive validity of intelligence for life outcomes.

  3. Role of test motivation in intelligence testing

    PubMed Central

    Duckworth, Angela Lee; Quinn, Patrick D.; Lynam, Donald R.; Loeber, Rolf; Stouthamer-Loeber, Magda

    2011-01-01

    Intelligence tests are widely assumed to measure maximal intellectual performance, and predictive associations between intelligence quotient (IQ) scores and later-life outcomes are typically interpreted as unbiased estimates of the effect of intellectual ability on academic, professional, and social life outcomes. The current investigation critically examines these assumptions and finds evidence against both. First, we examined whether motivation is less than maximal on intelligence tests administered in the context of low-stakes research situations. Specifically, we completed a meta-analysis of random-assignment experiments testing the effects of material incentives on intelligence-test performance on a collective 2,008 participants. Incentives increased IQ scores by an average of 0.64 SD, with larger effects for individuals with lower baseline IQ scores. Second, we tested whether individual differences in motivation during IQ testing can spuriously inflate the predictive validity of intelligence for life outcomes. Trained observers rated test motivation among 251 adolescent boys completing intelligence tests using a 15-min “thin-slice” video sample. IQ score predicted life outcomes, including academic performance in adolescence and criminal convictions, employment, and years of education in early adulthood. After adjusting for the influence of test motivation, however, the predictive validity of intelligence for life outcomes was significantly diminished, particularly for nonacademic outcomes. Collectively, our findings suggest that, under low-stakes research conditions, some individuals try harder than others, and, in this context, test motivation can act as a third-variable confound that inflates estimates of the predictive validity of intelligence for life outcomes. PMID:21518867

  4. Evaluation of the Thermo Scientific SureTect Listeria species assay. AOAC Performance Tested Method 071304.

    PubMed

    Cloke, Jonathan; Evans, Katharine; Crabtree, David; Hughes, Annette; Simpson, Helen; Holopainen, Jani; Wickstrand, Nina; Kauppinen, Mikko; Leon-Velarde, Carlos; Larson, Nathan; Dave, Keron

    2014-01-01

    The Thermo Scientific SureTect Listeria species Assay is a new real-time PCR assay for the detection of all species of Listeria in food and environmental samples. This validation study was conducted using the AOAC Research Institute (RI) Performance Tested Methods program to validate the SureTect Listeria species Assay in comparison to the reference method detailed in International Organization for Standardization 11290-1:1996 including amendment 1:2004 in a variety of foods plus plastic and stainless steel. The food matrixes validated were smoked salmon, processed cheese, fresh bagged spinach, cantaloupe, cooked prawns, cooked sliced turkey meat, cooked sliced ham, salami, pork frankfurters, and raw ground beef. All matrixes were tested by Thermo Fisher Scientific, Microbiology Division, Basingstoke, UK. In addition, three matrixes (pork frankfurters, fresh bagged spinach, and stainless steel surface samples) were analyzed independently as part of the AOAC-RI-controlled independent laboratory study by the University ofGuelph, Canada. Using probability of detection statistical analysis, a significant difference in favour of the SureTect assay was demonstrated between the SureTect and reference method for high level spiked samples of pork frankfurters, smoked salmon, cooked prawns, stainless steel, and low-spiked samples of salami. For all other matrixes, no significant difference was seen between the two methods during the study. Inclusivity testing was conducted with 68 different isolates of Listeria species, all of which were detected by the SureTect Listeria species Assay. None of the 33 exclusivity isolates were detected by the SureTect Listeria species Assay. Ruggedness testing was conducted to evaluate the performance of the assay with specific method deviations outside of the recommended parameters open to variation, which demonstrated that the assay gave reliable performance. Accelerated stability testing was additionally conducted, validating the assay shelf life.

  5. The Gender Difference: Validity of Standardized Admission Tests in Predicting MBA Performance.

    ERIC Educational Resources Information Center

    Hancock, Terence

    1999-01-01

    Of 120 female and 149 male master of business administration (MBA) students, women performed significantly less well on the Graduate Management Admission Test (GMAT). There were no differences in overall MBA grade point average, indicating no strong correlation between the GMAT and MBA performance. (SK)

  6. Correlation Results for a Mass Loaded Vehicle Panel Test Article Finite Element Models and Modal Survey Tests

    NASA Technical Reports Server (NTRS)

    Maasha, Rumaasha; Towner, Robert L.

    2012-01-01

    High-fidelity Finite Element Models (FEMs) were developed to support a recent test program at Marshall Space Flight Center (MSFC). The FEMs correspond to test articles used for a series of acoustic tests. Modal survey tests were used to validate the FEMs for five acoustic tests (a bare panel and four different mass-loaded panel configurations). An additional modal survey test was performed on the empty test fixture (orthogrid panel mounting fixture, between the reverb and anechoic chambers). Modal survey tests were used to test-validate the dynamic characteristics of FEMs used for acoustic test excitation. Modal survey testing and subsequent model correlation has validated the natural frequencies and mode shapes of the FEMs. The modal survey test results provide a basis for the analysis models used for acoustic loading response test and analysis comparisons

  7. Preliminary Report on Oak Ridge National Laboratory Testing of Drake/ACSS/MA2/E3X

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Irminger, Philip; King, Daniel J.; Herron, Andrew N.

    2016-01-01

    A key to industry acceptance of a new technology is extensive validation in field trials. The Powerline Conductor Accelerated Test facility (PCAT) at Oak Ridge National Laboratory (ORNL) is specifically designed to evaluate the performance and reliability of a new conductor technology under real world conditions. The facility is set up to capture large amounts of data during testing. General Cable used the ORNL PCAT facility to validate the performance of TransPowr with E3X Technology a standard overhead conductor with an inorganic high emissivity, low absorptivity surface coating. Extensive testing has demonstrated a significant improvement in conductor performance across amore » wide range of operating temperatures, indicating that E3X Technology can provide a reduction in temperature, a reduction in sag, and an increase in ampacity when applied to the surface of any overhead conductor. This report provides initial results of that testing.« less

  8. The Arthroscopic Surgical Skill Evaluation Tool (ASSET)

    PubMed Central

    Koehler, Ryan J.; Amsdell, Simon; Arendt, Elizabeth A; Bisson, Leslie J; Braman, Jonathan P; Butler, Aaron; Cosgarea, Andrew J; Harner, Christopher D; Garrett, William E; Olson, Tyson; Warme, Winston J.; Nicandri, Gregg T.

    2014-01-01

    Background Surgeries employing arthroscopic techniques are among the most commonly performed in orthopaedic clinical practice however, valid and reliable methods of assessing the arthroscopic skill of orthopaedic surgeons are lacking. Hypothesis The Arthroscopic Surgery Skill Evaluation Tool (ASSET) will demonstrate content validity, concurrent criterion-oriented validity, and reliability, when used to assess the technical ability of surgeons performing diagnostic knee arthroscopy on cadaveric specimens. Study Design Cross-sectional study; Level of evidence, 3 Methods Content validity was determined by a group of seven experts using a Delphi process. Intra-articular performance of a right and left diagnostic knee arthroscopy was recorded for twenty-eight residents and two sports medicine fellowship trained attending surgeons. Subject performance was assessed by two blinded raters using the ASSET. Concurrent criterion-oriented validity, inter-rater reliability, and test-retest reliability were evaluated. Results Content validity: The content development group identified 8 arthroscopic skill domains to evaluate using the ASSET. Concurrent criterion-oriented validity: Significant differences in total ASSET score (p<0.05) between novice, intermediate, and advanced experience groups were identified. Inter-rater reliability: The ASSET scores assigned by each rater were strongly correlated (r=0.91, p <0.01) and the intra-class correlation coefficient between raters for the total ASSET score was 0.90. Test-retest reliability: there was a significant correlation between ASSET scores for both procedures attempted by each individual (r = 0.79, p<0.01). Conclusion The ASSET appears to be a useful, valid, and reliable method for assessing surgeon performance of diagnostic knee arthroscopy in cadaveric specimens. Studies are ongoing to determine its generalizability to other procedures as well as to the live OR and other simulated environments. PMID:23548808

  9. The Leuven Embedded Figures Test (L-EFT): measuring perception, intelligence or executive function?

    PubMed Central

    Van der Hallen, Ruth; Wagemans, Johan; de-Wit, Lee; Chamberlain, Rebecca

    2018-01-01

    Performance on the Embedded Figures Test (EFT) has been interpreted as a reflection of local/global perceptual style, weak central coherence and/or field independence, as well as a measure of intelligence and executive function. The variable ways in which EFT findings have been interpreted demonstrate that the construct validity of this measure is unclear. In order to address this lack of clarity, we investigated to what extent performance on a new Embedded Figures Test (L-EFT) correlated with measures of intelligence, executive functions and estimates of local/global perceptual styles. In addition, we compared L-EFT performance to the original group EFT to directly contrast both tasks. Taken together, our results indicate that performance on the L-EFT does not correlate strongly with estimates of local/global perceptual style, intelligence or executive functions. Additionally, the results show that performance on the L-EFT is similarly associated with memory span and fluid intelligence as the group EFT. These results suggest that the L-EFT does not reflect a general perceptual or cognitive style/ability. These results further emphasize that empirical data on the construct validity of a task do not always align with the face validity of a task. PMID:29607257

  10. An ecologically valid performance-based social functioning assessment battery for schizophrenia.

    PubMed

    Shi, Chuan; He, Yi; Cheung, Eric F C; Yu, Xin; Chan, Raymond C K

    2013-12-30

    Psychiatrists pay more attention to the social functioning outcome of schizophrenia nowadays. How to evaluate the real world function among schizophrenia is a challenging task due to culture difference, there is no such kind of instrument in terms of the Chinese setting. This study aimed to report the validation of an ecologically valid performance-based everyday functioning assessment for schizophrenia, namely the Beijing Performance-based Functional Ecological Test (BJ-PERFECT). Fifty community-dwelling adults with schizophrenia and 37 healthy controls were recruited. Fifteen of the healthy controls were re-tested one week later. All participants were administered the University of California, San Diego, Performance-based Skill Assessment-Brief version (UPSA-B) and the MATRICS Consensus Cognitive Battery (MCCB). The finalized assessment included three subdomains: transportation, financial management and work ability. The test-retest and inter-rater reliabilities were good. The total score significantly correlated with the UPSA-B. The performance of individuals with schizophrenia was significantly more impaired than healthy controls, especially in the domain of work ability. Among individuals with schizophrenia, functional outcome was influenced by premorbid functioning, negative symptoms and neurocognition such as processing speed, visual learning and attention/vigilance. © 2013 Elsevier Ireland Ltd. All rights reserved.

  11. The predictive validity of selection for entry into postgraduate training in general practice: evidence from three longitudinal studies

    PubMed Central

    Patterson, Fiona; Lievens, Filip; Kerrin, Máire; Munro, Neil; Irish, Bill

    2013-01-01

    Background The selection methodology for UK general practice is designed to accommodate several thousand applicants per year and targets six core attributes identified in a multi-method job-analysis study Aim To evaluate the predictive validity of selection methods for entry into postgraduate training, comprising a clinical problem-solving test, a situational judgement test, and a selection centre. Design and setting A three-part longitudinal predictive validity study of selection into training for UK general practice. Method In sample 1, participants were junior doctors applying for training in general practice (n = 6824). In sample 2, participants were GP registrars 1 year into training (n = 196). In sample 3, participants were GP registrars sitting the licensing examination after 3 years, at the end of training (n = 2292). The outcome measures include: assessor ratings of performance in a selection centre comprising job simulation exercises (sample 1); supervisor ratings of trainee job performance 1 year into training (sample 2); and licensing examination results, including an applied knowledge examination and a 12-station clinical skills objective structured clinical examination (OSCE; sample 3). Results Performance ratings at selection predicted subsequent supervisor ratings of job performance 1 year later. Selection results also significantly predicted performance on both the clinical skills OSCE and applied knowledge examination for licensing at the end of training. Conclusion In combination, these longitudinal findings provide good evidence of the predictive validity of the selection methods, and are the first reported for entry into postgraduate training. Results show that the best predictor of work performance and training outcomes is a combination of a clinical problem-solving test, a situational judgement test, and a selection centre. Implications for selection methods for all postgraduate specialties are considered. PMID:24267856

  12. A Malay version of the Child Oral Impacts on Daily Performances (Child-OIDP) index: assessing validity and reliability.

    PubMed

    Yusof, Zamros Y M; Jaafar, Nasruddin

    2012-06-08

    The study aimed to develop and test a Malay version of the Child-OIDP index, evaluate its psychometric properties and report on the prevalence of oral impacts on eight daily performances in a sample of 11-12 year old Malaysian schoolchildren. The Child-OIDP index was translated from English into Malay. The Malay version was tested for reliability and validity on a non-random sample of 132, 11-12 year old schoolchildren from two urban schools in Kuala Lumpur. Psychometric analysis of the Malay Child-OIDP involved face, content, criterion and construct validity tests as well as internal and test-retest reliability. Non-parametric statistical methods were used to assess relationships between Child-OIDP scores and other subjective outcome measures. The standardised Cronbach's alpha was 0.80 and the weighted Kappa was 0.84 (intraclass correlation = 0.79). The index showed significant associations with different subjective measures viz. perceived satisfaction with mouth, perceived needs for dental treatment, perceived oral health status and toothache experience in the previous 3 months (p < 0.05). Two-thirds (66.7%) of the sample had oral impacts affecting one or more performances in the past 3 months. The three most frequently affected performances were cleaning teeth (36.4%), eating foods (34.8%) and maintaining emotional stability (26.5%). In terms of severity of impact, the ability to relax was most severely affected by their oral conditions, followed by ability to socialise and doing schoolwork. Almost three-quarters (74.2%) of schoolchildren with oral impacts had up to three performances affected by their oral conditions. This study indicated that the Malay Child-OIDP index is a valid and reliable instrument to measure the oral impacts of daily performances in 11-12 year old urban schoolchildren in Malaysia.

  13. The predictive validity of selection for entry into postgraduate training in general practice: evidence from three longitudinal studies.

    PubMed

    Patterson, Fiona; Lievens, Filip; Kerrin, Máire; Munro, Neil; Irish, Bill

    2013-11-01

    The selection methodology for UK general practice is designed to accommodate several thousand applicants per year and targets six core attributes identified in a multi-method job-analysis study To evaluate the predictive validity of selection methods for entry into postgraduate training, comprising a clinical problem-solving test, a situational judgement test, and a selection centre. A three-part longitudinal predictive validity study of selection into training for UK general practice. In sample 1, participants were junior doctors applying for training in general practice (n = 6824). In sample 2, participants were GP registrars 1 year into training (n = 196). In sample 3, participants were GP registrars sitting the licensing examination after 3 years, at the end of training (n = 2292). The outcome measures include: assessor ratings of performance in a selection centre comprising job simulation exercises (sample 1); supervisor ratings of trainee job performance 1 year into training (sample 2); and licensing examination results, including an applied knowledge examination and a 12-station clinical skills objective structured clinical examination (OSCE; sample 3). Performance ratings at selection predicted subsequent supervisor ratings of job performance 1 year later. Selection results also significantly predicted performance on both the clinical skills OSCE and applied knowledge examination for licensing at the end of training. In combination, these longitudinal findings provide good evidence of the predictive validity of the selection methods, and are the first reported for entry into postgraduate training. Results show that the best predictor of work performance and training outcomes is a combination of a clinical problem-solving test, a situational judgement test, and a selection centre. Implications for selection methods for all postgraduate specialties are considered.

  14. The influence of validity criteria on Immediate Post-Concussion Assessment and Cognitive Testing (ImPACT) test-retest reliability among high school athletes.

    PubMed

    Brett, Benjamin L; Solomon, Gary S

    2017-04-01

    Research findings to date on the stability of Immediate Post-Concussion Assessment and Cognitive Testing (ImPACT) Composite scores have been inconsistent, requiring further investigation. The use of test validity criteria across these studies also has been inconsistent. Using multiple measures of stability, we examined test-retest reliability of repeated ImPACT baseline assessments in high school athletes across various validity criteria reported in previous studies. A total of 1146 high school athletes completed baseline cognitive testing using the online ImPACT test battery at two time periods of approximately two-year intervals. No participant sustained a concussion between assessments. Five forms of validity criteria used in previous test-retest studies were applied to the data, and differences in reliability were compared. Intraclass correlation coefficients (ICCs) ranged in composite scores from .47 (95% confidence interval, CI [.38, .54]) to .83 (95% CI [.81, .85]) and showed little change across a two-year interval for all five sets of validity criteria. Regression based methods (RBMs) examining the test-retest stability demonstrated a lack of significant change in composite scores across the two-year interval for all forms of validity criteria, with no cases falling outside the expected range of 90% confidence intervals. The application of more stringent validity criteria does not alter test-retest reliability, nor does it account for some of the variation observed across previously performed studies. As such, use of the ImPACT manual validity criteria should be utilized in the determination of test validity and in the individualized approach to concussion management. Potential future efforts to improve test-retest reliability are discussed.

  15. Development and validation of a web-based questionnaire for surveying the health and working conditions of high-performance marine craft populations.

    PubMed

    de Alwis, Manudul Pahansen; Lo Martire, Riccardo; Äng, Björn O; Garme, Karl

    2016-06-20

    High-performance marine craft crews are susceptible to various adverse health conditions caused by multiple interactive factors. However, there are limited epidemiological data available for assessment of working conditions at sea. Although questionnaire surveys are widely used for identifying exposures, outcomes and associated risks with high accuracy levels, until now, no validated epidemiological tool exists for surveying occupational health and performance in these populations. To develop and validate a web-based questionnaire for epidemiological assessment of occupational and individual risk exposure pertinent to the musculoskeletal health conditions and performance in high-performance marine craft populations. A questionnaire for investigating the association between work-related exposure, performance and health was initially developed by a consensus panel under four subdomains, viz. demography, lifestyle, work exposure and health and systematically validated by expert raters for content relevance and simplicity in three consecutive stages, each iteratively followed by a consensus panel revision. The item content validity index (I-CVI) was determined as the proportion of experts giving a rating of 3 or 4. The scale content validity index (S-CVI/Ave) was computed by averaging the I-CVIs for the assessment of the questionnaire as a tool. Finally, the questionnaire was pilot tested. The S-CVI/Ave increased from 0.89 to 0.96 for relevance and from 0.76 to 0.94 for simplicity, resulting in 36 items in the final questionnaire. The pilot test confirmed the feasibility of the questionnaire. The present study shows that the web-based questionnaire fulfils previously published validity acceptance criteria and is therefore considered valid and feasible for the empirical surveying of epidemiological aspects among high-performance marine craft crews and similar populations. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/

  16. Reliability and validity of a talent identification test battery for seated and standing Paralympic throws.

    PubMed

    Spathis, Jemima Grace; Connick, Mark James; Beckman, Emma Maree; Newcombe, Peter Anthony; Tweedy, Sean Michael

    2015-01-01

    Paralympic throwing events for athletes with physical impairments comprise seated and standing javelin, shot put, discus and seated club throwing. Identification of talented throwers would enable prediction of future success and promote participation; however, a valid and reliable talent identification battery for Paralympic throwing has not been reported. This study evaluates the reliability and validity of a talent identification battery for Paralympic throws. Participants were non-disabled so that impairment would not confound analyses, and results would provide an indication of normative performance. Twenty-eight non-disabled participants (13 M; 15 F) aged 23.6 years (±5.44) performed five kinematically distinct criterion throws (three seated, two standing) and nine talent identification tests (three anthropometric, six motor); 23 were tested a second time to evaluate test-retest reliability. Talent identification test-retest reliability was evaluated using Intra-class Correlation Coefficient (ICC) and Bland-Altman plots (Limits of Agreement). Spearman's correlation assessed strength of association between criterion throws and talent identification tests. Reliability was generally acceptable (mean ICC = 0.89), but two seated talent identification tests require more extensive familiarisation. Correlation strength (mean rs = 0.76) indicated that the talent identification tests can be used to validly identify individuals with competitively advantageous attributes for each of the five kinematically distinct throwing activities. Results facilitate further research in this understudied area.

  17. Development and validation of the Perceived Game-Specific Soccer Competence Scale.

    PubMed

    Forsman, Hannele; Gråstén, Arto; Blomqvist, Minna; Davids, Keith; Liukkonen, Jarmo; Konttinen, Niilo

    2016-07-01

    The objective of this study was to create a valid, self-reported, game-specific soccer competence scale. A structural model of perceived competence, performance measures and motivation was tested as the basis for the scale. A total of 1321 soccer players (261 females, 1060 males) ranging from 12 to 15 years (13.4 ± 1.0 years) participated in the study. They completed the Perceived Game-Specific Soccer Competence Scale (PGSSCS), self-assessments of tactical skills and motivation, as well as technical and speed and agility tests. Results of factor analyses, tests of internal consistency and correlations between PGSSCS subscales, performance measures and motivation supported the reliability and validity of the PGSSCS. The scale can be considered a suitable instrument to assess perceived game-specific competence among young soccer players.

  18. Validation of the comprehensive feeding practices questionnaire in parents of preschool children in Brazil.

    PubMed

    Warkentin, Sarah; Mais, Laís Amaral; Latorre, Maria do Rosário Dias de Oliveira; Carnell, Susan; Taddei, José Augusto de Aguiar Carrazedo

    2016-07-19

    Recent national surveys in Brazil have demonstrated a decrease in the consumption of traditional food and a parallel increase in the consumption of ultra-processed food, which has contributed to a rise in obesity prevalence in all age groups. Environmental factors, especially familial factors, have a strong influence on the food intake of preschool children, and this has led to the development of psychometric scales to measure parents' feeding practices. The aim of this study was to test the validity of a translated and adapted Comprehensive Feeding Practices Questionnaire in a sample of Brazilian preschool-aged children enrolled in private schools. A transcultural adaptation process was performed in order to develop a modified questionnaire (43 items). After piloting, the questionnaire was sent to parents, along with additional questions about family characteristics. Test-retest reliability was assessed in one of the schools. Factor analysis with oblique rotation was performed. Internal reliability was tested using Cronbach's alpha and correlations between factors, discriminant validity using marker variables of child's food intake, and convergent validity via correlations with parental perceptions of perceived responsibility for feeding and concern about the child's weight were also performed. The final sample consisted of 402 preschool children. Factor analysis resulted in a final questionnaire of 43 items distributed over 6 factors. Cronbach alpha values were adequate (0.74 to 0.88), between-factor correlations were low, and discriminant validity and convergent validity were acceptable. The modified CFPQ demonstrated significant internal reliability in this urban Brazilian sample. Scale validation within different cultures is essential for a more comprehensive understanding of parental feeding practices for preschoolers.

  19. Oxygen uptake during functional activities after stroke—Reliability and validity of a portable ergospirometry system

    PubMed Central

    Brurok, Berit; Tjønna, Arnt Erik; Tørhaug, Tom; Askim, Torunn

    2017-01-01

    Background People with stroke have a low peak aerobic capacity and experience increased effort during performance of daily activities. The purpose of this study was to examine test-retest reliability of a portable ergospirometry system in people with stroke during performance of functional activities in a field-test. Secondary aims were to examine the proportion of oxygen consumed during the field-test in relation to the peak-test and to analyse the correlation between the oxygen uptake during the field-test and peak-test in order to support the validity of the field-test. Methods With simultaneous measurement of oxygen consumption, participants performed a standardized field-test consisting of five activities; walking over ground, stair walking, stepping over obstacles, walking slalom between cones and from a standing position lifting objects from one height to another. All activities were performed in self-selected speed. Prior to the field-test, a peak aerobic capacity test was performed. The field-test was repeated minimum 2 and maximum 14 days between the tests. ICC2,1 and Bland Altman tests (Limits of Agreement, LoA) were used to analyse test-retest reliability. Results In total 31 participants (39% women, mean (SD) age 54.5 (12.7) years and 21.1 (14.3) months’ post-stroke) were included. The ICC2,1 was ≥ 0.80 for absolute V̇O2, relative V̇O2, minute ventilation, CO2, respiratory exchange ratio, heart rate and Borgs rating of perceived exertion. ICC2,1 for total time to complete the field-test was 0.99. Mean difference in steady state V̇O2 during Test 1 and Test 2 was -0.40 (2.12) The LoAs were -3.75 and 4.51. Participants spent 60.7% of their V̇O2peak performing functional activities. Correlation between field-test and peak-test was 0.689, p = 0.001 for absolute and 0.733, p = 0.001 for relative V̇O2. Conclusions This study presents first evidence on reliability of oxygen uptake during performance of functional activities after stroke, showing very good test-retest reliability. The secondary analysis showed that the amount of energy spent during the field-test relative to the peak-test was high and the correlation between the two test was good, supporting the validity of this method. PMID:29065164

  20. An entropy-based nonparametric test for the validation of surrogate endpoints.

    PubMed

    Miao, Xiaopeng; Wang, Yong-Cheng; Gangopadhyay, Ashis

    2012-06-30

    We present a nonparametric test to validate surrogate endpoints based on measure of divergence and random permutation. This test is a proposal to directly verify the Prentice statistical definition of surrogacy. The test does not impose distributional assumptions on the endpoints, and it is robust to model misspecification. Our simulation study shows that the proposed nonparametric test outperforms the practical test of the Prentice criterion in terms of both robustness of size and power. We also evaluate the performance of three leading methods that attempt to quantify the effect of surrogate endpoints. The proposed method is applied to validate magnetic resonance imaging lesions as the surrogate endpoint for clinical relapses in a multiple sclerosis trial. Copyright © 2012 John Wiley & Sons, Ltd.

  1. Parameterization of Model Validating Sets for Uncertainty Bound Optimizations. Revised

    NASA Technical Reports Server (NTRS)

    Lim, K. B.; Giesy, D. P.

    2000-01-01

    Given measurement data, a nominal model and a linear fractional transformation uncertainty structure with an allowance on unknown but bounded exogenous disturbances, easily computable tests for the existence of a model validating uncertainty set are given. Under mild conditions, these tests are necessary and sufficient for the case of complex, nonrepeated, block-diagonal structure. For the more general case which includes repeated and/or real scalar uncertainties, the tests are only necessary but become sufficient if a collinearity condition is also satisfied. With the satisfaction of these tests, it is shown that a parameterization of all model validating sets of plant models is possible. The new parameterization is used as a basis for a systematic way to construct or perform uncertainty tradeoff with model validating uncertainty sets which have specific linear fractional transformation structure for use in robust control design and analysis. An illustrative example which includes a comparison of candidate model validating sets is given.

  2. Author Response to Sabour (2018), "Comment on Hall et al. (2017), 'How to Choose Between Measures of Tinnitus Loudness for Clinical Research? A Report on the Reliability and Validity of an Investigator-Administered Test and a Patient-Reported Measure Using Baseline Data Collected in a Phase IIa Drug Trial'".

    PubMed

    Hall, Deborah A; Mehta, Rajnikant L; Fackrell, Kathryn

    2018-03-08

    The authors respond to a letter to the editor (Sabour, 2018) concerning the interpretation of validity in the context of evaluating treatment-related change in tinnitus loudness over time. The authors refer to several landmark methodological publications and an international standard concerning the validity of patient-reported outcome measurement instruments. The tinnitus loudness rating performed better against our reported acceptability criteria for (face and convergent) validity than did the tinnitus loudness matching test. It is important to distinguish between tests that evaluate the validity of measuring treatment-related change over time and tests that quantify the accuracy of diagnosing tinnitus as a case and non-case.

  3. Construct Validity and Scoring Methods of the World Health Organization: Health and Work Performance Questionnaire Among Workers With Arthritis and Rheumatological Conditions.

    PubMed

    AlHeresh, Rawan; LaValley, Michael P; Coster, Wendy; Keysor, Julie J

    2017-06-01

    To evaluate construct validity and scoring methods of the world health organization-health and work performance questionnaire (HPQ) for people with arthritis. Construct validity was examined through hypothesis testing using the recommended guidelines of the consensus-based standards for the selection of health measurement instruments (COSMIN). The HPQ using the absolute scoring method showed moderate construct validity as four of the seven hypotheses were met. The HPQ using the relative scoring method had weak construct validity as only one of the seven hypotheses were met. The absolute scoring method for the HPQ is superior in construct validity to the relative scoring method in assessing work performance among people with arthritis and related rheumatic conditions; however, more research is needed to further explore other psychometric properties of the HPQ.

  4. An alternative to the balance error scoring system: using a low-cost balance board to improve the validity/reliability of sports-related concussion balance testing.

    PubMed

    Chang, Jasper O; Levy, Susan S; Seay, Seth W; Goble, Daniel J

    2014-05-01

    Recent guidelines advocate sports medicine professionals to use balance tests to assess sensorimotor status in the management of concussions. The present study sought to determine whether a low-cost balance board could provide a valid, reliable, and objective means of performing this balance testing. Criterion validity testing relative to a gold standard and 7 day test-retest reliability. University biomechanics laboratory. Thirty healthy young adults. Balance ability was assessed on 2 days separated by 1 week using (1) a gold standard measure (ie, scientific grade force plate), (2) a low-cost Nintendo Wii Balance Board (WBB), and (3) the Balance Error Scoring System (BESS). Validity of the WBB center of pressure path length and BESS scores were determined relative to the force plate data. Test-retest reliability was established based on intraclass correlation coefficients. Composite scores for the WBB had excellent validity (r = 0.99) and test-retest reliability (R = 0.88). Both the validity (r = 0.10-0.52) and test-retest reliability (r = 0.61-0.78) were lower for the BESS. These findings demonstrate that a low-cost balance board can provide improved balance testing accuracy/reliability compared with the BESS. This approach provides a potentially more valid/reliable, yet affordable, means of assessing sports-related concussion compared with current methods.

  5. Comparative Validity of the Descriptive Tests of Mathematical Skills (DTMS) and SAT-Mathematics (SAT-M) for Predicting Performance in Freshman College Mathematics Courses: Prefatory Report

    ERIC Educational Resources Information Center

    McLoughlin, M. Padraig M. M.; Bluford, Dontrell A.

    2004-01-01

    This study investigated the predictive validity of the Descriptive Tests of Mathematical Skills (DTMS) and the SAT-Mathematics (SAT-M) tests as placement tools for entering students in a small, liberal arts, historically black institution (HBI) using regression analysis. The placement schema is four-tiered: for a remedial algebra course, college…

  6. Vertical jumping tests in volleyball: reliability, validity, and playing-position specifics.

    PubMed

    Sattler, Tine; Sekulic, Damir; Hadzic, Vedran; Uljevic, Ognjen; Dervisevic, Edvin

    2012-06-01

    Vertical jumping is known to be important in volleyball, and jumping performance tests are frequently studied for their reliability and validity. However, most studies concerning jumping in volleyball have dealt with standard rather than sport-specific jumping procedures and tests. The aims of this study, therefore, were (a) to determine the reliability and factorial validity of 2 volleyball-specific jumping tests, the block jump (BJ) test and the attack jump (AJ) test, relative to 2 frequently used and systematically validated jumping tests, the countermovement jump test and the squat jump test and (b) to establish volleyball position-specific differences in the jumping tests and simple anthropometric indices (body height [BH], body weight, and body mass index [BMI]). The BJ was performed from a defensive volleyball position, with the hands positioned in front of the chest. During an AJ, the players used a 2- to 3-step approach and performed a drop jump with an arm swing followed by a quick vertical jump. A total of 95 high-level volleyball players (all men) participated in this study. The reliability of the jumping tests ranged from 0.97 to 0.99 for Cronbach's alpha coefficients, from 0.93 to 0.97 for interitem correlation coefficients and from 2.1 to 2.8 for coefficients of variation. The highest reliability was found for the specific jumping tests. The factor analysis extracted one significant component, and all of the tests were highly intercorrelated. The analysis of variance with post hoc analysis showed significant differences between 5 playing positions in some of the jumping tests. In general, receivers had a greater jumping capacity, followed by libero players. The differences in jumping capacities should be emphasized vis-a-vis differences in the anthropometric measures of players, where middle hitters had higher BH and body weight, followed by opposite hitters and receivers, with no differences in the BMI between positions.

  7. Assessment of performance validity in the Stroop Color and Word Test in mild traumatic brain injury patients: a criterion-groups validation design.

    PubMed

    Guise, Brian J; Thompson, Matthew D; Greve, Kevin W; Bianchini, Kevin J; West, Laura

    2014-03-01

    The current study assessed performance validity on the Stroop Color and Word Test (Stroop) in mild traumatic brain injury (TBI) using criterion-groups validation. The sample consisted of 77 patients with a reported history of mild TBI. Data from 42 moderate-severe TBI and 75 non-head-injured patients with other clinical diagnoses were also examined. TBI patients were categorized on the basis of Slick, Sherman, and Iverson (1999) criteria for malingered neurocognitive dysfunction (MND). Classification accuracy is reported for three indicators (Word, Color, and Color-Word residual raw scores) from the Stroop across a range of injury severities. With false-positive rates set at approximately 5%, sensitivity was as high as 29%. The clinical implications of these findings are discussed. © 2012 The British Psychological Society.

  8. Validation of a Dumbbell Body Sway Test in Olympic Air Pistol Shooting

    PubMed Central

    Mon, Daniel; Zakynthinaki, Maria S.; Cordente, Carlos A.; Monroy Antón, Antonio; López Jiménez, David

    2014-01-01

    We present and validate a test able to provide reliable body sway measurements in air pistol shooting, without the use of a gun. 46 senior male pistol shooters who participated in Spanish air pistol championships participated in the study. Body sway data of two static bipodal balance tests have been compared: during the first test, shooting was simulated by use of a dumbbell, while during the second test the shooters own pistol was used. Both tests were performed the day previous to the competition, during the official training time and at the training stands to simulate competition conditions. The participantś performance was determined as the total score of 60 shots at competition. Apart from the commonly used variables that refer to movements of the shooters centre of pressure (COP), such as COP displacements on the X and Y axes, maximum and average COP velocities and total COP area, the present analysis also included variables that provide information regarding the axes of the COP ellipse (length and angle in respect to X). A strong statistically significant correlation between the two tests was found (with an interclass correlation varying between 0.59 and 0.92). A statistically significant inverse linear correlation was also found between performance and COP movements. The study concludes that dumbbell tests are perfectly valid for measuring body sway by simulating pistol shooting. PMID:24756067

  9. RELIABILITY AND VALIDITY OF A MODIFIED ISOMETRIC DYNAMOMETER IN THE ASSESSMENT OF MUSCULAR PERFORMANCE IN INDIVIDUALS WITH ANTERIOR CRUCIATE LIGAMENT RECONSTRUCTION

    PubMed Central

    de Vasconcelos, Rodrigo Antunes; Bevilaqua-Grossi, Débora; Shimano, Antonio Carlos; Paccola, Cleber Jansen; Salvini, Tânia Fátima; Prado, Christiane Lanatovits; Junior, Wilson A. Mello

    2015-01-01

    Objectives: The aim of this study was to evaluate the reliability and validity of a modified isometric dynamometer (MID) in performance deficits of the knee extensor and flexor muscles in normal individuals and in those with ACL reconstructions. Methods: Sixty male subjects were invited to participate of the study, being divided into three groups with 20 subjects each: control group (GC), group of individuals with ACL reconstruction with patellar tendon graft (GTP, and group of individuals with ACL reconstruction with hamstrings graft (GTF). All individuals performed isometric tests in the MID, muscular strength deficits collected were subsequently compared to the tests performed on the Biodex System 3 operating in the isometric and isokinetic mode at speeds of 60°/s and 180o/s. Intraclass ICC correlation calculations were done in order to assess MID reliability, specificity, sensitivity and Kappa's consistency coefficient calculations, respectively, for assessing the MID's validity in detecting muscular deficits and intra- and intergroup comparisons when performing the four strength tests using the ANOVA method. Results: The modified isometric dynamometer (MID) showed excellent reliability and good validity in the assessment of the performance of the knee extensor and flexor muscles groups. In the comparison between groups, the GTP showed significantly greater deficits as compared to the GTF and GC groups. Conclusion: Isometric dynamometers connected to mechanotherapy equipments could be an alternative option to collect data concerning performance deficits of the extensor and flexor muscles groups of the knee in subjects with ACL reconstruction. PMID:27004175

  10. Model-Based Verification and Validation of Spacecraft Avionics

    NASA Technical Reports Server (NTRS)

    Khan, M. Omair; Sievers, Michael; Standley, Shaun

    2012-01-01

    Verification and Validation (V&V) at JPL is traditionally performed on flight or flight-like hardware running flight software. For some time, the complexity of avionics has increased exponentially while the time allocated for system integration and associated V&V testing has remained fixed. There is an increasing need to perform comprehensive system level V&V using modeling and simulation, and to use scarce hardware testing time to validate models; the norm for thermal and structural V&V for some time. Our approach extends model-based V&V to electronics and software through functional and structural models implemented in SysML. We develop component models of electronics and software that are validated by comparison with test results from actual equipment. The models are then simulated enabling a more complete set of test cases than possible on flight hardware. SysML simulations provide access and control of internal nodes that may not be available in physical systems. This is particularly helpful in testing fault protection behaviors when injecting faults is either not possible or potentially damaging to the hardware. We can also model both hardware and software behaviors in SysML, which allows us to simulate hardware and software interactions. With an integrated model and simulation capability we can evaluate the hardware and software interactions and identify problems sooner. The primary missing piece is validating SysML model correctness against hardware; this experiment demonstrated such an approach is possible.

  11. Ada (Trade name) Compiler Validation Summary Report: Rational. Rational Environment (Trademark) A952. Rational Architecture (R1000 (Trade name) Model 200).

    DTIC Science & Technology

    1987-05-06

    Rational . Rational Environment A_9_5_2. Rational Arthitecture (R1000 Model 200) 6. PERFORMING ORG. REPORT...validation testing performed on the Rational Environment , A_9_5_2, using Version 1.8 of the Ada0 Compiler Validation Capability (ACVC). The Rational ... Environment is hosted on a Rational Architecture (R1000 Model 200) operating under Rational Environment , Release A 95 2. Programs processed by this

  12. Comparative Predictive Validity of the New MCAT Using Different Admissions Criteria.

    ERIC Educational Resources Information Center

    Golmon, Melton E.; Berry, Charles A.

    1981-01-01

    New Medical College Admission Test (MCAT) scores and undergraduate academic achievement were examined for their validity in predicting the performance of two select student populations at Northwestern University Medical School. The data support the hypothesis that New MCAT scores possess substantial predictive validity. (Author/MLW)

  13. The Validity of Interpersonal Skills Assessment via Situational Judgment Tests for Predicting Academic Success and Job Performance

    ERIC Educational Resources Information Center

    Lievens, Filip; Sackett, Paul R.

    2012-01-01

    This study provides conceptual and empirical arguments why an assessment of applicants' procedural knowledge about interpersonal behavior via a video-based situational judgment test might be valid for academic and postacademic success criteria. Four cohorts of medical students (N = 723) were followed from admission to employment. Procedural…

  14. Authentication of Electromagnetic Interference Removal in Johnson Noise Thermometry

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Britton Jr, Charles L.; Roberts, Michael

    This report summarizes the testing performed offsite at the TVA Kingston Fossil Plant (KFP). This location is selected as a valid offsite test facility because the environment is very similar to the expected industrial nuclear power plant environment. This report will discuss the EMI discovered in the environment, the removal technique validity, and results from the measurements.

  15. Predicting Curriculum and Test Performance at Age 7 Years from Pupil Background, Baseline Skills and Phonological Awareness at Age 5

    ERIC Educational Resources Information Center

    Savage, R.; Carless, S.

    2004-01-01

    Background: Phonological awareness tests are known to be amongst the best predictors of literacy; however their predictive validity alongside current school screening practice (baseline assessment, pupil background data) and to National Curricular outcome measures is unknown. Aim: We explored the validity of phonological awareness and orthographic…

  16. Automated Vision Test Development and Validation

    DTIC Science & Technology

    2016-11-01

    Deputy Chief, Aerosp Med Consultation Div Chair, Aerospace Medicine Department This report is published in the interest of...produce software for desktop displays; and to evaluate features such as user interfaces, threshold algorithms, validity of results, and screening...cost of performing full threshold testing on over 30% of normal subjects, which is quite time consuming. This effort was accomplished using desktop

  17. [Validation of three screening tests used for early detection of cervical cancer].

    PubMed

    Rodriguez-Reyes, Esperanza Rosalba; Cerda-Flores, Ricardo M; Quiñones-Pérez, Juan M; Cortés-Gutiérrez, Elva I

    2008-01-01

    to evaluate the validity (sensitivity, specificity, and accuracy) of three screening methods used in the early detection of the cervical carcinoma versus the histopathology diagnosis. a selected sample of 107 women attended in the Opportune Detection of Cervicouterine Cancer Program in the Hospital de Zona 46, Instituto Mexicano del Seguro Social in Durango, during the 2003 was included. The application of Papa-nicolaou, acetic acid test, and molecular detection of human papillomavirus, and histopatholgy diagnosis were performed in all the patients at the time of the gynecological exam. The detection and tipification of the human papillomavirus was performed by polymerase chain reaction (PCR) and analysis of polymorphisms of length of restriction fragments (RFLP). Histopathology diagnosis was considered the gold standard. The evaluation of the validity was carried out by the Bayesian method for diagnosis test. the positive cases for acetic acid test, Papanicolaou, and PCR were 47, 22, and 19. The accuracy values were 0.70, 0.80 and 0.99, respectively. since the molecular method showed a greater validity in the early detection of the cervical carcinoma we considered of vital importance its implementation in suitable programs of Opportune Detection of Cervicouterino Cancer Program in Mexico. However, in order to validate this conclusion, cross-sectional studies in different region of country must be carried out.

  18. Reliability and validity of the closed kinetic chain upper extremity stability test.

    PubMed

    Lee, Dong-Rour; Kim, Laurentius Jongsoon

    2015-04-01

    [Purpose] The purpose of this study was to examine the reliability and validity of the Closed Kinetic Chain Upper Extremity Stability (CKCUES) test. [Subjects and Methods] A sample of 40 subjects (20 males, 20 females) with and without pain in the upper limbs was recruited. The subjects were tested twice, three days apart to assess the reliability of the CKCUES test. The CKCUES test was performed four times, and the average was calculated using the data of the last 3 tests. In order to test the validity of the CKCUES test, peak torque of internal/external shoulder rotation was measured using an isokinetic dynamometer, and maximum grip strength was measured using a hand dynamometer, and their Pearson correlation coefficients with the average values of the CKCUES test were calculated. [Results] The reliability of the CKCUES test was very high (ICC=0.97). The correlations between the CKCUES test and maximum grip strength (r=0.78-0.79), and the peak torque of internal/external shoulder rotation (r=0.87-0.94) were high indicating its validity. [Conclusion] The reliability and validity of the CKCUES test were high. The CKCUES test is expected to be used for clinical tests on upper limb stability at low price.

  19. Testing and validating environmental models

    USGS Publications Warehouse

    Kirchner, J.W.; Hooper, R.P.; Kendall, C.; Neal, C.; Leavesley, G.

    1996-01-01

    Generally accepted standards for testing and validating ecosystem models would benefit both modellers and model users. Universally applicable test procedures are difficult to prescribe, given the diversity of modelling approaches and the many uses for models. However, the generally accepted scientific principles of documentation and disclosure provide a useful framework for devising general standards for model evaluation. Adequately documenting model tests requires explicit performance criteria, and explicit benchmarks against which model performance is compared. A model's validity, reliability, and accuracy can be most meaningfully judged by explicit comparison against the available alternatives. In contrast, current practice is often characterized by vague, subjective claims that model predictions show 'acceptable' agreement with data; such claims provide little basis for choosing among alternative models. Strict model tests (those that invalid models are unlikely to pass) are the only ones capable of convincing rational skeptics that a model is probably valid. However, 'false positive' rates as low as 10% can substantially erode the power of validation tests, making them insufficiently strict to convince rational skeptics. Validation tests are often undermined by excessive parameter calibration and overuse of ad hoc model features. Tests are often also divorced from the conditions under which a model will be used, particularly when it is designed to forecast beyond the range of historical experience. In such situations, data from laboratory and field manipulation experiments can provide particularly effective tests, because one can create experimental conditions quite different from historical data, and because experimental data can provide a more precisely defined 'target' for the model to hit. We present a simple demonstration showing that the two most common methods for comparing model predictions to environmental time series (plotting model time series against data time series, and plotting predicted versus observed values) have little diagnostic power. We propose that it may be more useful to statistically extract the relationships of primary interest from the time series, and test the model directly against them.

  20. Evaluation of the Thermo Scientific SureTect Salmonella species assay. AOAC Performance Tested Method 051303.

    PubMed

    Cloke, Jonathan; Clark, Dorn; Radcliff, Roy; Leon-Velarde, Carlos; Larson, Nathan; Dave, Keron; Evans, Katharine; Crabtree, David; Hughes, Annette; Simpson, Helen; Holopainen, Jani; Wickstrand, Nina; Kauppinen, Mikko

    2014-01-01

    The Thermo Scientific SureTect Salmonella species Assay is a new real-time PCR assay for the detection of Salmonellae in food and environmental samples. This validation study was conducted using the AOAC Research Institute (RI) Performance Tested Methods program to validate the SureTect Salmonella species Assay in comparison to the reference method detailed in International Organization for Standardization 6579:2002 in a variety of food matrixes, namely, raw ground beef, raw chicken breast, raw ground pork, fresh bagged lettuce, pork frankfurters, nonfat dried milk powder, cooked peeled shrimp, pasteurized liquid whole egg, ready-to-eat meal containing beef, and stainless steel surface samples. With the exception of liquid whole egg and fresh bagged lettuce, which were tested in-house, all matrixes were tested by Marshfield Food Safety, Marshfield, WI, on behalf of Thermo Fisher Scientific. In addition, three matrixes (pork frankfurters, lettuce, and stainless steel surface samples) were analyzed independently as part of the AOAC-RI-controlled laboratory study by the University of Guelph, Canada. No significant difference by probability of detection or McNemars Chi-squared statistical analysis was found between the candidate or reference methods for any of the food matrixes or environmental surface samples tested during the validation study. Inclusivity and exclusivity testing was conducted with 117 and 36 isolates, respectively, which demonstrated that the SureTect Salmonella species Assay was able to detect all the major groups of Salmonella enterica subspecies enterica (e.g., Typhimurium) and the less common subspecies of S. enterica (e.g., arizoniae) and the rarely encountered S. bongori. None of the exclusivity isolates analyzed were detected by the SureTect Salmonella species Assay. Ruggedness testing was conducted to evaluate the performance of the assay with specific method deviations outside of the recommended parameters open to variation (enrichment time and temperature, and lysis temperature), which demonstrated that the assay gave reliable performance. Accelerated stability testing was additionally conducted, validating the assay shelf life.

  1. An investigation of new toxicity test method performance in validation studies: 1. Toxicity test methods that have predictive capacity no greater than chance.

    PubMed

    Bruner, L H; Carr, G J; Harbell, J W; Curren, R D

    2002-06-01

    An approach commonly used to measure new toxicity test method (NTM) performance in validation studies is to divide toxicity results into positive and negative classifications, and the identify true positive (TP), true negative (TN), false positive (FP) and false negative (FN) results. After this step is completed, the contingent probability statistics (CPS), sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) are calculated. Although these statistics are widely used and often the only statistics used to assess the performance of toxicity test methods, there is little specific guidance in the validation literature on what values for these statistics indicate adequate performance. The purpose of this study was to begin developing data-based answers to this question by characterizing the CPS obtained from an NTM whose data have a completely random association with a reference test method (RTM). Determining the CPS of this worst-case scenario is useful because it provides a lower baseline from which the performance of an NTM can be judged in future validation studies. It also provides an indication of relationships in the CPS that help identify random or near-random relationships in the data. The results from this study of randomly associated tests show that the values obtained for the statistics vary significantly depending on the cut-offs chosen, that high values can be obtained for individual statistics, and that the different measures cannot be considered independently when evaluating the performance of an NTM. When the association between results of an NTM and RTM is random the sum of the complementary pairs of statistics (sensitivity + specificity, NPV + PPV) is approximately 1, and the prevalence (i.e., the proportion of toxic chemicals in the population of chemicals) and PPV are equal. Given that combinations of high sensitivity-low specificity or low specificity-high sensitivity (i.e., the sum of the sensitivity and specificity equal to approximately 1) indicate lack of predictive capacity, an NTM having these performance characteristics should be considered no better for predicting toxicity than by chance alone.

  2. The predictive validity of three versions of the MCAT in relation to performance in medical school, residency, and licensing examinations: a longitudinal study of 36 classes of Jefferson Medical College.

    PubMed

    Callahan, Clara A; Hojat, Mohammadreza; Veloski, Jon; Erdmann, James B; Gonnella, Joseph S

    2010-06-01

    The Medical College Admission Test (MCAT) has undergone several revisions for content and validity since its inception. With another comprehensive review pending, this study examines changes in the predictive validity of the MCAT's three recent versions. Study participants were 7,859 matriculants in 36 classes entering Jefferson Medical College between 1970 and 2005; 1,728 took the pre-1978 version of the MCAT; 3,032 took the 1978-1991 version, and 3,099 took the post-1991 version. MCAT subtest scores were the predictors, and performance in medical school, attrition, scores on the medical licensing examinations, and ratings of clinical competence in the first year of residency were the criterion measures. No significant improvement in validity coefficients was observed for performance in medical school or residency. Validity coefficients for all three versions of the MCAT in predicting Part I/Step 1 remained stable (in the mid-0.40s, P < .01). A systematic decline was observed in the validity coefficients of the MCAT versions in predicting Part II/Step 2. It started at 0.47 for the pre-1978 version, decreased to between 0.42 and 0.40 for the 1978-1991 versions, and to 0.37 for the post-1991 version. Validity coefficients for the MCAT versions in predicting Part III/Step 3 remained near 0.30. These were generally larger for women than men. Although the findings support the short- and long-term predictive validity of the MCAT, opportunities to strengthen it remain. Subsequent revisions should increase the test's ability to predict performance on United States Medical Licensing Examination Step 2 and must minimize the differential validity for gender.

  3. Ruggedness testing and validation of a practical analytical method for > 100 veterinary drug residues in bovine muscle by ultrahigh performance liquid chromatography – tandem mass spectrometry

    USDA-ARS?s Scientific Manuscript database

    In this study, optimization, extension, and validation of a streamlined, qualitative and quantitative multiclass, multiresidue method was conducted to monitor great than100 veterinary drug residues in meat using ultrahigh-performance liquid chromatography – tandem mass spectrometry (UHPLC-MS/MS). I...

  4. The Effects of Surface Structure Variables on Performance in Reading Comprehension Tests.

    ERIC Educational Resources Information Center

    Drum, Priscilla; And Others

    1981-01-01

    Concludes that reading comprehension tests that are valid for beginning readers should incorporate different factors than tests appropriate for upper elementary readers, since word recognition and word meaning are prime sources of difficulty for younger readers while content density depresses the performance of readers in upper elementary grades.…

  5. Cold flow testing of the Space Shuttle Main Engine alternate turbopump development high pressure fuel turbine model

    NASA Technical Reports Server (NTRS)

    Gaddis, Stephen W.; Hudson, Susan T.; Johnson, P. D.

    1992-01-01

    NASA's Marshall Space Flight Center has established a cold airflow turbine test program to experimentally determine the performance of liquid rocket engine turbopump drive turbines. Testing of the SSME alternate turbopump development (ATD) fuel turbine was conducted for back-to-back comparisons with the baseline SSME fuel turbine results obtained in the first quarter of 1991. Turbine performance, Reynolds number effects, and turbine diagnostics, such as stage reactions and exit swirl angles, were investigated at the turbine design point and at off-design conditions. The test data showed that the ATD fuel turbine test article was approximately 1.4 percent higher in efficiency and flowed 5.3 percent more than the baseline fuel turbine test article. This paper describes the method and results used to validate the ATD fuel turbine aerodynamic design. The results are being used to determine the ATD high pressure fuel turbopump (HPFTP) turbine performance over its operating range, anchor the SSME ATD steady-state performance model, and validate various prediction and design analyses.

  6. Psychological collectivism: a measurement validation and linkage to group member performance.

    PubMed

    Jackson, Christine L; Colquitt, Jason A; Wesson, Michael J; Zapata-Phelan, Cindy P

    2006-07-01

    The 3 studies presented here introduce a new measure of the individual-difference form of collectivism. Psychological collectivism is conceptualized as a multidimensional construct with the following 5 facets: preference for in-groups, reliance on in-groups, concern for in-groups, acceptance of in-group norms, and prioritization of in-group goals. Study 1 developed and tested the new measure in a sample of consultants. Study 2 cross-validated the measure using an alumni sample of a Southeastern university, assessing its convergent validity with other collectivism measures. Study 3 linked scores on the measure to 4 dimensions of group member performance (task performance, citizenship behavior, counterproductive behavior, and withdrawal behavior) in a computer software firm and assessed discriminant validity using the Big Five. The results of the studies support the construct validity of the measure and illustrate the potential value of collectivism as a predictor of group member performance. ((c) 2006 APA, all rights reserved).

  7. [Design and validation of a questionnaire for psychosocial nursing diagnosis in Primary Care].

    PubMed

    Brito-Brito, Pedro Ruymán; Rodríguez-Álvarez, Cristobalina; Sierra-López, Antonio; Rodríguez-Gómez, José Ángel; Aguirre-Jaime, Armando

    2012-01-01

    To develop a valid, reliable and easy-to-use questionnaire for a psychosocial nursing diagnosis. The study was performed in two phases: first phase, questionnaire design and construction; second phase, validity and reliability tests. A bank of items was constructed using the NANDA classification as a theoretical framework. Each item was assigned a Likert scale or dichotomous response. The combination of responses to the items constituted the diagnostic rules to assign up to 28 labels. A group of experts carried out the validity test for content. Other validated scales were used as reference standards for the criterion validity tests. Forty-five nurses provided the questionnaire to the patients on three separate occasions over a period of three weeks, and the other validated scales only once to 188 randomly selected patients in Primary Care centres in Tenerife (Spain). Validity tests for construct confirmed the six dimensions of the questionnaire with 91% of total variance explained. Validity tests for criterion showed a specificity of 66%-100%, and showed high correlations with the reference scales when the questionnaire was assigning nursing diagnoses. Reliability tests showed agreement of 56%-91% (P<.001), and a 93% internal consistency. The Questionnaire for Psychosocial Nursing Diagnosis was called CdePS, and included 61 items. The CdePS is a valid, reliable and easy-to-use tool in Primary Care centres to improve the assigning of a psychosocial nursing diagnosis. Copyright © 2011 Elsevier España, S.L. All rights reserved.

  8. Scopolamine disrupts place navigation in rats and humans: a translational validation of the Hidden Goal Task in the Morris water maze and a real maze for humans.

    PubMed

    Laczó, Jan; Markova, Hana; Lobellova, Veronika; Gazova, Ivana; Parizkova, Martina; Cerman, Jiri; Nekovarova, Tereza; Vales, Karel; Klovrzova, Sylva; Harrison, John; Windisch, Manfred; Vlcek, Kamil; Svoboda, Jan; Hort, Jakub; Stuchlik, Ales

    2017-02-01

    Development of new drugs for treatment of Alzheimer's disease (AD) requires valid paradigms for testing their efficacy and sensitive tests validated in translational research. We present validation of a place-navigation task, a Hidden Goal Task (HGT) based on the Morris water maze (MWM), in comparable animal and human protocols. We used scopolamine to model cognitive dysfunction similar to that seen in AD and donepezil, a symptomatic medication for AD, to assess its potential reversible effect on this scopolamine-induced cognitive dysfunction. We tested the effects of scopolamine and the combination of scopolamine and donepezil on place navigation and compared their effects in human and rat versions of the HGT. Place navigation testing consisted of 4 sessions of HGT performed at baseline, 2, 4, and 8 h after dosing in humans or 1, 2.5, and 5 h in rats. Scopolamine worsened performance in both animals and humans. In the animal experiment, co-administration of donepezil alleviated the negative effect of scopolamine. In the human experiment, subjects co-administered with scopolamine and donepezil performed similarly to subjects on placebo and scopolamine, indicating a partial ameliorative effect of donepezil. In the task based on the MWM, scopolamine impaired place navigation, while co-administration of donepezil alleviated this effect in comparable animal and human protocols. Using scopolamine and donepezil to challenge place navigation testing can be studied concurrently in animals and humans and may be a valid and reliable model for translational research, as well as for preclinical and clinical phases of drug trials.

  9. Blood collection tubes as medical devices: The potential to affect assays and proposed verification and validation processes for the clinical laboratory.

    PubMed

    Bowen, Raffick A R; Adcock, Dorothy M

    2016-12-01

    Blood collection tubes (BCTs) are an often under-recognized variable in the preanalytical phase of clinical laboratory testing. Unfortunately, even the best-designed and manufactured BCTs may not work well in all clinical settings. Clinical laboratories, in collaboration with healthcare providers, should carefully evaluate BCTs prior to putting them into clinical use to determine their limitations and ensure that patients are not placed at risk because of inaccuracies due to poor tube performance. Selection of the best BCTs can be achieved through comparing advertising materials, reviewing the literature, observing the device at a scientific meeting, receiving a demonstration, evaluating the device under simulated conditions, or testing the device with patient samples. Although many publications have discussed method validations, few detail how to perform experiments for tube verification and validation. This article highlights the most common and impactful variables related to BCTs and discusses the validation studies that a typical clinical laboratory should perform when selecting BCTs. We also present a brief review of how in vitro diagnostic devices, particularly BCTs, are regulated in the United States, the European Union, and Canada. The verification and validation of BCTs will help to avoid the economic and human costs associated with incorrect test results, including poor patient care, unnecessary testing, and delays in test results. We urge laboratorians, tube manufacturers, diagnostic companies, and other researchers to take all the necessary steps to protect against the adverse effects of BCT components and their additives on clinical assays. Copyright © 2016 The Canadian Society of Clinical Chemists. Published by Elsevier Inc. All rights reserved.

  10. Reliability and Validity of Korean Version of Apraxia Screen of TULIA (K-AST).

    PubMed

    Kim, Soo Jin; Yang, You-Na; Lee, Jong Won; Lee, Jin-Youn; Jeong, Eunhwa; Kim, Bo-Ram; Lee, Jongmin

    2016-10-01

    To evaluate the reliability and validity of Korean version of AST (K-AST) as a bedside screening test of apraxia in patients with stroke for early and reliable detection. AST was translated into Korean, and the translated version received authorization from the author of AST. The performances of K-AST in 26 patients (21 males, 5 females; mean age 65.42±17.31 years) with stroke (23 ischemic, 3 hemorrhagic) were videotaped. To test the reliability and validity of K-AST, the recorded performances were assessed by two physiatrists and two occupational therapists twice at a 1-week interval. The patient performances at admission in Korean version of Mini-Mental State Examination (K-MMSE), self-care and transfer categories of Functional Independence Measure (FIM), and motor praxis area of Loewenstein Occupational Therapy Cognitive Assessment, the second edition (LOTCA-II) were also evaluated. Scores of motor praxis area of LOTCA-II was used to assess the validity of K-AST. Inter-rater reliabilities were 0.983 (p<0.001) at the first assessment and 0.982 (p<0.001) at the second assessment. For intra-rater (test-retest) reliabilities, the values of four raters were 0.978 (p<0.001), 0.957 (p<0.001), 0.987 (p<0.001), and 0.977 (p<0.001). K-AST showed significant correlation (r=0.758, p<0.001) with motor praxis area of LOTCA-II test. K-AST also showed positive correlations with the total FIM score (r=0.694, p<0.001), the selfcare category of FIM (r=0.705, p<0.001) and the transfer category of FIM (r=653, p<0.001). K-AST is a reliable and valid test for bedside screening of apraxia.

  11. A machine learning approach to multi-level ECG signal quality classification.

    PubMed

    Li, Qiao; Rajagopalan, Cadathur; Clifford, Gari D

    2014-12-01

    Current electrocardiogram (ECG) signal quality assessment studies have aimed to provide a two-level classification: clean or noisy. However, clinical usage demands more specific noise level classification for varying applications. This work outlines a five-level ECG signal quality classification algorithm. A total of 13 signal quality metrics were derived from segments of ECG waveforms, which were labeled by experts. A support vector machine (SVM) was trained to perform the classification and tested on a simulated dataset and was validated using data from the MIT-BIH arrhythmia database (MITDB). The simulated training and test datasets were created by selecting clean segments of the ECG in the 2011 PhysioNet/Computing in Cardiology Challenge database, and adding three types of real ECG noise at different signal-to-noise ratio (SNR) levels from the MIT-BIH Noise Stress Test Database (NSTDB). The MITDB was re-annotated for five levels of signal quality. Different combinations of the 13 metrics were trained and tested on the simulated datasets and the best combination that produced the highest classification accuracy was selected and validated on the MITDB. Performance was assessed using classification accuracy (Ac), and a single class overlap accuracy (OAc), which assumes that an individual type classified into an adjacent class is acceptable. An Ac of 80.26% and an OAc of 98.60% on the test set were obtained by selecting 10 metrics while 57.26% (Ac) and 94.23% (OAc) were the numbers for the unseen MITDB validation data without retraining. By performing the fivefold cross validation, an Ac of 88.07±0.32% and OAc of 99.34±0.07% were gained on the validation fold of MITDB. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  12. A Systematic Review of the Reliability and Validity of Behavioural Tests Used to Assess Behavioural Characteristics Important in Working Dogs.

    PubMed

    Brady, Karen; Cracknell, Nina; Zulch, Helen; Mills, Daniel Simon

    2018-01-01

    Working dogs are selected based on predictions from tests that they will be able to perform specific tasks in often challenging environments. However, withdrawal from service in working dogs is still a big problem, bringing into question the reliability of the selection tests used to make these predictions. A systematic review was undertaken aimed at bringing together available information on the reliability and predictive validity of the assessment of behavioural characteristics used with working dogs to establish the quality of selection tests currently available for use to predict success in working dogs. The search procedures resulted in 16 papers meeting the criteria for inclusion. A large range of behaviour tests and parameters were used in the identified papers, and so behaviour tests and their underpinning constructs were grouped on the basis of their relationship with positive core affect (willingness to work, human-directed social behaviour, object-directed play tendencies) and negative core affect (human-directed aggression, approach withdrawal tendencies, sensitivity to aversives). We then examined the papers for reports of inter-rater reliability, within-session intra-rater reliability, test-retest validity and predictive validity. The review revealed a widespread lack of information relating to the reliability and validity of measures to assess behaviour and inconsistencies in terminologies, study parameters and indices of success. There is a need to standardise the reporting of these aspects of behavioural tests in order to improve the knowledge base of what characteristics are predictive of optimal performance in working dog roles, improving selection processes and reducing working dog redundancy. We suggest the use of a framework based on explaining the direct or indirect relationship of the test with core affect.

  13. Simulation verification techniques study

    NASA Technical Reports Server (NTRS)

    Schoonmaker, P. B.; Wenglinski, T. H.

    1975-01-01

    Results are summarized of the simulation verification techniques study which consisted of two tasks: to develop techniques for simulator hardware checkout and to develop techniques for simulation performance verification (validation). The hardware verification task involved definition of simulation hardware (hardware units and integrated simulator configurations), survey of current hardware self-test techniques, and definition of hardware and software techniques for checkout of simulator subsystems. The performance verification task included definition of simulation performance parameters (and critical performance parameters), definition of methods for establishing standards of performance (sources of reference data or validation), and definition of methods for validating performance. Both major tasks included definition of verification software and assessment of verification data base impact. An annotated bibliography of all documents generated during this study is provided.

  14. Testing Game-Based Performance in Team-Handball.

    PubMed

    Wagner, Herbert; Orwat, Matthias; Hinz, Matthias; Pfusterschmied, Jürgen; Bacharach, David W; von Duvillard, Serge P; Müller, Erich

    2016-10-01

    Wagner, H, Orwat, M, Hinz, M, Pfusterschmied, J, Bacharach, DW, von Duvillard, SP, and Müller, E. Testing game-based performance in team-handball. J Strength Cond Res 30(10): 2794-2801, 2016-Team-handball is a fast paced game of defensive and offensive action that includes specific movements of jumping, passing, throwing, checking, and screening. To date and to the best of our knowledge, a game-based performance test (GBPT) for team-handball does not exist. Therefore, the aim of this study was to develop and validate such a test. Seventeen experienced team-handball players performed 2 GBPTs separated by 7 days between each test, an incremental treadmill running test, and a team-handball test game (TG) (2 × 20 minutes). Peak oxygen uptake (V[Combining Dot Above]O2peak), blood lactate concentration (BLC), heart rate (HR), sprinting time, time of offensive and defensive actions as well as running intensities, ball velocity, and jump height were measured in the game-based test. Reliability of the tests was calculated using an intraclass correlation coefficient (ICC). Additionally, we measured V[Combining Dot Above]O2peak in the incremental treadmill running test and BLC, HR, and running intensities in the team-handball TG to determine the validity of the GBPT. For the test-retest reliability, we found an ICC >0.70 for the peak BLC and HR, mean offense and defense time, as well as ball velocity that yielded an ICC >0.90 for the V[Combining Dot Above]O2peak in the GBPT. Percent walking and standing constituted 73% of total time. Moderate (18%) and high (9%) intensity running in the GBPT was similar to the team-handball TG. Our results indicated that the GBPT is a valid and reliable test to analyze team-handball performance (physiological and biomechanical variables) under conditions similar to competition.

  15. Validating workplace performance assessments in health sciences students: a case study from speech pathology.

    PubMed

    McAllister, Sue; Lincoln, Michelle; Ferguson, Allison; McAllister, Lindy

    2013-01-01

    Valid assessment of health science students' ability to perform in the real world of workplace practice is critical for promoting quality learning and ultimately certifying students as fit to enter the world of professional practice. Current practice in performance assessment in the health sciences field has been hampered by multiple issues regarding assessment content and process. Evidence for the validity of scores derived from assessment tools are usually evaluated against traditional validity categories with reliability evidence privileged over validity, resulting in the paradoxical effect of compromising the assessment validity and learning processes the assessments seek to promote. Furthermore, the dominant statistical approaches used to validate scores from these assessments fall under the umbrella of classical test theory approaches. This paper reports on the successful national development and validation of measures derived from an assessment of Australian speech pathology students' performance in the workplace. Validation of these measures considered each of Messick's interrelated validity evidence categories and included using evidence generated through Rasch analyses to support score interpretation and related action. This research demonstrated that it is possible to develop an assessment of real, complex, work based performance of speech pathology students, that generates valid measures without compromising the learning processes the assessment seeks to promote. The process described provides a model for other health professional education programs to trial.

  16. Femtosecond laser micro-inscription of optical coherence tomography resolution test artifacts.

    PubMed

    Tomlins, Peter H; Smith, Graham N; Woolliams, Peter D; Rasakanthan, Janarthanan; Sugden, Kate

    2011-04-25

    Optical coherence tomography (OCT) systems are becoming more commonly used in biomedical imaging and, to enable continued uptake, a reliable method of characterizing their performance and validating their operation is required. This paper outlines the use of femtosecond laser subsurface micro-inscription techniques to fabricate an OCT test artifact for validating the resolution performance of a commercial OCT system. The key advantage of this approach is that by utilizing the nonlinear absorption a three dimensional grid of highly localized point and line defects can be written in clear fused silica substrates.

  17. Development of an Itemwise Efficiency Scoring Method: Concurrent, Convergent, Discriminant, and Neuroimaging-Based Predictive Validity Assessed in a Large Community Sample

    PubMed Central

    Moore, Tyler M.; Reise, Steven P.; Roalf, David R.; Satterthwaite, Theodore D.; Davatzikos, Christos; Bilker, Warren B.; Port, Allison M.; Jackson, Chad T.; Ruparel, Kosha; Savitt, Adam P.; Baron, Robert B.; Gur, Raquel E.; Gur, Ruben C.

    2016-01-01

    Traditional “paper-and-pencil” testing is imprecise in measuring speed and hence limited in assessing performance efficiency, but computerized testing permits precision in measuring itemwise response time. We present a method of scoring performance efficiency (combining information from accuracy and speed) at the item level. Using a community sample of 9,498 youths age 8-21, we calculated item-level efficiency scores on four neurocognitive tests, and compared the concurrent, convergent, discriminant, and predictive validity of these scores to simple averaging of standardized speed and accuracy-summed scores. Concurrent validity was measured by the scores' abilities to distinguish men from women and their correlations with age; convergent and discriminant validity were measured by correlations with other scores inside and outside of their neurocognitive domains; predictive validity was measured by correlations with brain volume in regions associated with the specific neurocognitive abilities. Results provide support for the ability of itemwise efficiency scoring to detect signals as strong as those detected by standard efficiency scoring methods. We find no evidence of superior validity of the itemwise scores over traditional scores, but point out several advantages of the former. The itemwise efficiency scoring method shows promise as an alternative to standard efficiency scoring methods, with overall moderate support from tests of four different types of validity. This method allows the use of existing item analysis methods and provides the convenient ability to adjust the overall emphasis of accuracy versus speed in the efficiency score, thus adjusting the scoring to the real-world demands the test is aiming to fulfill. PMID:26866796

  18. Analytic Validation of Immunohistochemistry Assays: New Benchmark Data From a Survey of 1085 Laboratories.

    PubMed

    Stuart, Lauren N; Volmar, Keith E; Nowak, Jan A; Fatheree, Lisa A; Souers, Rhona J; Fitzgibbons, Patrick L; Goldsmith, Jeffrey D; Astles, J Rex; Nakhleh, Raouf E

    2017-09-01

    - A cooperative agreement between the College of American Pathologists (CAP) and the United States Centers for Disease Control and Prevention was undertaken to measure laboratories' awareness and implementation of an evidence-based laboratory practice guideline (LPG) on immunohistochemical (IHC) validation practices published in 2014. - To establish new benchmark data on IHC laboratory practices. - A 2015 survey on IHC assay validation practices was sent to laboratories subscribed to specific CAP proficiency testing programs and to additional nonsubscribing laboratories that perform IHC testing. Specific questions were designed to capture laboratory practices not addressed in a 2010 survey. - The analysis was based on responses from 1085 laboratories that perform IHC staining. Ninety-six percent (809 of 844) always documented validation of IHC assays. Sixty percent (648 of 1078) had separate procedures for predictive and nonpredictive markers, 42.7% (220 of 515) had procedures for laboratory-developed tests, 50% (349 of 697) had procedures for testing cytologic specimens, and 46.2% (363 of 785) had procedures for testing decalcified specimens. Minimum case numbers were specified by 85.9% (720 of 838) of laboratories for nonpredictive markers and 76% (584 of 768) for predictive markers. Median concordance requirements were 95% for both types. For initial validation, 75.4% (538 of 714) of laboratories adopted the 20-case minimum for nonpredictive markers and 45.9% (266 of 579) adopted the 40-case minimum for predictive markers as outlined in the 2014 LPG. The most common method for validation was correlation with morphology and expected results. Laboratories also reported which assay changes necessitated revalidation and their minimum case requirements. - Benchmark data on current IHC validation practices and procedures may help laboratories understand the issues and influence further refinement of LPG recommendations.

  19. Scoring Systems to Estimate Intracerebral Control and Survival Rates of Patients Irradiated for Brain Metastases;Brain metastases; Radiation therapy; Local control; Survival; Prognostic scores

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rades, Dirk, E-mail: Rades.Dirk@gmx.net; Dziggel, Liesa; Haatanen, Tiina

    2011-07-15

    Purpose: To create and validate scoring systems for intracerebral control (IC) and overall survival (OS) of patients irradiated for brain metastases. Methods and Materials: In this study, 1,797 patients were randomly assigned to the test (n = 1,198) or the validation group (n = 599). Two scoring systems were developed, one for IC and another for OS. The scores included prognostic factors found significant on multivariate analyses. Age, performance status, extracerebral metastases, interval tumor diagnosis to RT, and number of brain metastases were associated with OS. Tumor type, performance status, interval, and number of brain metastases were associated with IC.more » The score for each factor was determined by dividing the 6-month IC or OS rate (given in percent) by 10. The total score represented the sum of the scores for each factor. The score groups of the test group were compared with the corresponding score groups of the validation group. Results: In the test group, 6-month IC rates were 17% for 14-18 points, 49% for 19-23 points, and 77% for 24-27 points (p < 0.0001). IC rates in the validation group were 19%, 52%, and 77%, respectively (p < 0.0001). In the test group, 6-month OS rates were 9% for 15-19 points, 41% for 20-25 points, and 78% for 26-30 points (p < 0.0001). OS rates in the validation group were 7%, 39%, and 79%, respectively (p < 0.0001). Conclusions: Patients irradiated for brain metastases can be given scores to estimate OS and IC. IC and OS rates of the validation group were similar to the test group demonstrating the validity and reproducibility of both scores.« less

  20. Reliability and validity analysis of the open-source Chinese Foot and Ankle Outcome Score (FAOS).

    PubMed

    Ling, Samuel K K; Chan, Vincent; Ho, Karen; Ling, Fona; Lui, T H

    2017-12-21

    Develop the first reliable and validated open-source outcome scoring system in the Chinese language for foot and ankle problems. Translation of the English FAOS into Chinese following regular protocols. First, two forward-translations were created separately, these were then combined into a preliminary version by an expert committee, and was subsequently back-translated into English. The process was repeated until the original and back translations were congruent. This version was then field tested on actual patients who provided feedback for modification. The final Chinese FAOS version was then tested for reliability and validity. Reliability analysis was performed on 20 subjects while validity analysis was performed on 50 subjects. Tools used to validate the Chinese FAOS were the SF36 and Pain Numeric Rating Scale (NRS). Internal consistency between the FAOS subgroups was measured using Cronbach's alpha. Spearman's correlation was calculated between each subgroup in the FAOS, SF36 and NRS. The Chinese FAOS passed both reliability and validity testing; meaning it is reliable, internally consistent and correlates positively with the SF36 and the NRS. The Chinese FAOS is a free, open-source scoring system that can be used to provide a relatively standardised outcome measure for foot and ankle studies. Copyright © 2017 Elsevier Ltd. All rights reserved.

  1. How'd they do it? Malingering strategies on symptom validity tests.

    PubMed

    Tan, Jing Ee; Slick, Daniel J; Strauss, Esther; Hultsch, David F

    2002-12-01

    Twenty-five undergraduate students were instructed to feign believable impairment following a brain injury from a car accident and 27 students were told to perform like they had recovered from such an injury. Three forced-choice tests, the Test of Memory Malingering (TOMM), Victoria Symptom Validity Test (VSVT), and Word Memory Test (WMT) were given. Test-taking strategies were evaluated by means of a questionnaire given at the end of the test session. The results revealed that all the tasks differentiated between groups. Using conventional cut-scores, the WMT proved most efficient while the VSVT captured the most participants in the definitive below-chance category. Individuals instructed to feign injury were more likely to prepare prior to the experiment, with feigning of memory loss as the most frequently reported strategy. Regardless, preparation effort did not translate into believable performance on the tests.

  2. Predictive validities of several clinical color vision tests for aviation signal light gun performance.

    DOT National Transportation Integrated Search

    1975-01-01

    Scores on the American Optical Company (AOC) test (1965 edition), Dvorine test, Farnsworth Lantern test, Color Threshold Tester, Farnsworth-Munsell 100-Hue test, Farnsworth Panel D-15 test, and Schmidt-Haensch Anomaloscope were obtained from 137 men ...

  3. Lightweight ZERODUR: Validation of Mirror Performance and Mirror Modeling Predictions

    NASA Technical Reports Server (NTRS)

    Hull, Tony; Stahl, H. Philip; Westerhoff, Thomas; Valente, Martin; Brooks, Thomas; Eng, Ron

    2017-01-01

    Upcoming spaceborne missions, both moderate and large in scale, require extreme dimensional stability while relying both upon established lightweight mirror materials, and also upon accurate modeling methods to predict performance under varying boundary conditions. We describe tests, recently performed at NASA's XRCF chambers and laboratories in Huntsville Alabama, during which a 1.2 m diameter, f/1.2988% lightweighted SCHOTT lightweighted ZERODUR(TradeMark) mirror was tested for thermal stability under static loads in steps down to 230K. Test results are compared to model predictions, based upon recently published data on ZERODUR(TradeMark). In addition to monitoring the mirror surface for thermal perturbations in XRCF Thermal Vacuum tests, static load gravity deformations have been measured and compared to model predictions. Also the Modal Response(dynamic disturbance) was measured and compared to model. We will discuss the fabrication approach and optomechanical design of the ZERODUR(TradeMark) mirror substrate by SCHOTT, its optical preparation for test by Arizona Optical Systems (AOS). Summarize the outcome of NASA's XRCF tests and model validations

  4. Relationship of Temporal Lobe Volumes to Neuropsychological Test Performance in Healthy Children

    PubMed Central

    Wells, Carolyn T.; Matson, Melissa A.; Kates, Wendy R.; Hay, Trisha; Horska, Alena

    2008-01-01

    Ecological validity of neuropsychological assessment includes the ability of tests to predict real-world functioning and/or covary with brain structures. Studies have examined the relationship between adaptive skills and test performance, with less focus on the association between regional brain volumes and neurobehavioral function in healthy children. The present study examined the relationship between temporal lobe gray matter volumes and performance on two neuropsychological tests hypothesized to measure temporal lobe functioning (Visual Perception-VP; Peabody Picture Vocabulary Test, Third Edition-PPVT-III) in 48 healthy children ages 5-18 years. After controlling for age and gender, left and right temporal and left occipital volumes were significant predictors of VP. Left and right frontal and temporal volumes were significant predictors of PPVT-III. Temporal volume emerged as the strongest lobar correlate with both tests. These results provide convergent and discriminant validity supporting VP as a measure of the “what” system; but suggest the PPVT-III as a complex measure of receptive vocabulary, potentially involving executive function demands. PMID:18513844

  5. Lightweight ZERODUR®: Validation of mirror performance and mirror modeling predictions

    NASA Astrophysics Data System (ADS)

    Hull, Anthony B.; Stahl, H. Philip; Westerhoff, Thomas; Valente, Martin; Brooks, Thomas; Eng, Ron

    2017-01-01

    Upcoming spaceborne missions, both moderate and large in scale, require extreme dimensional stability while relying both upon established lightweight mirror materials, and also upon accurate modeling methods to predict performance under varying boundary conditions. We describe tests, recently performed at NASA’s XRCF chambers and laboratories in Huntsville Alabama, during which a 1.2m diameter, f/1.29 88% lightweighted SCHOTT lightweighted ZERODUR® mirror was tested for thermal stability under static loads in steps down to 230K. Test results are compared to model predictions, based upon recently published data on ZERODUR®. In addition to monitoring the mirror surface for thermal perturbations in XRCF Thermal Vacuum tests, static load gravity deformations have been measured and compared to model predictions. Also the Modal Response (dynamic disturbance) was measured and compared to model. We will discuss the fabrication approach and optomechanical design of the ZERODUR® mirror substrate by SCHOTT, its optical preparation for test by Arizona Optical Systems (AOS), and summarize the outcome of NASA’s XRCF tests and model validations.

  6. TU-D-201-05: Validation of Treatment Planning Dose Calculations: Experience Working with MPPG 5.a

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Xue, J; Park, J; Kim, L

    2016-06-15

    Purpose: Newly published medical physics practice guideline (MPPG 5.a.) has set the minimum requirements for commissioning and QA of treatment planning dose calculations. We present our experience in the validation of a commercial treatment planning system based on MPPG 5.a. Methods: In addition to tests traditionally performed to commission a model-based dose calculation algorithm, extensive tests were carried out at short and extended SSDs, various depths, oblique gantry angles and off-axis conditions to verify the robustness and limitations of a dose calculation algorithm. A comparison between measured and calculated dose was performed based on validation tests and evaluation criteria recommendedmore » by MPPG 5.a. An ion chamber was used for the measurement of dose at points of interest, and diodes were used for photon IMRT/VMAT validations. Dose profiles were measured with a three-dimensional scanning system and calculated in the TPS using a virtual water phantom. Results: Calculated and measured absolute dose profiles were compared at each specified SSD and depth for open fields. The disagreement is easily identifiable with the difference curve. Subtle discrepancy has revealed the limitation of the measurement, e.g., a spike at the high dose region and an asymmetrical penumbra observed on the tests with an oblique MLC beam. The excellent results we had (> 98% pass rate on 3%/3mm gamma index) on the end-to-end tests for both IMRT and VMAT are attributed to the quality beam data and the good understanding of the modeling. The limitation of the model and the uncertainty of measurement were considered when comparing the results. Conclusion: The extensive tests recommended by the MPPG encourage us to understand the accuracy and limitations of a dose algorithm as well as the uncertainty of measurement. Our experience has shown how the suggested tests can be performed effectively to validate dose calculation models.« less

  7. A new instrument to assess physician skill at thoracic ultrasound, including pleural effusion markup.

    PubMed

    Salamonsen, Matthew; McGrath, David; Steiler, Geoff; Ware, Robert; Colt, Henri; Fielding, David

    2013-09-01

    To reduce complications and increase success, thoracic ultrasound is recommended to guide all chest drainage procedures. Despite this, no tools currently exist to assess proceduralist training or competence. This study aims to validate an instrument to assess physician skill at performing thoracic ultrasound, including effusion markup, and examine its validity. We developed an 11-domain, 100-point assessment sheet in line with British Thoracic Society guidelines: the Ultrasound-Guided Thoracentesis Skills and Tasks Assessment Test (UGSTAT). The test was used to assess 22 participants (eight novices, seven intermediates, seven advanced) on two occasions while performing thoracic ultrasound on a pleural effusion phantom. Each test was scored by two blinded expert examiners. Validity was examined by assessing the ability of the test to stratify participants according to expected skill level (analysis of variance) and demonstrating test-retest and intertester reproducibility by comparison of repeated scores (mean difference [95% CI] and paired t test) and the intraclass correlation coefficient. Mean scores for the novice, intermediate, and advanced groups were 49.3, 73.0, and 91.5 respectively, which were all significantly different (P < .0001). There were no significant differences between repeated scores. Procedural training on mannequins prior to unsupervised performance on patients is rapidly becoming the standard in medical education. This study has validated the UGSTAT, which can now be used to determine the adequacy of thoracic ultrasound training prior to clinical practice. It is likely that its role could be extended to live patients, providing a way to document ongoing procedural competence.

  8. Familiarization, validity and smallest detectable difference of the isometric squat test in evaluating maximal strength.

    PubMed

    Drake, David; Kennedy, Rodney; Wallace, Eric

    2018-02-06

    Isometric multi-joint tests are considered reliable and have strong relationships with 1RM performance. However, limited evidence is available for the isometric squat in terms of effects of familiarization and reliability. This study aimed to assess, the effect of familiarization, stability reliability, determine the smallest detectible difference, and the correlation of the isometric squat test with 1RM squat performance. Thirty-six strength-trained participants volunteered to take part in this study. Following three familiarization sessions, test-retest reliability was evaluated with a 48-hour window between each time point. Isometric squat peak, net and relative force were assessed. Results showed three familiarizations were required, isometric squat had a high level of stability reliability and smallest detectible difference of 11% for peak and relative force. Isometric strength at a knee angle of ninety degrees had a strong significant relationship with 1RM squat performance. In conclusion, the isometric squat is a valid test to assess multi-joint strength and can discriminate between strong and weak 1RM squat performance. Changes greater than 11% in peak and relative isometric squat performance should be considered as meaningful in participants who are familiar with the test.

  9. Impact on Participation and Autonomy: Test of Validity and Reliability for Older Persons.

    PubMed

    Hammar, Isabelle Ottenvall; Ekelund, Christina; Wilhelmson, Katarina; Eklund, Kajsa

    2014-11-06

    In research and healthcare it is important to measure older persons' self-determination in order to improve their possibilities to decide for themselves in daily life. The questionnaire Impact on Participation and Autonomy (IPA) assesses self-determination, but is not constructed for older persons. The aim of this study was to examine the validity and reliability of the IPA-S questionnaire for persons aged 70 years and older. The study was performed in two steps; first a validity test of the Swedish version of the questionnaire, IPA-S, followed by a reliability test-retest of an adjusted version. The validity was tested with focus groups and individual interviews on persons aged 77-88 years, and the reliability on persons aged 70-99 years. The validity test result showed that IPA-S is valid for older persons but it was too extensive and the phrasing of the items needed adjustments. The reliability test-retest on the adjusted questionnaire, IPA- Older persons (IPA-O), showed that 15 of 22 items had high agreement. IPA-O can be used to measure older persons' self-determination in their care and rehabilitation.

  10. School-based behavioral assessment tools are reliable and valid for measurement of fruit and vegetable intake, physical activity, and television viewing in young children.

    PubMed

    Economos, Christina D; Sacheck, Jennifer M; Kwan Ho Chui, Kenneth; Irizarry, Laura; Irizzary, Laura; Guillemont, Juliette; Collins, Jessica J; Hyatt, Raymond R

    2008-04-01

    Interventions aiming to modify the dietary and physical activity behaviors of young children require precise and accurate measurement tools. As part of a larger community-based project, three school-based questionnaires were developed to assess (a) fruit and vegetable intake, (b) physical activity and television (TV) viewing, and (c) perceived parental support for diet and physical activity. Test-retest reliability was performed on all questionnaires and validity was measured for fruit and vegetable intake, physical activity, and TV viewing. Eighty-four school children (8.3+/-1.1 years) were studied. Test-retest reliability was performed by administering questionnaires twice, 1 to 2 hours apart. Validity of the fruit and vegetable questionnaire was measured by direct observation, while the physical activity and TV questionnaire was validated by a parent phone interview. All three questionnaires yielded excellent test-retest reliability (P<0.001). The majority of fruit and vegetable questions and the questions regarding specific physical activities and TV viewing were valid. Low validity scores were found for questions on watching TV during breakfast or dinner. These questionnaires are reliable and valid tools to assess fruit and vegetable intake, physical activity, and TV viewing behaviors in early elementary school-aged children. Methods for assessment of children's TV viewing during meals should be further investigated because of parent-child discrepancies.

  11. The Validity and Contributing Physiological Factors to 30-15 Intermittent Fitness Test Performance in Rugby League.

    PubMed

    Scott, Tannath J; Duthie, Grant M; Delaney, Jace A; Sanctuary, Colin E; Ballard, David A; Hickmans, Jeremy A; Dascombe, Ben J

    2017-09-01

    Scott, TJ, Duthie, GM, Delaney, JA, Sanctuary, CE, Ballard, DA, Hickmans, JA, and Dascombe, BJ. The validity and contributing physiological factors to 30-15 intermittent fitness test performance in rugby league. J Strength Cond Res 31(9): 2409-2416, 2017-This study examined the validity of the 30-15 Intermittent Fitness Test (30-15IFT) within rugby league. Sixty-three Australian elite and junior-elite rugby league players (22.5 ± 4.5 years, 96.1 ± 9.5 kg, Σ7 skinfolds: 71.0 ± 18.7 mm) from a professional club participated in this study. Players were assessed for anthropometry (body mass, Σ7 skinfolds, lean mass index), prolonged high-intensity intermittent running (PHIR; measured by 30-15IFT), predicted aerobic capacity (MSFT) and power (AAS), speed (40 m sprint), repeated sprint, and change of direction (COD-505 agility test) ability before and after an 11-week preseason training period. Validity of the 30-15IFT was established using Pearson's coefficient correlations. Forward stepwise regression model identified the fewest variables that could predict individual final velocity (VIFT) and change within 30-15IFT performance. Significant correlations between VIFT and Σ7 skinfolds, repeated sprint decrement, V[Combining Dot Above]O2maxMSFT, and average aerobic speed were observed. A total of 71.8% of the adjusted variance in 30-15IFT performance was explained using a 4-step best fit model (V[Combining Dot Above]O2maxMSFT, 61.4%; average aerobic speed, 4.7%; maximal velocity, 4.1%; lean mass index, 1.6%). Across the training period, 25% of the variance was accounted by ΔV[Combining Dot Above]O2maxMSFT (R = 0.25). These relationships suggest that the 30-15IFT is a valid test of PHIR within rugby league. Poor correlations were observed with measures of acceleration, speed, and COD. These findings demonstrate that although the 30-15IFT is a valid measure of PHIR, it also simultaneously examines various physiological capacities that differ between sporting cohorts.

  12. Development of a framework for international certification by OIE of diagnostic tests validated as fit for purpose.

    PubMed

    Wright, P; Edwards, S; Diallo, A; Jacobson, R

    2006-01-01

    Historically, the OIE has focused on test methods applicable to trade and the international movement of animals and animal products. With its expanding role as the World Organisation for Animal Health, the OIE has recognised the need to evaluate test methods relative to specific diagnostic applications other than trade. In collaboration with its international partners, the OIE solicited input from experts through consultants' meetings on the development of guidelines for validation and certification of diagnostic assays for infectious animal diseases. Recommendations from the first meeting were formally adopted and have subsequently been acted upon by the OIE. A validation template has been developed that specifically requires a test to be fit or suited for its intended purpose (e.g. as a screening or a confirmatory test). This is a key criterion for validation. The template incorporates four distinct stages of validation, each of which has bearing on the evaluation of fitness for purpose. The OIE has just recently created a registry for diagnostic tests that fulfil these validation requirements. Assay developers are invited to submit validation dossiers to the OIE for evaluation by a panel of experts. Recognising that validation is an incremental process, tests methods achieving at least the first stages of validation may be provisionally accepted. To provide additional confidence in assay performance, the OIE, through its network of Reference Laboratories, has embarked on the development of evaluation panels. These panels would contain specially selected test samples that would assist in verifying fitness for purpose.

  13. Development of a framework for international certification by the OIE of diagnostic tests validated as fit for purpose.

    PubMed

    Wright, P; Edwards, S; Diallo, A; Jacobson, R

    2007-01-01

    Historically, the OIE has focussed on test methods applicable to trade and the international movement of animals and animal products. With its expanding role as the World Organisation for Animal Health, the OIE has recognised the need to evaluate test methods relative to specific diagnostic applications other than trade. In collaboration with its international partners, the OIE solicited input from experts through consultants meetings on the development of guidelines for validation and certification of diagnostic assays for infectious animal diseases. Recommendations from the first meeting were formally adopted and have subsequently been acted upon by the OIE. A validation template has been developed that specifically requires a test to be fit or suited for its intended purpose (e.g. as a screening or a confirmatory test). This is a key criterion for validation. The template incorporates four distinct stages of validation, each of which has bearing on the evaluation of fitness for purpose. The OIE has just recently created a registry for diagnostic tests that fulfil these validation requirements. Assay developers are invited to submit validation dossiers to the OIE for evaluation by a panel of experts. Recognising that validation is an incremental process, tests methods achieving at least the first stages of validation may be provisionally accepted. To provide additional confidence in assay performance, the OIE, through its network of Reference Laboratories, has embarked on the development of evaluation panels. These panels would contain specially selected test samples that would assist in verifying fitness for purpose.

  14. Predictive validity of the Biomedical Admissions Test: an evaluation and case study.

    PubMed

    McManus, I C; Ferguson, Eamonn; Wakeford, Richard; Powis, David; James, David

    2011-01-01

    There has been an increase in the use of pre-admission selection tests for medicine. Such tests need to show good psychometric properties. Here, we use a paper by Emery and Bell [2009. The predictive validity of the Biomedical Admissions Test for pre-clinical examination performance. Med Educ 43:557-564] as a case study to evaluate and comment on the reporting of psychometric data in the field of medical student selection (and the comments apply to many papers in the field). We highlight pitfalls when reliability data are not presented, how simple zero-order associations can lead to inaccurate conclusions about the predictive validity of a test, and how biases need to be explored and reported. We show with BMAT that it is the knowledge part of the test which does all the predictive work. We show that without evidence of incremental validity it is difficult to assess the value of any selection tests for medicine.

  15. Incremental Validity of Useful Field of View Subtests for the Prediction of Instrumental Activities of Daily Living

    PubMed Central

    Aust, Frederik; Edwards, Jerri D.

    2015-01-01

    Introduction The Useful Field of View Test (UFOV®) is a cognitive measure that predicts older adults’ ability to perform a range of everyday activities. However, little is known about the individual contribution of each subtest to these predictions and the underlying constructs of UFOV performance remain a topic of debate. Method We investigated the incremental validity of UFOV subtests for the prediction of Instrumental Activities of Daily Living (IADL) performance in two independent datasets, the SKILL (n = 828) and ACTIVE (n = 2426) studies. We, then, explored the cognitive and visual abilities assessed by UFOV using a range of neuropsychological and vision tests administered in the SKILL study. Results In the four subtest variant of UFOV, only subtests 2 and 3 consistently made independent contributions to the prediction of IADL performance across three different behavioral measures. In all cases, the incremental validity of UFOV subtests 1 and 4 was negligible. Furthermore, we found that UFOV was related to processing speed, general non-speeded cognition, and visual function; the omission of subtests 1 and 4 from the test score did not affect these associations. Conclusions UFOV subtests 1 and 4 appear to be of limited use to predict IADL and possibly other everyday activities. Future experimental research should investigate if shortening the UFOV by omitting these subtests is a reliable and valid assessment approach. PMID:26782018

  16. Evaluating the Medical Symptom Validity Test (MSVT) in a Sample of Veterans Between the Ages of 18 to 64.

    PubMed

    Reslan, Summar; Axelrod, Bradley N

    2017-01-01

    The purpose of the current study was to compare three potential profiles of the Medical Symptom Validity Test (MSVT; Pass, Genuine Memory Impairment Profile [GMIP], and Fail) on other freestanding and embedded performance validity tests (PVTs). Notably, a quantitatively computed version of the GMIP was utilized in this investigation. Data obtained from veterans referred for a neuropsychological evaluation in a metropolitan Veteran Affairs medical center were included (N = 494). Individuals age 65 and older were not included to exclude individuals with dementia from this investigation. The sample revealed 222 (45%) in the Pass group. Of the 272 who failed the easy subtests of the MSVT, 221 (81%) met quantitative criteria for the GMIP and 51 (19%) were classified as Fail. The Pass group failed fewer freestanding and embedded PVTs and obtained higher raw scores on all PVTs than both GMIP and Fail groups. The differences in performances of the GMIP and Fail groups were minimal. Specifically, GMIP protocols failed fewer freestanding PVTs than the Fail group; failure on embedded PVTs did not differ between GMIP and Fail. The MSVT GMIP incorporates the presence of clinical correlates of disability to assist with this distinction, but future research should consider performances on other freestanding measures of performance validity to differentiate cognitive impairment from invalidity.

  17. Preliminary Results for the OECD/NEA Time Dependent Benchmark using Rattlesnake, Rattlesnake-IQS and TDKENO

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    DeHart, Mark D.; Mausolff, Zander; Weems, Zach

    2016-08-01

    One goal of the MAMMOTH M&S project is to validate the analysis capabilities within MAMMOTH. Historical data has shown limited value for validation of full three-dimensional (3D) multi-physics methods. Initial analysis considered the TREAT startup minimum critical core and one of the startup transient tests. At present, validation is focusing on measurements taken during the M8CAL test calibration series. These exercises will valuable in preliminary assessment of the ability of MAMMOTH to perform coupled multi-physics calculations; calculations performed to date are being used to validate the neutron transport solver Rattlesnake\\cite{Rattlesnake} and the fuels performance code BISON. Other validation projects outsidemore » of TREAT are available for single-physics benchmarking. Because the transient solution capability of Rattlesnake is one of the key attributes that makes it unique for TREAT transient simulations, validation of the transient solution of Rattlesnake using other time dependent kinetics benchmarks has considerable value. The Nuclear Energy Agency (NEA) of the Organization for Economic Cooperation and Development (OECD) has recently developed a computational benchmark for transient simulations. This benchmark considered both two-dimensional (2D) and 3D configurations for a total number of 26 different transients. All are negative reactivity insertions, typically returning to the critical state after some time.« less

  18. Linguistic validation and reliability properties are weak investigated of most dementia-specific quality of life measurements-a systematic review.

    PubMed

    Dichter, Martin Nikolaus; Schwab, Christian G G; Meyer, Gabriele; Bartholomeyczik, Sabine; Halek, Margareta

    2016-02-01

    For people with dementia, the concept of quality of life (Qol) reflects the disease's impact on the whole person. Thus, Qol is an increasingly used outcome measure in dementia research. This systematic review was performed to identify available dementia-specific Qol measurements and to assess the quality of linguistic validations and reliability studies of these measurements (PROSPERO 2013: CRD42014008725). The MEDLINE, CINAHL, EMBASE, PsycINFO, and Cochrane Methodology Register databases were systematically searched without any date restrictions. Forward and backward citation tracking were performed on the basis of selected articles. A total of 70 articles addressing 19 dementia-specific Qol measurements were identified; nine measurements were adapted to nonorigin countries. The quality of the linguistic validations varied from insufficient to good. Internal consistency was the most frequently tested reliability property. Most of the reliability studies lacked internal validity. Qol measurements for dementia are insufficiently linguistic validated and not well tested for reliability. None of the identified measurements can be recommended without further research. The application of international guidelines and quality criteria is strongly recommended for the performance of linguistic validations and reliability studies of dementia-specific Qol measurements. Copyright © 2016 Elsevier Inc. All rights reserved.

  19. Calibration and Validation of a Finite ELement Model of THor-K Anthropomorphic Test Device for Aerospace Safety Applications

    NASA Technical Reports Server (NTRS)

    Putnam, J. B.; Unataroiu, C. D.; Somers, J. T.

    2014-01-01

    The THOR anthropomorphic test device (ATD) has been developed and continuously improved by the National Highway Traffic Safety Administration to provide automotive manufacturers an advanced tool that can be used to assess the injury risk of vehicle occupants in crash tests. Recently, a series of modifications were completed to improve the biofidelity of THOR ATD [1]. The updated THOR Modification Kit (THOR-K) ATD was employed at Wright-Patterson Air Base in 22 impact tests in three configurations: vertical, lateral, and spinal [2]. Although a computational finite element (FE) model of the THOR had been previously developed [3], updates to the model were needed to incorporate the recent changes in the modification kit. The main goal of this study was to develop and validate a FE model of the THOR-K ATD. The CAD drawings of the THOR-K ATD were reviewed and FE models were developed for the updated parts. For example, the head-skin geometry was found to change significantly, so its model was re-meshed (Fig. 1a). A protocol was developed to calibrate each component identified as key to the kinematic and kinetic response of the THOR-K head/neck ATD FE model (Fig. 1b). The available ATD tests were divided in two groups: a) calibration tests where the unknown material parameters of deformable parts (e.g., head skin, pelvis foam) were optimized to match the data and b) validation tests where the model response was only compared with test data by calculating their score using CORrelation and Analysis (CORA) rating system. Finally, the whole ATD model was validated under horizontal-, vertical-, and lateral-loading conditions against data recorded in the Wright Patterson tests [2]. Overall, the final THOR-K ATD model developed in this study is shown to respond similarly to the ATD in all validation tests. This good performance indicates that the optimization performed during calibration by using the CORA score as objective function is not test specific. Therefore confidence is provided in the ATD model for uses in predicting response in test conditions not performed in this study such those observed in the spacecraft landing. Comparison studies with ATD and human models may also be performed to contribute to future changes in THOR ATD design in an effort to improve its biofidelity, which has been traditionally based on post-mortem human subject testing and designer experience.

  20. Development and validation of criterion-referenced clinically relevant fitness standards for maintaining physical independence in later years.

    PubMed

    Rikli, Roberta E; Jones, C Jessie

    2013-04-01

    To develop and validate criterion-referenced fitness standards for older adults that predict the level of capacity needed for maintaining physical independence into later life. The proposed standards were developed for use with a previously validated test battery for older adults-the Senior Fitness Test (Rikli, R. E., & Jones, C. J. (2001). Development and validation of a functional fitness test for community--residing older adults. Journal of Aging and Physical Activity, 6, 127-159; Rikli, R. E., & Jones, C. J. (1999a). Senior fitness test manual. Champaign, IL: Human Kinetics.). A criterion measure to assess physical independence was identified. Next, scores from a subset of 2,140 "moderate-functioning" older adults from a larger cross-sectional database, together with findings from longitudinal research on physical capacity and aging, were used as the basis for proposing fitness standards (performance cut points) associated with having the ability to function independently. Validity and reliability analyses were conducted to test the standards for their accuracy and consistency as predictors of physical independence. Performance standards are presented for men and women ages 60-94 indicating the level of fitness associated with remaining physically independent until late in life. Reliability and validity indicators for the standards ranged between .79 and .97. The proposed standards provide easy-to-use, previously unavailable methods for evaluating physical capacity in older adults relative to that associated with physical independence. Most importantly, the standards can be used in planning interventions that target specific areas of weakness, thus reducing risk for premature loss of mobility and independence.

  1. Analytical validation of a psychiatric pharmacogenomic test.

    PubMed

    Jablonski, Michael R; King, Nina; Wang, Yongbao; Winner, Joel G; Watterson, Lucas R; Gunselman, Sandra; Dechairo, Bryan M

    2018-05-01

    The aim of this study was to validate the analytical performance of a combinatorial pharmacogenomics test designed to aid in the appropriate medication selection for neuropsychiatric conditions. Genomic DNA was isolated from buccal swabs. Twelve genes (65 variants/alleles) associated with psychotropic medication metabolism, side effects, and mechanisms of actions were evaluated by bead array, MALDI-TOF mass spectrometry, and/or capillary electrophoresis methods (GeneSight Psychotropic, Assurex Health, Inc.). The combinatorial pharmacogenomics test has a dynamic range of 2.5-20 ng/μl of input genomic DNA, with comparable performance for all assays included in the test. Both the precision and accuracy of the test were >99.9%, with individual gene components between 99.4 and 100%. This study demonstrates that the combinatorial pharmacogenomics test is robust and reproducible, making it suitable for clinical use.

  2. Validity Issues in Assessing Dispositions: The Confirmatory Factor Analysis of a Teacher Dispositions Form

    ERIC Educational Resources Information Center

    Niu, Chunling; Everson, Kimberlee; Dietrich, Sylvia; Zippay, Cassie

    2017-01-01

    Critics against the inclusion of dispositions as part of the teacher education accreditation focus on the dearth of empirical literature on reliably and validly accessing dispositions (Borko, Liston, & Whitcomb, 2007). In this study, a confirmatory factor analysis (CFA) was performed to test the factorial validity of a teacher dispositions…

  3. Performing a Content Validation Study.

    ERIC Educational Resources Information Center

    Spool, Mark D.

    Content validity is concerned with three components: (1) the job content; (2) the test content, and (3) the strength of the relationship between the two. A content validation study, to be considered adequate and defensible should include at least the following four procedures: (1) A thorough and accurate job analysis (to define the job content);…

  4. Validity of the Medical College Admission Test for Predicting MD-PhD Student Outcomes

    ERIC Educational Resources Information Center

    Bills, James L.; VanHouten, Jacob; Grundy, Michelle M.; Chalkley, Roger; Dermody, Terence S.

    2016-01-01

    The Medical College Admission Test (MCAT) is a quantitative metric used by MD and MD-PhD programs to evaluate applicants for admission. This study assessed the validity of the MCAT in predicting training performance measures and career outcomes for MD-PhD students at a single institution. The study population consisted of 153 graduates of the…

  5. Validity Evidence for ACT Compass® Placement Tests. ACT Research Report Series 2014 (2)

    ERIC Educational Resources Information Center

    Westrick, Paul A.; Allen, Jeff

    2014-01-01

    We examined the validity of using Compass® test scores and high school grade point average (GPA) for placing students in first-year college courses and for identifying students at risk of not succeeding. Consistent with other research, the combination of high school GPA and Compass scores performed better than either measure used alone. Results…

  6. An Investigation into the Validity of the TOEFL iBT Speaking Test for International Teaching Assistant Certification

    ERIC Educational Resources Information Center

    Farnsworth, Timothy L.

    2013-01-01

    This study examined the construct validity of the TOEFL iBT Speaking subsection for the purposes of international teaching assistant (ITA) certification, a purpose for which it was not specifically designed. The factor structure of the new TOEFL was compared with that of another language performance test in use at a major American research…

  7. Testing a Multi-Stage Screening System: Predicting Performance on Australia's National Achievement Test Using Teachers' Ratings of Academic and Social Behaviors

    ERIC Educational Resources Information Center

    Kettler, Ryan J.; Elliott, Stephen N.; Davies, Michael; Griffin, Patrick

    2012-01-01

    This study addresses the predictive validity of results from a screening system of academic enablers, with a sample of Australian elementary school students, when the criterion variable is end-of-year achievement. The investigation included (a) comparing the predictive validity of a brief criterion-referenced nomination system with more…

  8. Cross-cultural adaptation, reliability and validity of the Turkish version of the Hospital for Special Surgery (HSS) Knee Score.

    PubMed

    Narin, Selnur; Unver, Bayram; Bakırhan, Serkan; Bozan, Ozgür; Karatosun, Vasfi

    2014-01-01

    The purpose of this study was to adapt the English version of the Hospital for Special Surgery (HSS) knee score for use in a Turkish population and to evaluate its validity, reliability and cultural adaptation. Standard forward-back translation of the HSS knee score was performed and the Turkish version was applied in 73 patients. The Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC), Mini-Mental State Examination and sit-to-stand test were also performed and analyzed. Internal consistency reliability was tested using Cronbach's alpha. The intraclass correlation coefficient (ICC) was used to calculate the test-retest reliability at one-week intervals. Validity was assessed by calculating the Pearson correlation between the HSS, WOMAC and sit-to-stand test scores. The ICC ranged from 0.98 to 0.99 with high internal consistency (Cronbach's alpha: 0.87). The WOMAC score correlated with total HSS score (r: -0.80, p<0.001) and sit-to-stand score (r: 0.12, p: 0.312). The Turkish version of the HSS knee score is reliable and valid in evaluating the total knee arthroplasty in Turkish patients.

  9. Simulations of carbon fiber composite delamination tests

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kay, G

    2007-10-25

    Simulations of mode I interlaminar fracture toughness tests of a carbon-reinforced composite material (BMS 8-212) were conducted with LSDYNA. The fracture toughness tests were performed by U.C. Berkeley. The simulations were performed to investigate the validity and practicality of employing decohesive elements to represent interlaminar bond failures that are prevalent in carbon-fiber composite structure penetration events. The simulations employed a decohesive element formulation that was verified on a simple two element model before being employed to perform the full model simulations. Care was required during the simulations to ensure that the explicit time integration of LSDYNA duplicate the near steady-statemore » testing conditions. In general, this study validated the use of employing decohesive elements to represent the interlaminar bond failures seen in carbon-fiber composite structures, but the practicality of employing the elements to represent the bond failures seen in carbon-fiber composite structures during penetration events was not established.« less

  10. Validation of EncephalApp, Smartphone-based Stroop Test, for the Diagnosis of Covert Hepatic Encephalopathy

    PubMed Central

    Bajaj, Jasmohan S; Heuman, Douglas M; Sterling, Richard K; Sanyal, Arun J; Siddiqui, Muhammad; Matherly, Scott; Luketic, Velimir; Stravitz, R Todd; Fuchs, Michael; Thacker, Leroy R; Gilles, HoChong; White, Melanie B; Unser, Ariel; Hovermale, James; Gavis, Edith; Noble, Nicole A; Wade, James B

    2014-01-01

    Background & Aims Detection of covert hepatic encephalopathy (CHE) is difficult but point of care testing could increase rates of diagnosis. We aimed to validate the ability of the smartphone app EncephalApp, a streamlined version of Stroop App, to detect CHE. We evaluated face validity, test–retest reliability, and external validity. Methods Patients with cirrhosis (n=167; 38% with overt HE [OHE]; mean age, 55 years; mean model for end-stage liver disease score, 12) and controls (n=114) were each given a paper and pencil cognitive battery (standard) along with EncephalApp. EncephalApp has Off and On states; results measured were: OffTime, OnTime, OffTime+OnTime, and number of runs required to complete 5 off and on runs. Thirty-six patients with cirrhosis underwent driving simulation tests, and EncephalApp results were correlated with results. Test–retest reliability was analyzed in a subgroup of patients. The test was performed before and after transjugular intra-hepatic portosystemic shunt placement, before and after correction for hyponatremia, to determine external validity. Results All patients with cirrhosis performed worse on paper and pencil and EncephalApp tests than controls. Patients with cirrhosis and OHE performed worse than those without OHE. Age-dependent EncephalApp cut-offs (younger or older than 45 years) were set. An OffTime+OnTime value of >190 seconds identified all patients with CHE with an area under the receiver operator characteristic (AUROC) value of 0.91; the AUROC value was 0.88 for diagnosis of CHE in those without OHE. EncephalApp times correlated with crashes and illegal turns in driving simulation tests. Test–retest reliability was high (intra-class coefficient, 0.83) among 30 patients retested 1–3 months apart. OffTime+OnTime increased significantly (206 vs 255, P=.007) among 10 patients retested 33±7 days after transjugular intra-hepatic portosystemic shunt placement. OffTime+OnTime decreased significantly (242 vs 225, P=.03) in 7 patients tested before and after correction for hyponatremia (126±3 to 132±4 meq/L, P=.01), 10±5 days apart. Conclusions A smartphone app called EncephalApp has good face validity, test–retest reliability, and external validity for the diagnosis of CHE. PMID:24846278

  11. Using artificial intelligence for automating testing of a resident space object collision avoidance system on an orbital spacecraft

    NASA Astrophysics Data System (ADS)

    Straub, Jeremy

    2014-06-01

    Resident space objects (RSOs) pose a significant threat to orbital assets. Due to high relative velocities, even a small RSO can cause significant damage to an object that it strikes. Worse, in many cases a collision may create numerous additional RSOs, if the impacted object shatters apart. These new RSOs will have heterogeneous mass, size and orbital characteristics. Collision avoidance systems (CASs) are used to maneuver spacecraft out of the path of RSOs to prevent these impacts. A RSO CAS must be validated to ensure that it is able to perform effectively given a virtually unlimited number of strike scenarios. This paper presents work on the creation of a testing environment and AI testing routine that can be utilized to perform verification and validation activities for cyber-physical systems. It reviews prior work on automated and autonomous testing. Comparative performance (relative to the performance of a human tester) is discussed.

  12. Assessing the driving performance of older adult drivers: on-road versus simulated driving.

    PubMed

    Lee, Hoe C; Cameron, Don; Lee, Andy H

    2003-09-01

    To validate a laboratory-based driving simulator in measuring on-road driving performance, 129 older adult drivers were assessed with both the simulator and an on-road test. The driving performance of the participants was gauged by appropriate and reliable age-specific assessment criteria, which were found to be negatively correlated with age. Using principal component analysis, two performance indices were developed from the criteria to represent the overall performance in simulated driving and the on-road assessment. There was significant positive association between the two indices, with the simulated driving performance index explaining over two-thirds of the variability of the on-road driving performance index, after adjustment for age and gender of the drivers. The results supported the validity of the driving simulator and it is a safer and more economical method than the on-road testing to assess the driving performance of older adult drivers.

  13. A Retrospective Performance Assessment of the Developmental Neurotoxicity Study in Support of OECD Test Guideline 426

    PubMed Central

    Makris, Susan L.; Raffaele, Kathleen; Allen, Sandra; Bowers, Wayne J.; Hass, Ulla; Alleva, Enrico; Calamandrei, Gemma; Sheets, Larry; Amcoff, Patric; Delrue, Nathalie; Crofton, Kevin M.

    2009-01-01

    Objective We conducted a review of the history and performance of developmental neurotoxicity (DNT) testing in support of the finalization and implementation of Organisation of Economic Co-operation and Development (OECD) DNT test guideline 426 (TG 426). Information sources and analysis In this review we summarize extensive scientific efforts that form the foundation for this testing paradigm, including basic neurotoxicology research, interlaboratory collaborative studies, expert workshops, and validation studies, and we address the relevance, applicability, and use of the DNT study in risk assessment. Conclusions The OECD DNT guideline represents the best available science for assessing the potential for DNT in human health risk assessment, and data generated with this protocol are relevant and reliable for the assessment of these end points. The test methods used have been subjected to an extensive history of international validation, peer review, and evaluation, which is contained in the public record. The reproducibility, reliability, and sensitivity of these methods have been demonstrated, using a wide variety of test substances, in accordance with OECD guidance on the validation and international acceptance of new or updated test methods for hazard characterization. Multiple independent, expert scientific peer reviews affirm these conclusions. PMID:19165382

  14. Analysis of Flowfields over Four-Engine DC-X Rockets

    NASA Technical Reports Server (NTRS)

    Wang, Ten-See; Cornelison, Joni

    1996-01-01

    The objective of this study is to validate a computational methodology for the aerodynamic performance of an advanced conical launch vehicle configuration. The computational methodology is based on a three-dimensional, viscous flow, pressure-based computational fluid dynamics formulation. Both wind-tunnel and ascent flight-test data are used for validation. Emphasis is placed on multiple-engine power-on effects. Computational characterization of the base drag in the critical subsonic regime is the focus of the validation effort; until recently, almost no multiple-engine data existed for a conical launch vehicle configuration. Parametric studies using high-order difference schemes are performed for the cold-flow tests, whereas grid studies are conducted for the flight tests. The computed vehicle axial force coefficients, forebody, aftbody, and base surface pressures compare favorably with those of tests. The results demonstrate that with adequate grid density and proper distribution, a high-order difference scheme, finite rate afterburning kinetics to model the plume chemistry, and a suitable turbulence model to describe separated flows, plume/air mixing, and boundary layers, computational fluid dynamics is a tool that can be used to predict the low-speed aerodynamic performance for rocket design and operations.

  15. Assessment of Technical Skills in Young Soccer Goalkeepers: Reliability and Validity of Two Goalkeeper-Specific Tests.

    PubMed

    Rebelo-Gonçalves, Ricardo; Figueiredo, António J; Coelho-E-Silva, Manuel J; Tessitore, Antonio

    2016-09-01

    The purpose of this study was to evaluate the reproducibility and validity of two new tests designed to examine goalkeeper-specific technique. Twenty-six goalkeepers (14.49 ± 2.52 years old) completed two trial sessions, each separated by one week, to evaluate the reproducibility of the Sprint-Keeper Test (S-Keeper) and the Lateral Shuffle-Keeper Test (LS-Keeper). Construct validity was assessed among forty goalkeepers (14.49 ± 1.71 years old) by competitive level (elite versus non-elite), after controlling for chronological age. All participants were examined in vertical jump (CMJ and CMJ-free arms), acceleration (5-m and 10-m sprint) and goalkeeper-specific technique. The S-Keeper requires the goalkeeper to accelerate during 3 m and dive over a stationary ball after performing a change of direction in a total distance of 10 m. The LS-Keeper involves three changes of direction and a diving save over a stationary ball, in a total distance of 12.55 m. Performance was respectively measured as total time for the right and left sides in each protocol. Bivariate correlations between repeated measures were high and significant (r = 0.835 - 0.912). Test-retest results for the S-Keeper and LS-Keeper showed good reliability (reliability coefficients > 0.88, intra-class correlation coefficient > 0.908 and coefficients of variation < 4.37%), even though participants tended to improve performance when diving to their right side (p < 0.05). Both tests were able to detect significant differences between elite and non-elite goalkeepers, particularly to the left side (p < 0.05). These findings suggest that the S-Keeper and LS-Keeper are reliable and valid tests for assessing goalkeeper-specific technique. Both protocols can be used as a practical tool to provide relevant information about the influence of several components of performance in the overall execution of a diving save, particularly movement patterns, take-off movements and possible asymmetries.

  16. Development of Flight-Test Performance Estimation Techniques for Small Unmanned Aerial Systems

    NASA Astrophysics Data System (ADS)

    McCrink, Matthew Henry

    This dissertation provides a flight-testing framework for assessing the performance of fixed-wing, small-scale unmanned aerial systems (sUAS) by leveraging sub-system models of components unique to these vehicles. The development of the sub-system models, and their links to broader impacts on sUAS performance, is the key contribution of this work. The sub-system modeling and analysis focuses on the vehicle's propulsion, navigation and guidance, and airframe components. Quantification of the uncertainty in the vehicle's power available and control states is essential for assessing the validity of both the methods and results obtained from flight-tests. Therefore, detailed propulsion and navigation system analyses are presented to validate the flight testing methodology. Propulsion system analysis required the development of an analytic model of the propeller in order to predict the power available over a range of flight conditions. The model is based on the blade element momentum (BEM) method. Additional corrections are added to the basic model in order to capture the Reynolds-dependent scale effects unique to sUAS. The model was experimentally validated using a ground based testing apparatus. The BEM predictions and experimental analysis allow for a parameterized model relating the electrical power, measurable during flight, to the power available required for vehicle performance analysis. Navigation system details are presented with a specific focus on the sensors used for state estimation, and the resulting uncertainty in vehicle state. Uncertainty quantification is provided by detailed calibration techniques validated using quasi-static and hardware-in-the-loop (HIL) ground based testing. The HIL methods introduced use a soft real-time flight simulator to provide inertial quality data for assessing overall system performance. Using this tool, the uncertainty in vehicle state estimation based on a range of sensors, and vehicle operational environments is presented. The propulsion and navigation system models are used to evaluate flight-testing methods for evaluating fixed-wing sUAS performance. A brief airframe analysis is presented to provide a foundation for assessing the efficacy of the flight-test methods. The flight-testing presented in this work is focused on validating the aircraft drag polar, zero-lift drag coefficient, and span efficiency factor. Three methods are detailed and evaluated for estimating these design parameters. Specific focus is placed on the influence of propulsion and navigation system uncertainty on the resulting performance data. Performance estimates are used in conjunction with the propulsion model to estimate the impact sensor and measurement uncertainty on the endurance and range of a fixed-wing sUAS. Endurance and range results for a simplistic power available model are compared to the Reynolds-dependent model presented in this work. Additional parameter sensitivity analysis related to state estimation uncertainties encountered in flight-testing are presented. Results from these analyses indicate that the sub-system models introduced in this work are of first-order importance, on the order of 5-10% change in range and endurance, in assessing the performance of a fixed-wing sUAS.

  17. Development of a specific anaerobic field test for aerobic gymnastics.

    PubMed

    Alves, Christiano Robles Rodrigues; Borelli, Marcello Tadeu Caetano; Paineli, Vitor de Salles; Azevedo, Rafael de Almeida; Borelli, Claudia Cristine Gomes; Lancha Junior, Antônio Herbert; Gualano, Bruno; Artioli, Guilherme Giannini

    2015-01-01

    The current investigation aimed to develop a valid specific field test to evaluate anaerobic physical performance in Aerobic Gymnastics athletes. We first designed the Specific Aerobic Gymnast Anaerobic Test (SAGAT), which included gymnastics-specific elements performed in maximal repeated sprint fashion, with a total duration of 80-90 s. In order to validate the SAGAT, three independent sub-studies were performed to evaluate the concurrent validity (Study I, n=8), the reliability (Study II, n=10) and the sensitivity (Study III, n=30) of the test in elite female athletes. In Study I, a positive correlation was shown between lower-body Wingate test and SAGAT performance (Mean power: p = 0.03, r = -0.69, CI: -0.94 to 0.03 and Peak power: p = 0.02, r = -0.72, CI: -0.95 to -0.04) and between upper-body Wingate test and SAGAT performance (Mean power: p = 0.03, r = -0.67, CI: -0.94 to 0.02 and Peak power: p = 0.03, r = -0.69, CI: -0.94 to 0.03). Additionally, plasma lactate was similarly increased in response to SAGAT (p = 0.002), lower-body Wingate Test (p = 0.021) and a simulated competition (p = 0.007). In Study II, no differences were found between the time to complete the SAGAT in repeated trials (p = 0.84; Cohen's d effect size = 0.09; ICC = 0.97, CI: 0.89 to 0.99; MDC95 = 0.12 s). Finally, in Study III the time to complete the SAGAT was significantly lower during the competition cycle when compared to the period before the preparatory cycle (p < 0.001), showing an improvement in SAGAT performance after a specific Aerobic Gymnastics training period. Taken together, these data have demonstrated that SAGAT is a specific, reliable and sensitive measurement of specific anaerobic performance in elite female Aerobic Gymnastics, presenting great potential to be largely applied in training settings.

  18. Development of a Specific Anaerobic Field Test for Aerobic Gymnastics

    PubMed Central

    Paineli, Vitor de Salles; Azevedo, Rafael de Almeida; Borelli, Claudia Cristine Gomes; Lancha Junior, Antônio Herbert; Gualano, Bruno; Artioli, Guilherme Giannini

    2015-01-01

    The current investigation aimed to develop a valid specific field test to evaluate anaerobic physical performance in Aerobic Gymnastics athletes. We first designed the Specific Aerobic Gymnast Anaerobic Test (SAGAT), which included gymnastics-specific elements performed in maximal repeated sprint fashion, with a total duration of 80-90 s. In order to validate the SAGAT, three independent sub-studies were performed to evaluate the concurrent validity (Study I, n=8), the reliability (Study II, n=10) and the sensitivity (Study III, n=30) of the test in elite female athletes. In Study I, a positive correlation was shown between lower-body Wingate test and SAGAT performance (Mean power: p = 0.03, r = -0.69, CI: -0.94 to 0.03 and Peak power: p = 0.02, r = -0.72, CI: -0.95 to -0.04) and between upper-body Wingate test and SAGAT performance (Mean power: p = 0.03, r = -0.67, CI: -0.94 to 0.02 and Peak power: p = 0.03, r = -0.69, CI: -0.94 to 0.03). Additionally, plasma lactate was similarly increased in response to SAGAT (p = 0.002), lower-body Wingate Test (p = 0.021) and a simulated competition (p = 0.007). In Study II, no differences were found between the time to complete the SAGAT in repeated trials (p = 0.84; Cohen’s d effect size = 0.09; ICC = 0.97, CI: 0.89 to 0.99; MDC95 = 0.12 s). Finally, in Study III the time to complete the SAGAT was significantly lower during the competition cycle when compared to the period before the preparatory cycle (p < 0.001), showing an improvement in SAGAT performance after a specific Aerobic Gymnastics training period. Taken together, these data have demonstrated that SAGAT is a specific, reliable and sensitive measurement of specific anaerobic performance in elite female Aerobic Gymnastics, presenting great potential to be largely applied in training settings. PMID:25876039

  19. Using Modeling and Simulation to Complement Testing for Increased Understanding of Weapon Subassembly Response.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wong, Michael K.; Davidson, Megan

    As part of Sandia’s nuclear deterrence mission, the B61-12 Life Extension Program (LEP) aims to modernize the aging weapon system. Modernization requires requalification and Sandia is using high performance computing to perform advanced computational simulations to better understand, evaluate, and verify weapon system performance in conjunction with limited physical testing. The Nose Bomb Subassembly (NBSA) of the B61-12 is responsible for producing a fuzing signal upon ground impact. The fuzing signal is dependent upon electromechanical impact sensors producing valid electrical fuzing signals at impact. Computer generated models were used to assess the timing between the impact sensor’s response to themore » deceleration of impact and damage to major components and system subassemblies. The modeling and simulation team worked alongside the physical test team to design a large-scale reverse ballistic test to not only assess system performance, but to also validate their computational models. The reverse ballistic test conducted at Sandia’s sled test facility sent a rocket sled with a representative target into a stationary B61-12 (NBSA) to characterize the nose crush and functional response of NBSA components. Data obtained from data recorders and high-speed photometrics were integrated with previously generated computer models in order to refine and validate the model’s ability to reliably simulate real-world effects. Large-scale tests are impractical to conduct for every single impact scenario. By creating reliable computer models, we can perform simulations that identify trends and produce estimates of outcomes over the entire range of required impact conditions. Sandia’s HPCs enable geometric resolution that was unachievable before, allowing for more fidelity and detail, and creating simulations that can provide insight to support evaluation of requirements and performance margins. As computing resources continue to improve, researchers at Sandia are hoping to improve these simulations so they provide increasingly credible analysis of the system response and performance over the full range of conditions.« less

  20. Assessing the validity of surface electromyography for recording muscle activation patterns from serratus anterior.

    PubMed

    Hackett, Lucien; Reed, Darren; Halaki, Mark; Ginn, Karen A

    2014-04-01

    No direct evidence exists to support the validity of using surface electrodes to record muscle activity from serratus anterior, an important and commonly investigated shoulder muscle. The aims of this study were to determine the validity of examining muscle activation patterns in serratus anterior using surface electromyography and to determine whether intramuscular electromyography is representative of serratus anterior muscle activity. Seven asymptomatic subjects performed dynamic and isometric shoulder flexion, extension, abduction, adduction and dynamic bench press plus tests. Surface electrodes were placed over serratus anterior and around intramuscular electrodes in serratus anterior. Load was ramped during isometric tests from 0% to 100% maximum load and dynamic tests were performed at 70% maximum load. EMG signals were normalised using five standard maximum voluntary contraction tests. Surface electrodes significantly underestimated serratus anterior muscle activity compared with the intramuscular electrodes during dynamic flexion, dynamic abduction, isometric flexion, isometric abduction and bench press plus tests. All other test conditions showed no significant differences including the flexion normalisation test where maximum activation was recorded from both electrode types. Low correlation between signals was recorded using surface and intramuscular electrodes during concentric phases of dynamic abduction and flexion. It is not valid to use surface electromyography to assess muscle activation levels in serratus anterior during isometric exercises where the electrodes are not placed at the angle of testing and dynamic exercises. Intramuscular electrodes are as representative of the serratus anterior muscle activity as surface electrodes. Copyright © 2014 Elsevier Ltd. All rights reserved.

  1. Construct Validation of Physical Activity Surveys in Culturally Diverse Older Adults: A Comparison of Four Commonly Used Questionnaires

    ERIC Educational Resources Information Center

    Moore, Delilah S.; Ellis, Rebecca; Allen, Priscilla D.; Cherry, Katie E.; Monroe, Pamela A.; O'Neil, Carol E.; Wood, Robert H.

    2008-01-01

    The purpose of this study was to establish validity evidence of four physical activity (PA) questionnaires in culturally diverse older adults by comparing self-report PA with performance-based physical function. Participants were 54 older adults who completed the Continuous Scale Physical Functional Performance 10-item Test (CS-PFP10), Physical…

  2. Propfan test assessment testbed aircraft stability and control/performance 1/9-scale wind tunnel tests

    NASA Technical Reports Server (NTRS)

    Little, B. H., Jr.; Tomlin, K. H.; Aljabri, A. S.; Mason, C. A.

    1988-01-01

    One-ninth scale wind tunnel model tests of the Propfan Test Assessment (PTA) aircraft were performed in three different NASA facilities. Wing and propfan nacelle static pressures, model forces and moments, and flow field at the propfan plane were measured in these tests. Tests started in June 1985 and were completed in January 1987. These data were needed to assure PTA safety of flight, predict PTA performance, and validate analytical codes that will be used to predict flow fields in which the propfan will operate.

  3. Reliability, sensitivity and validity of the assistant referee intermittent endurance test (ARIET) - a modified Yo-Yo IE2 test for elite soccer assistant referees.

    PubMed

    Castagna, Carlo; Bendiksen, Mads; Impellizzeri, Franco M; Krustrup, Peter

    2012-01-01

    We examined the reliability and validity of the assistant referee intermittent endurance test (ARIET), a modified Yo-Yo IE2 test including shuttles of sideways running. The ARIET was carried out on 198 Italian (Serie A-B, Lega-Pro and National Level) and 47 Danish elite soccer assistant referees. Reproducibility was tested for 41 assistant referees on four occasions each separated by one week. The ARIET intraclass correlation coefficients and typical error of measurement ranged from 0.96 to 0.99 and 3.1 to 5.7%, respectively. ARIET performance for Serie A and B was 23 and 25% greater than in Lega-Pro (P < 0.001). The lowest cut-off value derived from receiving operator characteristic discriminating Serie A-B from Lega-Pro was 1300 m. The ARIET performance was significantly correlated with VO(2max) (r = 0.78, P < 0.001), %HR(max) after 4 min of ARIET (r = - 0.81, P < 0.001) and Yo-Yo IR1 performance (r = 0.95, P < 0.001), but not sprint performance (r = -0.15; P = 0.58). The results showed that ARIET is a reproducible and valid test that is able to discriminate between assistant referees of different competitive levels. The lack of correlation with sprinting ability and close correlations with aerobic power, intermittent shuttle running and sub-maximal ARIET heart rate loading provide evidence that ARIET is a relevant test for assessment of intermittent endurance capacity of soccer assistant referees.

  4. The role of objective cognitive dysfunction in subjective cognitive complaints after stroke.

    PubMed

    van Rijsbergen, M W A; Mark, R E; Kop, W J; de Kort, P L M; Sitskoorn, M M

    2017-03-01

    Objective cognitive performance (OCP) is often impaired in patients post-stroke but the consequences of OCP for patient-reported subjective cognitive complaints (SCC) are poorly understood. We performed a detailed analysis on the association between post-stroke OCP and SCC. Assessments of OCP and SCC were obtained in 208 patients 3 months after stroke. OCP was evaluated using conventional and ecologically valid neuropsychological tests. Levels of SCC were measured using the CheckList for Cognitive and Emotional (CLCE) consequences following stroke inventory. Multivariate hierarchical regression analyses were used to evaluate the association of OCP with CLCE scores adjusting for age, sex and intelligence quotient. Analyses were performed to examine the global extent of OCP dysfunction (based on the total number of impaired neuropsychological tests, i.e. objective cognitive impairment index) and for each OCP test separately using the raw neuropsychological (sub)test scores. The objective cognitive impairment index for global OCP was positively correlated with the CLCE score (Spearman's rho = 0.22, P = 0.003), which remained significant in multivariate adjusted models (β = 0.25, P = 0.01). Results for the separate neuropsychological tests indicated that only one task (the ecologically valid Rivermead Behavioural Memory Test) was independently associated with the CLCE in multivariate adjusted models (β = -0.34, P < 0.001). Objective neuropsychological test performance, as measured by the global dysfunction index or an ecologically valid memory task, was associated with SCC. These data suggest that cumulative deficits in multiple cognitive domains contribute to subjectively experienced poor cognitive abilities in daily life in patients post-stroke. © 2016 EAN.

  5. Reliability and Validity of the Floor Transfer Test as a Measure of Readiness for Independent Living Among Older Adults.

    PubMed

    Ardali, Gunay; Brody, Lori T; States, Rebecca A; Godwin, Ellen M

    2017-10-20

    The ability to get up from the floor after a fall is a basic skill required for functional independence. Consequently, the inability to safely get down and up from the floor or to perform a floor transfer (FT) may indicate decreased mobility and/or increased frailty. A reliable and valid test of FT ability is a critical part of the clinical decision-making process. The FT test is a simple, performance-based test that can be administered quickly and easily to determine a patient's ability to safely and successfully get down and up from the floor using any movement strategy and without time restriction. The primary purpose of this cross-sectional study was to determine the intrarater reliability and validity of the FT test as a practical alternative to several widely used yet time-consuming measures of physical disability, frailty, and functional mobility. A total of 61 community-dwelling older adults (65-96 years of age) participated in the study divided into 2 separate subsamples: 15 of them in the intrarater reliability part, while the other 46 in the concurrent validity one. In both subsamples, the participants were stratified on the basis of the self-reported levels of FT ability as independent, assisted, and dependent. Intrarater reliability was assessed in 2 separate occasions and scores were analyzed by intraclass correlation coefficient and κ statistics. Concurrent validity of the FT test was assessed against the self-reported FT ability questionnaire, Physical Functioning Scale, Phenotype of Physical Frailty, and the Short Physical Performance Battery. Known-groups validity was tested by determining whether the FT test distinguished between (1) community-dwelling older adults with physical disabilities versus those who without physical disabilities; and (2) community-dwelling older adults who were functionally dependent versus those who were independent. Participants were also categorized on the basis of FT test outcome as independent, assisted, or dependent. The Spearman correlation coefficients were calculated to examine the strength of the relationships between the FT test and physical status measures. The Kruskal-Wallis test was used to determine whether the FT test significantly discriminated between groups as categorized by the Physical Functioning Scale and Short Physical Performance Battery, and to examine the significance level of the sociodemographic data across the 3 FT test outcome groups. The intrarater reliabilities of the measures were good (0.73-1.00). There were statistically positive and strong correlations between the FT test and all physical status measures (ρ ranged from 0.86 to 0.93, P < .001). Older adults who passed the FT test were collectively categorized as those without physical disabilities and functionally independent, whereas older adults who failed the FT test were categorized as those with physical disabilities and functionally dependent (P < .001). The FT test is a reliable and valid measure for screening for physical disability, frailty, and functional mobility. It can determine which older adults have physical disabilities and/or functional dependence and hence may be useful in assessing readiness for independent living. Inclusion of the FT test at initial evaluation may reveal the presence of these conditions and address the safety of older adults in the community.

  6. Validity of the Worth 4 Dot Test in Patients with Red-Green Color Vision Defect.

    PubMed

    Bak, Eunoo; Yang, Hee Kyung; Hwang, Jeong-Min

    2017-05-01

    The Worth four dot test uses red and green glasses for binocular dissociation, and although it has been believed that patients with red-green color vision defects cannot accurately perform the Worth four dot test, this has not been validated. Therefore, the purpose of this study was to demonstrate the validity of the Worth four dot test in patients with congenital red-green color vision defects who have normal or abnormal binocular vision. A retrospective review of medical records was performed on 30 consecutive congenital red-green color vision defect patients who underwent the Worth four dot test. The type of color vision anomaly was determined by the Hardy Rand and Rittler (HRR) pseudoisochromatic plate test, Ishihara color test, anomaloscope, and/or the 100 hue test. All patients underwent a complete ophthalmologic examination. Binocular sensory status was evaluated with the Worth four dot test and Randot stereotest. The results were interpreted according to the presence of strabismus or amblyopia. Among the 30 patients, 24 had normal visual acuity without strabismus nor amblyopia and 6 patients had strabismus and/or amblyopia. The 24 patients without strabismus nor amblyopia all showed binocular fusional responses by seeing four dots of the Worth four dot test. Meanwhile, the six patients with strabismus or amblyopia showed various results of fusion, suppression, and diplopia. Congenital red-green color vision defect patients of different types and variable degree of binocularity could successfully perform the Worth four dot test. They showed reliable results that were in accordance with their estimated binocular sensory status.

  7. Development and testing of the cancer multidisciplinary team meeting observational tool (MDT-MOT)

    PubMed Central

    Harris, Jenny; Taylor, Cath; Sevdalis, Nick; Jalil, Rozh; Green, James S.A.

    2016-01-01

    Abstract Objective To develop a tool for independent observational assessment of cancer multidisciplinary team meetings (MDMs), and test criterion validity, inter-rater reliability/agreement and describe performance. Design Clinicians and experts in teamwork used a mixed-methods approach to develop and refine the tool. Study 1 observers rated pre-determined optimal/sub-optimal MDM film excerpts and Study 2 observers independently rated video-recordings of 10 MDMs. Setting Study 2 included 10 cancer MDMs in England. Participants Testing was undertaken by 13 health service staff and a clinical and non-clinical observer. Intervention None. Main Outcome Measures Tool development, validity, reliability/agreement and variability in MDT performance. Results Study 1: Observers were able to discriminate between optimal and sub-optimal MDM performance (P ≤ 0.05). Study 2: Inter-rater reliability was good for 3/10 domains. Percentage of absolute agreement was high (≥80%) for 4/10 domains and percentage agreement within 1 point was high for 9/10 domains. Four MDTs performed well (scored 3+ in at least 8/10 domains), 5 MDTs performed well in 6–7 domains and 1 MDT performed well in only 4 domains. Leadership and chairing of the meeting, the organization and administration of the meeting, and clinical decision-making processes all varied significantly between MDMs (P ≤ 0.01). Conclusions MDT-MOT demonstrated good criterion validity. Agreement between clinical and non-clinical observers (within one point on the scale) was high but this was inconsistent with reliability coefficients and warrants further investigation. If further validated MDT-MOT might provide a useful mechanism for the routine assessment of MDMs by the local workforce to drive improvements in MDT performance. PMID:27084499

  8. Development and testing of the cancer multidisciplinary team meeting observational tool (MDT-MOT).

    PubMed

    Harris, Jenny; Taylor, Cath; Sevdalis, Nick; Jalil, Rozh; Green, James S A

    2016-06-01

    To develop a tool for independent observational assessment of cancer multidisciplinary team meetings (MDMs), and test criterion validity, inter-rater reliability/agreement and describe performance. Clinicians and experts in teamwork used a mixed-methods approach to develop and refine the tool. Study 1 observers rated pre-determined optimal/sub-optimal MDM film excerpts and Study 2 observers independently rated video-recordings of 10 MDMs. Study 2 included 10 cancer MDMs in England. Testing was undertaken by 13 health service staff and a clinical and non-clinical observer. None. Tool development, validity, reliability/agreement and variability in MDT performance. Study 1: Observers were able to discriminate between optimal and sub-optimal MDM performance (P ≤ 0.05). Study 2: Inter-rater reliability was good for 3/10 domains. Percentage of absolute agreement was high (≥80%) for 4/10 domains and percentage agreement within 1 point was high for 9/10 domains. Four MDTs performed well (scored 3+ in at least 8/10 domains), 5 MDTs performed well in 6-7 domains and 1 MDT performed well in only 4 domains. Leadership and chairing of the meeting, the organization and administration of the meeting, and clinical decision-making processes all varied significantly between MDMs (P ≤ 0.01). MDT-MOT demonstrated good criterion validity. Agreement between clinical and non-clinical observers (within one point on the scale) was high but this was inconsistent with reliability coefficients and warrants further investigation. If further validated MDT-MOT might provide a useful mechanism for the routine assessment of MDMs by the local workforce to drive improvements in MDT performance. © The Author 2016. Published by Oxford University Press in association with the International Society for Quality in Health Care; all rights reserved.

  9. Development and validation of a composite scoring system for robot-assisted surgical training--the Robotic Skills Assessment Score.

    PubMed

    Chowriappa, Ashirwad J; Shi, Yi; Raza, Syed Johar; Ahmed, Kamran; Stegemann, Andrew; Wilding, Gregory; Kaouk, Jihad; Peabody, James O; Menon, Mani; Hassett, James M; Kesavadas, Thenkurussi; Guru, Khurshid A

    2013-12-01

    A standardized scoring system does not exist in virtual reality-based assessment metrics to describe safe and crucial surgical skills in robot-assisted surgery. This study aims to develop an assessment score along with its construct validation. All subjects performed key tasks on previously validated Fundamental Skills of Robotic Surgery curriculum, which were recorded, and metrics were stored. After an expert consensus for the purpose of content validation (Delphi), critical safety determining procedural steps were identified from the Fundamental Skills of Robotic Surgery curriculum and a hierarchical task decomposition of multiple parameters using a variety of metrics was used to develop Robotic Skills Assessment Score (RSA-Score). Robotic Skills Assessment mainly focuses on safety in operative field, critical error, economy, bimanual dexterity, and time. Following, the RSA-Score was further evaluated for construct validation and feasibility. Spearman correlation tests performed between tasks using the RSA-Scores indicate no cross correlation. Wilcoxon rank sum tests were performed between the two groups. The proposed RSA-Score was evaluated on non-robotic surgeons (n = 15) and on expert-robotic surgeons (n = 12). The expert group demonstrated significantly better performance on all four tasks in comparison to the novice group. Validation of the RSA-Score in this study was carried out on the Robotic Surgical Simulator. The RSA-Score is a valid scoring system that could be incorporated in any virtual reality-based surgical simulator to achieve standardized assessment of fundamental surgical tents during robot-assisted surgery. Copyright © 2013 Elsevier Inc. All rights reserved.

  10. Determination of the criterion-related validity of hip joint angle test for estimating hamstring flexibility using a contemporary statistical approach.

    PubMed

    Sainz de Baranda, Pilar; Rodríguez-Iniesta, María; Ayala, Francisco; Santonja, Fernando; Cejudo, Antonio

    2014-07-01

    To examine the criterion-related validity of the horizontal hip joint angle (H-HJA) test and vertical hip joint angle (V-HJA) test for estimating hamstring flexibility measured through the passive straight-leg raise (PSLR) test using contemporary statistical measures. Validity study. Controlled laboratory environment. One hundred thirty-eight professional trampoline gymnasts (61 women and 77 men). Hamstring flexibility. Each participant performed 2 trials of H-HJA, V-HJA, and PSLR tests in a randomized order. The criterion-related validity of H-HJA and V-HJA tests was measured through the estimation equation, typical error of the estimate (TEEST), validity correlation (β), and their respective confidence limits. The findings from this study suggest that although H-HJA and V-HJA tests showed moderate to high validity scores for estimating hamstring flexibility (standardized TEEST = 0.63; β = 0.80), the TEEST statistic reported for both tests was not narrow enough for clinical purposes (H-HJA = 10.3 degrees; V-HJA = 9.5 degrees). Subsequently, the predicted likely thresholds for the true values that were generated were too wide (H-HJA = predicted value ± 13.2 degrees; V-HJA = predicted value ± 12.2 degrees). The results suggest that although the HJA test showed moderate to high validity scores for estimating hamstring flexibility, the prediction intervals between the HJA and PSLR tests are not strong enough to suggest that clinicians and sport medicine practitioners should use the HJA and PSLR tests interchangeably as gold standard measurement tools to evaluate and detect short hamstring muscle flexibility.

  11. Validity evidence for the situational judgment test paradigm in emotional intelligence measurement.

    PubMed

    Libbrecht, Nele; Lievens, Filip

    2012-01-01

    To date, various measurement approaches have been proposed to assess emotional intelligence (EI). Recently, two new EI tests have been developed based on the situational judgment test (SJT) paradigm: the Situational Test of Emotional Understanding (STEU) and the Situational Test of Emotion Management (STEM). Initial attempts have been made to examine the construct-related validity of these new tests; we extend these findings by placing the tests in a broad nomological network. To this end, 850 undergraduate students completed a personality inventory, a cognitive ability test, a self-report EI test, a performance-based EI measure, the STEU, and the STEM. The SJT-based EI tests were not strongly correlated with personality and fluid cognitive ability. Regarding their relation with existing EI measures, the tests did not capture the same construct as self-report EI measures, but corresponded rather to performance-based EI measures. Overall, these results lend support for the SJT paradigm for measuring EI as an ability.

  12. Assessment without Testing: Using Performance Measures Embedded in a Technology-Based Instructional Program as Indicators of Reading Ability

    ERIC Educational Resources Information Center

    Mitchell, Alison; Baron, Lauren; Macaruso, Paul

    2018-01-01

    Screening and monitoring student reading progress can be costly and time consuming. Assessment embedded within the context of online instructional programs can capture ongoing student performance data while limiting testing time outside of instruction. This paper presents two studies that examined the validity of using performance measures from a…

  13. Validation of a field test for the non-invasive determination of badminton specific aerobic performance

    PubMed Central

    Wonisch, M; Hofmann, P; Schwaberger, G; von Duvillard, S P; Klein, W

    2003-01-01

    Aim: To develop a badminton specific test to determine on court aerobic and anaerobic performance. Method: The test was evaluated by using a lactate steady state test. Seventeen male competitive badminton players (mean (SD) age 26 (8) years, weight 74 (10) kg, height 179 (7) cm) performed an incremental field test on the badminton court to assess the heart rate turn point (HRTP) and the individual physical working capacity (PWCi) at 90% of measured maximal heart rate (HRmax). All subjects performed a 20 minute steady state test at a workload just below the PWCi. Results: Significant correlations (p<0.05) for Pearson's product moment coefficient were found between the two methods for HR (r = 0.78) and velocity (r = 0.93). The HR at the PWCi (176 (5.5) beats/min) was significantly lower than the HRTP (179 (5.5) beats/min), but no significant difference was found for velocity (1.44 (0.3) m/s, 1.38 (0.4) m/s). The constant exercise test showed steady state conditions for both HR (175 (9) beats/min) and blood lactate concentration (3.1 (1.2) mmol/l). Conclusion: The data indicate that a valid determination of specific aerobic and anaerobic exercise performance for the sport of badminton is possible without HRTP determination. PMID:12663351

  14. Embedded measures of performance validity using verbal fluency tests in a clinical sample.

    PubMed

    Sugarman, Michael A; Axelrod, Bradley N

    2015-01-01

    The objective of this study was to determine to what extent verbal fluency measures can be used as performance validity indicators during neuropsychological evaluation. Participants were clinically referred for neuropsychological evaluation in an urban-based Veteran's Affairs hospital. Participants were placed into 2 groups based on their objectively evaluated effort on performance validity tests (PVTs). Individuals who exhibited credible performance (n = 431) failed 0 PVTs, and those with poor effort (n = 192) failed 2 or more PVTs. All participants completed the Controlled Oral Word Association Test (COWAT) and Animals verbal fluency measures. We evaluated how well verbal fluency scores could discriminate between the 2 groups. Raw scores and T scores for Animals discriminated between the credible performance and poor-effort groups with 90% specificity and greater than 40% sensitivity. COWAT scores had lower sensitivity for detecting poor effort. A combination of FAS and Animals scores into logistic regression models yielded acceptable group classification, with 90% specificity and greater than 44% sensitivity. Verbal fluency measures can yield adequate detection of poor effort during neuropsychological evaluation. We provide suggested cut points and logistic regression models for predicting the probability of poor effort in our clinical setting and offer suggested cutoff scores to optimize sensitivity and specificity.

  15. Evaluating Maintenance Performance: A Video Approach to Symbolic Testing of Electronics Maintenance Tasks. Final Report.

    ERIC Educational Resources Information Center

    Shriver, Edgar L.; And Others

    This volume reports an effort to use the video media as an approach for the preparation of a battery of symbolic tests that would be empirically valid substitutes for criterion referenced Job Task Performance Tests. The graphic symbolic tests require the storage of a large amount of pictorial information which must be searched rapidly for display.…

  16. Validation of a Real Time PCR for Classical Swine Fever Diagnosis

    PubMed Central

    Dias, Natanael Lamas; Fonseca Júnior, Antônio Augusto; Oliveira, Anapolino Macedo; Sales, Érica Bravo; Alves, Bruna Rios Coelho; Dorella, Fernanda Alves

    2014-01-01

    The viral disease classical swine fever (CSF), caused by a Pestivirus, is one of the major causes of economic losses for pig farming. The aim of this work was to validate a RT-qPCR using Taqman for detection of CSF in swine tissues. The parameters for the validation followed the specifications of the Manual of Diagnostic Tests and Vaccines for Terrestrial Animals of the World Organization for Animal Health (OIE) and the guide ABNT NBR ISO/IEC 17025:2005. The analysis of the 5′NTR region of CSF virus was performed in 145 samples from 29 infected pigs and in 240 samples from 80 pigs originated in the Brazilian CSF-free zone. The tissues tested were spleen, kidney, blood, tonsils, and lymph nodes. Sequencing of the positive samples for 5′NTR region was performed to evaluate the specificity of the RT-qPCR. Tests performed for the RT-qPCR validation demonstrated that the PCR assay was efficient in detecting RNA from CSF virus in all materials from different tissues of infected animals. Furthermore, RNA from CSF virus was not detected in samples of swine originated from the Brazilian CSF-free zone. Hence, it is concluded that RT-qPCR can be used as a complementary diagnostic for CSF. PMID:24818039

  17. Validation of a real time PCR for classical Swine Fever diagnosis.

    PubMed

    Dias, Natanael Lamas; Fonseca Júnior, Antônio Augusto; Oliveira, Anapolino Macedo; Sales, Erica Bravo; Alves, Bruna Rios Coelho; Dorella, Fernanda Alves; Camargos, Marcelo Fernandes

    2014-01-01

    The viral disease classical swine fever (CSF), caused by a Pestivirus, is one of the major causes of economic losses for pig farming. The aim of this work was to validate a RT-qPCR using Taqman for detection of CSF in swine tissues. The parameters for the validation followed the specifications of the Manual of Diagnostic Tests and Vaccines for Terrestrial Animals of the World Organization for Animal Health (OIE) and the guide ABNT NBR ISO/IEC 17025:2005. The analysis of the 5'NTR region of CSF virus was performed in 145 samples from 29 infected pigs and in 240 samples from 80 pigs originated in the Brazilian CSF-free zone. The tissues tested were spleen, kidney, blood, tonsils, and lymph nodes. Sequencing of the positive samples for 5'NTR region was performed to evaluate the specificity of the RT-qPCR. Tests performed for the RT-qPCR validation demonstrated that the PCR assay was efficient in detecting RNA from CSF virus in all materials from different tissues of infected animals. Furthermore, RNA from CSF virus was not detected in samples of swine originated from the Brazilian CSF-free zone. Hence, it is concluded that RT-qPCR can be used as a complementary diagnostic for CSF.

  18. Validation of On-board Cloud Cover Assessment Using EO-1

    NASA Technical Reports Server (NTRS)

    Mandl, Dan; Miller, Jerry; Griffin, Michael; Burke, Hsiao-hua

    2003-01-01

    The purpose of this NASA Earth Science Technology Office funded effort was to flight validate an on-board cloud detection algorithm and to determine the performance that can be achieved with a Mongoose V flight computer. This validation was performed on the EO-1 satellite, which is operational, by uploading new flight code to perform the cloud detection. The algorithm was developed by MIT/Lincoln Lab and is based on the use of the Hyperion hyperspectral instrument using selected spectral bands from 0.4 to 2.5 microns. The Technology Readiness Level (TRL) of this technology at the beginning of the task was level 5 and was TRL 6 upon completion. In the final validation, an 8 second (0.75 Gbytes) Hyperion image was processed on-board and assessed for percentage cloud cover within 30 minutes. It was expected to take many hours and perhaps a day considering that the Mongoose V is only a 6-8 MIP machine in performance. To accomplish this test, the image taken had to have level 0 and level 1 processing performed on-board before the cloud algorithm was applied. For almost all of the ground test cases and all of the flight cases, the cloud assessment was within 5% of the correct value and in most cases within 1-2%.

  19. Validation and Simulation of Ares I Scale Model Acoustic Test - 3 - Modeling and Evaluating the Effect of Rainbird Water Deluge Inclusion

    NASA Technical Reports Server (NTRS)

    Strutzenberg, Louise L.; Putman, Gabriel C.

    2011-01-01

    The Ares I Scale Model Acoustics Test (ASMAT) is a series of live-fire tests of scaled rocket motors meant to simulate the conditions of the Ares I launch configuration. These tests have provided a well documented set of high fidelity measurements useful for validation including data taken over a range of test conditions and containing phenomena like Ignition Over-Pressure and water suppression of acoustics. Building on dry simulations of the ASMAT tests with the vehicle at 5 ft. elevation (100 ft. real vehicle elevation), wet simulations of the ASMAT test setup have been performed using the Loci/CHEM computational fluid dynamics software to explore the effect of rainbird water suppression inclusion on the launch platform deck. Two-phase water simulation has been performed using an energy and mass coupled lagrangian particle system module where liquid phase emissions are segregated into clouds of virtual particles and gas phase mass transfer is accomplished through simple Weber number controlled breakup and boiling models. Comparisons have been performed to the dry 5 ft. elevation cases, using configurations with and without launch mounts. These cases have been used to explore the interaction between rainbird spray patterns and launch mount geometry and evaluate the acoustic sound pressure level knockdown achieved through above-deck rainbird deluge inclusion. This comparison has been anchored with validation from live-fire test data which showed a reduction in rainbird effectiveness with the presence of a launch mount.

  20. Development and validation of a German version of the joint protection behavior assessment in patients with rheumatoid arthritis.

    PubMed

    Niedermann, K; Forster, A; Hammond, A; Uebelhart, D; de Bie, R

    2007-03-15

    Joint protection (JP) is an important part of the treatment concept for patients with rheumatoid arthritis (RA). The Joint Protection Behavior Assessment short form (JPBA-S) assesses the use of hand JP methods by patients with RA while preparing a hot drink. The purpose of this study was to develop a German version of the JPBA-S (D-JPBA-S) and to test its validity and reliability. A manual was developed through consensus with 8 occupational therapist (OT) experts as the reference for assessing patients' JP behavior. Twenty-four patients with RA and 10 healthy individuals were videotaped while performing 10 tasks reflecting the activity of preparing instant coffee. Recordings were repeated after 3 months for test-retest analysis. One rater assessed all available patient recordings (n = 23, recorded twice) for test-retest reliability. The video recordings of 10 randomly selected patients and all healthy individuals were independently assessed for interrater reliability by 6 OTs who were explicitly asked to follow the manual. Rasch analysis was performed to test construct validity and transform ordinal raw data into interval data for reliability calculations. Nine of the 10 tasks fit the Rasch model. The D-JPBA-S, consisting of 9 valid tasks, had an intraclass correlation coefficient of 0.77 for interrater reliability and 0.71 for test-retest reliability. The D-JPBA-S provides a valid and reliable instrument for assessing JP behavior of patients with RA and can be used in German-speaking countries.

  1. Integration and validation testing for PhEDEx, DBS and DAS with the PhEDEx LifeCycle agent

    NASA Astrophysics Data System (ADS)

    Boeser, C.; Chwalek, T.; Giffels, M.; Kuznetsov, V.; Wildish, T.

    2014-06-01

    The ever-increasing amount of data handled by the CMS dataflow and workflow management tools poses new challenges for cross-validation among different systems within CMS experiment at LHC. To approach this problem we developed an integration test suite based on the LifeCycle agent, a tool originally conceived for stress-testing new releases of PhEDEx, the CMS data-placement tool. The LifeCycle agent provides a framework for customising the test workflow in arbitrary ways, and can scale to levels of activity well beyond those seen in normal running. This means we can run realistic performance tests at scales not likely to be seen by the experiment for some years, or with custom topologies to examine particular situations that may cause concern some time in the future. The LifeCycle agent has recently been enhanced to become a general purpose integration and validation testing tool for major CMS services. It allows cross-system integration tests of all three components to be performed in controlled environments, without interfering with production services. In this paper we discuss the design and implementation of the LifeCycle agent. We describe how it is used for small-scale debugging and validation tests, and how we extend that to large-scale tests of whole groups of sub-systems. We show how the LifeCycle agent can emulate the action of operators, physicists, or software agents external to the system under test, and how it can be scaled to large and complex systems.

  2. VALIDATION OF ANSI N42.34 AMERICAN NATIONAL STANDARD PERFORMANCE CRITERIA FOR HAND-HELD INSTRUMENTS FOR THE DETECTION AND IDENTIFICATION OF RADIONUCLIDES

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lorier, T.

    2014-09-03

    SRNL’s validation of ANSI N42.34-D6 for the Domestic Nuclear Detection Office (DNDO) was performed utilizing one hand-held instrument (or RID) – the FLIR identiFINDER 2. Each section of the standard was evaluated via a walk-through or test. NOTE: In Table 1, W = walk-through and T = test, as directed by the Domestic Nuclear Detection Office (DNDO). For a walk-through, the experiment was either setup or reviewed for setup; for a test, the N42.34-D6 procedures were followed with some exceptions and comments noted. SRNL is not fully able to evaluate a RID against Sections 7 (Environmental), 8 (Electromagnetic), and 9more » (Mechanical) of N42.34, so those portions of this validation were done in collaboration with Qualtest, Inc. in Orlando, Florida. The walk-throughs and tests of Sections 7, 8, and 9 were performed in Qualtest, Inc. facilities with SRNL providing radiological sources as necessary. Where applicable, assessment results and findings of the walk-throughs and tests were recorded on datasheets and a validation summary is provided. A general comment pertained to test requirements found in another standard and referenced in N42.34-D6. For example, step 1 of the test method in section 8.1.2 states “RF test set up information can be found in IEC 61000-4-3.” It is recommended that any information from other standards necessary for conducting the tests within N42.34 should be posted in N42.34 for simplicity and to prevent the user from having to peruse other documents. Another general comment, as noted by Qualtest, is that a tolerance reference is not listed for each test in sections 7-9. Overall, the N42.34-D6 was proven to be practicable, but areas for improvement and recommendations were identified for consideration prior to final ballot submittal.« less

  3. 100-lbf LO2/CH4 RCS Thruster Testing and Validation

    NASA Technical Reports Server (NTRS)

    Barnes, Frank; Cannella, Matthew; Gomez, Carlos; Hand, Jeffrey; Rosenberg, David

    2009-01-01

    100 pound thrust liquid Oxygen-Methane thruster sized for RCS (Reaction Control System) applications. Innovative Design Characteristics include: a) Simple compact design with minimal part count; b) Gaseous or Liquid propellant operation; c) Affordable and Reusable; d) Greater flexibility than existing systems; e) Part of NASA'S study of "Green Propellants." Hot-fire testing validated performance and functionality of thruster. Thruster's dependence on mixture ratio has been evaluated. Data has been used to calculate performance parameters such as thrust and Isp. Data has been compared with previous test results to verify reliability and repeatability. Thruster was found to have an Isp of 131 s and 82 lbf thrust at a mixture ratio of 1.62.

  4. Comparative assessment of three standardized robotic surgery training methods.

    PubMed

    Hung, Andrew J; Jayaratna, Isuru S; Teruya, Kara; Desai, Mihir M; Gill, Inderbir S; Goh, Alvin C

    2013-10-01

    To evaluate three standardized robotic surgery training methods, inanimate, virtual reality and in vivo, for their construct validity. To explore the concept of cross-method validity, where the relative performance of each method is compared. Robotic surgical skills were prospectively assessed in 49 participating surgeons who were classified as follows: 'novice/trainee': urology residents, previous experience <30 cases (n = 38) and 'experts': faculty surgeons, previous experience ≥30 cases (n = 11). Three standardized, validated training methods were used: (i) structured inanimate tasks; (ii) virtual reality exercises on the da Vinci Skills Simulator (Intuitive Surgical, Sunnyvale, CA, USA); and (iii) a standardized robotic surgical task in a live porcine model with performance graded by the Global Evaluative Assessment of Robotic Skills (GEARS) tool. A Kruskal-Wallis test was used to evaluate performance differences between novices and experts (construct validity). Spearman's correlation coefficient (ρ) was used to measure the association of performance across inanimate, simulation and in vivo methods (cross-method validity). Novice and expert surgeons had previously performed a median (range) of 0 (0-20) and 300 (30-2000) robotic cases, respectively (P < 0.001). Construct validity: experts consistently outperformed residents with all three methods (P < 0.001). Cross-method validity: overall performance of inanimate tasks significantly correlated with virtual reality robotic performance (ρ = -0.7, P < 0.001) and in vivo robotic performance based on GEARS (ρ = -0.8, P < 0.0001). Virtual reality performance and in vivo tissue performance were also found to be strongly correlated (ρ = 0.6, P < 0.001). We propose the novel concept of cross-method validity, which may provide a method of evaluating the relative value of various forms of skills education and assessment. We externally confirmed the construct validity of each featured training tool. © 2013 BJU International.

  5. The Role of Integrated Modeling in the Design and Verification of the James Webb Space Telescope

    NASA Technical Reports Server (NTRS)

    Mosier, Gary E.; Howard, Joseph M.; Johnston, John D.; Parrish, Keith A.; Hyde, T. Tupper; McGinnis, Mark A.; Bluth, Marcel; Kim, Kevin; Ha, Kong Q.

    2004-01-01

    The James Web Space Telescope (JWST) is a large, infrared-optimized space telescope scheduled for launch in 2011. System-level verification of critical optical performance requirements will rely on integrated modeling to a considerable degree. In turn, requirements for accuracy of the models are significant. The size of the lightweight observatory structure, coupled with the need to test at cryogenic temperatures, effectively precludes validation of the models and verification of optical performance with a single test in 1-g. Rather, a complex series of steps are planned by which the components of the end-to-end models are validated at various levels of subassembly, and the ultimate verification of optical performance is by analysis using the assembled models. This paper describes the critical optical performance requirements driving the integrated modeling activity, shows how the error budget is used to allocate and track contributions to total performance, and presents examples of integrated modeling methods and results that support the preliminary observatory design. Finally, the concepts for model validation and the role of integrated modeling in the ultimate verification of observatory are described.

  6. Experimental investigation of an RNA sequence space

    NASA Technical Reports Server (NTRS)

    Lee, Youn-Hyung; Dsouza, Lisa; Fox, George E.

    1993-01-01

    Modern rRNAs are the historic consequence of an ongoing evolutionary exploration of a sequence space. These extant sequences belong to a special subset of the sequence space that is comprised only of those primary sequences that can validly perform the biological function(s) required of the particular RNA. If it were possible to readily identify all such valid sequences, stochastic predictions could be made about the relative likelihood of various evolutionary pathways available to an RNA. Herein an experimental system which can assess whether a particular sequence is likely to have validity as a eubacterial 5S rRNA is described. A total of ten naturally occurring, and hence known to be valid, sequences and two point mutants of unknown validity were used to test the usefulness of the approach. Nine of the ten valid sequences tested positive whereas both mutants tested as clearly defective. The tenth valid sequence gave results that would be interpreted as reflecting a borderline status were the answer not known. These results demonstrate that it is possible to experimentally determine which sequences in local regions of the sequence space are potentially valid 5S rRNAs.

  7. Real-Time Sensor Validation, Signal Reconstruction, and Feature Detection for an RLV Propulsion Testbed

    NASA Technical Reports Server (NTRS)

    Jankovsky, Amy L.; Fulton, Christopher E.; Binder, Michael P.; Maul, William A., III; Meyer, Claudia M.

    1998-01-01

    A real-time system for validating sensor health has been developed in support of the reusable launch vehicle program. This system was designed for use in a propulsion testbed as part of an overall effort to improve the safety, diagnostic capability, and cost of operation of the testbed. The sensor validation system was designed and developed at the NASA Lewis Research Center and integrated into a propulsion checkout and control system as part of an industry-NASA partnership, led by Rockwell International for the Marshall Space Flight Center. The system includes modules for sensor validation, signal reconstruction, and feature detection and was designed to maximize portability to other applications. Review of test data from initial integration testing verified real-time operation and showed the system to perform correctly on both hard and soft sensor failure test cases. This paper discusses the design of the sensor validation and supporting modules developed at LeRC and reviews results obtained from initial test cases.

  8. Constructing a question bank based on script concordance approach as a novel assessment methodology in surgical education.

    PubMed

    Aldekhayel, Salah A; Alselaim, Nahar A; Magzoub, Mohi Eldin; Al-Qattan, Mohammad M; Al-Namlah, Abdullah M; Tamim, Hani; Al-Khayal, Abdullah; Al-Habdan, Sultan I; Zamakhshary, Mohammed F

    2012-10-24

    Script Concordance Test (SCT) is a new assessment tool that reliably assesses clinical reasoning skills. Previous descriptions of developing SCT-question banks were merely subjective. This study addresses two gaps in the literature: 1) conducting the first phase of a multistep validation process of SCT in Plastic Surgery, and 2) providing an objective methodology to construct a question bank based on SCT. After developing a test blueprint, 52 test items were written. Five validation questions were developed and a validation survey was established online. Seven reviewers were asked to answer this survey. They were recruited from two countries, Saudi Arabia and Canada, to improve the test's external validity. Their ratings were transformed into percentages. Analysis was performed to compare reviewers' ratings by looking at correlations, ranges, means, medians, and overall scores. Scores of reviewers' ratings were between 76% and 95% (mean 86% ± 5). We found poor correlations between reviewers (Pearson's: +0.38 to -0.22). Ratings of individual validation questions ranged between 0 and 4 (on a scale 1-5). Means and medians of these ranges were computed for each test item (mean: 0.8 to 2.4; median: 1 to 3). A subset of test items comprising 27 items was generated based on a set of inclusion and exclusion criteria. This study proposes an objective methodology for validation of SCT-question bank. Analysis of validation survey is done from all angles, i.e., reviewers, validation questions, and test items. Finally, a subset of test items is generated based on a set of criteria.

  9. Validation of a Scalable Solar Sailcraft

    NASA Technical Reports Server (NTRS)

    Murphy, D. M.

    2006-01-01

    The NASA In-Space Propulsion (ISP) program sponsored intensive solar sail technology and systems design, development, and hardware demonstration activities over the past 3 years. Efforts to validate a scalable solar sail system by functional demonstration in relevant environments, together with test-analysis correlation activities on a scalable solar sail system have recently been successfully completed. A review of the program, with descriptions of the design, results of testing, and analytical model validations of component and assembly functional, strength, stiffness, shape, and dynamic behavior are discussed. The scaled performance of the validated system is projected to demonstrate the applicability to flight demonstration and important NASA road-map missions.

  10. Iridology: A systematic review.

    PubMed

    Ernst, E

    1999-02-01

    Iridologists claim to be able to diagnose medical conditions through abnormalities of pigmentation in the iris. This technique is popular in many countries. Therefore it is relevant to ask whether it is valid. To systematically review all interpretable tests of the validity of iridology as a diagnostic tool. DATA SOURCE AND EXTRACTION: Three independent literature searches were performed to identify all blinded tests. Data were extracted in a predefined, standardized fashion. Four case control studies were found. The majority of these investigations suggests that iridology is not a valid diagnostic method. The validity of iridology as a diagnostic tool is not supported by scientific evaluations. Patients and therapists should be discouraged from using this method.

  11. Fecal electrolyte testing for evaluation of unexplained diarrhea: Validation of body fluid test accuracy in the absence of a reference method.

    PubMed

    Voskoboev, Nikolay V; Cambern, Sarah J; Hanley, Matthew M; Giesen, Callen D; Schilling, Jason J; Jannetto, Paul J; Lieske, John C; Block, Darci R

    2015-11-01

    Validation of tests performed on body fluids other than blood or urine can be challenging due to the lack of a reference method to confirm accuracy. The aim of this study was to evaluate alternate assessments of accuracy that laboratories can rely on to validate body fluid tests in the absence of a reference method using the example of sodium (Na(+)), potassium (K(+)), and magnesium (Mg(2+)) testing in stool fluid. Validations of fecal Na(+), K(+), and Mg(2+) were performed on the Roche cobas 6000 c501 (Roche Diagnostics) using residual stool specimens submitted for clinical testing. Spiked recovery, mixing studies, and serial dilutions were performed and % recovery of each analyte was calculated to assess accuracy. Results were confirmed by comparison to a reference method (ICP-OES, PerkinElmer). Mean recoveries for fecal electrolytes were Na(+) upon spiking=92%, mixing=104%, and dilution=105%; K(+) upon spiking=94%, mixing=96%, and dilution=100%; and Mg(2+) upon spiking=93%, mixing=98%, and dilution=100%. When autoanalyzer results were compared to reference ICP-OES results, Na(+) had a slope=0.94, intercept=4.1, and R(2)=0.99; K(+) had a slope=0.99, intercept=0.7, and R(2)=0.99; and Mg(2+) had a slope=0.91, intercept=-4.6, and R(2)=0.91. Calculated osmotic gap using both methods were highly correlated with slope=0.95, intercept=4.5, and R(2)=0.97. Acid pretreatment increased magnesium recovery from a subset of clinical specimens. A combination of mixing, spiking, and dilution recovery experiments are an acceptable surrogate for assessing accuracy in body fluid validations in the absence of a reference method. Copyright © 2015 The Canadian Society of Clinical Chemists. Published by Elsevier Inc. All rights reserved.

  12. The construct validity of HPAT-Ireland for the selection of medical students: unresolved issues and future research implications.

    PubMed

    Kelly, Maureen E; O'Flynn, Siun

    2017-05-01

    Aptitude tests are widely used in selection. However, despite certain advantages their use remains controversial. This paper aims to critically appraise five sources of evidence for the construct validity of the Health Professions Admission Test (HPAT)-Ireland, an aptitude test used for selecting undergraduate medical students. The objectives are to identify gaps in the evidence, draw comparisons with other aptitude tests and outline future research directions. Our appraisal of the literature found that stakeholder feedback indicates that there is reasonable evidence for test content validity for two of the three sections of HPAT-Ireland. By contrast the Non-Verbal Reasoning section is widely criticised as having limited relevance to medical school performance and future clinical practice. In terms of concurrent validity there is a significant small to medium, negative correlation with school exit examinations, but not consistently so across all studies (r = -0.18, -0.28, 0.017). Likewise predictive validity studies vary, from negative to moderate strength correlations with examination performance during early years at medical school. Five studies indicate that HPAT-Ireland is supported in principle by the majority of stakeholders. While one consequence of its introduction is that successful applicants are now coming from more diverse academic backgrounds, there is no evidence that the socio-economic background of medical school entrants has been altered significantly. Negative perceptions of unfairness relating to gender, coaching and socio-economics remain. The evidence to date suggests that while there are slight gender differences, initially favouring males, these vary year on year. In conclusion, the attitudes towards, and performance of, HPAT-Ireland is not unlike that of other aptitude tests widely used internationally. The main justifications for its introduction have been achieved, in that Ireland no longer relies exclusively on a single measure of academic record for selection to medical school. However a number of areas require further research and exploration.

  13. Validation Test Report For The CRWMS Analysis and Logistics Visually Interactive Model Calvin Version 3.0, 10074-Vtr-3.0-00

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    S. Gillespie

    2000-07-27

    This report describes the tests performed to validate the CRWMS ''Analysis and Logistics Visually Interactive'' Model (CALVIN) Version 3.0 (V3.0) computer code (STN: 10074-3.0-00). To validate the code, a series of test cases was developed in the CALVIN V3.0 Validation Test Plan (CRWMS M&O 1999a) that exercises the principal calculation models and options of CALVIN V3.0. Twenty-five test cases were developed: 18 logistics test cases and 7 cost test cases. These cases test the features of CALVIN in a sequential manner, so that the validation of each test case is used to demonstrate the accuracy of the input to subsequentmore » calculations. Where necessary, the test cases utilize reduced-size data tables to make the hand calculations used to verify the results more tractable, while still adequately testing the code's capabilities. Acceptance criteria, were established for the logistics and cost test cases in the Validation Test Plan (CRWMS M&O 1999a). The Logistics test cases were developed to test the following CALVIN calculation models: Spent nuclear fuel (SNF) and reactivity calculations; Options for altering reactor life; Adjustment of commercial SNF (CSNF) acceptance rates for fiscal year calculations and mid-year acceptance start; Fuel selection, transportation cask loading, and shipping to the Monitored Geologic Repository (MGR); Transportation cask shipping to and storage at an Interim Storage Facility (ISF); Reactor pool allocation options; and Disposal options at the MGR. Two types of cost test cases were developed: cases to validate the detailed transportation costs, and cases to validate the costs associated with the Civilian Radioactive Waste Management System (CRWMS) Management and Operating Contractor (M&O) and Regional Servicing Contractors (RSCs). For each test case, values calculated using Microsoft Excel 97 worksheets were compared to CALVIN V3.0 scenarios with the same input data and assumptions. All of the test case results compare with the CALVIN V3.0 results within the bounds of the acceptance criteria. Therefore, it is concluded that the CALVIN V3.0 calculation models and options tested in this report are validated.« less

  14. Tier One Performance Screen Initial Operational Test and Evaluation: 2011 Annual Report

    DTIC Science & Technology

    2013-01-01

    OPERATIONAL TEST AND EVALUATION: 2011 ANNUAL REPORT EXECUTIVE SUMMARY Research Requirement: In addition to educational, physical , and...34 Table 5.4. Incremental Validity Estimates for the TAPAS and TOPS Composite Scales over the AFQT for Predicting IMT Physical Fitness Criteria by...Validity Estimates for the TAPAS and TOPS Composite Scales over the AFQT for Predicting In-Unit Physical Fitness Criteria by Education Tier

  15. Concurrent Validity and Diagnostic Accuracy of the Dynamic Indicators of Basic Early Literacy Skills and the Comprehensive Test of Phonological Processing

    ERIC Educational Resources Information Center

    Hintze, John M.; Ryan, Amanda L.; Stoner, Gary

    2003-01-01

    The purpose of this study was to (a) examine the concurrent validity of the Dynamic Indicators of Basic Early Literacy Skills (DIBELS) with the Comprehensive Test of Phonological Processing (CTOPP), and (b) explore the diagnostic accuracy of the DIBELS in predicting CTOPP performance using suggested and alternative cut-scores. Eighty-six students…

  16. The Predictive Validity of Interim Assessment Scores Based on the Full-Information Bifactor Model for the Prediction of End-of-Grade Test Performance

    ERIC Educational Resources Information Center

    Immekus, Jason C.; Atitya, Ben

    2016-01-01

    Interim tests are a central component of district-wide assessment systems, yet their technical quality to guide decisions (e.g., instructional) has been repeatedly questioned. In response, the study purpose was to investigate the validity of a series of English Language Arts (ELA) interim assessments in terms of dimensionality and prediction of…

  17. A Criterion-Related Validation Study of the Army Core Leader Competency Model

    DTIC Science & Technology

    2007-04-01

    2004). Transformational and transactional leadership: A meta-analytic test of their relative validity. Journal of Applied Psychology , 89, 755- 768...performance criteria in an attempt to adjust ratings for this influence. Leader survey materials were developed and pilot tested at Ft. Drum and Ft... psychological constructs in the behavioral science realm. Numerous theories, popular literature, websites, assessments, and competency models are

  18. Rasch Modeling of Revised Token Test Performance: Validity and Sensitivity to Change

    ERIC Educational Resources Information Center

    Hula, William; Doyle, Patrick J.; McNeil, Malcolm R.; Mikolic, Joseph M.

    2006-01-01

    The purpose of this research was to examine the validity of the 55-item Revised Token Test (RTT) and to compare traditional and Rasch-based scores in their ability to detect group differences and change over time. The 55-item RTT was administered to 108 left- and right-hemisphere stroke survivors, and the data were submitted to Rasch analysis.…

  19. Development and Initial Validation of the NyTid Test: A Movement Assessment Tool for Compulsory School Pupils

    ERIC Educational Resources Information Center

    Tidén, Anna; Lundqvist, Carolina; Nyberg, Marie

    2015-01-01

    This study presents the development process and initial validation of the NyTid test, a process-oriented movement assessment tool for compulsory school pupils. A sample of 1,260 (627 girls and 633 boys; mean age of 14.39) Swedish school children participated in the study. In the first step, exploratory factor analyses (EFAs) were performed in…

  20. Experimental validation of prototype high voltage bushing

    NASA Astrophysics Data System (ADS)

    Shah, Sejal; Tyagi, H.; Sharma, D.; Parmar, D.; M. N., Vishnudev; Joshi, K.; Patel, K.; Yadav, A.; Patel, R.; Bandyopadhyay, M.; Rotti, C.; Chakraborty, A.

    2017-08-01

    Prototype High voltage bushing (PHVB) is a scaled down configuration of DNB High Voltage Bushing (HVB) of ITER. It is designed for operation at 50 kV DC to ensure operational performance and thereby confirming the design configuration of DNB HVB. Two concentric insulators viz. Ceramic and Fiber reinforced polymer (FRP) rings are used as double layered vacuum boundary for 50 kV isolation between grounded and high voltage flanges. Stress shields are designed for smooth electric field distribution. During ceramic to Kovar brazing, spilling cannot be controlled which may lead to high localized electrostatic stress. To understand spilling phenomenon and precise stress calculation, quantitative analysis was performed using Scanning Electron Microscopy (SEM) of brazed sample and similar configuration modeled while performing the Finite Element (FE) analysis. FE analysis of PHVB is performed to find out electrical stresses on different areas of PHVB and are maintained similar to DNB HV Bushing. With this configuration, the experiment is performed considering ITER like vacuum and electrical parameters. Initial HV test is performed by temporary vacuum sealing arrangements using gaskets/O-rings at both ends in order to achieve desired vacuum and keep the system maintainable. During validation test, 50 kV voltage withstand is performed for one hour. Voltage withstand test for 60 kV DC (20% higher rated voltage) have also been performed without any breakdown. Successful operation of PHVB confirms the design of DNB HV Bushing. In this paper, configuration of PHVB with experimental validation data is presented.

  1. Quality of Education Predicts Performance on the Wide Range Achievement Test-4th Edition Word Reading Subtest

    PubMed Central

    Sayegh, Philip; Arentoft, Alyssa; Thaler, Nicholas S.; Dean, Andy C.; Thames, April D.

    2014-01-01

    The current study examined whether self-rated education quality predicts Wide Range Achievement Test-4th Edition (WRAT-4) Word Reading subtest and neurocognitive performance, and aimed to establish this subtest's construct validity as an educational quality measure. In a community-based adult sample (N = 106), we tested whether education quality both increased the prediction of Word Reading scores beyond demographic variables and predicted global neurocognitive functioning after adjusting for WRAT-4. As expected, race/ethnicity and education predicted WRAT-4 reading performance. Hierarchical regression revealed that when including education quality, the amount of WRAT-4's explained variance increased significantly, with race/ethnicity and both education quality and years as significant predictors. Finally, WRAT-4 scores, but not education quality, predicted neurocognitive performance. Results support WRAT-4 Word Reading as a valid proxy measure for education quality and a key predictor of neurocognitive performance. Future research should examine these findings in larger, more diverse samples to determine their robust nature. PMID:25404004

  2. Psychometric properties of the 30-m walking test in patients with degenerative cervical myelopathy: results from two prospective multicenter cohort studies.

    PubMed

    Bohm, Parker E; Fehlings, Michael G; Kopjar, Branko; Tetreault, Lindsay A; Vaccaro, Alexander R; Anderson, Karen K; Arnold, Paul M

    2017-02-01

    The timed 30-m walking test (30MWT) is used in clinical practice and in research to objectively quantify gait impairment. The psychometric properties of 30MWT have not yet been rigorously evaluated. This study aimed to determine test-retest reliability, divergent and convergent validity, and responsiveness to change of the 30MWT in patients with degenerative cervical myelopathy (DCM). A retrospective observational study was carried out. The sample consisted of patients with symptomatic DCM enrolled in the AOSpine North America or AOSpine International cervical spondylotic myelopathy studies at 26 sites. Modified Japanese Orthopaedic Association scale (mJOA), Nurick scale, 30MWT, Neck Disability Index (NDI), and Short-Form-36 (SF-36v2) physical component score (PCS) and mental component score (MCS) were the outcome measures. Data from two prospective multicenter cohort myelopathy studies were merged. Each patient was evaluated at baseline and 6 months postoperatively. Of 757 total patients, 682 (90.09%) attempted to perform the 30MWT at baseline. Of these 682 patients, 602 (88.12%) performed the 30MWT at baseline. One patient was excluded, leaving601 in the analysis. At baseline, 81 of 682 (11.88%) patients were unable to perform the test, and their mJOA, NDI, and SF-36v2 PCS scores were lower compared with those who performed the test at baseline. In patients who performed the 30MWT at baseline, there was very high correlation among the three baseline 30MWT measurements (r=0.9569-0.9919). The 30MWT demonstrated good convergent and divergent validity. It was moderately correlated with the Nurick (r=0.4932), mJOA (r=-0.4424), and SF-36v2 PCS (r=-0.3537) (convergent validity) and poorly correlated with the NDI (r=0.2107) and SF-36v2 MCS (r=-0.1984) (divergent validity). Overall, the 30MWT was not responsive to change (standardized response mean [SRM]=0.30). However, for patients who had a baseline time above the median value of 29 seconds, the SRM was 0.45. The 30MWT shows high test-retest reliability and good divergent and convergent validity. It is responsive to change only in patients with more severe myelopathy. The 30MWT is a simple, quick, and affordable test, and should be used as an ancillary test to evaluate gait parameters in patients with DCM. Copyright © 2016 Elsevier Inc. All rights reserved.

  3. Validity and reliability of the Omron HJ-303 tri-axial accelerometer-based pedometer.

    PubMed

    Steeves, Jeremy A; Tyo, Brian M; Connolly, Christopher P; Gregory, Douglas A; Stark, Nyle A; Bassett, David R

    2011-09-01

    This study compared the validity of a new Omron HJ-303 piezoelectric pedometer and 2 other pedometers (Sportline Traq and Yamax SW200). To examine the effect of speed, 60 subjects walked on a treadmill at 2, 3, and 4 mph. Twenty subjects also ran at 6, 7, and 8 mph. To test lifestyle activities, 60 subjects performed front-back-side-side stepping, elliptical machine and stair climbing/descending. Twenty others performed ballroom dancing. Sixty participants completed 5 100-step trials while wearing 5 different sets of the devices tested device reliability. Actual steps were determined using a hand tally counter. Significant differences existed among pedometers (P < .05). For walking, the Omron pedometers were the most valid. The Sportline overestimated and the Yamax underestimated steps (P < .05). Worn on the waist or in the backpack, the Omron device and Sportline were valid for running. The Omron was valid for 3 activities (elliptical machine, ascending and descending stairs). The Sportline overestimated all of these activities, and Yamax was only valid for descending stairs. The Omron andYamax were both valid and reliable in the 100-step trials. The Omron HJ-303, worn on the waist, appeared to be the most valid of the 3 pedometers.

  4. A Mode Propagation Database Suitable for Code Validation Utilizing the NASA Glenn Advanced Noise Control Fan and Artificial Sources

    NASA Technical Reports Server (NTRS)

    Sutliff, Daniel L.

    2014-01-01

    The NASA Glenn Research Center's Advanced Noise Control Fan (ANCF) was developed in the early 1990s to provide a convenient test bed to measure and understand fan-generated acoustics, duct propagation, and radiation to the farfield. A series of tests were performed primarily for the use of code validation and tool validation. Rotating Rake mode measurements were acquired for parametric sets of: (i) mode blockage, (ii) liner insertion loss, (iii) short ducts, and (iv) mode reflection.

  5. A Mode Propagation Database Suitable for Code Validation Utilizing the NASA Glenn Advanced Noise Control Fan and Artificial Sources

    NASA Technical Reports Server (NTRS)

    Sutliff, Daniel L.

    2014-01-01

    The NASA Glenn Research Center's Advanced Noise Control Fan (ANCF) was developed in the early 1990s to provide a convenient test bed to measure and understand fan-generated acoustics, duct propagation, and radiation to the farfield. A series of tests were performed primarily for the use of code validation and tool validation. Rotating Rake mode measurements were acquired for parametric sets of: (1) mode blockage, (2) liner insertion loss, (3) short ducts, and (4) mode reflection.

  6. Integrated Resilient Aircraft Control Project Full Scale Flight Validation

    NASA Technical Reports Server (NTRS)

    Bosworth, John T.

    2009-01-01

    Objective: Provide validation of adaptive control law concepts through full scale flight evaluation. Technical Approach: a) Engage failure mode - destabilizing or frozen surface. b) Perform formation flight and air-to-air tracking tasks. Evaluate adaptive algorithm: a) Stability metrics. b) Model following metrics. Full scale flight testing provides an ability to validate different adaptive flight control approaches. Full scale flight testing adds credence to NASA's research efforts. A sustained research effort is required to remove the road blocks and provide adaptive control as a viable design solution for increased aircraft resilience.

  7. SMART empirical approaches for predicting field performance of PV modules from results of reliability tests

    NASA Astrophysics Data System (ADS)

    Hardikar, Kedar Y.; Liu, Bill J. J.; Bheemreddy, Venkata

    2016-09-01

    Gaining an understanding of degradation mechanisms and their characterization are critical in developing relevant accelerated tests to ensure PV module performance warranty over a typical lifetime of 25 years. As newer technologies are adapted for PV, including new PV cell technologies, new packaging materials, and newer product designs, the availability of field data over extended periods of time for product performance assessment cannot be expected within the typical timeframe for business decisions. In this work, to enable product design decisions and product performance assessment for PV modules utilizing newer technologies, Simulation and Mechanism based Accelerated Reliability Testing (SMART) methodology and empirical approaches to predict field performance from accelerated test results are presented. The method is demonstrated for field life assessment of flexible PV modules based on degradation mechanisms observed in two accelerated tests, namely, Damp Heat and Thermal Cycling. The method is based on design of accelerated testing scheme with the intent to develop relevant acceleration factor models. The acceleration factor model is validated by extensive reliability testing under different conditions going beyond the established certification standards. Once the acceleration factor model is validated for the test matrix a modeling scheme is developed to predict field performance from results of accelerated testing for particular failure modes of interest. Further refinement of the model can continue as more field data becomes available. While the demonstration of the method in this work is for thin film flexible PV modules, the framework and methodology can be adapted to other PV products.

  8. Grid Modernization Laboratory Consortium - Testing and Verification

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kroposki, Benjamin; Skare, Paul; Pratt, Rob

    This paper highlights some of the unique testing capabilities and projects being performed at several national laboratories as part of the U. S. Department of Energy Grid Modernization Laboratory Consortium. As part of this effort, the Grid Modernization Laboratory Consortium Testing Network isbeing developed to accelerate grid modernization by enablingaccess to a comprehensive testing infrastructure and creating a repository of validated models and simulation tools that will be publicly available. This work is key to accelerating thedevelopment, validation, standardization, adoption, and deployment of new grid technologies to help meet U. S. energy goals.

  9. The sensitivity of laboratory tests assessing driving related skills to dose-related impairment of alcohol: A literature review.

    PubMed

    Jongen, S; Vuurman, E F P M; Ramaekers, J G; Vermeeren, A

    2016-04-01

    Laboratory tests assessing driving related skills can be useful as initial screening tools to assess potential drug induced impairment as part of a standardized behavioural assessment. Unfortunately, consensus about which laboratory tests should be included to reliably assess drug induced impairment has not yet been reached. The aim of the present review was to evaluate the sensitivity of laboratory tests to the dose dependent effects of alcohol, as a benchmark, on performance parameters. In total, 179 experimental studies were included. Results show that a cued go/no-go task and a divided attention test with primary tracking and secondary visual search were consistently sensitive to the impairing effects at medium and high blood alcohol concentrations. Driving performance assessed in a simulator was less sensitive to the effects of alcohol as compared to naturalistic, on-the-road driving. In conclusion, replicating results of several potentially useful tests and their predictive validity of actual driving impairment should deserve further research. In addition, driving simulators should be validated and compared head to head to naturalistic driving in order to increase construct validity. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

  10. The Reading the Mind in the Eyes test: validation of a French version and exploration of cultural variations in a multi-ethnic city.

    PubMed

    Prevost, Marie; Carrier, Marie-Eve; Chowne, Gabrielle; Zelkowitz, Phyllis; Joseph, Lawrence; Gold, Ian

    2014-01-01

    The first aim of our study was to validate the French version of the Reading the Mind in the Eyes test, a theory of mind test. The second aim was to test whether cultural differences modulate performance on this test. A total of 109 participants completed the original English version and 97 participants completed the French version. Another group of 30 participants completed the French version twice, one week apart. We report a similar overall distribution of scores in both versions and no differences in the mean scores between them. However, 2 items in the French version did not collect a majority of responses, which differed from the results of the English version. Test-retest showed good stability of the French version. As expected, participants who do not speak French or English at home, and those born in Asia, performed worse than North American participants, and those who speak English or French at home. We report a French version with acceptable validity and good stability. The cultural differences observed support the idea that Asian culture does not use theory of mind to explain people's behaviours as much as North American people do.

  11. Development and validation of trauma surgical skills metrics: Preliminary assessment of performance after training.

    PubMed

    Shackelford, Stacy; Garofalo, Evan; Shalin, Valerie; Pugh, Kristy; Chen, Hegang; Pasley, Jason; Sarani, Babak; Henry, Sharon; Bowyer, Mark; Mackenzie, Colin F

    2015-07-01

    Maintaining trauma-specific surgical skills is an ongoing challenge for surgical training programs. An objective assessment of surgical skills is needed. We hypothesized that a validated surgical performance assessment tool could detect differences following a training intervention. We developed surgical performance assessment metrics based on discussion with expert trauma surgeons, video review of 10 experts and 10 novice surgeons performing three vascular exposure procedures and lower extremity fasciotomy on cadavers, and validated the metrics with interrater reliability testing by five reviewers blinded to level of expertise and a consensus conference. We tested these performance metrics in 12 surgical residents (Year 3-7) before and 2 weeks after vascular exposure skills training in the Advanced Surgical Skills for Exposure in Trauma (ASSET) course. Performance was assessed in three areas as follows: knowledge (anatomic, management), procedure steps, and technical skills. Time to completion of procedures was recorded, and these metrics were combined into a single performance score, the Trauma Readiness Index (TRI). Wilcoxon matched-pairs signed-ranks test compared pretraining/posttraining effects. Mean time to complete procedures decreased by 4.3 minutes (from 13.4 minutes to 9.1 minutes). The performance component most improved by the 1-day skills training was procedure steps, completion of which increased by 21%. Technical skill scores improved by 12%. Overall knowledge improved by 3%, with 18% improvement in anatomic knowledge. TRI increased significantly from 50% to 64% with ASSET training. Interrater reliability of the surgical performance assessment metrics was validated with single intraclass correlation coefficient of 0.7 to 0.98. A trauma-relevant surgical performance assessment detected improvements in specific procedure steps and anatomic knowledge taught during a 1-day course, quantified by the TRI. ASSET training reduced time to complete vascular control by one third. Future applications include assessing specific skills in a larger surgeon cohort, assessing military surgical readiness, and quantifying skill degradation with time since training.

  12. The Short Physical Performance Battery is a discriminative tool for identifying patients with COPD at risk of disability.

    PubMed

    Bernabeu-Mora, Roberto; Medina-Mirapeix, Françesc; Llamazares-Herrán, Eduardo; García-Guillamón, Gloria; Giménez-Giménez, Luz María; Sánchez-Nieto, Juan Miguel

    2015-01-01

    Limited mobility is a risk factor for developing chronic obstructive pulmonary disease (COPD)-related disabilities. Little is known about the validity of the Short Physical Performance Battery (SPPB) for identifying mobility limitations in patients with COPD. To determine the clinical validity of the SPPB summary score and its three components (standing balance, 4-meter gait speed, and five-repetition sit-to-stand) for identifying mobility limitations in patients with COPD. This cross-sectional study included 137 patients with COPD, recruited from a hospital in Spain. Muscle strength tests and SPPB were measured; then, patients were surveyed for self-reported mobility limitations. The validity of SPPB scores was analyzed by developing receiver operating characteristic curves to analyze the sensitivity and specificity for identifying patients with mobility limitations; by examining group differences in SPPB scores across categories of mobility activities; and by correlating SPPB scores to strength tests. Only the SPPB summary score and the five-repetition sit-to-stand components showed good discriminative capabilities; both showed areas under the receiver operating characteristic curves greater than 0.7. Patients with limitations had significantly lower SPPB scores than patients without limitations in nine different mobility activities. SPPB scores were moderately correlated with the quadriceps test (r>0.40), and less correlated with the handgrip test (r<0.30), which reinforced convergent and divergent validities. A SPPB summary score cutoff of 10 provided the best accuracy for identifying mobility limitations. This study provided evidence for the validity of the SPPB summary score and the five-repetition sit-to-stand test for assessing mobility in patients with COPD. These tests also showed potential as a screening test for identifying patients with COPD that have mobility limitations.

  13. The Short Physical Performance Battery is a discriminative tool for identifying patients with COPD at risk of disability

    PubMed Central

    Bernabeu-Mora, Roberto; Medina-Mirapeix, Françesc; Llamazares-Herrán, Eduardo; García-Guillamón, Gloria; Giménez-Giménez, Luz María; Sánchez-Nieto, Juan Miguel

    2015-01-01

    Background Limited mobility is a risk factor for developing chronic obstructive pulmonary disease (COPD)-related disabilities. Little is known about the validity of the Short Physical Performance Battery (SPPB) for identifying mobility limitations in patients with COPD. Objective To determine the clinical validity of the SPPB summary score and its three components (standing balance, 4-meter gait speed, and five-repetition sit-to-stand) for identifying mobility limitations in patients with COPD. Methods This cross-sectional study included 137 patients with COPD, recruited from a hospital in Spain. Muscle strength tests and SPPB were measured; then, patients were surveyed for self-reported mobility limitations. The validity of SPPB scores was analyzed by developing receiver operating characteristic curves to analyze the sensitivity and specificity for identifying patients with mobility limitations; by examining group differences in SPPB scores across categories of mobility activities; and by correlating SPPB scores to strength tests. Results Only the SPPB summary score and the five-repetition sit-to-stand components showed good discriminative capabilities; both showed areas under the receiver operating characteristic curves greater than 0.7. Patients with limitations had significantly lower SPPB scores than patients without limitations in nine different mobility activities. SPPB scores were moderately correlated with the quadriceps test (r>0.40), and less correlated with the handgrip test (r<0.30), which reinforced convergent and divergent validities. A SPPB summary score cutoff of 10 provided the best accuracy for identifying mobility limitations. Conclusion This study provided evidence for the validity of the SPPB summary score and the five-repetition sit-to-stand test for assessing mobility in patients with COPD. These tests also showed potential as a screening test for identifying patients with COPD that have mobility limitations. PMID:26664110

  14. ExEP yield modeling tool and validation test results

    NASA Astrophysics Data System (ADS)

    Morgan, Rhonda; Turmon, Michael; Delacroix, Christian; Savransky, Dmitry; Garrett, Daniel; Lowrance, Patrick; Liu, Xiang Cate; Nunez, Paul

    2017-09-01

    EXOSIMS is an open-source simulation tool for parametric modeling of the detection yield and characterization of exoplanets. EXOSIMS has been adopted by the Exoplanet Exploration Programs Standards Definition and Evaluation Team (ExSDET) as a common mechanism for comparison of exoplanet mission concept studies. To ensure trustworthiness of the tool, we developed a validation test plan that leverages the Python-language unit-test framework, utilizes integration tests for selected module interactions, and performs end-to-end crossvalidation with other yield tools. This paper presents the test methods and results, with the physics-based tests such as photometry and integration time calculation treated in detail and the functional tests treated summarily. The test case utilized a 4m unobscured telescope with an idealized coronagraph and an exoplanet population from the IPAC radial velocity (RV) exoplanet catalog. The known RV planets were set at quadrature to allow deterministic validation of the calculation of physical parameters, such as working angle, photon counts and integration time. The observing keepout region was tested by generating plots and movies of the targets and the keepout zone over a year. Although the keepout integration test required the interpretation of a user, the test revealed problems in the L2 halo orbit and the parameterization of keepout applied to some solar system bodies, which the development team was able to address. The validation testing of EXOSIMS was performed iteratively with the developers of EXOSIMS and resulted in a more robust, stable, and trustworthy tool that the exoplanet community can use to simulate exoplanet direct-detection missions from probe class, to WFIRST, up to large mission concepts such as HabEx and LUVOIR.

  15. The Brief Fear of Negative Evaluation Scale (BFNE): translation and validation study of the Iranian version.

    PubMed

    Tavoli, Azadeh; Melyani, Mahdiyeh; Bakhtiari, Maryam; Ghaedi, Gholam Hossein; Montazeri, Ali

    2009-07-09

    The Brief Fear of Negative Evaluation Scale (BFNE) is a commonly used instrument to measure social anxiety. This study aimed to translate and to test the reliability and validity of the BFNE in Iran. The English language version of the BFNE was translated into Persian (Iranian language) and was used in this study. The questionnaire was administered to a consecutive sample of 235 students with (n = 33, clinical group) and without social phobia (n = 202, non-clinical group). In addition to the BFNE, two standard instruments were used to measure social phobia severity: the Social Phobia Inventory (SPIN), and the Social Interaction Anxiety Scale (SIAS). All participants completed a brief background information questionnaire, the SPIN, the SIAS and the BFNE scales. Statistical analysis was performed to test the reliability and validity of the BFNE. In all 235 students were studied (111 male and 124 female). The mean age for non-clinical group was 22.2 (SD = 2.1) years and for clinical sample it was 22.4 (SD = 1.8) years. Cronbach's alpha coefficient (to test reliability) was acceptable for both non-clinical and clinical samples (alpha = 0.90 and 0.82 respectively). In addition, 3-week test-retest reliability was performed in non-clinical sample and the intraclass correlation coefficient (ICC) was quite high (ICC = 0.71). Validity as performed using convergent and discriminant validity showed satisfactory results. The questionnaire correlated well with established measures of social phobia such as the SPIN (r = 0.43, p < 0.001) and the SIAS (r = 0.54, p < 0.001). Also the BFNE discriminated well between men and women with and without social phobia in the expected direction. Factor analysis supported a two-factor solution corresponding to positive and reverse-worded items. This validation study of the Iranian version of BFNE proved that it is an acceptable, reliable and valid measure of social phobia. However, since the scale showed a two-factor structure and this does not confirm to the theoretical basis for the BFNE, thus we suggest the use of the BFNE-II when it becomes available in Iran. The validation study of the BFNE-II is in progress.

  16. Developing and Testing the Guitar Songleading Performance Scale (GSPS)

    ERIC Educational Resources Information Center

    Silverman, Michael J.

    2011-01-01

    Guitar songleading is a critical component in music education and music therapy training curricula. However, at present, there is no standardized instrument to evaluate guitar songleading performance that is both valid and reliable. The purpose of this article is to describe the construction, development, and testing of a guitar songleading…

  17. Dynamic CFD Simulations of the Supersonic Inflatable Aerodynamic Decelerator (SIAD) Ballistic Range Tests

    NASA Technical Reports Server (NTRS)

    Brock, Joseph M; Stern, Eric

    2016-01-01

    Dynamic CFD simulations of the SIAD ballistic test model were performed using US3D flow solver. Motivation for performing these simulations is for the purpose of validation and verification of the US3D flow solver as a viable computational tool for predicting dynamic coefficients.

  18. Using the Rasch analysis for the psychometric validation of the Irregular Word Reading Test (TeLPI): A Portuguese test for the assessment of premorbid intelligence.

    PubMed

    Freitas, Sandra; Prieto, Gerardo; Simões, Mário R; Nogueira, Joana; Santana, Isabel; Martins, Cristina; Alves, Lara

    2018-05-03

    The present study aims to analyze the psychometric characteristics of the TeLPI (Irregular Words Reading Test), a Portuguese premorbid intelligence test, using the Rasch model for dichotomous items. The results reveal an overall adequacy and a good fit of values regarding both items and persons. A high variability of cognitive performance level and a good quality of the measurements were also found. The TeLPI has proved to be a unidimensional measure with reduced DIF effects. The present findings contribute to overcome an important gap in the psychometric validity of this instrument and provide good evidence of the overall psychometric validity of TeLPI results.

  19. Thermo-mechanical evaluation of carbon-carbon primary structure for SSTO vehicles

    NASA Astrophysics Data System (ADS)

    Croop, Harold C.; Lowndes, Holland B.; Hahn, Steven E.; Barthel, Chris A.

    1998-01-01

    An advanced development program to demonstrate carbon-carbon composite structure for use as primary load carrying structure has entered the experimental validation phase. The component being evaluated is a wing torque box section for a single-stage-to-orbit (SSTO) vehicle. The validation or demonstration component features an advanced carbon-carbon design incorporating 3D woven graphite preforms, integral spars, oxidation inhibited matrix, chemical vapor deposited (CVD) oxidation protection coating, and ceramic matrix composite fasteners. The validation component represents the culmination of a four phase design and fabrication development effort. Extensive developmental testing was performed to verify material properties and integrity of basic design features before committing to fabrication of the full scale box. The wing box component is now being set up for testing in the Air Force Research Laboratory Structural Test Facility at Wright-Patterson Air Force Base, Ohio. One of the important developmental tests performed in support of the design and planned testing of the full scale box was the fabrication and test of a skin/spar trial subcomponent. The trial subcomponent incorporated critical features of the full scale wing box design. This paper discusses the results of the trial subcomponent test which served as a pathfinder for the upcoming full scale box test.

  20. Invalid before impaired: an emerging paradox of embedded validity indicators.

    PubMed

    Erdodi, Laszlo A; Lichtenstein, Jonathan D

    Embedded validity indicators (EVIs) are cost-effective psychometric tools to identify non-credible response sets during neuropsychological testing. As research on EVIs expands, assessors are faced with an emerging contradiction: the range of credible impairment disappears between the 'normal' and 'invalid' range of performance. We labeled this phenomenon as the invalid-before-impaired paradox. This study was designed to explore the origin of this psychometric anomaly, subject it to empirical investigation, and generate potential solutions. Archival data were analyzed from a mixed clinical sample of 312 (M Age  = 45.2; M Education  = 13.6) patients medically referred for neuropsychological assessment. The distribution of scores on eight subtests of the third and fourth editions of Wechsler Adult Intelligence Scale (WAIS) were examined in relation to the standard normal curve and two performance validity tests (PVTs). Although WAIS subtests varied in their sensitivity to non-credible responding, they were all significant predictors of performance validity. While subtests previously identified as EVIs (Digit Span, Coding, and Symbol Search) were comparably effective at differentiating credible and non-credible response sets, their classification accuracy was driven by their base rate of low scores, requiring different cutoffs to achieve comparable specificity. Invalid performance had a global effect on WAIS scores. Genuine impairment and non-credible performance can co-exist, are often intertwined, and may be psychometrically indistinguishable. A compromise between the alpha and beta bias on PVTs based on a balanced, objective evaluation of the evidence that requires concessions from both sides is needed to maintain/restore the credibility of performance validity assessment.

  1. The Validity and Reliability of the Gymaware Linear Position Transducer for Measuring Counter-Movement Jump Performance in Female Athletes

    ERIC Educational Resources Information Center

    O'Donnell, Shannon; Tavares, Francisco; McMaster, Daniel; Chambers, Samuel; Driller, Matthew

    2018-01-01

    The current study aimed to assess the validity and test-retest reliability of a linear position transducer when compared to a force plate through a counter-movement jump in female participants. Twenty-seven female recreational athletes (19 ± 2 years) performed three counter-movement jumps simultaneously using the linear position transducer and…

  2. Proposal and validation of a clinical trunk control test in individuals with spinal cord injury.

    PubMed

    Quinzaños, J; Villa, A R; Flores, A A; Pérez, R

    2014-06-01

    One of the problems that arise in spinal cord injury (SCI) is alteration in trunk control. Despite the need for standardized scales, these do not exist for evaluating trunk control in SCI. To propose and validate a trunk control test in individuals with SCI. National Institute of Rehabilitation, Mexico. The test was developed and later evaluated for reliability and criteria, content, and construct validity. We carried out 531 tests on 177 patients and found high inter- and intra-rater reliability. In terms of criterion validity, analysis of variance demonstrated a statistically significant difference in the test score of patients with adequate or inadequate trunk control according to the assessment of a group of experts. A receiver operating characteristic curve was plotted for optimizing the instrument's cutoff point, which was determined at 13 points, with a sensitivity of 98% and a specificity of 92.2%. With regard to construct validity, the correlation between the proposed test and the spinal cord independence measure (SCIM) was 0.873 (P=0.001) and that with the evolution time was 0.437 (P=0.001). For testing the hypothesis with qualitative variables, the Kruskal-Wallis test was performed, which resulted in a statistically significant difference between the scores in the proposed scale of each group defined by these variables. It was proven experimentally that the proposed trunk control test is valid and reliable. Furthermore, the test can be used for all patients with SCI despite the type and level of injury.

  3. Practical color vision tests for air traffic control applicants: en route center and terminal facilities.

    PubMed

    Mertens, H W; Milburn, N J; Collins, W E

    2000-12-01

    Two practical color vision tests were developed and validated for use in screening Air Traffic Control Specialist (ATCS) applicants for work at en route center or terminal facilities. The development of the tests involved careful reproduction/simulation of color-coded materials from the most demanding, safety-critical color task performed in each type of facility. The tests were evaluated using 106 subjects with normal color vision and 85 with color vision deficiency. The en route center test, named the Flight Progress Strips Test (FPST), required the identification of critical red/black coding in computer printing and handwriting on flight progress strips. The terminal option test, named the Aviation Lights Test (ALT), simulated red/green/white aircraft lights that must be identified in night ATC tower operations. Color-coding is a non-redundant source of safety-critical information in both tasks. The FPST was validated by direct comparison of responses to strip reproductions with responses to the original flight progress strips and a set of strips selected independently. Validity was high; Kappa = 0.91 with original strips as the validation criterion and 0.86 with different strips. The light point stimuli of the ALT were validated physically with a spectroradiometer. The reliabilities of the FPST and ALT were estimated with Chronbach's alpha as 0.93 and 0.98, respectively. The high job-relevance, validity, and reliability of these tests increases the effectiveness and fairness of ATCS color vision testing.

  4. An Overview of Models of Speaking Performance and Its Implications for the Development of Procedural Framework for Diagnostic Speaking Tests

    ERIC Educational Resources Information Center

    Zhao, Zhongbao

    2013-01-01

    This paper aims at developing a procedural framework for the development and validation of diagnostic speaking tests. The researcher reviews the current available models of speaking performance, analyzes the distinctive features and then points out the implications for the development of a procedural framework for diagnostic speaking tests. On…

  5. Testing Math or Testing Language? The Construct Validity of the KeyMath-Revised for Children with Intellectual Disability and Language Difficulties

    ERIC Educational Resources Information Center

    Rhodes, Katherine T.; Branum-Martin, Lee; Morris, Robin D.; Romski, MaryAnn; Sevcik, Rose A.

    2015-01-01

    Although it is often assumed that mathematics ability alone predicts mathematics test performance, linguistic demands may also predict achievement. This study examined the role of language in mathematics assessment performance for children with intellectual disability (ID) at less severe levels, on the KeyMath-Revised Inventory (KM-R) with a…

  6. NASA Ares I Launch Vehicle First Stage Roll Control System Cold Flow Development Test Program Overview

    NASA Technical Reports Server (NTRS)

    Butt, Adam; Popp, Christopher G.; Holt, Kimberly A.; Pitts, Hank M.

    2010-01-01

    The Ares I launch vehicle is the selected design, chosen to return humans to the moon, Mars, and beyond. It is configured in two inline stages: the First Stage is a Space Shuttle derived five-segment Solid Rocket Booster and the Upper Stage is powered by a Saturn V derived J-2X engine. During launch, roll control for the First Stage (FS) is handled by a dedicated Roll Control System (RoCS) located on the connecting Interstage. That system will provide the Ares I with the ability to counteract induced roll torque while any induced yaw or pitch moments are handled by vectoring of the booster nozzle. This paper provides an overview of NASA s Ares I FS RoCS cold flow development test program including detailed test objectives, types of tests run to meet those objectives, an overview of the results, and applicable lessons learned. The test article was built and tested at the NASA Marshall Space Flight Center in Huntsville, AL. The FS RoCS System Development Test Article (SDTA) is a full scale, flight representative water flow test article whose primary objective was to obtain fluid system performance data to evaluate integrated system level performance characteristics and verify analytical models. Development testing and model correlation was deemed necessary as there is little historical precedent for similar large flow, pulsing systems such as the FS RoCS. The cold flow development test program consisted of flight-similar tanks, pressure regulators, and thruster valves, as well as plumbing simulating flight geometries, combined with other facility grade components and structure. Orifices downstream of the thruster valves were used to simulate the pressure drop through the thrusters. Additional primary objectives of this test program were to: evaluate system surge pressure (waterhammer) characteristics due to thruster valve operation over a range of mission duty cycles at various feed system pressures, evaluate temperature transients and heat transfer in the pressurization system, including regulator blowdown and propellant ullage performance, measure system pressure drops for comparison to analysis of tubing and components, and validate system activation and re-activation procedures for the helium pressurant system. Secondary objectives included: validating system processes for loading, unloading, and purging, validating procedures and system response for multiple failure scenarios, including relief valve operation, and evaluating system performance for contingency scenarios. The test results of the cold flow development test program are essential in validating the performance and interaction of the Roll Control System and anchoring analysis tools and results to a Critical Design Review level of fidelity.

  7. Flight control system design factors for applying automated testing techniques

    NASA Technical Reports Server (NTRS)

    Sitz, Joel R.; Vernon, Todd H.

    1990-01-01

    Automated validation of flight-critical embedded systems is being done at ARC Dryden Flight Research Facility. The automated testing techniques are being used to perform closed-loop validation of man-rated flight control systems. The principal design features and operational experiences of the X-29 forward-swept-wing aircraft and F-18 High Alpha Research Vehicle (HARV) automated test systems are discussed. Operationally applying automated testing techniques has accentuated flight control system features that either help or hinder the application of these techniques. The paper also discusses flight control system features which foster the use of automated testing techniques.

  8. The Trunk Impairment Scale - modified to ordinal scales in the Norwegian version.

    PubMed

    Gjelsvik, Bente; Breivik, Kyrre; Verheyden, Geert; Smedal, Tori; Hofstad, Håkon; Strand, Liv Inger

    2012-01-01

    To translate the Trunk Impairment Scale (TIS), a measure of trunk control in patients after stroke, into Norwegian (TIS-NV), and to explore its construct validity, internal consistency, intertester and test-retest reliability. TIS was translated according to international guidelines. The validity study was performed on data from 201 patients with acute stroke. Fifty patients with stroke and acquired brain injury were recruited to examine intertester and test-retest reliability. Construct validity was analyzed with exploratory and confirmatory factor analysis and item response theory, internal consistency with Cronbach's alpha test, and intertester and test-retest reliability with kappa and intraclass correlation coefficient tests. The back-translated version of TIS-NV was validated by the original developer. The subscale Static sitting balance was removed. By combining items from the subscales Dynamic sitting balance and Coordination, six ordinal superitems (testlets) were constructed. The TIS-NV was renamed the modified TIS-NV (TIS-modNV). After modifications the TIS-modNV fitted well to a locally dependent unidimensional item response theory model. It demonstrated good construct validity, excellent internal consistency, and high intertester and test-retest reliability for the total score. This study supports that the TIS-modNV is a valid and reliable scale for use in clinical practice and research.

  9. Assessing reliability and validity measures in managed care studies.

    PubMed

    Montoya, Isaac D

    2003-01-01

    To review the reliability and validity literature and develop an understanding of these concepts as applied to managed care studies. Reliability is a test of how well an instrument measures the same input at varying times and under varying conditions. Validity is a test of how accurately an instrument measures what one believes is being measured. A review of reliability and validity instructional material was conducted. Studies of managed care practices and programs abound. However, many of these studies utilize measurement instruments that were developed for other purposes or for a population other than the one being sampled. In other cases, instruments have been developed without any testing of the instrument's performance. The lack of reliability and validity information may limit the value of these studies. This is particularly true when data are collected for one purpose and used for another. The usefulness of certain studies without reliability and validity measures is questionable, especially in cases where the literature contradicts itself

  10. [Cross-cultural validated adaptation of dysfunctional voiding symptom score (DVSS) to Japanese language and cognitive linguistics in questionnaire for pediatric patients].

    PubMed

    Imamura, Masaaki; Usui, Tomoko; Johnin, Kazuyoshi; Yoshimura, Koji; Farhat, Walid; Kanematsu, Akihiro; Ogawa, Osamu

    2014-07-01

    Validated questionnaire for evaluation of pediatric lower urinary tract symptoms (LUTS) is of a great need. We performed cross-cultural validated adaptation of Dysfunctional Voiding Symptom Score (DVSS) to Japanese language, and assessed whether children understand and respond to questionnaire correctly, using cognitive linguistic approach. We translated DVSS into two Japanese versions according to a standard validation methodology: translation, synthesis, back-translation, expert review, and pre-testing. One version was written in adult language for parents, and the other was written in child language for children. Pre-testing was done with 5 to 15-year-old patients visiting us, having normal intelligence. A specialist in cognitive linguistics observed the response by children and parents to DVSS as an interviewer. When a child could not understand a question without adding or paraphrasing the question by the parents, it was defined as 'misidentification'. We performed pretesting with 2 trial versions of DVSS before having the final version. The pre-testing for the first trial version was done for 32 patients (male to female ratio was 19 : 13). The pre-testing for the second trial version was done for 11 patients (male to female ratio was 8 : 3). In DVSS in child language, misidentification was consistently observed for representation of time or frequency. We completed the formal validated translation by amending the problems raised in the pre-testing. The cross-cultural validated adaptation of DVSS to child and adult Japanese was completed. Since temporal perception is not fully developed in children, caution should be taken for using the terms related with time or frequency in the questionnaires for children.

  11. Development and psychometric testing of an abridged version of Dundee Ready Educational Environment Measure (DREEM).

    PubMed

    Jeyashree, Kathiresan; Shewade, Hemant Deepak; Kathirvel, Soundappan

    2018-04-17

    Dundee Ready Educational Environment Measure (DREEM) is a 50-item tool to assess the educational environment of medical institutions as perceived by the students. This cross-sectional study developed and validated an abridged version of the DREEM-50 with an aim to have a less resource-intensive (time, manpower), yet valid and reliable, version of DREEM-50 while also avoiding respondent fatigue. A methodology similar to that used in the development of WHO-BREF was adopted to develop the abridged version of DREEM. Medical students (n = 418) from a private teaching hospital in Madurai, India, were divided into two groups. Group I (n = 277) participated in the development of the abridged version. This was performed by domain-wise selection of items that had the highest item-total correlation. Group II (n = 141) participated in the testing of the abridged version for construct validity, internal consistency and test-retest reliability. Confirmatory factor analysis was performed to assess the construct validity of DREEM-12. The abridged version had 12 items (DREEM-12) spread over all five domains in DREEM-50. DREEM-12 explained 77.4% of the variance in DREEM-50 scores. Correlation between total scores of DREEM-50 and DREEM-12 was 0.88 (p < 0.001). Confirmatory factor analysis of DREEM-12 construct was statistically significant (LR test of model vs. saturated p = 0.0006). The internal consistency of DREEM-12 was 0.83. The test-retest reliability of DREEM-12 was 0.595, p < 0.001. DREEM-12 is a valid and reliable tool for use in educational research. Future research using DREEM-12 will establish its validity and reliability across different settings.

  12. Suitability Screening Test for Marine Corps Air Traffic Controllers Phase 3: Non-cognitive Test Validation and Cognitive Test Prototype

    DTIC Science & Technology

    2014-06-01

    Individuals possess a variety of abilities, preferences , interests, and personal characteristics that should be useful in predicting who will be best suited... traits . Through both concurrent and predictive validity designs, scores on the NCAPS were correlated with measures of schoolhouse academic performance and...other provision of law, no person shall be subject to any penalty for failing to comply with a collection of information if it does not display a

  13. 6DOF Testing of the SLS Inertial Navigation Unit

    NASA Technical Reports Server (NTRS)

    Geohagan, Kevin; Bernard, Bill; Oliver, T. Emerson; Leggett, Jared; Strickland, Dennis

    2018-01-01

    The Navigation System on the NASA Space Launch System (SLS) Block 1 vehicle performs initial alignment of the Inertial Navigation System (INS) navigation frame through gyrocompass alignment (GCA). Because the navigation architecture for the SLS Block 1 vehicle is a purely inertial system, the accuracy of the achieved orbit relative to mission requirements is very sensitive to initial alignment accuracy. The assessment of this sensitivity and many others via simulation is a part of the SLS Model-Based Design and Model-Based Requirements approach. As a part of the aforementioned, 6DOF Monte Carlo simulation is used in large part to develop and demonstrate verification of program requirements. To facilitate this and the GN&C flight software design process, an SLS-Program-controlled Design Math Model (DMM) of the SLS INS was developed by the SLS Navigation Team. The SLS INS model implements all of the key functions of the hardware-namely, GCA, inertial navigation, and FDIR (Fault Detection, Isolation, and Recovery)-in support of SLS GN&C design requirements verification. Despite the strong sensitivity to initial alignment, GCA accuracy requirements were not verified by test due to program cost and schedule constraints. Instead, the system relies upon assessments performed using the SLS INS model. In order to verify SLS program requirements by analysis, the SLS INS model is verified and validated against flight hardware. In lieu of direct testing of GCA accuracy in support of requirement verification, the SLS Navigation Team proposed and conducted an engineering test to, among other things, validate the GCA performance and overall behavior of the SLS INS model through comparison with test data. This paper will detail dynamic hardware testing of the SLS INS, conducted by the SLS Navigation Team at Marshall Space Flight Center's 6DOF Table Facility, in support of GCA performance characterization and INS model validation. A 6-DOF motion platform was used to produce 6DOF pad twist and sway dynamics while a simulated SLS flight computer communicated with the INS. Tests conducted include an evaluation of GCA algorithm robustness to increasingly dynamic pad environments, an examination of GCA algorithm stability and accuracy over long durations, and a long-duration static test to gather enough data for Allan Variance analysis. Test setup, execution, and data analysis will be discussed, including analysis performed in support of SLS INS model validation.

  14. Rapid detection of generalized anxiety disorder and major depression in epilepsy: Validation of the GAD-7 as a complementary tool to the NDDI-E in a French sample.

    PubMed

    Micoulaud-Franchi, Jean-Arthur; Lagarde, Stanislas; Barkate, Gérald; Dufournet, Boris; Besancon, Cyril; Trébuchon-Da Fonseca, Agnès; Gavaret, Martine; Bartolomei, Fabrice; Bonini, Francesca; McGonigal, Aileen

    2016-04-01

    Generalized anxiety disorder (GAD) in people with epilepsy (PWE) is underdiagnosed and undertreated. The GAD-7 is a screening questionnaire to detect GAD. However, the usefulness of the GAD-7 as a screening tool in PWE remains to be validated. Thus, we aimed to: (1) validate the GAD-7 in French PWE and (2) assess its complementarity with regard to the previously validated screening tool for depression, the Neurological Disorders Depression Inventory for Epilepsy (NDDI-E). This study was performed under the auspices of the ILAE Commission on Neuropsychiatry. People with epilepsy >18 years of age were recruited from the specialist epilepsy unit in Marseille, France. The Mini-International Neuropsychiatric Interview (MINI) was performed as gold standard, and the Penn State Worry Questionnaire (PSWQ) and the NDDI-E were performed for external validity. Data were compared between PWE with/without GAD using Chi(2) test and Student's t-test. Internal structural validity, external validity, and receiver operator characteristics were analyzed. A principal component factor analysis with Varimax rotation was performed on the 13 items of the GAD-7 (7 items) plus the NDDI-E (6 items). Testing was performed on 145 PWE: mean age = 39.38 years old (SD=14.01, range: 18-75); 63.4% (92) women; 75.9% with focal epilepsy. Using the MINI, 49 (33.8%) patients had current GAD. Cronbach's alpha coefficient was 0.898, indicating satisfactory internal consistency. Correlation between GAD-7 and the PSQW scores was high (r (145)=.549, P<.0001), indicating good external validity. Factor analysis shows that the anxiety investigated with the GAD-7 and depression investigated with the NDDI-E reflect distinct factors. Receiver operator characteristic analysis showed area under the curve of 0.899 (95% CI 0.838-0.943, P < 0.0001) indicating good capacity of the GAD-7 to detect GAD (defined by MINI). Cutoff for maximal sensitivity and specificity was 7. Mean GAD-7 score in PWE with GAD was 13.22 (SD = 3.99), and that without GAD was 5.17 (SD = 4.66). This study validates the French language version of the GAD-7 screening tool for generalized anxiety in PWE, with a cutoff score of 7/21 for GAD, and also confirms that the GAD-7 is a short and easily administered test. Factor analysis shows that the GAD-7 (screening for generalized anxiety disorder) and the NDDI-E (screening for major depression) provide complementary information. The routine use of both GAD-7 and NDDI-E should be considered in clinical evaluation of patients with epilepsy. Copyright © 2016 Elsevier Inc. All rights reserved.

  15. Traditional vs. Sport-Specific Vertical Jump Tests: Reliability, Validity, and Relationship With the Legs Strength and Sprint Performance in Adult and Teen Soccer and Basketball Players.

    PubMed

    Rodríguez-Rosell, David; Mora-Custodio, Ricardo; Franco-Márquez, Felipe; Yáñez-García, Juan M; González-Badillo, Juan J

    2017-01-01

    Rodríguez-Rosell, D, Mora-Custodio, R, Franco-Márquez, F, Yáñez-García, JM, González-Badillo, JJ. Traditional vs. sport-specific vertical jump tests: reliability, validity, and relationship with the legs strength and sprint performance in adult and teen soccer and basketball players. J Strength Cond Res 31(1): 196-206, 2017-The vertical jump is considered an essential motor skill in many team sports. Many protocols have been used to assess vertical jump ability. However, controversy regarding test selection still exists based on the reliability and specificity of the tests. The main aim of this study was to analyze the reliability and validity of 2 standardized (countermovement jump [CMJ] and Abalakov jump [AJ]) and 2 sport-specific (run-up with 2 [2-LEGS] or 1 leg [1-LEG] take-off jump) vertical jump tests, and their usefulness as predictors of sprint and strength performance for soccer (n = 127) and basketball (n = 59) players in 3 different categories (Under-15, Under-18, and Adults). Three attempts for each of the 4 jump tests were recorded. Twenty-meter sprint time and estimated 1 repetition maximum in full squat were also evaluated. All jump tests showed high intraclass correlation coefficients (0.969-0.995) and low coefficients of variation (1.54-4.82%), although 1-LEG was the jump test with the lowest absolute and relative reliability. All selected jump tests were significantly correlated (r = 0.580-0.983). Factor analysis resulted in the extraction of one principal component, which explained 82.90-95.79% of the variance of all jump tests. The 1-LEG test showed the lowest associations with sprint and strength performance. The results of this study suggest that CMJ and AJ are the most reliable tests for the estimation of explosive force in soccer and basketball players in different age categories.

  16. Development of Modal Test Techniques for Validation of a Solar Sail Design

    NASA Technical Reports Server (NTRS)

    Gaspar, James L.; Mann, Troy; Behun, Vaughn; Wilkie, W. Keats; Pappa, Richard

    2004-01-01

    This paper focuses on the development of modal test techniques for validation of a solar sail gossamer space structure design. The major focus is on validating and comparing the capabilities of various excitation techniques for modal testing solar sail components. One triangular shaped quadrant of a solar sail membrane was tested in a 1 Torr vacuum environment using various excitation techniques including, magnetic excitation, and surface-bonded piezoelectric patch actuators. Results from modal tests performed on the sail using piezoelectric patches at different positions are discussed. The excitation methods were evaluated for their applicability to in-vacuum ground testing and to the development of on orbit flight test techniques. The solar sail membrane was tested in the horizontal configuration at various tension levels to assess the variation in frequency with tension in a vacuum environment. A segment of a solar sail mast prototype was also tested in ambient atmospheric conditions using various excitation techniques, and these methods are also assessed for their ground test capabilities and on-orbit flight testing.

  17. Evidencing the association between swimming capacities and performance indicators in water polo: a multiple regression study.

    PubMed

    Kontic, Dean; Zenic, Natasa; Uljevic, Ognjen; Sekulic, Damir; Lesnik, Blaz

    2017-06-01

    Swimming capacities are hypothesized to be important determinants of water polo performance but there is an evident lack of studies examining different swimming capacities in relation to specific offensive and defensive performance variables in this sport. The aim of this study was to determine the relationship between five swimming capacities and six performance determinants in water polo. The sample comprised 79 high-level youth water polo players (all males, 17-18 years of age). The variables included six performance-related variables (agility in offence and defense, efficacy in offence and defense, polyvalence in offence and defense), and five swimming-capacity tests (water polo sprint test [15 m], swimming sprint test [25 m], short-distance [100 m], aerobic endurance [400 m] and an anaerobic lactate endurance test [4× 50 m]). First, multiple regressions were calculated for one-half of the sample of subjects which were then validated with the remaining half of the sample. The 25-m swim was not included in the regression analyses due to the multicollinearity with other predictors. The originally calculated regression models were validated for defensive agility (R=0.67 and R=0.55 for the original regression calculation and validation subsample, respectively) offensive agility (R=0.59 and R=0.61), and offensive efficacy (R=0.64 and R=0.58). Anaerobic lactate endurance is a significant predictor of offensive and defensive agility, while 15 m sprint significantly contributes to offensive efficacy. Swimming capacities are not found to be related to the polyvalence of the players. The most superior offensive performance can be expected from those players with a high level of anaerobic lactate endurance and advanced sprinting capacity, while anaerobic lactate endurance is recognized as most important quality in defensive duties. Future studies should observe players' polyvalence in relation to (theoretical) knowledge of technical and tactical tasks. Results reinforce the need for the cross-validation of the prediction-models in sport and exercise sciences.

  18. Derivation and Cross-Validation of Cutoff Scores for Patients With Schizophrenia Spectrum Disorders on WAIS-IV Digit Span-Based Performance Validity Measures.

    PubMed

    Glassmire, David M; Toofanian Ross, Parnian; Kinney, Dominique I; Nitch, Stephen R

    2016-06-01

    Two studies were conducted to identify and cross-validate cutoff scores on the Wechsler Adult Intelligence Scale-Fourth Edition Digit Span-based embedded performance validity (PV) measures for individuals with schizophrenia spectrum disorders. In Study 1, normative scores were identified on Digit Span-embedded PV measures among a sample of patients (n = 84) with schizophrenia spectrum diagnoses who had no known incentive to perform poorly and who put forth valid effort on external PV tests. Previously identified cutoff scores resulted in unacceptable false positive rates and lower cutoff scores were adopted to maintain specificity levels ≥90%. In Study 2, the revised cutoff scores were cross-validated within a sample of schizophrenia spectrum patients (n = 96) committed as incompetent to stand trial. Performance on Digit Span PV measures was significantly related to Full Scale IQ in both studies, indicating the need to consider the intellectual functioning of examinees with psychotic spectrum disorders when interpreting scores on Digit Span PV measures. © The Author(s) 2015.

  19. Optimal test selection for prediction uncertainty reduction

    DOE PAGES

    Mullins, Joshua; Mahadevan, Sankaran; Urbina, Angel

    2016-12-02

    Economic factors and experimental limitations often lead to sparse and/or imprecise data used for the calibration and validation of computational models. This paper addresses resource allocation for calibration and validation experiments, in order to maximize their effectiveness within given resource constraints. When observation data are used for model calibration, the quality of the inferred parameter descriptions is directly affected by the quality and quantity of the data. This paper characterizes parameter uncertainty within a probabilistic framework, which enables the uncertainty to be systematically reduced with additional data. The validation assessment is also uncertain in the presence of sparse and imprecisemore » data; therefore, this paper proposes an approach for quantifying the resulting validation uncertainty. Since calibration and validation uncertainty affect the prediction of interest, the proposed framework explores the decision of cost versus importance of data in terms of the impact on the prediction uncertainty. Often, calibration and validation tests may be performed for different input scenarios, and this paper shows how the calibration and validation results from different conditions may be integrated into the prediction. Then, a constrained discrete optimization formulation that selects the number of tests of each type (calibration or validation at given input conditions) is proposed. Furthermore, the proposed test selection methodology is demonstrated on a microelectromechanical system (MEMS) example.« less

  20. Testing for the validity of purchasing power parity theory both in the long-run and the short-run for ASEAN-5

    NASA Astrophysics Data System (ADS)

    Choji, Niri Martha; Sek, Siok Kun

    2017-11-01

    The purchasing power parity theory says that the trade rates among two nations ought to be equivalent to the proportion of the total price levels between the two nations. For more than a decade, there has been substantial interest in testing for the validity of the Purchasing Power Parity (PPP) empirically. This paper performs a series of tests to see if PPP is valid for ASEAN-5 nations for the period of 2000-2016 using monthly data. For this purpose, we conducted four different tests of stationarity, two cointegration tests (Pedroni and Westerlund), and also the VAR model. The stationarity (unit root) tests reveal that the variables are not stationary at levels however stationary at first difference. Cointegration test results did not reject the H0 of no cointegration implying the absence long-run association among the variables and results of the VAR model did not reveal a strong short-run relationship. Based on the data, we, therefore, conclude that PPP is not valid in long-and short-run for ASEAN-5 during 2000-2016.

  1. Uncertainty Analysis of OC5-DeepCwind Floating Semisubmersible Offshore Wind Test Campaign

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Robertson, Amy N

    This paper examines how to assess the uncertainty levels for test measurements of the Offshore Code Comparison, Continued, with Correlation (OC5)-DeepCwind floating offshore wind system, examined within the OC5 project. The goal of the OC5 project was to validate the accuracy of ultimate and fatigue load estimates from a numerical model of the floating semisubmersible using data measured during scaled tank testing of the system under wind and wave loading. The examination of uncertainty was done after the test, and it was found that the limited amount of data available did not allow for an acceptable uncertainty assessment. Therefore, thismore » paper instead qualitatively examines the sources of uncertainty associated with this test to start a discussion of how to assess uncertainty for these types of experiments and to summarize what should be done during future testing to acquire the information needed for a proper uncertainty assessment. Foremost, future validation campaigns should initiate numerical modeling before testing to guide the test campaign, which should include a rigorous assessment of uncertainty, and perform validation during testing to ensure that the tests address all of the validation needs.« less

  2. Uncertainty Analysis of OC5-DeepCwind Floating Semisubmersible Offshore Wind Test Campaign: Preprint

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Robertson, Amy N

    This paper examines how to assess the uncertainty levels for test measurements of the Offshore Code Comparison, Continued, with Correlation (OC5)-DeepCwind floating offshore wind system, examined within the OC5 project. The goal of the OC5 project was to validate the accuracy of ultimate and fatigue load estimates from a numerical model of the floating semisubmersible using data measured during scaled tank testing of the system under wind and wave loading. The examination of uncertainty was done after the test, and it was found that the limited amount of data available did not allow for an acceptable uncertainty assessment. Therefore, thismore » paper instead qualitatively examines the sources of uncertainty associated with this test to start a discussion of how to assess uncertainty for these types of experiments and to summarize what should be done during future testing to acquire the information needed for a proper uncertainty assessment. Foremost, future validation campaigns should initiate numerical modeling before testing to guide the test campaign, which should include a rigorous assessment of uncertainty, and perform validation during testing to ensure that the tests address all of the validation needs.« less

  3. Construct validity of the Free and Cued Selective Reminding Test in older adults with memory complaints.

    PubMed

    Clerici, Francesca; Ghiretti, Roberta; Di Pucchio, Alessandra; Pomati, Simone; Cucumo, Valentina; Marcone, Alessandra; Vanacore, Nicola; Mariani, Claudio; Cappa, Stefano Francesco

    2017-06-01

    The Free and Cued Selective Reminding Test (FCSRT) is the memory test recommended by the International Working Group on Alzheimer's disease (AD) for the detection of amnestic syndrome of the medial temporal type in prodromal AD. Assessing the construct validity and internal consistency of the Italian version of the FCSRT is thus crucial. The FCSRT was administered to 338 community-dwelling participants with memory complaints (57% females, age 74.5 ± 7.7 years), including 34 with AD, 203 with Mild Cognitive Impairment, and 101 with Subjective Memory Impairment. Internal Consistency was estimated using Cronbach's alpha coefficient. To assess convergent validity, five FCSRT scores (Immediate Free Recall, Immediate Total Recall, Delayed Free Recall, Delayed Total Recall, and Index of Sensitivity of Cueing) were correlated with three well-validated memory tests: Story Recall, Rey Auditory Verbal Learning test, and Rey Complex Figure (RCF) recall (partial correlation analysis). To assess divergent validity, a principal component analysis (an exploratory factor analysis) was performed including, in addition to the above-mentioned memory tasks, the following tests: Word Fluencies, RCF copy, Clock Drawing Test, Trail Making Test, Frontal Assessment Battery, Raven Coloured Progressive Matrices, and Stroop Colour-Word Test. Cronbach's alpha coefficients for immediate recalls (IFR and ITR) and delayed recalls (DFR and DTR) were, respectively, .84 and .81. All FCSRT scores were highly correlated with those of the three well-validated memory tests. The factor analysis showed that the FCSRT does not load on the factors saturated by non-memory tests. These findings indicate that the FCSRT has a good internal consistency and has an excellent construct validity as an episodic memory measure. © 2015 The British Psychological Society.

  4. Reliability and validity of the test of incremental respiratory endurance measures of inspiratory muscle performance in COPD

    PubMed Central

    Formiga, Magno F; Roach, Kathryn E; Vital, Isabel; Urdaneta, Gisel; Balestrini, Kira; Calderon-Candelario, Rafael A

    2018-01-01

    Purpose The Test of Incremental Respiratory Endurance (TIRE) provides a comprehensive assessment of inspiratory muscle performance by measuring maximal inspiratory pressure (MIP) over time. The integration of MIP over inspiratory duration (ID) provides the sustained maximal inspiratory pressure (SMIP). Evidence on the reliability and validity of these measurements in COPD is not currently available. Therefore, we assessed the reliability, responsiveness and construct validity of the TIRE measures of inspiratory muscle performance in subjects with COPD. Patients and methods Test–retest reliability, known-groups and convergent validity assessments were implemented simultaneously in 81 male subjects with mild to very severe COPD. TIRE measures were obtained using the portable PrO2 device, following standard guidelines. Results All TIRE measures were found to be highly reliable, with SMIP demonstrating the strongest test–retest reliability with a nearly perfect intraclass correlation coefficient (ICC) of 0.99, while MIP and ID clustered closely together behind SMIP with ICC values of about 0.97. Our findings also demonstrated known-groups validity of all TIRE measures, with SMIP and ID yielding larger effect sizes when compared to MIP in distinguishing between subjects of different COPD status. Finally, our analyses confirmed convergent validity for both SMIP and ID, but not MIP. Conclusion The TIRE measures of MIP, SMIP and ID have excellent test–retest reliability and demonstrated known-groups validity in subjects with COPD. SMIP and ID also demonstrated evidence of moderate convergent validity and appear to be more stable measures in this patient population than the traditional MIP. PMID:29805255

  5. Generalizability and Validity of a Mathematics Performance Assessment.

    ERIC Educational Resources Information Center

    Lane, Suzanne; And Others

    1996-01-01

    Evidence from test results of 3,604 sixth and seventh graders is provided for the generalizability and validity of the Quantitative Understanding: Amplifying Student Achievement and Reasoning (QUASAR) Cognitive Assessment Instrument, which is designed to measure program outcomes and growth in mathematics. (SLD)

  6. When the Principal Asks: "Why are you doing Piagetian Task Testing when you have given Basal Placement Tests?"

    ERIC Educational Resources Information Center

    Harp, Bill

    1987-01-01

    Suggests ways a teacher who is interested in performing Piagetian testing might persuade a principal of the validity of the tests, and offers insight into the connection between cognitive development and reading achievement. (JC)

  7. Cognitive Testing in People at Increased Risk of Dementia Using a Smartphone App: The iVitality Proof-of-Principle Study

    PubMed Central

    Wijsman, Liselotte Willemijn; Cachucho, Ricardo; Hoevenaar-Blom, Marieke Peternella; Mooijaart, Simon Pieter; Richard, Edo

    2017-01-01

    Background Smartphone-assisted technologies potentially provide the opportunity for large-scale, long-term, repeated monitoring of cognitive functioning at home. Objective The aim of this proof-of-principle study was to evaluate the feasibility and validity of performing cognitive tests in people at increased risk of dementia using smartphone-based technology during a 6 months follow-up period. Methods We used the smartphone-based app iVitality to evaluate five cognitive tests based on conventional neuropsychological tests (Memory-Word, Trail Making, Stroop, Reaction Time, and Letter-N-Back) in healthy adults. Feasibility was tested by studying adherence of all participants to perform smartphone-based cognitive tests. Validity was studied by assessing the correlation between conventional neuropsychological tests and smartphone-based cognitive tests and by studying the effect of repeated testing. Results We included 151 participants (mean age in years=57.3, standard deviation=5.3). Mean adherence to assigned smartphone tests during 6 months was 60% (SD 24.7). There was moderate correlation between the firstly made smartphone-based test and the conventional test for the Stroop test and the Trail Making test with Spearman ρ=.3-.5 (P<.001). Correlation increased for both tests when comparing the conventional test with the mean score of all attempts a participant had made, with the highest correlation for Stroop panel 3 (ρ=.62, P<.001). Performance on the Stroop and the Trail Making tests improved over time suggesting a learning effect, but the scores on the Letter-N-back, the Memory-Word, and the Reaction Time tests remained stable. Conclusions Repeated smartphone-assisted cognitive testing is feasible with reasonable adherence and moderate relative validity for the Stroop and the Trail Making tests compared with conventional neuropsychological tests. Smartphone-based cognitive testing seems promising for large-scale data-collection in population studies. PMID:28546139

  8. Multi-Evaporator Miniature Loop Heat Pipe for Small Spacecraft Thermal Control. Part 1; New Technologies and Validation Approach

    NASA Technical Reports Server (NTRS)

    Ku, Jentung; Ottenstein, Laura; Douglas, Donya; Hoang, Triem

    2010-01-01

    Under NASA s New Millennium Program Space Technology 8 (ST 8) Project, four experiments Thermal Loop, Dependable Microprocessor, SAILMAST, and UltraFlex - were conducted to advance the maturity of individual technologies from proof of concept to prototype demonstration in a relevant environment , i.e. from a technology readiness level (TRL) of 3 to a level of 6. This paper presents the new technologies and validation approach of the Thermal Loop experiment. The Thermal Loop is an advanced thermal control system consisting of a miniature loop heat pipe (MLHP) with multiple evaporators and multiple condensers designed for future small system applications requiring low mass, low power, and compactness. The MLHP retains all features of state-of-the-art loop heat pipes (LHPs) and offers additional advantages to enhance the functionality, performance, versatility, and reliability of the system. Details of the thermal loop concept, technical advances, benefits, objectives, level 1 requirements, and performance characteristics are described. Also included in the paper are descriptions of the test articles and mathematical modeling used for the technology validation. An MLHP breadboard was built and tested in the laboratory and thermal vacuum environments for TRL 4 and TRL 5 validations, and an MLHP proto-flight unit was built and tested in a thermal vacuum chamber for the TRL 6 validation. In addition, an analytical model was developed to simulate the steady state and transient behaviors of the MLHP during various validation tests. Capabilities and limitations of the analytical model are also addressed.

  9. Evolving the Principles and Practice of Validation for New Alternative Approaches to Toxicity Testing.

    PubMed

    Whelan, Maurice; Eskes, Chantra

    Validation is essential for the translation of newly developed alternative approaches to animal testing into tools and solutions suitable for regulatory applications. Formal approaches to validation have emerged over the past 20 years or so and although they have helped greatly to progress the field, it is essential that the principles and practice underpinning validation continue to evolve to keep pace with scientific progress. The modular approach to validation should be exploited to encourage more innovation and flexibility in study design and to increase efficiency in filling data gaps. With the focus now on integrated approaches to testing and assessment that are based on toxicological knowledge captured as adverse outcome pathways, and which incorporate the latest in vitro and computational methods, validation needs to adapt to ensure it adds value rather than hinders progress. Validation needs to be pursued both at the method level, to characterise the performance of in vitro methods in relation their ability to detect any association of a chemical with a particular pathway or key toxicological event, and at the methodological level, to assess how integrated approaches can predict toxicological endpoints relevant for regulatory decision making. To facilitate this, more emphasis needs to be given to the development of performance standards that can be applied to classes of methods and integrated approaches that provide similar information. Moreover, the challenge of selecting the right reference chemicals to support validation needs to be addressed more systematically, consistently and in a manner that better reflects the state of the science. Above all however, validation requires true partnership between the development and user communities of alternative methods and the appropriate investment of resources.

  10. Does the Finger-to-Nose Test measure upper limb coordination in chronic stroke?

    PubMed

    Rodrigues, Marcos R M; Slimovitch, Matthew; Chilingaryan, Gevorg; Levin, Mindy F

    2017-01-23

    We aimed to kinematically validate that the time to perform the Finger-to-Nose Test (FNT) assesses coordination by determining its construct, convergent and discriminant validity. Experimental, criterion standard study. Both clinical and experimental evaluations were done at a research facility in a rehabilitation hospital. Forty individuals (20 individuals with chronic stroke and 20 healthy, age- and gender-matched individuals) participated.. Both groups performed two blocks of 10 to-and-fro pointing movements (non-dominant/affected arm) between a sagittal target and the nose (ReachIn, ReachOut) at a self-paced speed. Time to perform the test was the main outcome. Kinematics (Optotrak, 100Hz) and clinical impairment/activity levels were evaluated. Spatiotemporal coordination was assessed with slope (IJC) and cross-correlation (LAG) between elbow and shoulder movements. Compared to controls, individuals with stroke (Fugl-Meyer Assessment, FMA-UE: 51.9 ± 13.2; Box & Blocks, BBT: 72.1 ± 26.9%) made more curved endpoint trajectories using less shoulder horizontal-abduction. For construct validity, shoulder range (β = 0.127), LAG (β = 0.855) and IJC (β = -0.191) explained 82% of FNT-time variance for ReachIn and LAG (β = 0.971) explained 94% for ReachOut in patients with stroke. In contrast, only LAG explained 62% (β = 0.790) and 79% (β = 0.889) of variance for ReachIn and ReachOut respectively in controls. For convergent validity, FNT-time correlated with FMA-UE (r = -0.67, p < 0.01), FMA-Arm (r = -0.60, p = 0.005), biceps spasticity (r = 0.39, p < 0.05) and BBT (r = -0.56, p < 0.01). A cut-off time of 10.6 s discriminated between mild and moderate-to-severe impairment (discriminant validity). Each additional second represented 42% odds increase of greater impairment. For this version of the FNT, the time to perform the test showed construct, convergent and discriminant validity to measure UL coordination in stroke.

  11. Preemployment physical evaluation.

    PubMed

    Jackson, A S

    1994-01-01

    There is a growing trend toward using preemployment tests to select employees for physically demanding jobs. Women are, in increasing numbers, entering physically demanding occupations that were traditionally dominated by men. Under current Federal employment law, it is illegal to disqualify an employee for a job because of race, color, religion, sex, national origin, and with the recent passage of the American Disabilities Act (ADA), handicap. Because of gender differences in strength, body composition, and VO2max, preemployment tests for physically demanding jobs tend to screen out more females than males. Employers are using preemployment tests not only to enhance worker productivity, but also to minimize the threat of litigation for discriminatory hiring practices and to reduce the risk of musculoskeletal injuries. The primary ergonomic methods used in industry to reduce the risk of back injuries are preemployment testing and job redesign. When a test results in adverse impact, the validity of the test must be established. Validity in this context means that the test represents or predicts the applicant's capacity to perform the job. Criterion-related, content, and construct validation studies are the means used to establish validity. The validity of preemployment hiring practices for physically demanding jobs has been decided in the courts. The most common reason for ruling an employment practice invalid is the failure to show that the test measured important job behaviors. Much of this litigation has involved height and weight requirements for public safety jobs. The courts have generally ruled that using height and weight standards as a criteria for employment is illegal because they were not job related. If fitness tests comprise part or all of the preemployment test, it is essential to demonstrate that the fitness component is related to job performance. Although there are many factors to consider when establishing a cut score, there is a growing trend toward establishing the cut score on the basis of the job's physical demands, defined by VO2max and strength. This literature is limited because most validation studies are not published. They more typically take the form of a technical report to the governmental agency or company that funded the project. There are published preemployment validation studies for outdoor telephone craft jobs involving pole-climbing tasks; firefighters; highway patrol officers; steel workers; underground coal miners; chemical plant workers; electrical transmission lineworkers; and various military jobs.

  12. Aerobic fitness testing in 6- to 9-year-old children: reliability and validity of a modified Yo-Yo IR1 test and the Andersen test.

    PubMed

    Ahler, T; Bendiksen, M; Krustrup, P; Wedderkopp, N

    2012-03-01

    This study analysed the reliability and validity of two intermittent running tests (the Yo-Yo IR1 test and the Andersen test) as tools for estimating VO(2max) in children under the age of 10. Two groups, aged 6-7 years (grade 0, n = 18) and 8-9 years (grade 2, n = 16), carried out two repetitions of a modified Yo-Yo IR1 test (2 × 16 m) and the Andersen test, as well as an incremental treadmill test, to directly determine the VO(2max). No significant differences were observed in test-retest performance of the Yo-Yo IR1 test [693 ± 418 (±SD) and 670 ± 328 m, r (2) = 0.79, CV = 19%, p > 0.05, n = 32) and the Andersen test (988 ± 77 and 989 ± 87 m, r (2) = 0.86, CV = 3%, p > 0.05, n = 31). The Yo-Yo IR1 (r (2) = 0.47, n = 31, p < 0.002) and Andersen test performance (r (2) = 0.53, n = 32, p < 0.001) correlated with the VO(2max). Yo-Yo IR1 performance correlated with Andersen test performance (r (2) = 0.74, n = 32, p < 0.0001). In conclusion, the Yo-Yo IR1 and the Andersen tests are reproducible and can be used as an indicator of aerobic fitness for 6- to 9-year-old children.

  13. 32 CFR 634.35 - Chemical testing policies and procedures.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 32 National Defense 4 2010-07-01 2010-07-01 true Chemical testing policies and procedures. 634.35... Chemical testing policies and procedures. (a) Validity of chemical testing. Results of chemical testing are... instruction manual. (iv) Perform preventive maintenance as required by the instruction manual. (c) Chemical...

  14. 32 CFR 634.35 - Chemical testing policies and procedures.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 32 National Defense 4 2011-07-01 2011-07-01 false Chemical testing policies and procedures. 634.35... Chemical testing policies and procedures. (a) Validity of chemical testing. Results of chemical testing are... instruction manual. (iv) Perform preventive maintenance as required by the instruction manual. (c) Chemical...

  15. 40 CFR 1037.501 - General testing and modeling provisions.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 1065 to perform valid tests. (1) For service accumulation, use the test fuel or any commercially... appropriate diesel test fuel is ultra low-sulfur diesel fuel. (3) For gasoline-fueled vehicles, use the...) AIR POLLUTION CONTROLS CONTROL OF EMISSIONS FROM NEW HEAVY-DUTY MOTOR VEHICLES Test and Modeling...

  16. 32 CFR 634.35 - Chemical testing policies and procedures.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 32 National Defense 4 2012-07-01 2011-07-01 true Chemical testing policies and procedures. 634.35... Chemical testing policies and procedures. (a) Validity of chemical testing. Results of chemical testing are... instruction manual. (iv) Perform preventive maintenance as required by the instruction manual. (c) Chemical...

  17. 32 CFR 634.35 - Chemical testing policies and procedures.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 32 National Defense 4 2014-07-01 2013-07-01 true Chemical testing policies and procedures. 634.35... Chemical testing policies and procedures. (a) Validity of chemical testing. Results of chemical testing are... instruction manual. (iv) Perform preventive maintenance as required by the instruction manual. (c) Chemical...

  18. 32 CFR 634.35 - Chemical testing policies and procedures.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 32 National Defense 4 2013-07-01 2013-07-01 false Chemical testing policies and procedures. 634.35... Chemical testing policies and procedures. (a) Validity of chemical testing. Results of chemical testing are... instruction manual. (iv) Perform preventive maintenance as required by the instruction manual. (c) Chemical...

  19. Evaluating the statistical performance of less applied algorithms in classification of worldview-3 imagery data in an urbanized landscape

    NASA Astrophysics Data System (ADS)

    Ranaie, Mehrdad; Soffianian, Alireza; Pourmanafi, Saeid; Mirghaffari, Noorollah; Tarkesh, Mostafa

    2018-03-01

    In recent decade, analyzing the remotely sensed imagery is considered as one of the most common and widely used procedures in the environmental studies. In this case, supervised image classification techniques play a central role. Hence, taking a high resolution Worldview-3 over a mixed urbanized landscape in Iran, three less applied image classification methods including Bagged CART, Stochastic gradient boosting model and Neural network with feature extraction were tested and compared with two prevalent methods: random forest and support vector machine with linear kernel. To do so, each method was run ten time and three validation techniques was used to estimate the accuracy statistics consist of cross validation, independent validation and validation with total of train data. Moreover, using ANOVA and Tukey test, statistical difference significance between the classification methods was significantly surveyed. In general, the results showed that random forest with marginal difference compared to Bagged CART and stochastic gradient boosting model is the best performing method whilst based on independent validation there was no significant difference between the performances of classification methods. It should be finally noted that neural network with feature extraction and linear support vector machine had better processing speed than other.

  20. PIV Measurements of the CEV Hot Abort Motor Plume for CFD Validation

    NASA Technical Reports Server (NTRS)

    Wernet, Mark; Wolter, John D.; Locke, Randy; Wroblewski, Adam; Childs, Robert; Nelson, Andrea

    2010-01-01

    NASA s next manned launch platform for missions to the moon and Mars are the Orion and Ares systems. Many critical aspects of the launch system performance are being verified using computational fluid dynamics (CFD) predictions. The Orion Launch Abort Vehicle (LAV) consists of a tower mounted tractor rocket tasked with carrying the Crew Module (CM) safely away from the launch vehicle in the event of a catastrophic failure during the vehicle s ascent. Some of the predictions involving the launch abort system flow fields produced conflicting results, which required further investigation through ground test experiments. Ground tests were performed to acquire data from a hot supersonic jet in cross-flow for the purpose of validating CFD turbulence modeling relevant to the Orion Launch Abort Vehicle (LAV). Both 2-component axial plane Particle Image Velocimetry (PIV) and 3-component cross-stream Stereo Particle Image Velocimetry (SPIV) measurements were obtained on a model of an Abort Motor (AM). Actual flight conditions could not be simulated on the ground, so the highest temperature and pressure conditions that could be safely used in the test facility (nozzle pressure ratio 28.5 and a nozzle temperature ratio of 3) were used for the validation tests. These conditions are significantly different from those of the flight vehicle, but were sufficiently high enough to begin addressing turbulence modeling issues that predicated the need for the validation tests.

  1. Assessing working memory in children with ADHD: Minor administration and scoring changes may improve digit span backward's construct validity.

    PubMed

    Wells, Erica L; Kofler, Michael J; Soto, Elia F; Schaefer, Hillary S; Sarver, Dustin E

    2018-01-01

    Pediatric ADHD is associated with impairments in working memory, but these deficits often go undetected when using clinic-based tests such as digit span backward. The current study pilot-tested minor administration/scoring modifications to improve digit span backward's construct and predictive validities in a well-characterized sample of children with ADHD. WISC-IV digit span was modified to administer all trials (i.e., ignore discontinue rule) and count digits rather than trials correct. Traditional and modified scores were compared to a battery of criterion working memory (construct validity) and academic achievement tests (predictive validity) for 34 children with ADHD ages 8-13 (M=10.41; 11 girls). Traditional digit span backward scores failed to predict working memory or KTEA-2 achievement (allns). Alternate administration/scoring of digit span backward significantly improved its associations with working memory reordering (r=.58), working memory dual-processing (r=.53), working memory updating (r=.28), and KTEA-2 achievement (r=.49). Consistent with prior work, these findings urge caution when interpreting digit span performance. Minor test modifications may address test validity concerns, and should be considered in future test revisions. Digit span backward becomes a valid measure of working memory at exactly the point that testing is traditionally discontinued. Copyright © 2017 Elsevier Ltd. All rights reserved.

  2. Comprehensive validation scheme for in situ fiber optics dissolution method for pharmaceutical drug product testing.

    PubMed

    Mirza, Tahseen; Liu, Qian Julie; Vivilecchia, Richard; Joshi, Yatindra

    2009-03-01

    There has been a growing interest during the past decade in the use of fiber optics dissolution testing. Use of this novel technology is mainly confined to research and development laboratories. It has not yet emerged as a tool for end product release testing despite its ability to generate in situ results and efficiency improvement. One potential reason may be the lack of clear validation guidelines that can be applied for the assessment of suitability of fiber optics. This article describes a comprehensive validation scheme and development of a reliable, robust, reproducible and cost-effective dissolution test using fiber optics technology. The test was successfully applied for characterizing the dissolution behavior of a 40-mg immediate-release tablet dosage form that is under development at Novartis Pharmaceuticals, East Hanover, New Jersey. The method was validated for the following parameters: linearity, precision, accuracy, specificity, and robustness. In particular, robustness was evaluated in terms of probe sampling depth and probe orientation. The in situ fiber optic method was found to be comparable to the existing manual sampling dissolution method. Finally, the fiber optic dissolution test was successfully performed by different operators on different days, to further enhance the validity of the method. The results demonstrate that the fiber optics technology can be successfully validated for end product dissolution/release testing. (c) 2008 Wiley-Liss, Inc. and the American Pharmacists Association

  3. A Cross-Validation of easyCBM Mathematics Cut Scores in Washington State: 2009-2010 Test. Technical Report #1105

    ERIC Educational Resources Information Center

    Anderson, Daniel; Alonzo, Julie; Tindal, Gerald

    2011-01-01

    In this technical report, we document the results of a cross-validation study designed to identify optimal cut-scores for the use of the easyCBM[R] mathematics test in the state of Washington. A large sample, randomly split into two groups of roughly equal size, was used for this study. Students' performance classification on the Washington state…

  4. A Cross-Validation of easyCBM[R] Mathematics Cut Scores in Oregon: 2009-2010. Technical Report #1104

    ERIC Educational Resources Information Center

    Anderson, Daniel; Alonzo, Julie; Tindal, Gerald

    2011-01-01

    In this technical report, we document the results of a cross-validation study designed to identify optimal cut-scores for the use of the easyCBM[R] mathematics test in Oregon. A large sample, randomly split into two groups of roughly equal size, was used for this study. Students' performance classification on the Oregon state test was used as the…

  5. Sensitivity and validity of psychometric tests for assessing driving impairment: effects of sleep deprivation.

    PubMed

    Jongen, Stefan; Perrier, Joy; Vuurman, Eric F; Ramaekers, Johannes G; Vermeeren, Annemiek

    2015-01-01

    To assess drug induced driving impairment, initial screening is needed. However, no consensus has been reached about which initial screening tools have to be used. The present study aims to determine the ability of a battery of psychometric tests to detect performance impairing effects of clinically relevant levels of drowsiness as induced by one night of sleep deprivation. Twenty four healthy volunteers participated in a 2-period crossover study in which the highway driving test was conducted twice: once after normal sleep and once after one night of sleep deprivation. The psychometric tests were conducted on 4 occasions: once after normal sleep (at 11 am) and three times during a single night of sleep deprivation (at 1 am, 5 am, and 11 am). On-the-road driving performance was significantly impaired after sleep deprivation, as measured by an increase in Standard Deviation of Lateral Position (SDLP) of 3.1 cm compared to performance after a normal night of sleep. At 5 am, performance in most psychometric tests showed significant impairment. As expected, largest effect sizes were found on performance in the Psychomotor Vigilance Test (PVT). Large effects sizes were also found in the Divided Attention Test (DAT), the Attention Network Test (ANT), and the test for Useful Field of View (UFOV) at 5 and 11 am during sleep deprivation. Effects of sleep deprivation on SDLP correlated significantly with performance changes in the PVT and the DAT, but not with performance changes in the UFOV. From the psychometric tests used in this study, the PVT and DAT seem most promising for initial evaluation of drug impairment based on sensitivity and correlations with driving impairment. Further studies are needed to assess the sensitivity and validity of these psychometric tests after benchmark sedative drug use.

  6. Construct validity of tests that measure kick performance for young soccer players based on cluster analysis: exploring the relationship between coaches rating and actual measures.

    PubMed

    Palucci Vieira, Luiz H; de Andrade, Vitor L; Aquino, Rodrigo L; Moraes, Renato; Barbieri, Fabio A; Cunha, Sérgio A; Bedo, Bruno L; Santiago, Paulo R

    2017-12-01

    The main aim of this study was to verify the relationship between the classification of coaches and actual performance in field tests that measure the kicking performance in young soccer players, using the K-means clustering technique. Twenty-three U-14 players performed 8 tests to measure their kicking performance. Four experienced coaches provided a rating for each player as follows: 1: poor; 2: below average; 3: average; 4: very good; 5: excellent as related to three parameters (i.e. accuracy, power and ability to put spin on the ball). The scores interval established from k-means cluster metric was useful to originating five groups of performance level, since ANOVA revealed significant differences between clusters generated (P<0.01). Accuracy seems to be moderately predicted by the penalty kick, free kick, kicking the ball rolling and Wall Volley Test (0.44≤r≤0.56), while the ability to put spin on the ball can be measured by the free kick and the corner kick tests (0.52≤r≤0.61). Body measurements, age and PHV did not systematically influence the performance. The Wall Volley Test seems to be a good predictor of other tests. Five tests showed reasonable construct validity and can be used to predict the accuracy (penalty kick, free kick, kicking a rolling ball and Wall Volley Test) and ability to put spin on the ball (free kick and corner kick tests) when kicking in soccer. In contrast, the goal kick, kicking the ball when airborne and the vertical kick tests exhibited low power of discrimination and using them should be viewed with caution.

  7. Generic Helicopter-Based Testbed for Surface Terrain Imaging Sensors

    NASA Technical Reports Server (NTRS)

    Alexander, James; Goldberg, Hannah; Montgomery, James; Spiers, Gary; Liebe, Carl; Johnson, Andrew; Gromov, Konstantin; Konefat, Edward; Lam, Raymond; Meras, Patrick

    2008-01-01

    To be certain that a candidate sensor system will perform as expected during missions, we have developed a field test system and have executed test flights with a helicopter-mounted sensor platform over desert terrains, which simulate Lunar features. A key advantage to this approach is that different sensors can be tested and characterized in an environment relevant to the flight needs prior to flight. Testing the various sensors required the development of a field test system, including an instrument to validate the truth of the sensor system under test. The field test system was designed to be flexible enough to cover the test needs of many sensors (lidar, radar, cameras) that require an aerial test platform, including helicopters, airplanes, unmanned aerial vehicles (UAV), or balloons. To validate the performance of the sensor under test, the dynamics of the test platform must be known with sufficient accuracy to provide accurate models for input into algorithm development. The test system provides support equipment to measure the dynamics of the field test sensor platform, and allow computation of the truth position, velocity, attitude, and time.

  8. Illegal performance enhancing drugs and doping in sport: a picture-based brief implicit association test for measuring athletes' attitudes.

    PubMed

    Brand, Ralf; Heck, Philipp; Ziegler, Matthias

    2014-01-30

    Doping attitude is a key variable in predicting athletes' intention to use forbidden performance enhancing drugs. Indirect reaction-time based attitude tests, such as the implicit association test, conceal the ultimate goal of measurement from the participant better than questionnaires. Indirect tests are especially useful when socially sensitive constructs such as attitudes towards doping need to be described. The present study serves the development and validation of a novel picture-based brief implicit association test (BIAT) for testing athletes' attitudes towards doping in sport. It shall provide the basis for a transnationally compatible research instrument able to harmonize anti-doping research efforts. Following a known-group differences validation strategy, the doping attitudes of 43 athletes from bodybuilding (representative for a highly doping prone sport) and handball (as a contrast group) were compared using the picture-based doping-BIAT. The Performance Enhancement Attitude Scale (PEAS) was employed as a corresponding direct measure in order to additionally validate the results. As expected, in the group of bodybuilders, indirectly measured doping attitudes as tested with the picture-based doping-BIAT were significantly less negative (η2 = .11). The doping-BIAT and PEAS scores correlated significantly at r = .50 for bodybuilders, and not significantly at r = .36 for handball players. There was a low error rate (7%) and a satisfactory internal consistency (rtt = .66) for the picture-based doping-BIAT. The picture-based doping-BIAT constitutes a psychometrically tested method, ready to be adopted by the international research community. The test can be administered via the internet. All test material is available "open source". The test might be implemented, for example, as a new effect-measure in the evaluation of prevention programs.

  9. Illegal performance enhancing drugs and doping in sport: a picture-based brief implicit association test for measuring athletes’ attitudes

    PubMed Central

    2014-01-01

    Background Doping attitude is a key variable in predicting athletes’ intention to use forbidden performance enhancing drugs. Indirect reaction-time based attitude tests, such as the implicit association test, conceal the ultimate goal of measurement from the participant better than questionnaires. Indirect tests are especially useful when socially sensitive constructs such as attitudes towards doping need to be described. The present study serves the development and validation of a novel picture-based brief implicit association test (BIAT) for testing athletes’ attitudes towards doping in sport. It shall provide the basis for a transnationally compatible research instrument able to harmonize anti-doping research efforts. Method Following a known-group differences validation strategy, the doping attitudes of 43 athletes from bodybuilding (representative for a highly doping prone sport) and handball (as a contrast group) were compared using the picture-based doping-BIAT. The Performance Enhancement Attitude Scale (PEAS) was employed as a corresponding direct measure in order to additionally validate the results. Results As expected, in the group of bodybuilders, indirectly measured doping attitudes as tested with the picture-based doping-BIAT were significantly less negative (η2 = .11). The doping-BIAT and PEAS scores correlated significantly at r = .50 for bodybuilders, and not significantly at r = .36 for handball players. There was a low error rate (7%) and a satisfactory internal consistency (r tt  = .66) for the picture-based doping-BIAT. Conclusions The picture-based doping-BIAT constitutes a psychometrically tested method, ready to be adopted by the international research community. The test can be administered via the internet. All test material is available “open source”. The test might be implemented, for example, as a new effect-measure in the evaluation of prevention programs. PMID:24479865

  10. Validity and reliability of an online visual-spatial working memory task for self-reliant administration in school-aged children.

    PubMed

    Van de Weijer-Bergsma, Eva; Kroesbergen, Evelyn H; Prast, Emilie J; Van Luit, Johannes E H

    2015-09-01

    Working memory is an important predictor of academic performance, and of math performance in particular. Most working memory tasks depend on one-to-one administration by a testing assistant, which makes the use of such tasks in large-scale studies time-consuming and costly. Therefore, an online, self-reliant visual-spatial working memory task (the Lion game) was developed for primary school children (6-12 years of age). In two studies, the validity and reliability of the Lion game were investigated. The results from Study 1 (n = 442) indicated satisfactory six-week test-retest reliability, excellent internal consistency, and good concurrent and predictive validity. The results from Study 2 (n = 5,059) confirmed the results on the internal consistency and predictive validity of the Lion game. In addition, multilevel analysis revealed that classroom membership influenced Lion game scores. We concluded that the Lion game is a valid and reliable instrument for the online computerized and self-reliant measurement of visual-spatial working memory (i.e., updating).

  11. Design, development, testing and validation of a Photonics Virtual Laboratory for the study of LEDs

    NASA Astrophysics Data System (ADS)

    Naranjo, Francisco L.; Martínez, Guadalupe; Pérez, Ángel L.; Pardo, Pedro J.

    2014-07-01

    This work presents the design, development, testing and validation of a Photonic Virtual Laboratory, highlighting the study of LEDs. The study was conducted from a conceptual, experimental and didactic standpoint, using e-learning and m-learning platforms. Specifically, teaching tools that help ensure that our students perform significant learning have been developed. It has been brought together the scientific aspect, such as the study of LEDs, with techniques of generation and transfer of knowledge through the selection, hierarchization and structuring of information using concept maps. For the validation of the didactic materials developed, it has been used procedures with various assessment tools for the collection and processing of data, applied in the context of an experimental design. Additionally, it was performed a statistical analysis to determine the validity of the materials developed. The assessment has been designed to validate the contributions of the new materials developed over the traditional method of teaching, and to quantify the learning achieved by students, in order to draw conclusions that serve as a reference for its application in the teaching and learning processes, and comprehensively validate the work carried out.

  12. Is there inter-procedural transfer of skills in intraocular surgery? A randomized controlled trial.

    PubMed

    Thomsen, Ann Sofia Skou; Kiilgaard, Jens Folke; la Cour, Morten; Brydges, Ryan; Konge, Lars

    2017-12-01

    To investigate how experience in simulated cataract surgery impacts and transfers to the learning curves for novices in vitreoretinal surgery. Twelve ophthalmology residents without previous experience in intraocular surgery were randomized to (1) intensive training in cataract surgery on a virtual-reality simulator until passing a test with predefined validity evidence (cataract trainees) or to (2) no cataract surgery training (novices). Possible skill transfer was assessed using a test consisting of all 11 vitreoretinal modules on the EyeSi virtual-reality simulator. All participants repeated the test of vitreoretinal surgical skills until their performance curve plateaued. Three experienced vitreoretinal surgeons also performed the test to establish validity evidence. Analysis with independent samples t-tests was performed. The vitreoretinal test on the EyeSi simulator demonstrated evidence of validity, given statistically significant differences in mean test scores for the first repetition; experienced surgeons scored higher than novices (p = 0.023) and cataract trainees (p = 0.003). Internal consistency for the 11 modules of the test was acceptable (Cronbach's α = 0.73). Our findings did not indicate a transfer effect with no significant differences found between cataract trainees and novices in their starting scores (mean ± SD 381 ± 129 points versus 455 ± 82 points, p = 0.262), time to reach maximum performance level (10.7 ± 3.0 hr versus 8.7 ± 2.8 hr, p = 0.265), or maximum scores (785 ± 162 points versus 805 ± 73 points, p = 0.791). Pretraining in cataract surgery did not demonstrate any measurable effect on vitreoretinal procedural performance. The results of this study indicate that we should not anticipate extensive transfer of surgical skills when planning training programmes in intraocular surgery. © 2017 Acta Ophthalmologica Scandinavica Foundation. Published by John Wiley & Sons Ltd.

  13. A contemporary approach to validity arguments: a practical guide to Kane's framework.

    PubMed

    Cook, David A; Brydges, Ryan; Ginsburg, Shiphra; Hatala, Rose

    2015-06-01

    Assessment is central to medical education and the validation of assessments is vital to their use. Earlier validity frameworks suffer from a multiplicity of types of validity or failure to prioritise among sources of validity evidence. Kane's framework addresses both concerns by emphasising key inferences as the assessment progresses from a single observation to a final decision. Evidence evaluating these inferences is planned and presented as a validity argument. We aim to offer a practical introduction to the key concepts of Kane's framework that educators will find accessible and applicable to a wide range of assessment tools and activities. All assessments are ultimately intended to facilitate a defensible decision about the person being assessed. Validation is the process of collecting and interpreting evidence to support that decision. Rigorous validation involves articulating the claims and assumptions associated with the proposed decision (the interpretation/use argument), empirically testing these assumptions, and organising evidence into a coherent validity argument. Kane identifies four inferences in the validity argument: Scoring (translating an observation into one or more scores); Generalisation (using the score[s] as a reflection of performance in a test setting); Extrapolation (using the score[s] as a reflection of real-world performance), and Implications (applying the score[s] to inform a decision or action). Evidence should be collected to support each of these inferences and should focus on the most questionable assumptions in the chain of inference. Key assumptions (and needed evidence) vary depending on the assessment's intended use or associated decision. Kane's framework applies to quantitative and qualitative assessments, and to individual tests and programmes of assessment. Validation focuses on evaluating the key claims, assumptions and inferences that link assessment scores with their intended interpretations and uses. The Implications and associated decisions are the most important inferences in the validity argument. © 2015 John Wiley & Sons Ltd.

  14. Portuguese Version of the Pain Beliefs and Perceptions Inventory: A Multicenter Validation Study.

    PubMed

    Azevedo, Luís Filipe; Sampaio, Rute; Camila Dias, Cláudia; Romão, José; Lemos, Laurinda; Agualusa, Luís; Vaz-Serra, Sílvia; Patto, Teresa; Costa-Pereira, Altamiro; Castro-Lopes, José Manuel

    2017-07-01

    We aimed to perform the translation, cultural adaptation, and validation of the Pain Beliefs and Perceptions Inventory (PBPI) for the European Portuguese language and chronic pain population. This is a longitudinal multicenter validation study. A Portuguese version of the PBPI (PBPI-P) was created through a process of translation, back translation, and expert panel evaluation. The PBPI-P was administered to a total of 122 patients from 13 chronic pain clinics in Portugal, at baseline and after 7 days. Internal consistency and test-retest reliability were assessed by Cronbach's alpha (α) and intraclass correlation coefficient (ICC). Construct (convergent and discriminant) validity was assessed based on a set of previously developed theoretical hypotheses about interrelations between the PBPI-P and other measures. Exploratory and confirmatory factor analyses were performed to test the theoretical structure of the PBPI-P. The internal consistency and test-retest reliability coefficients for each respective subscale were α = 0.620 and ICC = 0.801 for mystery; α = 0.744 and ICC = 0.841 for permanence; α = 0.778 and ICC = 0.791 for constancy; and α = 0.764 and ICC = 0.881 for self-blame. Exploratory and confirmatory factor analysis revealed a four-factor structure (performance, constancy, self-blame, and mystery) that explained 63% of the variance. The construct validity of the PBPI-P was shown to be adequate, with more than 90% of the previously defined hypotheses regarding interrelations with other measures confirmed. The PBPI-P has been shown to be adequate and to have excellent reliability, internal consistency, and validity. It may contribute to a better pain assessment and is suitable for research and clinical use. © 2016 World Institute of Pain.

  15. 1:50 Scale Testing of Three Floating Wind Turbines at MARIN and Numerical Model Validation Against Test Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dagher, Habib; Viselli, Anthony; Goupee, Andrew

    The primary goal of the basin model test program discussed herein is to properly scale and accurately capture physical data of the rigid body motions, accelerations and loads for different floating wind turbine platform technologies. The intended use for this data is for performing comparisons with predictions from various aero-hydro-servo-elastic floating wind turbine simulators for calibration and validation. Of particular interest is validating the floating offshore wind turbine simulation capabilities of NREL’s FAST open-source simulation tool. Once the validation process is complete, coupled simulators such as FAST can be used with a much greater degree of confidence in design processesmore » for commercial development of floating offshore wind turbines. The test program subsequently described in this report was performed at MARIN (Maritime Research Institute Netherlands) in Wageningen, the Netherlands. The models considered consisted of the horizontal axis, NREL 5 MW Reference Wind Turbine (Jonkman et al., 2009) with a flexible tower affixed atop three distinct platforms: a tension leg platform (TLP), a spar-buoy modeled after the OC3 Hywind (Jonkman, 2010) and a semi-submersible. The three generic platform designs were intended to cover the spectrum of currently investigated concepts, each based on proven floating offshore structure technology. The models were tested under Froude scale wind and wave loads. The high-quality wind environments, unique to these tests, were realized in the offshore basin via a novel wind machine which exhibits negligible swirl and low turbulence intensity in the flow field. Recorded data from the floating wind turbine models included rotor torque and position, tower top and base forces and moments, mooring line tensions, six-axis platform motions and accelerations at key locations on the nacelle, tower, and platform. A large number of tests were performed ranging from simple free-decay tests to complex operating conditions with irregular sea states and dynamic winds.« less

  16. Parametric Study of the Effect of Membrane Tension on Sunshield Dynamics

    NASA Technical Reports Server (NTRS)

    Ross, Brian; Johnston, John D.; Smith, James

    2002-01-01

    The NGST sunshield is a lightweight, flexible structure consisting of pretensioned membranes supported by deployable booms. The structural dynamic behavior of the sunshield must be well understood in order to predict its influence on observatory performance. A 1/10th scale model of the sunshield has been developed for ground testing to provide data to validate modeling techniques for thin film membrane structures. The validated models can then be used to predict the behaviour of the full scale sunshield. This paper summarizes the most recent tests performed on the 1/10th scale sunshield to study the effect of membrane preload on sunshield dynamics. Topics to be covered include the test setup, procedures, and a summary of results.

  17. SAS molecular tests Salmonella detection kit. Performance tested method 021202.

    PubMed

    Bapanpally, Chandra; Montier, Laura; Khan, Shah; Kasra, Akif; Brunelle, Sharon L

    2014-01-01

    The SAS Molecular tests Salmonella Detection method, a Loop-mediated Isothermal Amplification method, performed as well as or better than the U.S. Department of Agriculture-Food Safety Inspection Service Microbiology Laboratory Guidebook and the U.S. Food and Drug Administration Bacteriological Analytical Manual reference methods for ground beef, beef trim, ground turkey, chicken carcass rinses, bagged mixed lettuce, and fresh spinach. The ground beef (30% fat, 25 g test portion), poultry matrixes and leafy greens were validated in a 6-7 h enrichment, and ground beef (30% fat, 375 g composite test portion) and beef trim (375 g composite test portion) were validated in a 16-20 h enrichment. The method performance for meat and leafy green matrixes was shown to be acceptable under conditions of co-enrichment with Escherichia coli 0157. Thus, after a short 6-7 h co-enrichment step, ground beef, beef trim, lettuce, and spinach can be tested for both Salmonella and E. coli O157. Inclusivity and exclusivity testing revealed no false negatives and no false positives among the 100 Salmonella serovars and 30 non-Salmonella species examined. The method was shown to be robust when enrichment time, DNA extract hold time, and DNA volume were varied.

  18. Alcohol calibration of tests measuring skills related to car driving.

    PubMed

    Jongen, Stefan; Vuurman, Eric; Ramaekers, Jan; Vermeeren, Annemiek

    2014-06-01

    Medication and illicit drugs can have detrimental side effects which impair driving performance. A drug's impairing potential should be determined by well-validated, reliable, and sensitive tests and ideally be calibrated by benchmark drugs and doses. To date, no consensus has been reached on the issue of which psychometric tests are best suited for initial screening of a drug's driving impairment potential. The aim of this alcohol calibration study is to determine which performance tests are useful to measure drug-induced impairment. The effects of alcohol are used to compare the psychometric quality between tests and as benchmark to quantify performance changes in each test associated with potentially impairing drug effects. Twenty-four healthy volunteers participated in a double-blind, four-way crossover study. Treatments were placebo and three different doses of alcohol leading to blood alcohol concentrations (BACs) of 0.2, 0.5, and 0.8 g/L. Main effects of alcohol were found in most tests. Compared with placebo, performance in the Divided Attention Test (DAT) was significantly impaired after all alcohol doses and performance in the Psychomotor Vigilance Test (PVT) and the Balance Test was impaired with a BAC of 0.5 and 0.8 g/L. The largest effect sizes were found on postural balance with eyes open and mean reaction time in the divided attention and the psychomotor vigilance test. The preferable tests for initial screening are the DAT and the PVT, as these tests were most sensitive to the impairing effects of alcohol and being considerably valid in assessing potential driving impairment.

  19. Validity of the modified back-saver sit-and-reach test: a comparison with other protocols.

    PubMed

    Hui, S S; Yuen, P Y

    2000-09-01

    Studies have shown that the classical sit-and-reach (CSR) test, the modified sit-and-reach (MSR), and the newly developed back-saver sit-and-reach (BS) test have poor criterion-related validity in estimating low-back flexibility but yielded moderate criterion-related validity in hamstring flexibility. The V sit-and-reach (VSR) test was found to be practical but the validity has not been established. The purpose of this study was to propose a modified back-saver sit-and-reach (MBS) test, which incorporated all advantages of the various protocols, and to compare the criterion-related validity and reliability of all these tests. 158 college students (F = 96, and M = 62; age = 20.77 +/- 2.51) performed CSR, VSR, BS (left and right leg), and MBS (left and right leg) tests in a randomized order. Scores from each test were then correlated with the criterion measures. For all sit-reach tests, intraclass reliability (single trial) was very high (r = 0.89-0.98). MBS yielded significant and highest r with low-back and hamstring criterion for men (r = 0.47-0.67) and women (r = 0.23-0.54). The low-back and right hamstring validity of MBS for men were significantly (P < 0.01) higher than those from BS and CSR, whereas no differences in criterion-related validity were found between the MBS and other protocols in women. The ratings of perceived comfort among the sit-and-reach protocols were significantly different (P < 0.001) from each other. The rating for MBS was observed the most comfortable test as compared with other protocols. The MBS test is not only a reliable test for hamstring and low-back flexibility, it is also a more practical with improved validity for hamstring and low-back flexibility in men than previous protocols.

  20. Practical Aspects of Designing and Conducting Validation Studies Involving Multi-study Trials.

    PubMed

    Coecke, Sandra; Bernasconi, Camilla; Bowe, Gerard; Bostroem, Ann-Charlotte; Burton, Julien; Cole, Thomas; Fortaner, Salvador; Gouliarmou, Varvara; Gray, Andrew; Griesinger, Claudius; Louhimies, Susanna; Gyves, Emilio Mendoza-de; Joossens, Elisabeth; Prinz, Maurits-Jan; Milcamps, Anne; Parissis, Nicholaos; Wilk-Zasadna, Iwona; Barroso, João; Desprez, Bertrand; Langezaal, Ingrid; Liska, Roman; Morath, Siegfried; Reina, Vittorio; Zorzoli, Chiara; Zuang, Valérie

    This chapter focuses on practical aspects of conducting prospective in vitro validation studies, and in particular, by laboratories that are members of the European Union Network of Laboratories for the Validation of Alternative Methods (EU-NETVAL) that is coordinated by the EU Reference Laboratory for Alternatives to Animal Testing (EURL ECVAM). Prospective validation studies involving EU-NETVAL, comprising a multi-study trial involving several laboratories or "test facilities", typically consist of two main steps: (1) the design of the validation study by EURL ECVAM and (2) the execution of the multi-study trial by a number of qualified laboratories within EU-NETVAL, coordinated and supported by EURL ECVAM. The approach adopted in the conduct of these validation studies adheres to the principles described in the OECD Guidance Document on the Validation and International Acceptance of new or updated test methods for Hazard Assessment No. 34 (OECD 2005). The context and scope of conducting prospective in vitro validation studies is dealt with in Chap. 4 . Here we focus mainly on the processes followed to carry out a prospective validation of in vitro methods involving different laboratories with the ultimate aim of generating a dataset that can support a decision in relation to the possible development of an international test guideline (e.g. by the OECD) or the establishment of performance standards.

  1. Development and psychometric evaluation of a cardiovascular risk and disease management knowledge assessment tool.

    PubMed

    Rosneck, James S; Hughes, Joel; Gunstad, John; Josephson, Richard; Noe, Donald A; Waechter, Donna

    2014-01-01

    This article describes the systematic construction and psychometric analysis of a knowledge assessment instrument for phase II cardiac rehabilitation (CR) patients measuring risk modification disease management knowledge and behavioral outcomes derived from national standards relevant to secondary prevention and management of cardiovascular disease. First, using adult curriculum based on disease-specific learning outcomes and competencies, a systematic test item development process was completed by clinical staff. Second, a panel of educational and clinical experts used an iterative process to identify test content domain and arrive at consensus in selecting items meeting criteria. Third, the resulting 31-question instrument, the Cardiac Knowledge Assessment Tool (CKAT), was piloted in CR patients to ensure use of application. Validity and reliability analyses were performed on 3638 adults before test administrations with additional focused analyses on 1999 individuals completing both pretreatment and posttreatment administrations within 6 months. Evidence of CKAT content validity was substantiated, with 85% agreement among content experts. Evidence of construct validity was demonstrated via factor analysis identifying key underlying factors. Estimates of internal consistency, for example, Cronbach's α = .852 and Spearman-Brown split-half reliability = 0.817 on pretesting, support test reliability. Item analysis, using point biserial correlation, measured relationships between performance on single items and total score (P < .01). Analyses using item difficulty and item discrimination indices further verified item stability and validity of the CKAT. A knowledge instrument specifically designed for an adult CR population was systematically developed and tested in a large representative patient population, satisfying psychometric parameters, including validity and reliability.

  2. Identification and Validation of a Brief Test Anxiety Screening Tool

    ERIC Educational Resources Information Center

    von der Embse, Nathaniel P.; Kilgus, Stephen P.; Segool, Natasha; Putwain, Dave

    2013-01-01

    The implementation of test-based accountability policies around the world has increased the pressure placed on students to perform well on state achievement tests. Educational researchers have begun taking a closer look at the reciprocal effects of test anxiety and high-stakes testing. However, existing test anxiety assessments lack efficiency and…

  3. Latent Profile Analyses of Test Anxiety: A Pilot Study

    ERIC Educational Resources Information Center

    von der Embse, Nathaniel P.; Mata, Andrea D.; Segool, Natasha; Scott, Emma-Catherine

    2014-01-01

    In an era of test-based accountability, there has been a renewed interest in understanding the relationship between test anxiety and test performance. The development and validation of test anxiety scales have grown with the rise of test anxiety research. Research is needed to critically examine the psychometric properties of these scales prior to…

  4. The English and Chinese versions of the five-level EuroQoL Group's five-dimension questionnaire (EQ-5D) were valid and reliable and provided comparable scores in Asian breast cancer patients.

    PubMed

    Lee, Chun Fan; Ng, Raymond; Luo, Nan; Wong, Nan Soon; Yap, Yoon Sim; Lo, Soo Kien; Chia, Whay Kuang; Yee, Alethea; Krishna, Lalit; Wong, Celest; Goh, Cynthia; Cheung, Yin Bun

    2013-01-01

    To examine the measurement properties of and comparability between the English and Chinese versions of the five-level EuroQoL Group's five-dimension questionnaire (EQ-5D) in breast cancer patients in Singapore. This is an observational study of 269 patients. Known-group validity and responsiveness of the EQ-5D utility index and visual analog scale (VAS) were assessed in relation to various clinical characteristics and longitudinal change in performance status, respectively. Convergent and divergent validity was examined by correlation coefficients between the EQ-5D and a breast cancer-specific instrument. Test-retest reliability was evaluated. The two language versions were compared by multiple regression analyses. For both English and Chinese versions, the EQ-5D utility index and VAS demonstrated known-group validity and convergent and divergent validity, and presented sufficient test-retest reliability (intraclass correlation = 0.72 to 0.83). The English version was responsive to changes in performance status. The Chinese version was responsive to decline in performance status, but there was no conclusive evidence about its responsiveness to improvement in performance status. In the comparison analyses of the utility index and VAS between the two language versions, borderline results were obtained, and equivalence cannot be definitely confirmed. The five-level EQ-5D is valid, responsive, and reliable in assessing health outcome of breast cancer patients. The English and Chinese versions provide comparable measurement results.

  5. A Predictive Validity Study of Creative and Effective Managerial Performance.

    ERIC Educational Resources Information Center

    Moffie, D. J.; Goodner, Susan

    This study tests the following hypotheses concerning the job creativity of managers: (1) There is a significant relationship between psychological test scores secured on subjects 15 to 20 years ago and creative performance on the job today, (2) there is a significant relationship between biographical information secured from subjects at the time…

  6. Rater Expertise in a Second Language Speaking Assessment: The Influence of Training and Experience

    ERIC Educational Resources Information Center

    Davis, Lawrence Edward

    2012-01-01

    Speaking performance tests typically employ raters to produce scores; accordingly, variability in raters' scoring decisions has important consequences for test reliability and validity. One such source of variability is the rater's level of expertise in scoring. Therefore, it is important to understand how raters' performance is influenced by…

  7. Is viscoelastic coagulation monitoring with ROTEM or TEG validated?

    PubMed

    Solomon, Cristina; Asmis, Lars M; Spahn, Donat R

    2016-10-01

    Recent years have seen increasing worldwide interest in the use of viscoelastic coagulation monitoring tests, performed using devices such as ROTEM and TEG. The use of such tests to guide haemostatic therapy may help reduce transfusion of allogeneic blood products in bleeding patients and is supported in European guidelines for managing trauma and severe perioperative bleeding. In addition, viscoelastic tests form the basis of numerous published treatment algorithms. However, some publications have stated that viscoelastic tests are not validated. A specific definition of the term validation is lacking and regulatory requirements of the US Food and Drug Administration (FDA) and European Medicines Agency (EMA) have been fulfilled by ROTEM and TEG assays. Viscoelastic tests have been used in pivotal clinical trials, and they are approved for use in most of the world's countries. Provided that locally approved indications are adhered to, the regulatory framework for clinicians to use viscoelastic tests in routine clinical practice is in place.

  8. [Methodologic and clinical comparison of four different ergospirometry systems].

    PubMed

    Winter, U J; Fritsch, J; Gitt, A K; Pothoff, G; Berge, P G; Hilger, H H

    1994-01-01

    The clinician who uses cardio-pulmonary exercise testing (CPX) systems relies on the technical informations from the device producers. In this paper, the practicability, the accuracy and the safety of four different, available CPX systems are compared in the clinical area, using clinically orientated criteria. The exercise tests were performed in healthy subjects, in patients with cardiac and/or pulmonary disease as well as in young or old people. The comparison study showed, that there were partially large differences in device design and measurement accuracy. Furthermore, our investigation demonstrated that beneath repetitive calibrations of the CPX systems a frequent validation of the devices by means of a metabolic simulator is necessary. Problems in calibration can be caused by an inadequate performance or by unclean calibration gases. Problems in validation can be due to incompatibility of the CPX device and the validator. The comparison study of the four different systems showed that in the future standards for CPX testing should be defined.

  9. Validation of Modifications to the ANSR for Listeria Method for Improved Internal Positive Control Performance.

    PubMed

    Alles, Susan; Meister, Evan; Hosking, Edan; Tovar, Eric; Shaulis, Rebecca; Schonfeld, Mark; Zhang, Lei; Li, Lin; Biswas, Preetha; Mozola, Mark; Donofrio, Robert; Chen, Yi

    2018-03-01

    A study was conducted to validate a minor reagent formulation change to the ANSR for Listeria method, Performance Tested MethodSM 101202. This change involves increasing the master mix volume prelyophilization by 40% and addition of salmon sperm DNA (nontarget DNA) to the master mix. These changes improve the robustness of the internal positive control response and reduce the possibility of obtaining invalid results due to weak-positive control curves. When three foods (hot dogs, Mexican-style cheese, and cantaloupe) and sponge samples taken from a stainless steel surface were tested, no significant differences in performance between the ANSR and U.S. Food and Drug Administration Bacteriological Analytical Manual or U.S. Department of Agriculture-Food Safety and Inspection Service Microbiology Laboratory Guidebook reference culture procedures were observed for any of the matrixes as determined by probability of detection analysis. Inclusivity and exclusivity testing yielded 100% expected results for target and nontarget bacteria. Accelerated stability testing was carried out over a 7 week period and showed no decrease in assay performance over time.

  10. Damage Assessment of a Full-Scale Six-Story wood-frame Building Following Triaxial shake Table Tests

    Treesearch

    John W. van de Lindt; Rakesh Gupta; Shiling Pei; Kazuki Tachibana; Yasuhiro Araki; Douglas Rammer; Hiroshi Isoda

    2012-01-01

    In the summer of 2009, a full-scale midrise wood-frame building was tested under a series of simulated earthquakes on the world's largest shake table in Miki City, Japan. The objective of this series of tests was to validate a performance-based seismic design approach by qualitatively and quantitatively examining the building's seismic performance in terms of...

  11. Development and Validation of High Precision Thermal, Mechanical, and Optical Models for the Space Interferometry Mission

    NASA Technical Reports Server (NTRS)

    Lindensmith, Chris A.; Briggs, H. Clark; Beregovski, Yuri; Feria, V. Alfonso; Goullioud, Renaud; Gursel, Yekta; Hahn, Inseob; Kinsella, Gary; Orzewalla, Matthew; Phillips, Charles

    2006-01-01

    SIM Planetquest (SIM) is a large optical interferometer for making microarcsecond measurements of the positions of stars, and to detect Earth-sized planets around nearby stars. To achieve this precision, SIM requires stability of optical components to tens of picometers per hour. The combination of SIM s large size (9 meter baseline) and the high stability requirement makes it difficult and costly to measure all aspects of system performance on the ground. To reduce risks, costs and to allow for a design with fewer intermediate testing stages, the SIM project is developing an integrated thermal, mechanical and optical modeling process that will allow predictions of the system performance to be made at the required high precision. This modeling process uses commercial, off-the-shelf tools and has been validated against experimental results at the precision of the SIM performance requirements. This paper presents the description of the model development, some of the models, and their validation in the Thermo-Opto-Mechanical (TOM3) testbed which includes full scale brassboard optical components and the metrology to test them at the SIM performance requirement levels.

  12. Validation of tsunami inundation model TUNA-RP using OAR-PMEL-135 benchmark problem set

    NASA Astrophysics Data System (ADS)

    Koh, H. L.; Teh, S. Y.; Tan, W. K.; Kh'ng, X. Y.

    2017-05-01

    A standard set of benchmark problems, known as OAR-PMEL-135, is developed by the US National Tsunami Hazard Mitigation Program for tsunami inundation model validation. Any tsunami inundation model must be tested for its accuracy and capability using this standard set of benchmark problems before it can be gainfully used for inundation simulation. The authors have previously developed an in-house tsunami inundation model known as TUNA-RP. This inundation model solves the two-dimensional nonlinear shallow water equations coupled with a wet-dry moving boundary algorithm. This paper presents the validation of TUNA-RP against the solutions provided in the OAR-PMEL-135 benchmark problem set. This benchmark validation testing shows that TUNA-RP can indeed perform inundation simulation with accuracy consistent with that in the tested benchmark problem set.

  13. Digital Fly-By-Wire Flight Control Validation Experience

    NASA Technical Reports Server (NTRS)

    Szalai, K. J.; Jarvis, C. R.; Krier, G. E.; Megna, V. A.; Brock, L. D.; Odonnell, R. N.

    1978-01-01

    The experience gained in digital fly-by-wire technology through a flight test program being conducted by the NASA Dryden Flight Research Center in an F-8C aircraft is described. The system requirements are outlined, along with the requirements for flight qualification. The system is described, including the hardware components, the aircraft installation, and the system operation. The flight qualification experience is emphasized. The qualification process included the theoretical validation of the basic design, laboratory testing of the hardware and software elements, systems level testing, and flight testing. The most productive testing was performed on an iron bird aircraft, which used the actual electronic and hydraulic hardware and a simulation of the F-8 characteristics to provide the flight environment. The iron bird was used for sensor and system redundancy management testing, failure modes and effects testing, and stress testing in many cases with the pilot in the loop. The flight test program confirmed the quality of the validation process by achieving 50 flights without a known undetected failure and with no false alarms.

  14. Analytical difficulties facing today's regulatory laboratories: issues in method validation.

    PubMed

    MacNeil, James D

    2012-08-01

    The challenges facing analytical laboratories today are not unlike those faced in the past, although both the degree of complexity and the rate of change have increased. Challenges such as development and maintenance of expertise, maintenance and up-dating of equipment, and the introduction of new test methods have always been familiar themes for analytical laboratories, but international guidelines for laboratories involved in the import and export testing of food require management of such changes in a context which includes quality assurance, accreditation, and method validation considerations. Decisions as to when a change in a method requires re-validation of the method or on the design of a validation scheme for a complex multi-residue method require a well-considered strategy, based on a current knowledge of international guidance documents and regulatory requirements, as well the laboratory's quality system requirements. Validation demonstrates that a method is 'fit for purpose', so the requirement for validation should be assessed in terms of the intended use of a method and, in the case of change or modification of a method, whether that change or modification may affect a previously validated performance characteristic. In general, method validation involves method scope, calibration-related parameters, method precision, and recovery. Any method change which may affect method scope or any performance parameters will require re-validation. Some typical situations involving change in methods are discussed and a decision process proposed for selection of appropriate validation measures. © 2012 John Wiley & Sons, Ltd.

  15. Development, validity, and reliability of a ballet-specific aerobic fitness test.

    PubMed

    Twitchett, Emily; Nevill, Alan; Angioi, Manuela; Koutedakis, Yiannis; Wyon, Matthew

    2011-09-01

    The aim of this study was to develop and assess the reliability and validity of a multi-stage, ballet-specific aerobic fitness test to be used in a dance studio setting. The test consists of five stages, each four minutes long, that increase in intensity. It uses classical ballet movement of an intermediate-level of difficulty, thus emphasizing physiological demand rather than skill. The demand of each stage was determined by calculating the mean oxygen uptake during its final minute using a portable gas analyser. After an initial familiarization period, eight female subjects performed the test twice within seven days. The results showed significant differences in oxygen consumption between stages (p < 0.001), but not between trials. Pearson correlation co-efficients produced a very good linear relationship between trials (r = 0.998, p < 0.001). Bland-Altman reliability analysis revealed the 95% limits of agreement to be ± 6.2 ml·kg(-1)·min(-1), showing good agreement between trials. The oxygen uptake in our subjects equated positively to previous estimates for class and performance, confirming validity. It was concluded that the test is suitable for use among classical ballet dancers, with many possible applications.

  16. Development and Validation of a Translation Test.

    ERIC Educational Resources Information Center

    Ghonsooly, Behzad

    1993-01-01

    Translation testing methodology has been criticized for its subjective character. No real strides have so far been made in developing an objective translation test. In this paper, certain detailed procedures including various phases of pretesting have been performed to achieve objectivity and scorability in translation testing methodology. In…

  17. Testing Integrity Symposium: Issues and Recommendations for Best Practice

    ERIC Educational Resources Information Center

    National Center for Education Statistics, 2013

    2013-01-01

    Educators, parents, and the public depend on accurate, valid, reliable, and timely information about student academic performance. Testing irregularities--breaches of test security or improper administration of academic testing--undermine efforts to use those data to improve student achievement. Unfortunately, there have been high-profile and…

  18. Validity of the Symbol Digit Modalities Test as a cognition performance outcome measure for multiple sclerosis

    PubMed Central

    Benedict, Ralph HB; DeLuca, John; Phillips, Glenn; LaRocca, Nicholas; Hudson, Lynn D; Rudick, Richard

    2017-01-01

    Cognitive and motor performance measures are commonly employed in multiple sclerosis (MS) research, particularly when the purpose is to determine the efficacy of treatment. The increasing focus of new therapies on slowing progression or reversing neurological disability makes the utilization of sensitive, reproducible, and valid measures essential. Processing speed is a basic elemental cognitive function that likely influences downstream processes such as memory. The Multiple Sclerosis Outcome Assessments Consortium (MSOAC) includes representatives from advocacy organizations, Food and Drug Administration (FDA), European Medicines Agency (EMA), National Institute of Neurological Disorders and Stroke (NINDS), academic institutions, and industry partners along with persons living with MS. Among the MSOAC goals is acceptance and qualification by regulators of performance outcomes that are highly reliable and valid, practical, cost-effective, and meaningful to persons with MS. A critical step for these neuroperformance metrics is elucidation of clinically relevant benchmarks, well-defined degrees of disability, and gradients of change that are deemed clinically meaningful. This topical review provides an overview of research on one particular cognitive measure, the Symbol Digit Modalities Test (SDMT), recognized as being particularly sensitive to slowed processing of information that is commonly seen in MS. The research in MS clearly supports the reliability and validity of this test and recently has supported a responder definition of SDMT change approximating 4 points or 10% in magnitude. PMID:28206827

  19. Validity of the Symbol Digit Modalities Test as a cognition performance outcome measure for multiple sclerosis.

    PubMed

    Benedict, Ralph Hb; DeLuca, John; Phillips, Glenn; LaRocca, Nicholas; Hudson, Lynn D; Rudick, Richard

    2017-04-01

    Cognitive and motor performance measures are commonly employed in multiple sclerosis (MS) research, particularly when the purpose is to determine the efficacy of treatment. The increasing focus of new therapies on slowing progression or reversing neurological disability makes the utilization of sensitive, reproducible, and valid measures essential. Processing speed is a basic elemental cognitive function that likely influences downstream processes such as memory. The Multiple Sclerosis Outcome Assessments Consortium (MSOAC) includes representatives from advocacy organizations, Food and Drug Administration (FDA), European Medicines Agency (EMA), National Institute of Neurological Disorders and Stroke (NINDS), academic institutions, and industry partners along with persons living with MS. Among the MSOAC goals is acceptance and qualification by regulators of performance outcomes that are highly reliable and valid, practical, cost-effective, and meaningful to persons with MS. A critical step for these neuroperformance metrics is elucidation of clinically relevant benchmarks, well-defined degrees of disability, and gradients of change that are deemed clinically meaningful. This topical review provides an overview of research on one particular cognitive measure, the Symbol Digit Modalities Test (SDMT), recognized as being particularly sensitive to slowed processing of information that is commonly seen in MS. The research in MS clearly supports the reliability and validity of this test and recently has supported a responder definition of SDMT change approximating 4 points or 10% in magnitude.

  20. Performance of the Colson MAM BP 3AA1-2 automatic blood pressure monitor according to the European Society of Hypertension validation protocol.

    PubMed

    Pereira, Telmo; Maldonado, João

    2005-11-01

    To evaluate the performance of the Colson MAM BP 3AA1-2 oscillometric automatic blood pressure monitor according to the validation protocol of the European Society of Hypertension, testing its suitability for self-measurement of blood pressure. The performance of the device was assessed in relation to various clinical variables, including age, gender, body mass index, arm circumference and arterial stiffness. 33 subjects (15 men and 18 women), with a mean age of 47 +/- 10 years, were studied according to the procedures laid down in the European Society of Hypertension validation protocol. Sequential same-arm blood pressure measurements were made, alternating between a mercury standard and the automatic device. The differences among the test-control measurements were assessed and divided into categorization zones of 5, 10 and 15 mmHg discrepancy. Aortic pulse wave velocity was assessed in all subjects with a Complior device (Colson, Paris). The Colson MAM BP 3AA1-2 passed all three phases of the protocol for both systolic and diastolic blood pressure. The mean differences between the test and control measurements were -1.0 +/- 5.0 mmHg for systolic blood pressure and -1.1 +/- 4.1 mmHg for diastolic blood pressure. Both standard deviations are well below the 8 mmHg limit proposed by the Association for the Advancement of Medical Instrumentation. The predictive value of various clinical variables for the discrepancies was assessed by a regression model analysis, with no variable being found that independently undermined the performance of the monitor. In another regression analysis, we found a similar relation between test and control blood pressures and aortic pulse wave velocity, a widely recognized and validated index of target organ damage. These data show that the Colson MAM BP 3AA1-2 satisfies the quality requirements proposed by the European Society of Hypertension, demonstrating its suitability for inclusion in integrated programs of clinical surveillance based on self-measurement of blood pressure. The uniformity of its performance over a wide spectrum of clinical characteristics and the relation found with pulse wave velocity further reinforce its clinical validity.

Top