validation studies performed: Topics by Science.gov

Sample records for validation studies performed

An empirical assessment of validation practices for molecular classifiers

PubMed Central

Castaldi, Peter J.; Dahabreh, Issa J.

2011-01-01

Proposed molecular classifiers may be overfit to idiosyncrasies of noisy genomic and proteomic data. Cross-validation methods are often used to obtain estimates of classification accuracy, but both simulations and case studies suggest that, when inappropriate methods are used, bias may ensue. Bias can be bypassed and generalizability can be tested by external (independent) validation. We evaluated 35 studies that have reported on external validation of a molecular classifier. We extracted information on study design and methodological features, and compared the performance of molecular classifiers in internal cross-validation versus external validation for 28 studies where both had been performed. We demonstrate that the majority of studies pursued cross-validation practices that are likely to overestimate classifier performance. Most studies were markedly underpowered to detect a 20% decrease in sensitivity or specificity between internal cross-validation and external validation [median power was 36% (IQR, 21–61%) and 29% (IQR, 15–65%), respectively]. The median reported classification performance for sensitivity and specificity was 94% and 98%, respectively, in cross-validation and 88% and 81% for independent validation. The relative diagnostic odds ratio was 3.26 (95% CI 2.04–5.21) for cross-validation versus independent validation. Finally, we reviewed all studies (n = 758) which cited those in our study sample, and identified only one instance of additional subsequent independent validation of these classifiers. In conclusion, these results document that many cross-validation practices employed in the literature are potentially biased and genuine progress in this field will require adoption of routine external validation of molecular classifiers, preferably in much larger studies than in current practice. PMID:21300697
Design Characteristics Influence Performance of Clinical Prediction Rules in Validation: A Meta-Epidemiological Study

PubMed Central

Ban, Jong-Wook; Emparanza, José Ignacio; Urreta, Iratxe; Burls, Amanda

2016-01-01

Background Many new clinical prediction rules are derived and validated. But the design and reporting quality of clinical prediction research has been less than optimal. We aimed to assess whether design characteristics of validation studies were associated with the overestimation of clinical prediction rules’ performance. We also aimed to evaluate whether validation studies clearly reported important methodological characteristics. Methods Electronic databases were searched for systematic reviews of clinical prediction rule studies published between 2006 and 2010. Data were extracted from the eligible validation studies included in the systematic reviews. A meta-analytic meta-epidemiological approach was used to assess the influence of design characteristics on predictive performance. From each validation study, it was assessed whether 7 design and 7 reporting characteristics were properly described. Results A total of 287 validation studies of clinical prediction rule were collected from 15 systematic reviews (31 meta-analyses). Validation studies using case-control design produced a summary diagnostic odds ratio (DOR) 2.2 times (95% CI: 1.2–4.3) larger than validation studies using cohort design and unclear design. When differential verification was used, the summary DOR was overestimated by twofold (95% CI: 1.2 -3.1) compared to complete, partial and unclear verification. The summary RDOR of validation studies with inadequate sample size was 1.9 (95% CI: 1.2 -3.1) compared to studies with adequate sample size. Study site, reliability, and clinical prediction rule was adequately described in 10.1%, 9.4%, and 7.0% of validation studies respectively. Conclusion Validation studies with design shortcomings may overestimate the performance of clinical prediction rules. The quality of reporting among studies validating clinical prediction rules needs to be improved. PMID:26730980
Design Characteristics Influence Performance of Clinical Prediction Rules in Validation: A Meta-Epidemiological Study.

PubMed

Ban, Jong-Wook; Emparanza, José Ignacio; Urreta, Iratxe; Burls, Amanda

2016-01-01

Many new clinical prediction rules are derived and validated. But the design and reporting quality of clinical prediction research has been less than optimal. We aimed to assess whether design characteristics of validation studies were associated with the overestimation of clinical prediction rules' performance. We also aimed to evaluate whether validation studies clearly reported important methodological characteristics. Electronic databases were searched for systematic reviews of clinical prediction rule studies published between 2006 and 2010. Data were extracted from the eligible validation studies included in the systematic reviews. A meta-analytic meta-epidemiological approach was used to assess the influence of design characteristics on predictive performance. From each validation study, it was assessed whether 7 design and 7 reporting characteristics were properly described. A total of 287 validation studies of clinical prediction rule were collected from 15 systematic reviews (31 meta-analyses). Validation studies using case-control design produced a summary diagnostic odds ratio (DOR) 2.2 times (95% CI: 1.2-4.3) larger than validation studies using cohort design and unclear design. When differential verification was used, the summary DOR was overestimated by twofold (95% CI: 1.2 -3.1) compared to complete, partial and unclear verification. The summary RDOR of validation studies with inadequate sample size was 1.9 (95% CI: 1.2 -3.1) compared to studies with adequate sample size. Study site, reliability, and clinical prediction rule was adequately described in 10.1%, 9.4%, and 7.0% of validation studies respectively. Validation studies with design shortcomings may overestimate the performance of clinical prediction rules. The quality of reporting among studies validating clinical prediction rules needs to be improved.
The Relationship of Aptitudes to the Performance of Skilled Technical Jobs in Engine Manufacturing. Technical Report 1982-5 [and Supplement].

ERIC Educational Resources Information Center

Daniel, Mark; And Others

A study examined the relationship of aptitudes to the performance of skilled technical jobs in engine manufacturing. During the study, several approaches were utilized, including criterion-referenced validation, taxonomic validation, construct validation, and detailed anlaysis of the behaviors involved in performing the jobs. The study sample…
Performance and Symptom Validity Testing as a Function of Medical Board Evaluation in U.S. Military Service Members with a History of Mild Traumatic Brain Injury.

PubMed

Armistead-Jehle, Patrick; Cole, Wesley R; Stegman, Robert L

2018-02-01

The study was designed to replicate and extend pervious findings demonstrating the high rates of invalid neuropsychological testing in military service members (SMs) with a history of mild traumatic brain injury (mTBI) assessed in the context of a medical evaluation board (MEB). Two hundred thirty-one active duty SMs (61 of which were undergoing an MEB) underwent neuropsychological assessment. Performance validity (Word Memory Test) and symptom validity (MMPI-2-RF) test data were compared across those evaluated within disability (MEB) and clinical contexts. As with previous studies, there were significantly more individuals in an MEB context that failed performance (MEB = 57%, non-MEB = 31%) and symptom validity testing (MEB = 57%, non-MEB = 22%) and performance validity testing had a notable affect on cognitive test scores. Performance and symptom validity test failure rates did not vary as a function of the reason for disability evaluation when divided into behavioral versus physical health conditions. These data are consistent with past studies, and extends those studies by including symptom validity testing and investigating the effect of reason for MEB. This and previous studies demonstrate that more than 50% of SMs seen in the context of an MEB will fail performance validity tests and over-report on symptom validity measures. These results emphasize the importance of using both performance and symptom validity testing when evaluating SMs with a history of mTBI, especially if they are being seen for disability evaluations, in order to ensure the accuracy of cognitive and psychological test data. Published by Oxford University Press 2017. This work is written by (a) US Government employee(s) and is in the public domain in the US.
Development and Initial Validation of the Performance Perfectionism Scale for Sport (PPS-S)

ERIC Educational Resources Information Center

Hill, Andrew P.; Appleton, Paul R.; Mallinson, Sarah H.

2016-01-01

Valid and reliable instruments are required to appropriately study perfectionism. With this in mind, three studies are presented that describe the development and initial validation of a new instrument designed to measure multidimensional performance perfectionism for use in sport (Performance Perfectionism Scale--Sport [PPS-S]). The instrument is…
Unremarked or Unperformed? Systematic Review on Reporting of Validation Efforts of Health Economic Decision Models in Seasonal Influenza and Early Breast Cancer.

PubMed

de Boer, Pieter T; Frederix, Geert W J; Feenstra, Talitha L; Vemer, Pepijn

2016-09-01

Transparent reporting of validation efforts of health economic models give stakeholders better insight into the credibility of model outcomes. In this study we reviewed recently published studies on seasonal influenza and early breast cancer in order to gain insight into the reporting of model validation efforts in the overall health economic literature. A literature search was performed in Pubmed and Embase to retrieve health economic modelling studies published between 2008 and 2014. Reporting on model validation was evaluated by checking for the word validation, and by using AdViSHE (Assessment of the Validation Status of Health Economic decision models), a tool containing a structured list of relevant items for validation. Additionally, we contacted corresponding authors to ask whether more validation efforts were performed other than those reported in the manuscripts. A total of 53 studies on seasonal influenza and 41 studies on early breast cancer were included in our review. The word validation was used in 16 studies (30 %) on seasonal influenza and 23 studies (56 %) on early breast cancer; however, in a minority of studies, this referred to a model validation technique. Fifty-seven percent of seasonal influenza studies and 71 % of early breast cancer studies reported one or more validation techniques. Cross-validation of study outcomes was found most often. A limited number of studies reported on model validation efforts, although good examples were identified. Author comments indicated that more validation techniques were performed than those reported in the manuscripts. Although validation is deemed important by many researchers, this is not reflected in the reporting habits of health economic modelling studies. Systematic reporting of validation efforts would be desirable to further enhance decision makers' confidence in health economic models and their outcomes.
Psychological collectivism: a measurement validation and linkage to group member performance.

PubMed

Jackson, Christine L; Colquitt, Jason A; Wesson, Michael J; Zapata-Phelan, Cindy P

2006-07-01

The 3 studies presented here introduce a new measure of the individual-difference form of collectivism. Psychological collectivism is conceptualized as a multidimensional construct with the following 5 facets: preference for in-groups, reliance on in-groups, concern for in-groups, acceptance of in-group norms, and prioritization of in-group goals. Study 1 developed and tested the new measure in a sample of consultants. Study 2 cross-validated the measure using an alumni sample of a Southeastern university, assessing its convergent validity with other collectivism measures. Study 3 linked scores on the measure to 4 dimensions of group member performance (task performance, citizenship behavior, counterproductive behavior, and withdrawal behavior) in a computer software firm and assessed discriminant validity using the Big Five. The results of the studies support the construct validity of the measure and illustrate the potential value of collectivism as a predictor of group member performance. ((c) 2006 APA, all rights reserved).
The Contribution of Rubrics to the Validity of Performance Assessment: A Study of the Conservation-Restoration and Design Undergraduate Degrees

ERIC Educational Resources Information Center

Menéndez-Varela, José-Luis; Gregori-Giralt, Eva

2016-01-01

Rubrics have attained considerable importance in the authentic and sustainable assessment paradigm; nevertheless, few studies have examined their contribution to validity, especially outside the domain of educational studies. This empirical study used a quantitative approach to analyse the validity of a rubrics-based performance assessment. Raters…
A new framework to enhance the interpretation of external validation studies of clinical prediction models.

PubMed

Debray, Thomas P A; Vergouwe, Yvonne; Koffijberg, Hendrik; Nieboer, Daan; Steyerberg, Ewout W; Moons, Karel G M

2015-03-01

It is widely acknowledged that the performance of diagnostic and prognostic prediction models should be assessed in external validation studies with independent data from "different but related" samples as compared with that of the development sample. We developed a framework of methodological steps and statistical methods for analyzing and enhancing the interpretation of results from external validation studies of prediction models. We propose to quantify the degree of relatedness between development and validation samples on a scale ranging from reproducibility to transportability by evaluating their corresponding case-mix differences. We subsequently assess the models' performance in the validation sample and interpret the performance in view of the case-mix differences. Finally, we may adjust the model to the validation setting. We illustrate this three-step framework with a prediction model for diagnosing deep venous thrombosis using three validation samples with varying case mix. While one external validation sample merely assessed the model's reproducibility, two other samples rather assessed model transportability. The performance in all validation samples was adequate, and the model did not require extensive updating to correct for miscalibration or poor fit to the validation settings. The proposed framework enhances the interpretation of findings at external validation of prediction models. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
40 CFR 761.392 - Preparing validation study samples.

Code of Federal Regulations, 2010 CFR

2010-07-01

... 40 Protection of Environment 30 2010-07-01 2010-07-01 false Preparing validation study samples..., AND USE PROHIBITIONS Comparison Study for Validating a New Performance-Based Decontamination Solvent Under Â§ 761.79(d)(4) § 761.392 Preparing validation study samples. (a)(1) To validate a procedure to...
40 CFR 761.392 - Preparing validation study samples.

Code of Federal Regulations, 2011 CFR

2011-07-01

... 40 Protection of Environment 31 2011-07-01 2011-07-01 false Preparing validation study samples..., AND USE PROHIBITIONS Comparison Study for Validating a New Performance-Based Decontamination Solvent Under Â§ 761.79(d)(4) § 761.392 Preparing validation study samples. (a)(1) To validate a procedure to...
Reliability and Validity of the Turkish Version of the Job Performance Scale Instrument.

PubMed

Harmanci Seren, Arzu Kader; Tuna, Rujnan; Eskin Bacaksiz, Feride

2018-02-01

Objective measurement of the job performance of nursing staff using valid and reliable instruments is important in the evaluation of healthcare quality. A current, valid, and reliable instrument that specifically measures the performance of nurses is required for this purpose. The aim of this study was to determine the validity and reliability of the Turkish version of the Job Performance Instrument. This study used a methodological design and a sample of 240 nurses working at different units in four hospitals in Istanbul, Turkey. A descriptive data form, the Job Performance Scale, and the Employee Performance Scale were used to collect data. Data were analyzed using IBM SPSS Statistics Version 21.0 and LISREL Version 8.51. On the basis of the data analysis, the instrument was revised. Some items were deleted, and subscales were combined. The Turkish version of the Job Performance Instrument was determined to be valid and reliable to measure the performance of nurses. The instrument is suitable for evaluating current nursing roles.
Reaction time as an indicator of insufficient effort: Development and validation of an embedded performance validity parameter.

PubMed

Stevens, Andreas; Bahlo, Simone; Licha, Christina; Liske, Benjamin; Vossler-Thies, Elisabeth

2016-11-30

Subnormal performance in attention tasks may result from various sources including lack of effort. In this report, the derivation and validation of a performance validity parameter for reaction time is described, using a set of malingering-indices ("Slick-criteria"), and 3 independent samples of participants (total n =893). The Slick-criteria yield an estimate of the probability of malingering based on the presence of an external incentive, evidence from neuropsychological testing, from self-report and clinical data. In study (1) a validity parameter is derived using reaction time data of a sample, composed of inpatients with recent severe brain lesions not involved in litigation and of litigants with and without brain lesion. In study (2) the validity parameter is tested in an independent sample of litigants. In study (3) the parameter is applied to an independent sample comprising cooperative and non-cooperative testees. Logistic regression analysis led to a derived validity parameter based on median reaction time and standard deviation. It performed satisfactorily in studies (2) and (3) (study 2 sensitivity=0.94, specificity=1.00; study 3 sensitivity=0.79, specificity=0.87). The findings suggest that median reaction time and standard deviation may be used as indicators of negative response bias. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Standards Performance Continuum: Development and Validation of a Measure of Effective Pedagogy.

ERIC Educational Resources Information Center

Doherty, R. William; Hilberg, R. Soleste; Epaloose, Georgia; Tharp, Roland G.

2002-01-01

Describes the development and validation of the Standards Performance Continuum (SPC) for assessing teacher performance of the Standards for Effective Pedagogy. Three studies involving Florida, California, and New Mexico public school teachers provided evidence of inter-rater reliability, concurrent validity, and criterion-related validity…
Derivation and Cross-Validation of Cutoff Scores for Patients With Schizophrenia Spectrum Disorders on WAIS-IV Digit Span-Based Performance Validity Measures.

PubMed

Glassmire, David M; Toofanian Ross, Parnian; Kinney, Dominique I; Nitch, Stephen R

2016-06-01

Two studies were conducted to identify and cross-validate cutoff scores on the Wechsler Adult Intelligence Scale-Fourth Edition Digit Span-based embedded performance validity (PV) measures for individuals with schizophrenia spectrum disorders. In Study 1, normative scores were identified on Digit Span-embedded PV measures among a sample of patients (n = 84) with schizophrenia spectrum diagnoses who had no known incentive to perform poorly and who put forth valid effort on external PV tests. Previously identified cutoff scores resulted in unacceptable false positive rates and lower cutoff scores were adopted to maintain specificity levels ≥90%. In Study 2, the revised cutoff scores were cross-validated within a sample of schizophrenia spectrum patients (n = 96) committed as incompetent to stand trial. Performance on Digit Span PV measures was significantly related to Full Scale IQ in both studies, indicating the need to consider the intellectual functioning of examinees with psychotic spectrum disorders when interpreting scores on Digit Span PV measures. © The Author(s) 2015.
Development and validation of a virtual reality simulator: human factors input to interventional radiology training.

PubMed

Johnson, Sheena Joanne; Guediri, Sara M; Kilkenny, Caroline; Clough, Peter J

2011-12-01

This study developed and validated a virtual reality (VR) simulator for use by interventional radiologists. Research in the area of skill acquisition reports practice as essential to become a task expert. Studies on simulation show skills learned in VR can be successfully transferred to a real-world task. Recently, with improvements in technology, VR simulators have been developed to allow complex medical procedures to be practiced without risking the patient. Three studies are reported. In Study I, 35 consultant interventional radiologists took part in a cognitive task analysis to empirically establish the key competencies of the Seldinger procedure. In Study 2, 62 participants performed one simulated procedure, and their performance was compared by expertise. In Study 3, the transferability of simulator training to a real-world procedure was assessed with 14 trainees. Study I produced 23 key competencies that were implemented as performance measures in the simulator. Study 2 showed the simulator had both face and construct validity, although some issues were identified. Study 3 showed the group that had undergone simulator training received significantly higher mean performance ratings on a subsequent patient procedure. The findings of this study support the centrality of validation in the successful design of simulators and show the utility of simulators as a training device. The studies show the key elements of a validation program for a simulator. In addition to task analysis and face and construct validities, the authors highlight the importance of transfer of training in validation studies.
Validation of GC and HPLC systems for residue studies

DOE Office of Scientific and Technical Information (OSTI.GOV)

Williams, M.

1995-12-01

For residue studies, GC and HPLC system performance must be validated prior to and during use. One excellent measure of system performance is the standard curve and associated chromatograms used to construct that curve. The standard curve is a model of system response to an analyte over a specific time period, and is prima facia evidence of system performance beginning at the auto sampler and proceeding through the injector, column, detector, electronics, data-capture device, and printer/plotter. This tool measures the performance of the entire chromatographic system; its power negates most of the benefits associated with costly and time-consuming validation ofmore » individual system components. Other measures of instrument and method validation will be discussed, including quality control charts and experimental designs for method validation.« less
Model performance evaluation (validation and calibration) in model-based studies of therapeutic interventions for cardiovascular diseases : a review and suggested reporting framework.

PubMed

Haji Ali Afzali, Hossein; Gray, Jodi; Karnon, Jonathan

2013-04-01

Decision analytic models play an increasingly important role in the economic evaluation of health technologies. Given uncertainties around the assumptions used to develop such models, several guidelines have been published to identify and assess 'best practice' in the model development process, including general modelling approach (e.g., time horizon), model structure, input data and model performance evaluation. This paper focuses on model performance evaluation. In the absence of a sufficient level of detail around model performance evaluation, concerns regarding the accuracy of model outputs, and hence the credibility of such models, are frequently raised. Following presentation of its components, a review of the application and reporting of model performance evaluation is presented. Taking cardiovascular disease as an illustrative example, the review investigates the use of face validity, internal validity, external validity, and cross model validity. As a part of the performance evaluation process, model calibration is also discussed and its use in applied studies investigated. The review found that the application and reporting of model performance evaluation across 81 studies of treatment for cardiovascular disease was variable. Cross-model validation was reported in 55 % of the reviewed studies, though the level of detail provided varied considerably. We found that very few studies documented other types of validity, and only 6 % of the reviewed articles reported a calibration process. Considering the above findings, we propose a comprehensive model performance evaluation framework (checklist), informed by a review of best-practice guidelines. This framework provides a basis for more accurate and consistent documentation of model performance evaluation. This will improve the peer review process and the comparability of modelling studies. Recognising the fundamental role of decision analytic models in informing public funding decisions, the proposed framework should usefully inform guidelines for preparing submissions to reimbursement bodies.
41 CFR 60-3.7 - Use of other validity studies.

Code of Federal Regulations, 2014 CFR

2014-07-01

... study was conducted perform substantially the same major work behaviors, as shown by appropriate job analyses both on the job or group of jobs on which the validity study was performed and on the job for...

41 CFR 60-3.7 - Use of other validity studies.

Code of Federal Regulations, 2013 CFR

2013-07-01

... study was conducted perform substantially the same major work behaviors, as shown by appropriate job analyses both on the job or group of jobs on which the validity study was performed and on the job for...
Validity Evidence in Scale Development: The Application of Cross Validation and Classification-Sequencing Validation

ERIC Educational Resources Information Center

Acar, Tu¨lin

2014-01-01

In literature, it has been observed that many enhanced criteria are limited by factor analysis techniques. Besides examinations of statistical structure and/or psychological structure, such validity studies as cross validation and classification-sequencing studies should be performed frequently. The purpose of this study is to examine cross…
Validation of a Performance Assessment Instrument in Problem-Based Learning Tutorials Using Two Cohorts of Medical Students

ERIC Educational Resources Information Center

Lee, Ming; Wimmers, Paul F.

2016-01-01

Although problem-based learning (PBL) has been widely used in medical schools, few studies have attended to the assessment of PBL processes using validated instruments. This study examined reliability and validity for an instrument assessing PBL performance in four domains: Problem Solving, Use of Information, Group Process, and Professionalism.…
40 CFR 761.395 - A validation study.

Code of Federal Regulations, 2011 CFR

2011-07-01

... 40 Protection of Environment 31 2011-07-01 2011-07-01 false A validation study. 761.395 Section... PROHIBITIONS Comparison Study for Validating a New Performance-Based Decontamination Solvent Under Â§ 761.79(d)(4) § 761.395 A validation study. (a) Decontaminate the following prepared sample surfaces using the...
40 CFR 761.395 - A validation study.

Code of Federal Regulations, 2010 CFR

2010-07-01

... 40 Protection of Environment 30 2010-07-01 2010-07-01 false A validation study. 761.395 Section... PROHIBITIONS Comparison Study for Validating a New Performance-Based Decontamination Solvent Under Â§ 761.79(d)(4) § 761.395 A validation study. (a) Decontaminate the following prepared sample surfaces using the...
Further examination of embedded performance validity indicators for the Conners' Continuous Performance Test and Brief Test of Attention in a large outpatient clinical sample.

PubMed

Sharland, Michael J; Waring, Stephen C; Johnson, Brian P; Taran, Allise M; Rusin, Travis A; Pattock, Andrew M; Palcher, Jeanette A

2018-01-01

Assessing test performance validity is a standard clinical practice and although studies have examined the utility of cognitive/memory measures, few have examined attention measures as indicators of performance validity beyond the Reliable Digit Span. The current study further investigates the classification probability of embedded Performance Validity Tests (PVTs) within the Brief Test of Attention (BTA) and the Conners' Continuous Performance Test (CPT-II), in a large clinical sample. This was a retrospective study of 615 patients consecutively referred for comprehensive outpatient neuropsychological evaluation. Non-credible performance was defined two ways: failure on one or more PVTs and failure on two or more PVTs. Classification probability of the BTA and CPT-II into non-credible groups was assessed. Sensitivity, specificity, positive predictive value, and negative predictive value were derived to identify clinically relevant cut-off scores. When using failure on two or more PVTs as the indicator for non-credible responding compared to failure on one or more PVTs, highest classification probability, or area under the curve (AUC), was achieved by the BTA (AUC = .87 vs. .79). CPT-II Omission, Commission, and Total Errors exhibited higher classification probability as well. Overall, these findings corroborate previous findings, extending them to a large clinical sample. BTA and CPT-II are useful embedded performance validity indicators within a clinical battery but should not be used in isolation without other performance validity indicators.
Simulation verification techniques study

NASA Technical Reports Server (NTRS)

Schoonmaker, P. B.; Wenglinski, T. H.

1975-01-01

Results are summarized of the simulation verification techniques study which consisted of two tasks: to develop techniques for simulator hardware checkout and to develop techniques for simulation performance verification (validation). The hardware verification task involved definition of simulation hardware (hardware units and integrated simulator configurations), survey of current hardware self-test techniques, and definition of hardware and software techniques for checkout of simulator subsystems. The performance verification task included definition of simulation performance parameters (and critical performance parameters), definition of methods for establishing standards of performance (sources of reference data or validation), and definition of methods for validating performance. Both major tasks included definition of verification software and assessment of verification data base impact. An annotated bibliography of all documents generated during this study is provided.
PLCO Ovarian Phase III Validation Study — EDRN Public Portal

Cancer.gov

Our preliminary data indicate that the performance of CA 125 as a screening test for ovarian cancer can be improved upon by additional biomarkers. With completion of one additional validation step, we will be ready to test the performance of a consensus marker panel in a phase III validation study. Given the original aims of the PLCO trial, we believe that the PLCO represents an ideal longitudinal cohort offering specimens for phase III validation of ovarian cancer biomarkers.
Simulation verification techniques study: Simulation performance validation techniques document. [for the space shuttle system

NASA Technical Reports Server (NTRS)

Duncan, L. M.; Reddell, J. P.; Schoonmaker, P. B.

1975-01-01

Techniques and support software for the efficient performance of simulation validation are discussed. Overall validation software structure, the performance of validation at various levels of simulation integration, guidelines for check case formulation, methods for real time acquisition and formatting of data from an all up operational simulator, and methods and criteria for comparison and evaluation of simulation data are included. Vehicle subsystems modules, module integration, special test requirements, and reference data formats are also described.
Tests for the Assessment of Sport-Specific Performance in Olympic Combat Sports: A Systematic Review With Practical Recommendations.

PubMed

Chaabene, Helmi; Negra, Yassine; Bouguezzi, Raja; Capranica, Laura; Franchini, Emerson; Prieske, Olaf; Hbacha, Hamdi; Granacher, Urs

2018-01-01

The regular monitoring of physical fitness and sport-specific performance is important in elite sports to increase the likelihood of success in competition. This study aimed to systematically review and to critically appraise the methodological quality, validation data, and feasibility of the sport-specific performance assessment in Olympic combat sports like amateur boxing, fencing, judo, karate, taekwondo, and wrestling. A systematic search was conducted in the electronic databases PubMed, Google-Scholar, and Science-Direct up to October 2017. Studies in combat sports were included that reported validation data (e.g., reliability, validity, sensitivity) of sport-specific tests. Overall, 39 studies were eligible for inclusion in this review. The majority of studies (74%) contained sample sizes <30 subjects. Nearly, 1/3 of the reviewed studies lacked a sufficient description (e.g., anthropometrics, age, expertise level) of the included participants. Seventy-two percent of studies did not sufficiently report inclusion/exclusion criteria of their participants. In 62% of the included studies, the description and/or inclusion of a familiarization session (s) was either incomplete or not existent. Sixty-percent of studies did not report any details about the stability of testing conditions. Approximately half of the studies examined reliability measures of the included sport-specific tests (intraclass correlation coefficient [ICC] = 0.43-1.00). Content validity was addressed in all included studies, criterion validity (only the concurrent aspect of it) in approximately half of the studies with correlation coefficients ranging from r = -0.41 to 0.90. Construct validity was reported in 31% of the included studies and predictive validity in only one. Test sensitivity was addressed in 13% of the included studies. The majority of studies (64%) ignored and/or provided incomplete information on test feasibility and methodological limitations of the sport-specific test. In 28% of the included studies, insufficient information or a complete lack of information was provided in the respective field of the test application. Several methodological gaps exist in studies that used sport-specific performance tests in Olympic combat sports. Additional research should adopt more rigorous validation procedures in the application and description of sport-specific performance tests in Olympic combat sports.
Ride qualities criteria validation/pilot performance study: Flight test results

NASA Technical Reports Server (NTRS)

Nardi, L. U.; Kawana, H. Y.; Greek, D. C.

1979-01-01

Pilot performance during a terrain following flight was studied for ride quality criteria validation. Data from manual and automatic terrain following operations conducted during low level penetrations were analyzed to determine the effect of ride qualities on crew performance. The conditions analyzed included varying levels of turbulence, terrain roughness, and mission duration with a ride smoothing system on and off. Limited validation of the B-1 ride quality criteria and some of the first order interactions between ride qualities and pilot/vehicle performance are highlighted. An earlier B-1 flight simulation program correlated well with the flight test results.
Prognostic models for complete recovery in ischemic stroke: a systematic review and meta-analysis.

PubMed

Jampathong, Nampet; Laopaiboon, Malinee; Rattanakanokchai, Siwanon; Pattanittum, Porjai

2018-03-09

Prognostic models have been increasingly developed to predict complete recovery in ischemic stroke. However, questions arise about the performance characteristics of these models. The aim of this study was to systematically review and synthesize performance of existing prognostic models for complete recovery in ischemic stroke. We searched journal publications indexed in PUBMED, SCOPUS, CENTRAL, ISI Web of Science and OVID MEDLINE from inception until 4 December, 2017, for studies designed to develop and/or validate prognostic models for predicting complete recovery in ischemic stroke patients. Two reviewers independently examined titles and abstracts, and assessed whether each study met the pre-defined inclusion criteria and also independently extracted information about model development and performance. We evaluated validation of the models by medians of the area under the receiver operating characteristic curve (AUC) or c-statistic and calibration performance. We used a random-effects meta-analysis to pool AUC values. We included 10 studies with 23 models developed from elderly patients with a moderately severe ischemic stroke, mainly in three high income countries. Sample sizes for each study ranged from 75 to 4441. Logistic regression was the only analytical strategy used to develop the models. The number of various predictors varied from one to 11. Internal validation was performed in 12 models with a median AUC of 0.80 (95% CI 0.73 to 0.84). One model reported good calibration. Nine models reported external validation with a median AUC of 0.80 (95% CI 0.76 to 0.82). Four models showed good discrimination and calibration on external validation. The pooled AUC of the two validation models of the same developed model was 0.78 (95% CI 0.71 to 0.85). The performance of the 23 models found in the systematic review varied from fair to good in terms of internal and external validation. Further models should be developed with internal and external validation in low and middle income countries.
Isokinetic knee strength qualities as predictors of jumping performance in high-level volleyball athletes: multiple regression approach.

PubMed

Sattler, Tine; Sekulic, Damir; Spasic, Miodrag; Osmankac, Nedzad; Vicente João, Paulo; Dervisevic, Edvin; Hadzic, Vedran

2016-01-01

Previous investigations noted potential importance of isokinetic strength in rapid muscular performances, such as jumping. This study aimed to identify the influence of isokinetic-knee-strength on specific jumping performance in volleyball. The secondary aim of the study was to evaluate reliability and validity of the two volleyball-specific jumping tests. The sample comprised 67 female (21.96±3.79 years; 68.26±8.52 kg; 174.43±6.85 cm) and 99 male (23.62±5.27 years; 84.83±10.37 kg; 189.01±7.21 cm) high- volleyball players who competed in 1st and 2nd National Division. Subjects were randomly divided into validation (N.=55 and 33 for males and females, respectively) and cross-validation subsamples (N.=54 and 34 for males and females, respectively). Set of predictors included isokinetic tests, to evaluate the eccentric and concentric strength capacities of the knee extensors, and flexors for dominant and non-dominant leg. The main outcome measure for the isokinetic testing was peak torque (PT) which was later normalized for body mass and expressed as PT/Kg. Block-jump and spike-jump performances were measured over three trials, and observed as criteria. Forward stepwise multiple regressions were calculated for validation subsamples and then cross-validated. Cross validation included correlations between and t-test differences between observed and predicted scores; and Bland Altman graphics. Jumping tests were found to be reliable (spike jump: ICC of 0.79 and 0.86; block-jump: ICC of 0.86 and 0.90; for males and females, respectively), and their validity was confirmed by significant t-test differences between 1st vs. 2nd division players. Isokinetic variables were found to be significant predictors of jumping performance in females, but not among males. In females, the isokinetic-knee measures were shown to be stronger and more valid predictors of the block-jump (42% and 64% of the explained variance for validation and cross-validation subsample, respectively) than that of the spike-jump (39% and 34% of the explained variance for validation and cross-validation subsample, respectively). Differences between prediction models calculated for males and females are mostly explained by gender-specific biomechanics of jumping. Study defined importance of knee-isokinetic-strength in volleyball jumping performance in female athletes. Further studies should evaluate association between ankle-isokinetic-strength and volleyball-specific jumping performances. Results reinforce the need for the cross-validation of the prediction-models in sport and exercise sciences.
Geographic and temporal validity of prediction models: Different approaches were useful to examine model performance

PubMed Central

Austin, Peter C.; van Klaveren, David; Vergouwe, Yvonne; Nieboer, Daan; Lee, Douglas S.; Steyerberg, Ewout W.

2017-01-01

Objective Validation of clinical prediction models traditionally refers to the assessment of model performance in new patients. We studied different approaches to geographic and temporal validation in the setting of multicenter data from two time periods. Study Design and Setting We illustrated different analytic methods for validation using a sample of 14,857 patients hospitalized with heart failure at 90 hospitals in two distinct time periods. Bootstrap resampling was used to assess internal validity. Meta-analytic methods were used to assess geographic transportability. Each hospital was used once as a validation sample, with the remaining hospitals used for model derivation. Hospital-specific estimates of discrimination (c-statistic) and calibration (calibration intercepts and slopes) were pooled using random effects meta-analysis methods. I2 statistics and prediction interval width quantified geographic transportability. Temporal transportability was assessed using patients from the earlier period for model derivation and patients from the later period for model validation. Results Estimates of reproducibility, pooled hospital-specific performance, and temporal transportability were on average very similar, with c-statistics of 0.75. Between-hospital variation was moderate according to I2 statistics and prediction intervals for c-statistics. Conclusion This study illustrates how performance of prediction models can be assessed in settings with multicenter data at different time periods. PMID:27262237
The Development and Validation of a Rubric to Enhance Performer Feedback for Undergraduate Vocal Solo Performance

ERIC Educational Resources Information Center

Herrell, Katherine A.

2014-01-01

This is a study of the development and validation of a rubric to enhance performer feedback for undergraduate vocal solo performance. In the literature, assessment of vocal performance is under-represented, and the value of feedback from the assessment of musical performances, from the point of view of the performer, is nonexistent. The research…
A systematic review of the asymmetric inheritance of cellular organelles in eukaryotes: A critique of basic science validity and imprecision

PubMed Central

Collins, Anne; Ross, Janine

2017-01-01

We performed a systematic review to identify all original publications describing the asymmetric inheritance of cellular organelles in normal animal eukaryotic cells and to critique the validity and imprecision of the evidence. Searches were performed in Embase, MEDLINE and Pubmed up to November 2015. Screening of titles, abstracts and full papers was performed by two independent reviewers. Data extraction and validity were performed by one reviewer and checked by a second reviewer. Study quality was assessed using the SYRCLE risk of bias tool, for animal studies and by developing validity tools for the experimental model, organelle markers and imprecision. A narrative data synthesis was performed. We identified 31 studies (34 publications) of the asymmetric inheritance of organelles after mitotic or meiotic division. Studies for the asymmetric inheritance of centrosomes (n = 9); endosomes (n = 6), P granules (n = 4), the midbody (n = 3), mitochondria (n = 3), proteosomes (n = 2), spectrosomes (n = 2), cilia (n = 2) and endoplasmic reticulum (n = 2) were identified. Asymmetry was defined and quantified by variable methods. Assessment of the statistical reliability of the results indicated only two studies (7%) were judged to have low concern, the majority of studies (77%) were 'unclear' and five (16%) were judged to have 'high concerns'; the main reasons were low technical repeats (<10). Assessment of model validity indicated that the majority of studies (61%) were judged to be valid, ten studies (32%) were unclear and two studies (7%) were judged to have 'high concerns'; both described 'stem cells' without providing experimental evidence to confirm this (pluripotency and self-renewal). Assessment of marker validity indicated that no studies had low concern, most studies were unclear (96.5%), indicating there were insufficient details to judge if the markers were appropriate. One study had high concern for marker validity due to the contradictory results of two markers for the same organelle. For most studies the validity and imprecision of results could not be confirmed. In particular, data were limited due to a lack of reporting of interassay variability, sample size calculations, controls and functional validation of organelle markers. An evaluation of 16 systematic reviews containing cell assays found that only 50% reported adherence to PRISMA or ARRIVE reporting guidelines and 38% reported a formal risk of bias assessment. 44% of the reviews did not consider how relevant or valid the models were to the research question. 75% reviews did not consider how valid the markers were. 69% of reviews did not consider the impact of the statistical reliability of the results. Future systematic reviews in basic or preclinical research should ensure the rigorous reporting of the statistical reliability of the results in addition to the validity of the methods. Increased awareness of the importance of reporting guidelines and validation tools is needed for the scientific community. PMID:28562636
External validation of preexisting first trimester preeclampsia prediction models.

PubMed

Allen, Rebecca E; Zamora, Javier; Arroyo-Manzano, David; Velauthar, Luxmilar; Allotey, John; Thangaratinam, Shakila; Aquilina, Joseph

2017-10-01

To validate the increasing number of prognostic models being developed for preeclampsia using our own prospective study. A systematic review of literature that assessed biomarkers, uterine artery Doppler and maternal characteristics in the first trimester for the prediction of preeclampsia was performed and models selected based on predefined criteria. Validation was performed by applying the regression coefficients that were published in the different derivation studies to our cohort. We assessed the models discrimination ability and calibration. Twenty models were identified for validation. The discrimination ability observed in derivation studies (Area Under the Curves) ranged from 0.70 to 0.96 when these models were validated against the validation cohort, these AUC varied importantly, ranging from 0.504 to 0.833. Comparing Area Under the Curves obtained in the derivation study to those in the validation cohort we found statistically significant differences in several studies. There currently isn't a definitive prediction model with adequate ability to discriminate for preeclampsia, which performs as well when applied to a different population and can differentiate well between the highest and lowest risk groups within the tested population. The pre-existing large number of models limits the value of further model development and future research should be focussed on further attempts to validate existing models and assessing whether implementation of these improves patient care. Crown Copyright © 2017. Published by Elsevier B.V. All rights reserved.
Further Validation of the Conner's Adult Attention Deficit/Hyperactivity Rating Scale Infrequency Index (CII) for Detection of Non-Credible Report of Attention Deficit/Hyperactivity Disorder Symptoms.

PubMed

Cook, Carolyn M; Bolinger, Elizabeth; Suhr, Julie

2016-06-01

Attention deficit/hyperactivity disorder (ADHD) can be easily presented in a non-credible manner, through non-credible report of ADHD symptoms and/or by non-credible performance on neuropsychological tests. While most studies have focused on detection of non-credible performance using performance validity tests, there are few studies examining the ability to detect non-credible report of ADHD symptoms. We provide further validation data for a recently developed measure of non-credible ADHD symptom report, the Conner's Adult ADHD Rating Scales (CAARS) Infrequency Index (CII). Using archival data from 86 adults referred for concerns about ADHD, we examined the accuracy of the CII in detecting extreme scores on the CAARS and invalid reporting on validity indices of the Minnesota Multiphasic Personality Inventory-2 Restructured Format (MMPI-2-RF). We also examined the accuracy of the CII in detecting non-credible performance on standalone and embedded performance validity tests. The CII was 52% sensitive to extreme scores on CAARS DSM symptom subscales (with 97% specificity) and 20%-36% sensitive to invalid responding on MMPI-2-RF validity scales (with near 90% specificity), providing further evidence for the interpretation of the CII as an indicator of non-credible ADHD symptom report. However, the CII detected only 18% of individuals who failed a standalone performance validity test (Word Memory Test), with 87.8% specificity, and was not accurate in detecting non-credible performance using embedded digit span cutoffs. Future studies should continue to examine how best to assess for non-credible symptom report in ADHD referrals. © The Author 2016. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Noncredible cognitive performance at clinical evaluation of adult ADHD: An embedded validity indicator in a visuospatial working memory test.

PubMed

Fuermaier, Anselm B M; Tucha, Oliver; Koerts, Janneke; Lange, Klaus W; Weisbrod, Matthias; Aschenbrenner, Steffen; Tucha, Lara

2017-12-01

The assessment of performance validity is an essential part of the neuropsychological evaluation of adults with attention-deficit/hyperactivity disorder (ADHD). Most available tools, however, are inaccurate regarding the identification of noncredible performance. This study describes the development of a visuospatial working memory test, including a validity indicator for noncredible cognitive performance of adults with ADHD. Visuospatial working memory of adults with ADHD (n = 48) was first compared to the test performance of healthy individuals (n = 48). Furthermore, a simulation design was performed including 252 individuals who were randomly assigned to either a control group (n = 48) or to 1 of 3 simulation groups who were requested to feign ADHD (n = 204). Additional samples of 27 adults with ADHD and 69 instructed simulators were included to cross-validate findings from the first samples. Adults with ADHD showed impaired visuospatial working memory performance of medium size as compared to healthy individuals. Simulation groups committed significantly more errors and had shorter response times as compared to patients with ADHD. Moreover, binary logistic regression analysis was carried out to derive a validity index that optimally differentiates between true and feigned ADHD. ROC analysis demonstrated high classification rates of the validity index, as shown in excellent specificity (95.8%) and adequate sensitivity (60.3%). The visuospatial working memory test as presented in this study therefore appears sensitive in indicating cognitive impairment of adults with ADHD. Furthermore, the embedded validity index revealed promising results concerning the detection of noncredible cognitive performance of adults with ADHD. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Functional performance testing of the hip in athletes: a systematic review for reliability and validity.

PubMed

Kivlan, Benjamin R; Martin, Robroy L

2012-08-01

The purpose of this study was to systematically review the literature for functional performance tests with evidence of reliability and validity that could be used for a young, athletic population with hip dysfunction. A search of PubMed and SPORTDiscus databases were performed to identify movement, balance, hop/jump, or agility functional performance tests from the current peer-reviewed literature used to assess function of the hip in young, athletic subjects. The single-leg stance, deep squat, single-leg squat, and star excursion balance tests (SEBT) demonstrated evidence of validity and normative data for score interpretation. The single-leg stance test and SEBT have evidence of validity with association to hip abductor function. The deep squat test demonstrated evidence as a functional performance test for evaluating femoroacetabular impingement. Hop/Jump tests and agility tests have no reported evidence of reliability or validity in a population of subjects with hip pathology. Use of functional performance tests in the assessment of hip dysfunction has not been well established in the current literature. Diminished squat depth and provocation of pain during the single-leg balance test have been associated with patients diagnosed with FAI and gluteal tendinopathy, respectively. The SEBT and single-leg squat tests provided evidence of convergent validity through an analysis of kinematics and muscle function in normal subjects. Reliability of functional performance tests have not been established on patients with hip dysfunction. Further study is needed to establish reliability and validity of functional performance tests that can be used in a young, athletic population with hip dysfunction. 2b (Systematic Review of Literature).

Exploring a Framework for Consequential Validity for Performance-Based Assessments

ERIC Educational Resources Information Center

Kim, Su Jung

2017-01-01

This study explores a new comprehensive framework for understanding elements of validity, specifically for performance assessments that are administered within specific and dynamic contexts. The adoption of edTPA is a good empirical case for examining the concept of consequential validity because this assessment has been implemented at the state…
Convergent Validity Evidence regarding the Validity of the Chilean Standards-Based Teacher Evaluation System

ERIC Educational Resources Information Center

Santelices, Maria Veronica; Taut, Sandy

2011-01-01

This paper describes convergent validity evidence regarding the mandatory, standards-based Chilean national teacher evaluation system (NTES). The study examined whether NTES identifies--and thereby rewards or punishes--the "right" teachers as high- or low-performing. We collected in-depth teaching performance data on a sample of 58…
A framework to assess management performance in district health systems: a qualitative and quantitative case study in Iran.

PubMed

Tabrizi, Jafar Sadegh; Gholipour, Kamal; Iezadi, Shabnam; Farahbakhsh, Mostafa; Ghiasi, Akbar

2018-01-01

The aim was to design a district health management performance framework for Iran's healthcare system. The mixed-method study was conducted between September 2015 and May 2016 in Tabriz, Iran. In this study, the indicators of district health management performance were obtained by analyzing the 45 semi-structured surveys of experts in the public health system. Content validity of performance indicators which were generated in qualitative part were reviewed and confirmed based on content validity index (CVI). Also content validity ratio (CVR) was calculated using data acquired from a survey of 21 experts in quantitative part. The result of this study indicated that, initially, 81 indicators were considered in framework of district health management performance and, at the end, 53 indicators were validated and confirmed. These indicators were classified in 11 categories which include: human resources and organizational creativity, management and leadership, rules and ethics, planning and evaluation, district managing, health resources management and economics, community participation, quality improvement, research in health system, health information management, epidemiology and situation analysis. The designed framework model can be used to assess the district health management and facilitates performance improvement at the district level.
Construct validity of the individual work performance questionnaire.

PubMed

Koopmans, Linda; Bernaards, Claire M; Hildebrandt, Vincent H; de Vet, Henrica C W; van der Beek, Allard J

2014-03-01

To examine the construct validity of the Individual Work Performance Questionnaire (IWPQ). A total of 1424 Dutch workers from three occupational sectors (blue, pink, and white collar) participated in the study. First, IWPQ scores were correlated with related constructs (convergent validity). Second, differences between known groups were tested (discriminative validity). First, IWPQ scores correlated weakly to moderately with absolute and relative presenteeism, and work engagement. Second, significant differences in IWPQ scores were observed for workers differing in job satisfaction, and workers differing in health. Overall, the results indicate acceptable construct validity of the IWPQ. Researchers are provided with a reliable and valid instrument to measure individual work performance comprehensively and generically, among workers from different occupational sectors, with and without health problems.
Construct Validity of Three Clerkship Performance Assessments

ERIC Educational Resources Information Center

Lee, Ming; Wimmers, Paul F.

2010-01-01

This study examined construct validity of three commonly used clerkship performance assessments: preceptors' evaluations, OSCE-type clinical performance measures, and the NBME [National Board of Medical Examiners] medicine subject examination. Six hundred and eighty-six students taking the inpatient medicine clerkship from 2003 to 2007…
Validating Performance Level Descriptors (PLDs) for the AP® Environmental Science Exam

ERIC Educational Resources Information Center

Reshetar, Rosemary; Kaliski, Pamela; Chajewski, Michael; Lionberger, Karen

2012-01-01

This presentation summarizes a pilot study conducted after the May 2011 administration of the AP Environmental Science Exam. The study used analytical methods based on scaled anchoring as input to a Performance Level Descriptor validation process that solicited systematic input from subject matter experts.
Agility performance in high-level junior basketball players: the predictive value of anthropometrics and power qualities.

PubMed

Sisic, Nedim; Jelicic, Mario; Pehar, Miran; Spasic, Miodrag; Sekulic, Damir

2016-01-01

In basketball, anthropometric status is an important factor when identifying and selecting talents, while agility is one of the most vital motor performances. The aim of this investigation was to evaluate the influence of anthropometric variables and power capacities on different preplanned agility performances. The participants were 92 high-level, junior-age basketball players (16-17 years of age; 187.6±8.72 cm in body height, 78.40±12.26 kg in body mass), randomly divided into a validation and cross-validation subsample. The predictors set consisted of 16 anthropometric variables, three tests of power-capacities (Sargent-jump, broad-jump and medicine-ball-throw) as predictors. The criteria were three tests of agility: a T-Shape-Test; a Zig-Zag-Test, and a test of running with a 180-degree turn (T180). Forward stepwise multiple regressions were calculated for validation subsamples and then cross-validated. Cross validation included correlations between observed and predicted scores, dependent samples t-test between predicted and observed scores; and Bland Altman graphics. Analysis of the variance identified centres being advanced in most of the anthropometric indices, and medicine-ball-throw (all at P<0.05); with no significant between-position-differences for other studied motor performances. Multiple regression models originally calculated for the validation subsample were then cross-validated, and confirmed for Zig-zag-Test (R of 0.71 and 0.72 for the validation and cross-validation subsample, respectively). Anthropometrics were not strongly related to agility performance, but leg length is found to be negatively associated with performance in basketball-specific agility. Power capacities are confirmed to be an important factor in agility. The results highlighted the importance of sport-specific tests when studying pre-planned agility performance in basketball. The improvement in power capacities will probably result in an improvement in agility in basketball athletes, while anthropometric indices should be used in order to identify those athletes who can achieve superior agility performance.
Review and evaluation of performance measures for survival prediction models in external validation settings.

PubMed

Rahman, M Shafiqur; Ambler, Gareth; Choodari-Oskooei, Babak; Omar, Rumana Z

2017-04-18

When developing a prediction model for survival data it is essential to validate its performance in external validation settings using appropriate performance measures. Although a number of such measures have been proposed, there is only limited guidance regarding their use in the context of model validation. This paper reviewed and evaluated a wide range of performance measures to provide some guidelines for their use in practice. An extensive simulation study based on two clinical datasets was conducted to investigate the performance of the measures in external validation settings. Measures were selected from categories that assess the overall performance, discrimination and calibration of a survival prediction model. Some of these have been modified to allow their use with validation data, and a case study is provided to describe how these measures can be estimated in practice. The measures were evaluated with respect to their robustness to censoring and ease of interpretation. All measures are implemented, or are straightforward to implement, in statistical software. Most of the performance measures were reasonably robust to moderate levels of censoring. One exception was Harrell's concordance measure which tended to increase as censoring increased. We recommend that Uno's concordance measure is used to quantify concordance when there are moderate levels of censoring. Alternatively, Gönen and Heller's measure could be considered, especially if censoring is very high, but we suggest that the prediction model is re-calibrated first. We also recommend that Royston's D is routinely reported to assess discrimination since it has an appealing interpretation. The calibration slope is useful for both internal and external validation settings and recommended to report routinely. Our recommendation would be to use any of the predictive accuracy measures and provide the corresponding predictive accuracy curves. In addition, we recommend to investigate the characteristics of the validation data such as the level of censoring and the distribution of the prognostic index derived in the validation setting before choosing the performance measures.
The Virtual Shop: A new immersive virtual reality environment and scenario for the assessment of everyday memory.

PubMed

Ouellet, Émilie; Boller, Benjamin; Corriveau-Lecavalier, Nick; Cloutier, Simon; Belleville, Sylvie

2018-06-01

Assessing and predicting memory performance in everyday life is a common assignment for neuropsychologists. However, most traditional neuropsychological tasks are not conceived to capture everyday memory performance. The Virtual Shop is a fully immersive task developed to assess memory in a more ecological way than traditional neuropsychological assessments. Two studies were undertaken to assess the feasibility of the Virtual Shop and to appraise its ecological and construct validity. In study 1, 20 younger and 19 older adults completed the Virtual Shop task to evaluate its level of difficulty and the way the participants interacted with the VR material. The construct validity was examined with the contrasted-group method, by comparing the performance of younger and older adults. In study 2, 35 individuals with subjective cognitive decline completed the Virtual Shop task. Performance was correlated with an existing questionnaire evaluating everyday memory in order to appraise its ecological validity. To add further support to its construct validity, performance was correlated with traditional episodic memory and executive tasks. All participants successfully completed the Virtual Shop. The task had an appropriate level of difficulty that helped differentiate younger and older adults, supporting the feasibility and construct validity of the task. The performance on the Virtual Shop was significantly and moderately correlated with the performance on the questionnaire and on the traditional memory and executive tasks. Results support the feasibility and both the ecological and construct validity of the Virtual Shop. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.
The predictive validity of the MCAT for medical school performance and medical board licensing examinations: a meta-analysis of the published research.

PubMed

Donnon, Tyrone; Paolucci, Elizabeth Oddone; Violato, Claudio

2007-01-01

To conduct a meta-analysis of published studies to determine the predictive validity of the MCAT on medical school performance and medical board licensing examinations. The authors included all peer-reviewed published studies reporting empirical data on the relationship between MCAT scores and medical school performance or medical board licensing exam measures. Moderator variables, participant characteristics, and medical school performance/medical board licensing exam measures were extracted and reviewed separately by three reviewers using a standardized protocol. Medical school performance measures from 11 studies and medical board licensing examinations from 18 studies, for a total of 23 studies, were selected. A random-effects model meta-analysis of weighted effects sizes (r) resulted in (1) a predictive validity coefficient for the MCAT in the preclinical years of r = 0.39 (95% confidence interval [CI], 0.21-0.54) and on the USMLE Step 1 of r = 0.60 (95% CI, 0.50-0.67); and (2) the biological sciences subtest as the best predictor of medical school performance in the preclinical years (r = 0.32 95% CI, 0.21-0.42) and on the USMLE Step 1 (r = 0.48 95% CI, 0.41-0.54). The predictive validity of the MCAT ranges from small to medium for both medical school performance and medical board licensing exam measures. The medical profession is challenged to develop screening and selection criteria with improved validity that can supplement the MCAT as an important criterion for admission to medical schools.
Tests for the Assessment of Sport-Specific Performance in Olympic Combat Sports: A Systematic Review With Practical Recommendations

PubMed Central

Chaabene, Helmi; Negra, Yassine; Bouguezzi, Raja; Capranica, Laura; Franchini, Emerson; Prieske, Olaf; Hbacha, Hamdi; Granacher, Urs

2018-01-01

The regular monitoring of physical fitness and sport-specific performance is important in elite sports to increase the likelihood of success in competition. This study aimed to systematically review and to critically appraise the methodological quality, validation data, and feasibility of the sport-specific performance assessment in Olympic combat sports like amateur boxing, fencing, judo, karate, taekwondo, and wrestling. A systematic search was conducted in the electronic databases PubMed, Google-Scholar, and Science-Direct up to October 2017. Studies in combat sports were included that reported validation data (e.g., reliability, validity, sensitivity) of sport-specific tests. Overall, 39 studies were eligible for inclusion in this review. The majority of studies (74%) contained sample sizes <30 subjects. Nearly, 1/3 of the reviewed studies lacked a sufficient description (e.g., anthropometrics, age, expertise level) of the included participants. Seventy-two percent of studies did not sufficiently report inclusion/exclusion criteria of their participants. In 62% of the included studies, the description and/or inclusion of a familiarization session (s) was either incomplete or not existent. Sixty-percent of studies did not report any details about the stability of testing conditions. Approximately half of the studies examined reliability measures of the included sport-specific tests (intraclass correlation coefficient [ICC] = 0.43–1.00). Content validity was addressed in all included studies, criterion validity (only the concurrent aspect of it) in approximately half of the studies with correlation coefficients ranging from r = −0.41 to 0.90. Construct validity was reported in 31% of the included studies and predictive validity in only one. Test sensitivity was addressed in 13% of the included studies. The majority of studies (64%) ignored and/or provided incomplete information on test feasibility and methodological limitations of the sport-specific test. In 28% of the included studies, insufficient information or a complete lack of information was provided in the respective field of the test application. Several methodological gaps exist in studies that used sport-specific performance tests in Olympic combat sports. Additional research should adopt more rigorous validation procedures in the application and description of sport-specific performance tests in Olympic combat sports. PMID:29692739
Towards a full integration of optimization and validation phases: An analytical-quality-by-design approach.

PubMed

Hubert, C; Houari, S; Rozet, E; Lebrun, P; Hubert, Ph

2015-05-22

When using an analytical method, defining an analytical target profile (ATP) focused on quantitative performance represents a key input, and this will drive the method development process. In this context, two case studies were selected in order to demonstrate the potential of a quality-by-design (QbD) strategy when applied to two specific phases of the method lifecycle: the pre-validation study and the validation step. The first case study focused on the improvement of a liquid chromatography (LC) coupled to mass spectrometry (MS) stability-indicating method by the means of the QbD concept. The design of experiments (DoE) conducted during the optimization step (i.e. determination of the qualitative design space (DS)) was performed a posteriori. Additional experiments were performed in order to simultaneously conduct the pre-validation study to assist in defining the DoE to be conducted during the formal validation step. This predicted protocol was compared to the one used during the formal validation. A second case study based on the LC/MS-MS determination of glucosamine and galactosamine in human plasma was considered in order to illustrate an innovative strategy allowing the QbD methodology to be incorporated during the validation phase. An operational space, defined by the qualitative DS, was considered during the validation process rather than a specific set of working conditions as conventionally performed. Results of all the validation parameters conventionally studied were compared to those obtained with this innovative approach for glucosamine and galactosamine. Using this strategy, qualitative and quantitative information were obtained. Consequently, an analyst using this approach would be able to select with great confidence several working conditions within the operational space rather than a given condition for the routine use of the method. This innovative strategy combines both a learning process and a thorough assessment of the risk involved. Copyright © 2015 Elsevier B.V. All rights reserved.
Validating workplace performance assessments in health sciences students: a case study from speech pathology.

PubMed

McAllister, Sue; Lincoln, Michelle; Ferguson, Allison; McAllister, Lindy

2013-01-01

Valid assessment of health science students' ability to perform in the real world of workplace practice is critical for promoting quality learning and ultimately certifying students as fit to enter the world of professional practice. Current practice in performance assessment in the health sciences field has been hampered by multiple issues regarding assessment content and process. Evidence for the validity of scores derived from assessment tools are usually evaluated against traditional validity categories with reliability evidence privileged over validity, resulting in the paradoxical effect of compromising the assessment validity and learning processes the assessments seek to promote. Furthermore, the dominant statistical approaches used to validate scores from these assessments fall under the umbrella of classical test theory approaches. This paper reports on the successful national development and validation of measures derived from an assessment of Australian speech pathology students' performance in the workplace. Validation of these measures considered each of Messick's interrelated validity evidence categories and included using evidence generated through Rasch analyses to support score interpretation and related action. This research demonstrated that it is possible to develop an assessment of real, complex, work based performance of speech pathology students, that generates valid measures without compromising the learning processes the assessment seeks to promote. The process described provides a model for other health professional education programs to trial.
Reliable and valid tools for measuring surgeons' teaching performance: residents' vs. self evaluation.

PubMed

Boerebach, Benjamin C M; Arah, Onyebuchi A; Busch, Olivier R C; Lombarts, Kiki M J M H

2012-01-01

In surgical education, there is a need for educational performance evaluation tools that yield reliable and valid data. This paper describes the development and validation of robust evaluation tools that provide surgeons with insight into their clinical teaching performance. We investigated (1) the reliability and validity of 2 tools for evaluating the teaching performance of attending surgeons in residency training programs, and (2) whether surgeons' self evaluation correlated with the residents' evaluation of those surgeons. We surveyed 343 surgeons and 320 residents as part of a multicenter prospective cohort study of faculty teaching performance in residency training programs. The reliability and validity of the SETQ (System for Evaluation Teaching Qualities) tools were studied using standard psychometric techniques. We then estimated the correlations between residents' and surgeons' evaluations. The response rate was 87% among surgeons and 84% among residents, yielding 2625 residents' evaluations and 302 self evaluations. The SETQ tools yielded reliable and valid data on 5 domains of surgical teaching performance, namely, learning climate, professional attitude towards residents, communication of goals, evaluation of residents, and feedback. The correlations between surgeons' self and residents' evaluations were low, with coefficients ranging from 0.03 for evaluation of residents to 0.18 for communication of goals. The SETQ tools for the evaluation of surgeons' teaching performance appear to yield reliable and valid data. The lack of strong correlations between surgeons' self and residents' evaluations suggest the need for using external feedback sources in informed self evaluation of surgeons. Copyright © 2012 Association of Program Directors in Surgery. Published by Elsevier Inc. All rights reserved.
Assessing Procedural Competence: Validity Considerations.

PubMed

Pugh, Debra M; Wood, Timothy J; Boulet, John R

2015-10-01

Simulation-based medical education (SBME) offers opportunities for trainees to learn how to perform procedures and to be assessed in a safe environment. However, SBME research studies often lack robust evidence to support the validity of the interpretation of the results obtained from tools used to assess trainees' skills. The purpose of this paper is to describe how a validity framework can be applied when reporting and interpreting the results of a simulation-based assessment of skills related to performing procedures. The authors discuss various sources of validity evidence because they relate to SBME. A case study is presented.
A Decade of Candidates' Performances in NECO-SSCE Mathematics in Nigeria

ERIC Educational Resources Information Center

Utibe, U. J.; Agwagah, U. N.

2015-01-01

This study investigated a decade of candidates' performances in NECO-SSCE mathematics in NIGERIA. A total of 9266459 valid results were collated for the study and analyzed for zones in the country. Already validated results of NECO for 2000 to 2009 were used for the study. Three research questions guided the conduct of the study. Results showed…
The Arthroscopic Surgical Skill Evaluation Tool (ASSET)

PubMed Central

Koehler, Ryan J.; Amsdell, Simon; Arendt, Elizabeth A; Bisson, Leslie J; Braman, Jonathan P; Butler, Aaron; Cosgarea, Andrew J; Harner, Christopher D; Garrett, William E; Olson, Tyson; Warme, Winston J.; Nicandri, Gregg T.

2014-01-01

Background Surgeries employing arthroscopic techniques are among the most commonly performed in orthopaedic clinical practice however, valid and reliable methods of assessing the arthroscopic skill of orthopaedic surgeons are lacking. Hypothesis The Arthroscopic Surgery Skill Evaluation Tool (ASSET) will demonstrate content validity, concurrent criterion-oriented validity, and reliability, when used to assess the technical ability of surgeons performing diagnostic knee arthroscopy on cadaveric specimens. Study Design Cross-sectional study; Level of evidence, 3 Methods Content validity was determined by a group of seven experts using a Delphi process. Intra-articular performance of a right and left diagnostic knee arthroscopy was recorded for twenty-eight residents and two sports medicine fellowship trained attending surgeons. Subject performance was assessed by two blinded raters using the ASSET. Concurrent criterion-oriented validity, inter-rater reliability, and test-retest reliability were evaluated. Results Content validity: The content development group identified 8 arthroscopic skill domains to evaluate using the ASSET. Concurrent criterion-oriented validity: Significant differences in total ASSET score (p<0.05) between novice, intermediate, and advanced experience groups were identified. Inter-rater reliability: The ASSET scores assigned by each rater were strongly correlated (r=0.91, p <0.01) and the intra-class correlation coefficient between raters for the total ASSET score was 0.90. Test-retest reliability: there was a significant correlation between ASSET scores for both procedures attempted by each individual (r = 0.79, p<0.01). Conclusion The ASSET appears to be a useful, valid, and reliable method for assessing surgeon performance of diagnostic knee arthroscopy in cadaveric specimens. Studies are ongoing to determine its generalizability to other procedures as well as to the live OR and other simulated environments. PMID:23548808
The development and testing of a skin tear risk assessment tool.

PubMed

Newall, Nelly; Lewin, Gill F; Bulsara, Max K; Carville, Keryln J; Leslie, Gavin D; Roberts, Pam A

2017-02-01

The aim of the present study is to develop a reliable and valid skin tear risk assessment tool. The six characteristics identified in a previous case control study as constituting the best risk model for skin tear development were used to construct a risk assessment tool. The ability of the tool to predict skin tear development was then tested in a prospective study. Between August 2012 and September 2013, 1466 tertiary hospital patients were assessed at admission and followed up for 10 days to see if they developed a skin tear. The predictive validity of the tool was assessed using receiver operating characteristic (ROC) analysis. When the tool was found not to have performed as well as hoped, secondary analyses were performed to determine whether a potentially better performing risk model could be identified. The tool was found to have high sensitivity but low specificity and therefore have inadequate predictive validity. Secondary analysis of the combined data from this and the previous case control study identified an alternative better performing risk model. The tool developed and tested in this study was found to have inadequate predictive validity. The predictive validity of an alternative, more parsimonious model now needs to be tested. © 2015 Medicalhelplines.com Inc and John Wiley & Sons Ltd.
Comparing current definitions of return to work: a measurement approach.

PubMed

Steenstra, I A; Lee, H; de Vroome, E M M; Busse, J W; Hogg-Johnson, S J

2012-09-01

Return-to-work (RTW) status is an often used outcome in work and health research. In low back pain, work is regarded as a normal activity a worker should return to in order to fully recover. Comparing outcomes across studies and even jurisdictions using different definitions of RTW can be challenging for readers in general and when performing a systematic review in particular. In this study, the measurement properties of previously defined RTW outcomes were examined with data from two studies from two countries. Data on RTW in low back pain (LBP) from the Canadian Early Claimant Cohort (ECC); a workers' compensation based study, and the Dutch Amsterdam Sherbrooke Evaluation (ASE) study were analyzed. Correlations between outcomes, differences in predictive validity when using different outcomes and construct validity when comparing outcomes to a functional status outcome were analyzed. In the ECC all definitions were highly correlated and performed similarly in predictive validity. When compared to functional status, RTW definitions in the ECC study performed fair to good on all time points. In the ASE study all definitions were highly correlated and performed similarly in predictive validity. The RTW definitions, however, failed to compare or compared poorly with functional status. Only one definition compared fairly on one time point. Differently defined outcomes are highly correlated, give similar results in prediction, but seem to differ in construct validity when compared to functional status depending on societal context or possibly birth cohort. Comparison of studies using different RTW definitions appears valid as long as RTW status is not considered as a measure of functional status.
Impact of External Cue Validity on Driving Performance in Parkinson's Disease

PubMed Central

Scally, Karen; Charlton, Judith L.; Iansek, Robert; Bradshaw, John L.; Moss, Simon; Georgiou-Karistianis, Nellie

2011-01-01

This study sought to investigate the impact of external cue validity on simulated driving performance in 19 Parkinson's disease (PD) patients and 19 healthy age-matched controls. Braking points and distance between deceleration point and braking point were analysed for red traffic signals preceded either by Valid Cues (correctly predicting signal), Invalid Cues (incorrectly predicting signal), and No Cues. Results showed that PD drivers braked significantly later and travelled significantly further between deceleration and braking points compared with controls for Invalid and No-Cue conditions. No significant group differences were observed for driving performance in response to Valid Cues. The benefit of Valid Cues relative to Invalid Cues and No Cues was significantly greater for PD drivers compared with controls. Trail Making Test (B-A) scores correlated with driving performance for PDs only. These results highlight the importance of external cues and higher cognitive functioning for driving performance in mild to moderate PD. PMID:21789275

FUNCTIONAL PERFORMANCE TESTING OF THE HIP IN ATHLETES: A SYSTEMATIC REVIEW FOR RELIABILITY AND VALIDITY

PubMed Central

Martin, RobRoy L.

2012-01-01

Purpose/Background: The purpose of this study was to systematically review the literature for functional performance tests with evidence of reliability and validity that could be used for a young, athletic population with hip dysfunction. Methods: A search of PubMed and SPORTDiscus databases were performed to identify movement, balance, hop/jump, or agility functional performance tests from the current peer-reviewed literature used to assess function of the hip in young, athletic subjects. Results: The single-leg stance, deep squat, single-leg squat, and star excursion balance tests (SEBT) demonstrated evidence of validity and normative data for score interpretation. The single-leg stance test and SEBT have evidence of validity with association to hip abductor function. The deep squat test demonstrated evidence as a functional performance test for evaluating femoroacetabular impingement. Hop/Jump tests and agility tests have no reported evidence of reliability or validity in a population of subjects with hip pathology. Conclusions: Use of functional performance tests in the assessment of hip dysfunction has not been well established in the current literature. Diminished squat depth and provocation of pain during the single-leg balance test have been associated with patients diagnosed with FAI and gluteal tendinopathy, respectively. The SEBT and single-leg squat tests provided evidence of convergent validity through an analysis of kinematics and muscle function in normal subjects. Reliability of functional performance tests have not been established on patients with hip dysfunction. Further study is needed to establish reliability and validity of functional performance tests that can be used in a young, athletic population with hip dysfunction. Level of Evidence: 2b (Systematic Review of Literature) PMID:22893860
The predictive validity of a situational judgement test, a clinical problem solving test and the core medical training selection methods for performance in specialty training .

PubMed

Patterson, Fiona; Lopes, Safiatu; Harding, Stephen; Vaux, Emma; Berkin, Liz; Black, David

2017-02-01

The aim of this study was to follow up a sample of physicians who began core medical training (CMT) in 2009. This paper examines the long-term validity of CMT and GP selection methods in predicting performance in the Membership of Royal College of Physicians (MRCP(UK)) examinations. We performed a longitudinal study, examining the extent to which the GP and CMT selection methods (T1) predict performance in the MRCP(UK) examinations (T2). A total of 2,569 applicants from 2008-09 who completed CMT and GP selection methods were included in the study. Looking at MRCP(UK) part 1, part 2 written and PACES scores, both CMT and GP selection methods show evidence of predictive validity for the outcome variables, and hierarchical regressions show the GP methods add significant value to the CMT selection process. CMT selection methods predict performance in important outcomes and have good evidence of validity; the GP methods may have an additional role alongside the CMT selection methods. © Royal College of Physicians 2017. All rights reserved.
Validation conform ISO-15189 of assays in the field of autoimmunity: Joint efforts in The Netherlands.

PubMed

Mulder, Leontine; van der Molen, Renate; Koelman, Carin; van Leeuwen, Ester; Roos, Anja; Damoiseaux, Jan

2018-05-01

ISO 15189:2012 requires validation of methods used in the medical laboratory, and lists a series of performance parameters for consideration to include. Although these performance parameters are feasible for clinical chemistry analytes, application in the validation of autoimmunity tests is a challenge. Lack of gold standards or reference methods in combination with the scarcity of well-defined diagnostic samples of patients with rare diseases make validation of new assays difficult. The present manuscript describes the initiative of Dutch medical immunology laboratory specialists to combine efforts and perform multi-center validation studies of new assays in the field of autoimmunity. Validation data and reports are made available to interested Dutch laboratory specialists. Copyright © 2018 Elsevier B.V. All rights reserved.
Assessment of predictive performance in incomplete data by combining internal validation and multiple imputation.

PubMed

Wahl, Simone; Boulesteix, Anne-Laure; Zierer, Astrid; Thorand, Barbara; van de Wiel, Mark A

2016-10-26

Missing values are a frequent issue in human studies. In many situations, multiple imputation (MI) is an appropriate missing data handling strategy, whereby missing values are imputed multiple times, the analysis is performed in every imputed data set, and the obtained estimates are pooled. If the aim is to estimate (added) predictive performance measures, such as (change in) the area under the receiver-operating characteristic curve (AUC), internal validation strategies become desirable in order to correct for optimism. It is not fully understood how internal validation should be combined with multiple imputation. In a comprehensive simulation study and in a real data set based on blood markers as predictors for mortality, we compare three combination strategies: Val-MI, internal validation followed by MI on the training and test parts separately, MI-Val, MI on the full data set followed by internal validation, and MI(-y)-Val, MI on the full data set omitting the outcome followed by internal validation. Different validation strategies, including bootstrap und cross-validation, different (added) performance measures, and various data characteristics are considered, and the strategies are evaluated with regard to bias and mean squared error of the obtained performance estimates. In addition, we elaborate on the number of resamples and imputations to be used, and adopt a strategy for confidence interval construction to incomplete data. Internal validation is essential in order to avoid optimism, with the bootstrap 0.632+ estimate representing a reliable method to correct for optimism. While estimates obtained by MI-Val are optimistically biased, those obtained by MI(-y)-Val tend to be pessimistic in the presence of a true underlying effect. Val-MI provides largely unbiased estimates, with a slight pessimistic bias with increasing true effect size, number of covariates and decreasing sample size. In Val-MI, accuracy of the estimate is more strongly improved by increasing the number of bootstrap draws rather than the number of imputations. With a simple integrated approach, valid confidence intervals for performance estimates can be obtained. When prognostic models are developed on incomplete data, Val-MI represents a valid strategy to obtain estimates of predictive performance measures.
A systematic review of the measurement properties of the European Organisation for Research and Treatment of Cancer In-patient Satisfaction with Care Questionnaire, the EORTC IN-PATSAT32.

PubMed

Neijenhuijs, Koen I; Jansen, Femke; Aaronson, Neil K; Brédart, Anne; Groenvold, Mogens; Holzner, Bernhard; Terwee, Caroline B; Cuijpers, Pim; Verdonck-de Leeuw, Irma M

2018-05-07

The EORTC IN-PATSAT32 is a patient-reported outcome measure (PROM) to assess cancer patients' satisfaction with in-patient health care. The aim of this study was to investigate whether the initial good measurement properties of the IN-PATSAT32 are confirmed in new studies. Within the scope of a larger systematic review study (Prospero ID 42017057237), a systematic search was performed of Embase, Medline, PsycINFO, and Web of Science for studies that investigated measurement properties of the IN-PATSAT32 up to July 2017. Study quality was assessed, data were extracted, and synthesized according to the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) methodology. Nine studies were included in this review. The evidence on reliability and construct validity were rated as sufficient and of the quality of the evidence as moderate. The evidence on structural validity was rated as insufficient and of low quality. The evidence on internal consistency was indeterminate. Measurement error, responsiveness, criterion validity, and cross-cultural validity were not reported in the included studies. Measurement error could be calculated for two studies and was judged indeterminate. In summary, the IN-PATSAT32 performs as expected with respect to reliability and construct validity. No firm conclusions can be made yet whether the IN-PATSAT32 also performs as well with respect to structural validity and internal consistency. Further research on these measurement properties of the PROM is therefore needed as well as on measurement error, responsiveness, criterion validity, and cross-cultural validity. For future studies, it is recommended to take the COSMIN methodology into account.
Risk prediction models of breast cancer: a systematic review of model performances.

PubMed

Anothaisintawee, Thunyarat; Teerawattananon, Yot; Wiratkapun, Chollathip; Kasamesup, Vijj; Thakkinstian, Ammarin

2012-05-01

The number of risk prediction models has been increasingly developed, for estimating about breast cancer in individual women. However, those model performances are questionable. We therefore have conducted a study with the aim to systematically review previous risk prediction models. The results from this review help to identify the most reliable model and indicate the strengths and weaknesses of each model for guiding future model development. We searched MEDLINE (PubMed) from 1949 and EMBASE (Ovid) from 1974 until October 2010. Observational studies which constructed models using regression methods were selected. Information about model development and performance were extracted. Twenty-five out of 453 studies were eligible. Of these, 18 developed prediction models and 7 validated existing prediction models. Up to 13 variables were included in the models and sample sizes for each study ranged from 550 to 2,404,636. Internal validation was performed in four models, while five models had external validation. Gail and Rosner and Colditz models were the significant models which were subsequently modified by other scholars. Calibration performance of most models was fair to good (expected/observe ratio: 0.87-1.12), but discriminatory accuracy was poor to fair both in internal validation (concordance statistics: 0.53-0.66) and in external validation (concordance statistics: 0.56-0.63). Most models yielded relatively poor discrimination in both internal and external validation. This poor discriminatory accuracy of existing models might be because of a lack of knowledge about risk factors, heterogeneous subtypes of breast cancer, and different distributions of risk factors across populations. In addition the concordance statistic itself is insensitive to measure the improvement of discrimination. Therefore, the new method such as net reclassification index should be considered to evaluate the improvement of the performance of a new develop model.
A Unified Model of Performance: Validation of its Predictions across Different Sleep/Wake Schedules

PubMed Central

Ramakrishnan, Sridhar; Wesensten, Nancy J.; Balkin, Thomas J.; Reifman, Jaques

2016-01-01

Study Objectives: Historically, mathematical models of human neurobehavioral performance developed on data from one sleep study were limited to predicting performance in similar studies, restricting their practical utility. We recently developed a unified model of performance (UMP) to predict the effects of the continuum of sleep loss—from chronic sleep restriction (CSR) to total sleep deprivation (TSD) challenges—and validated it using data from two studies of one laboratory. Here, we significantly extended this effort by validating the UMP predictions across a wide range of sleep/wake schedules from different studies and laboratories. Methods: We developed the UMP on psychomotor vigilance task (PVT) lapse data from one study encompassing four different CSR conditions (7 d of 3, 5, 7, and 9 h of sleep/night), and predicted performance in five other studies (from four laboratories), including different combinations of TSD (40 to 88 h), CSR (2 to 6 h of sleep/night), control (8 to 10 h of sleep/night), and nap (nocturnal and diurnal) schedules. Results: The UMP accurately predicted PVT performance trends across 14 different sleep/wake conditions, yielding average prediction errors between 7% and 36%, with the predictions lying within 2 standard errors of the measured data 87% of the time. In addition, the UMP accurately predicted performance impairment (average error of 15%) for schedules (TSD and naps) not used in model development. Conclusions: The unified model of performance can be used as a tool to help design sleep/wake schedules to optimize the extent and duration of neurobehavioral performance and to accelerate recovery after sleep loss. Citation: Ramakrishnan S, Wesensten NJ, Balkin TJ, Reifman J. A unified model of performance: validation of its predictions across different sleep/wake schedules. SLEEP 2016;39(1):249–262. PMID:26518594
The predictive validity of three versions of the MCAT in relation to performance in medical school, residency, and licensing examinations: a longitudinal study of 36 classes of Jefferson Medical College.

PubMed

Callahan, Clara A; Hojat, Mohammadreza; Veloski, Jon; Erdmann, James B; Gonnella, Joseph S

2010-06-01

The Medical College Admission Test (MCAT) has undergone several revisions for content and validity since its inception. With another comprehensive review pending, this study examines changes in the predictive validity of the MCAT's three recent versions. Study participants were 7,859 matriculants in 36 classes entering Jefferson Medical College between 1970 and 2005; 1,728 took the pre-1978 version of the MCAT; 3,032 took the 1978-1991 version, and 3,099 took the post-1991 version. MCAT subtest scores were the predictors, and performance in medical school, attrition, scores on the medical licensing examinations, and ratings of clinical competence in the first year of residency were the criterion measures. No significant improvement in validity coefficients was observed for performance in medical school or residency. Validity coefficients for all three versions of the MCAT in predicting Part I/Step 1 remained stable (in the mid-0.40s, P < .01). A systematic decline was observed in the validity coefficients of the MCAT versions in predicting Part II/Step 2. It started at 0.47 for the pre-1978 version, decreased to between 0.42 and 0.40 for the 1978-1991 versions, and to 0.37 for the post-1991 version. Validity coefficients for the MCAT versions in predicting Part III/Step 3 remained near 0.30. These were generally larger for women than men. Although the findings support the short- and long-term predictive validity of the MCAT, opportunities to strengthen it remain. Subsequent revisions should increase the test's ability to predict performance on United States Medical Licensing Examination Step 2 and must minimize the differential validity for gender.
Early Detection of Increased Intracranial Pressure Episodes in Traumatic Brain Injury: External Validation in an Adult and in a Pediatric Cohort.

PubMed

Güiza, Fabian; Depreitere, Bart; Piper, Ian; Citerio, Giuseppe; Jorens, Philippe G; Maas, Andrew; Schuhmann, Martin U; Lo, Tsz-Yan Milly; Donald, Rob; Jones, Patricia; Maier, Gottlieb; Van den Berghe, Greet; Meyfroidt, Geert

2017-03-01

A model for early detection of episodes of increased intracranial pressure in traumatic brain injury patients has been previously developed and validated based on retrospective adult patient data from the multicenter Brain-IT database. The purpose of the present study is to validate this early detection model in different cohorts of recently treated adult and pediatric traumatic brain injury patients. Prognostic modeling. Noninterventional, observational, retrospective study. The adult validation cohort comprised recent traumatic brain injury patients from San Gerardo Hospital in Monza (n = 50), Leuven University Hospital (n = 26), Antwerp University Hospital (n = 19), Tübingen University Hospital (n = 18), and Southern General Hospital in Glasgow (n = 8). The pediatric validation cohort comprised patients from neurosurgical and intensive care centers in Edinburgh and Newcastle (n = 79). None. The model's performance was evaluated with respect to discrimination, calibration, overall performance, and clinical usefulness. In the recent adult validation cohort, the model retained excellent performance as in the original study. In the pediatric validation cohort, the model retained good discrimination and a positive net benefit, albeit with a performance drop in the remaining criteria. The obtained external validation results confirm the robustness of the model to predict future increased intracranial pressure events 30 minutes in advance, in adult and pediatric traumatic brain injury patients. These results are a large step toward an early warning system for increased intracranial pressure that can be generally applied. Furthermore, the sparseness of this model that uses only two routinely monitored signals as inputs (intracranial pressure and mean arterial blood pressure) is an additional asset.
Geosynthetic wall performance : facing pressure and deformation : final report.

DOT National Transportation Integrated Search

2017-02-01

The objective of the study was to validate the performance of blocked-faced Geosynthetic Reinforced Soil (GRS) wall and to validate the Colorado Department of Transportations (CDOT) decision to waive the positive block connection for closely-space...
PERFORMANCE OF OVID MEDLINE SEARCH FILTERS TO IDENTIFY HEALTH STATE UTILITY STUDIES.

PubMed

Arber, Mick; Garcia, Sonia; Veale, Thomas; Edwards, Mary; Shaw, Alison; Glanville, Julie M

2017-01-01

This study was designed to assess the sensitivity of three Ovid MEDLINE search filters developed to identify studies reporting health state utility values (HSUVs), to improve the performance of the best performing filter, and to validate resulting search filters. Three quasi-gold standard sets (QGS1, QGS2, QGS3) of relevant studies were harvested from reviews of studies reporting HSUVs. The performance of three initial filters was assessed by measuring their relative recall of studies in QGS1. The best performing filter was then developed further using QGS2. This resulted in three final search filters (FSF1, FSF2, and FSF3), which were validated using QGS3. FSF1 (sensitivity maximizing) retrieved 132/139 records (sensitivity: 95 percent) in the QGS3 validation set. FSF1 had a number needed to read (NNR) of 842. FSF2 (balancing sensitivity and precision) retrieved 128/139 records (sensitivity: 92 percent) with a NNR of 502. FSF3 (precision maximizing) retrieved 123/139 records (sensitivity: 88 percent) with a NNR of 383. We have developed and validated a search filter (FSF1) to identify studies reporting HSUVs with high sensitivity (95 percent) and two other search filters (FSF2 and FSF3) with reasonably high sensitivity (92 percent and 88 percent) but greater precision, resulting in a lower NNR. These seem to be the first validated filters available for HSUVs. The availability of filters with a range of sensitivity and precision options enables researchers to choose the filter which is most appropriate to the resources available for their specific research.
The Development and Validation of a Concise Instrument for Formative Assessment of Team Leader Performance During Simulated Pediatric Resuscitations.

PubMed

Nadkarni, Lindsay D; Roskind, Cindy G; Auerbach, Marc A; Calhoun, Aaron W; Adler, Mark D; Kessler, David O

2018-04-01

The aim of this study was to assess the validity of a formative feedback instrument for leaders of simulated resuscitations. This is a prospective validation study with a fully crossed (person × scenario × rater) study design. The Concise Assessment of Leader Management (CALM) instrument was designed by pediatric emergency medicine and graduate medical education experts to be used off the shelf to evaluate and provide formative feedback to resuscitation leaders. Four experts reviewed 16 videos of in situ simulated pediatric resuscitations and scored resuscitation leader performance using the CALM instrument. The videos consisted of 4 pediatric emergency department resuscitation teams each performing in 4 pediatric resuscitation scenarios (cardiac arrest, respiratory arrest, seizure, and sepsis). We report on content and internal structure (reliability) validity of the CALM instrument. Content validity was supported by the instrument development process that involved professional experience, expert consensus, focused literature review, and pilot testing. Internal structure validity (reliability) was supported by the generalizability analysis. The main component that contributed to score variability was the person (33%), meaning that individual leaders performed differently. The rater component had almost zero (0%) contribution to variance, which implies that raters were in agreement and argues for high interrater reliability. These results provide initial evidence to support the validity of the CALM instrument as a reliable assessment instrument that can facilitate formative feedback to leaders of pediatric simulated resuscitations.
The Arthroscopic Surgical Skill Evaluation Tool (ASSET).

PubMed

Koehler, Ryan J; Amsdell, Simon; Arendt, Elizabeth A; Bisson, Leslie J; Braman, Jonathan P; Bramen, Jonathan P; Butler, Aaron; Cosgarea, Andrew J; Harner, Christopher D; Garrett, William E; Olson, Tyson; Warme, Winston J; Nicandri, Gregg T

2013-06-01

Surgeries employing arthroscopic techniques are among the most commonly performed in orthopaedic clinical practice; however, valid and reliable methods of assessing the arthroscopic skill of orthopaedic surgeons are lacking. The Arthroscopic Surgery Skill Evaluation Tool (ASSET) will demonstrate content validity, concurrent criterion-oriented validity, and reliability when used to assess the technical ability of surgeons performing diagnostic knee arthroscopic surgery on cadaveric specimens. Cross-sectional study; Level of evidence, 3. Content validity was determined by a group of 7 experts using the Delphi method. Intra-articular performance of a right and left diagnostic knee arthroscopic procedure was recorded for 28 residents and 2 sports medicine fellowship-trained attending surgeons. Surgeon performance was assessed by 2 blinded raters using the ASSET. Concurrent criterion-oriented validity, interrater reliability, and test-retest reliability were evaluated. Content validity: The content development group identified 8 arthroscopic skill domains to evaluate using the ASSET. Concurrent criterion-oriented validity: Significant differences in the total ASSET score (P < .05) between novice, intermediate, and advanced experience groups were identified. Interrater reliability: The ASSET scores assigned by each rater were strongly correlated (r = 0.91, P < .01), and the intraclass correlation coefficient between raters for the total ASSET score was 0.90. Test-retest reliability: There was a significant correlation between ASSET scores for both procedures attempted by each surgeon (r = 0.79, P < .01). The ASSET appears to be a useful, valid, and reliable method for assessing surgeon performance of diagnostic knee arthroscopic surgery in cadaveric specimens. Studies are ongoing to determine its generalizability to other procedures as well as to the live operating room and other simulated environments.
Validity of Highlighting on Text Comprehension

NASA Astrophysics Data System (ADS)

So, Joey C. Y.; Chan, Alan H. S.

2009-10-01

In this study, 38 university students were tested with a Chinese reading task on an LED display under different task conditions for determining the effects of the highlighting and its validity on comprehension performance on light-emitting diodes (LED) display for Chinese reading. Four levels of validity (0%, 33%, 67% and 100%) and a control condition with no highlighting were tested. Each subject was required to perform the five experimental conditions in which different passages were read and comprehended. The results showed that the condition with 100% validity of highlighting was found to have better comprehension performance than other validity levels and conditions with no highlighting. The comprehension score of the condition without highlighting effect was comparatively lower than those highlighting conditions with distracters, though not significant.
Development and Validation of the Basketball Offensive Game Performance Instrument

ERIC Educational Resources Information Center

Chen, Weiyun; Hendricks, Kristin; Zhu, Weimo

2013-01-01

The purpose of this study was to design and validate the Basketball Offensive Game Performance Instrument (BOGPI) that assesses an individual player's offensive game performance competency in basketball. Twelve physical education teacher education (PETE) students playing two 10-minute, 3 vs. 3 basketball games were videotaped at end of a…
Structural and Convergent Validity of the Homework Performance Questionnaire

ERIC Educational Resources Information Center

Pendergast, Laura L.; Watkins, Marley W.; Canivez, Gary L.

2014-01-01

Homework is a requirement for most school-age children, but research on the benefits and drawbacks of homework is limited by lack of psychometrically sound measurement of homework performance. This study examined the structural and convergent validity of scores from the newly developed Homework Performance Questionnaire -- Teacher Scale (HPQ-T).…
Chronic obstructive lung disease "expert system": validation of a predictive tool for assisting diagnosis.

PubMed

Braido, Fulvio; Santus, Pierachille; Corsico, Angelo Guido; Di Marco, Fabiano; Melioli, Giovanni; Scichilone, Nicola; Solidoro, Paolo

2018-01-01

The purposes of this study were development and validation of an expert system (ES) aimed at supporting the diagnosis of chronic obstructive lung disease (COLD). A questionnaire and a WebFlex code were developed and validated in silico. An expert panel pilot validation on 60 cases and a clinical validation on 241 cases were performed. The developed questionnaire and code validated in silico resulted in a suitable tool to support the medical diagnosis. The clinical validation of the ES was performed in an academic setting that included six different reference centers for respiratory diseases. The results of the ES expressed as a score associated with the risk of suffering from COLD were matched and compared with the final clinical diagnoses. A set of 60 patients were evaluated by a pilot expert panel validation with the aim of calculating the sample size for the clinical validation study. The concordance analysis between these preliminary ES scores and diagnoses performed by the experts indicated that the accuracy was 94.7% when both experts and the system confirmed the COLD diagnosis and 86.3% when COLD was excluded. Based on these results, the sample size of the validation set was established in 240 patients. The clinical validation, performed on 241 patients, resulted in ES accuracy of 97.5%, with confirmed COLD diagnosis in 53.6% of the cases and excluded COLD diagnosis in 32% of the cases. In 11.2% of cases, a diagnosis of COLD was made by the experts, although the imaging results showed a potential concomitant disorder. The ES presented here (COLD ES ) is a safe and robust supporting tool for COLD diagnosis in primary care settings.
Technical skills assessment toolbox: a review using the unitary framework of validity.

PubMed

Ghaderi, Iman; Manji, Farouq; Park, Yoon Soo; Juul, Dorthea; Ott, Michael; Harris, Ilene; Farrell, Timothy M

2015-02-01

The purpose of this study was to create a technical skills assessment toolbox for 35 basic and advanced skills/procedures that comprise the American College of Surgeons (ACS)/Association of Program Directors in Surgery (APDS) surgical skills curriculum and to provide a critical appraisal of the included tools, using contemporary framework of validity. Competency-based training has become the predominant model in surgical education and assessment of performance is an essential component. Assessment methods must produce valid results to accurately determine the level of competency. A search was performed, using PubMed and Google Scholar, to identify tools that have been developed for assessment of the targeted technical skills. A total of 23 assessment tools for the 35 ACS/APDS skills modules were identified. Some tools, such as Operative Performance Rating System (OSATS) and Objective Structured Assessment of Technical Skill (OPRS), have been tested for more than 1 procedure. Therefore, 30 modules had at least 1 assessment tool, with some common surgical procedures being addressed by several tools. Five modules had none. Only 3 studies used Messick's framework to design their validity studies. The remaining studies used an outdated framework on the basis of "types of validity." When analyzed using the contemporary framework, few of these studies demonstrated validity for content, internal structure, and relationship to other variables. This study provides an assessment toolbox for common surgical skills/procedures. Our review shows that few authors have used the contemporary unitary concept of validity for development of their assessment tools. As we progress toward competency-based training, future studies should provide evidence for various sources of validity using the contemporary framework.
Effort, symptom validity testing, performance validity testing and traumatic brain injury.

PubMed

Bigler, Erin D

2014-01-01

To understand the neurocognitive effects of brain injury, valid neuropsychological test findings are paramount. This review examines the research on what has been referred to a symptom validity testing (SVT). Above a designated cut-score signifies a 'passing' SVT performance which is likely the best indicator of valid neuropsychological test findings. Likewise, substantially below cut-point performance that nears chance or is at chance signifies invalid test performance. Significantly below chance is the sine qua non neuropsychological indicator for malingering. However, the interpretative problems with SVT performance below the cut-point yet far above chance are substantial, as pointed out in this review. This intermediate, border-zone performance on SVT measures is where substantial interpretative challenges exist. Case studies are used to highlight the many areas where additional research is needed. Historical perspectives are reviewed along with the neurobiology of effort. Reasons why performance validity testing (PVT) may be better than the SVT term are reviewed. Advances in neuroimaging techniques may be key in better understanding the meaning of border zone SVT failure. The review demonstrates the problems with rigidity in interpretation with established cut-scores. A better understanding of how certain types of neurological, neuropsychiatric and/or even test conditions may affect SVT performance is needed.
The reliability and validity of the Complex Task Performance Assessment: A performance-based assessment of executive function.

PubMed

Wolf, Timothy J; Dahl, Abigail; Auen, Colleen; Doherty, Meghan

2017-07-01

The objective of this study was to evaluate the inter-rater reliability, test-retest reliability, concurrent validity, and discriminant validity of the Complex Task Performance Assessment (CTPA): an ecologically valid performance-based assessment of executive function. Community control participants (n = 20) and individuals with mild stroke (n = 14) participated in this study. All participants completed the CTPA and a battery of cognitive assessments at initial testing. The control participants completed the CTPA at two different times one week apart. The intra-class correlation coefficient (ICC) for inter-rater reliability for the total score on the CTPA was .991. The ICCs for all of the sub-scores of the CTPA were also high (.889-.977). The CTPA total score was significantly correlated to Condition 4 of the DKEFS Color-Word Interference Test (p = -.425), and the Wechsler Test of Adult Reading (p = -.493). Finally, there were significant differences between control subjects and individuals with mild stroke on the total score of the CTPA (p = .007) and all sub-scores except interpretation failures and total items incorrect. These results are also consistent with other current executive function performance-based assessments and indicate that the CTPA is a reliable and valid performance-based measure of executive function.

The Stroop test as a measure of performance validity in adults clinically referred for neuropsychological assessment.

PubMed

Erdodi, Laszlo A; Sagar, Sanya; Seke, Kristian; Zuccato, Brandon G; Schwartz, Eben S; Roth, Robert M

2018-06-01

This study was designed to develop performance validity indicators embedded within the Delis-Kaplan Executive Function Systems (D-KEFS) version of the Stroop task. Archival data from a mixed clinical sample of 132 patients (50% male; M Age = 43.4; M Education = 14.1) clinically referred for neuropsychological assessment were analyzed. Criterion measures included the Warrington Recognition Memory Test-Words and 2 composites based on several independent validity indicators. An age-corrected scaled score ≤6 on any of the 4 trials reliably differentiated psychometrically defined credible and noncredible response sets with high specificity (.87-.94) and variable sensitivity (.34-.71). An inverted Stroop effect was less sensitive (.14-.29), but comparably specific (.85-90) to invalid performance. Aggregating the newly developed D-KEFS Stroop validity indicators further improved classification accuracy. Failing the validity cutoffs was unrelated to self-reported depression or anxiety. However, it was associated with elevated somatic symptom report. In addition to processing speed and executive function, the D-KEFS version of the Stroop task can function as a measure of performance validity. A multivariate approach to performance validity assessment is generally superior to univariate models. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Assessing the validity and reliability of the Malagasy version of Oral Impacts on Daily Performance (OIDP): a cross-sectional study.

PubMed

Razanamihaja, Noeline; Ranivoharilanto, Eva

2017-01-01

Evaluating health needs includes measures of the impact of state of health on the quality of life. This entails evaluating the psychosocial aspects of health. To achieve this, several tools for measuring the quality of life related to oral health have been developed. However, it is vital to evaluate the psychometric properties of these tools so they can be used in a new context and on a new population. The purpose of this study was to evaluate the reliability and validity of the Malagasy version of a questionnaire for studying the impacts of oral-dental health on daily activities (Oral Impacts on Daily Performance), and analyse the interrelations between the scores obtained and the oral health indicators. A cross-sectional study was performed for the transcultural adaptation of the Oral Impacts on Daily Performance questionnaire forward translated and back-translated from English to Malagasy and from Malagasy to English, respectively. The psychometric characteristics of the Malagasy version of the Oral Impacts on Daily Performance were then evaluated in terms of internal reliability, test-retest, and construct, criteria and discriminant validity. Four hundred and six adults responded in face-to-face interviews to the Malagasy version of the Oral Impacts on Daily Performance questionnaire. Nearly 74% of the participants indicated impacts of their oral health on their performance in their daily lives during the 6 months prior to the survey. The activities most affected were: "smiling", "eating" and "sleeping and relaxing". Cronbach's alpha was 0.87. The construct validity was demonstrated by a significant association between the Oral Impacts on Daily Performance scores and the subjective evaluation of oral health ( p <0.001). Discriminant validity was demonstrated by the fact that the Oral Impacts on Daily Performance scores were significantly higher in subjects with more than ten missing teeth, compared to those with fewer than ten missing teeth ( p < 0.001). The Malagasy version of the Oral Impacts on Daily Performance index is a valid and reliable measure for use in Malagasy adults over 55 years old.
Predicting Performance in Higher Education Using Proximal Predictors.

PubMed

Niessen, A Susan M; Meijer, Rob R; Tendeiro, Jorge N

2016-01-01

We studied the validity of two methods for predicting academic performance and student-program fit that were proximal to important study criteria. Applicants to an undergraduate psychology program participated in a selection procedure containing a trial-studying test based on a work sample approach, and specific skills tests in English and math. Test scores were used to predict academic achievement and progress after the first year, achievement in specific course types, enrollment, and dropout after the first year. All tests showed positive significant correlations with the criteria. The trial-studying test was consistently the best predictor in the admission procedure. We found no significant differences between the predictive validity of the trial-studying test and prior educational performance, and substantial shared explained variance between the two predictors. Only applicants with lower trial-studying scores were significantly less likely to enroll in the program. In conclusion, the trial-studying test yielded predictive validities similar to that of prior educational performance and possibly enabled self-selection. In admissions aimed at student-program fit, or in admissions in which past educational performance is difficult to use, a trial-studying test is a good instrument to predict academic performance.
The western Mediterranean Sea: An area for a regional validation for TOPEX/Poseidon and a field for geophysical and oceanographic studies

NASA Technical Reports Server (NTRS)

Barlier, Francois; Balmino, G.; Boucher, Claude; Willis, P.; Biancale, R.; Menard, Yves; Vincent, P.; Bethoux, J. P.; Exertier, P.; Pierron, F.

1991-01-01

The research project has two kinds of objectives. The first is focused on the regional validation of the altimeter, orbit, and mean sea surface; it will be performed in close cooperation with the local validation performed at Lampedusa/Lampione (Italy). The second deals with the geophysical and oceanographic research of interest in this area.
Evaluation of the methodological quality of studies of the performance of diagnostic tests for bovine tuberculosis using QUADAS.

PubMed

Downs, Sara H; More, Simon J; Goodchild, Anthony V; Whelan, Adam O; Abernethy, Darrell A; Broughan, Jennifer M; Cameron, Angus; Cook, Alasdair J; Ricardo de la Rua-Domenech, R; Greiner, Matthias; Gunn, Jane; Nuñez-Garcia, Javier; Rhodes, Shelley; Rolfe, Simon; Sharp, Michael; Upton, Paul; Watson, Eamon; Welsh, Michael; Woolliams, John A; Clifton-Hadley, Richard S; Parry, Jessica E

2018-05-01

There has been little assessment of the methodological quality of studies measuring the performance (sensitivity and/or specificity) of diagnostic tests for animal diseases. In a systematic review, 190 studies of tests for bovine tuberculosis (bTB) in cattle (published 1934-2009) were assessed by at least one of 18 reviewers using the QUADAS (Quality Assessment of Diagnostic Accuracy Studies) checklist adapted for animal disease tests. VETQUADAS (VQ) included items measuring clarity in reporting (n = 3), internal validity (n = 9) and external validity (n = 2). A similar pattern for compliance was observed in studies of different diagnostic test types. Compliance significantly improved with year of publication for all items measuring clarity in reporting and external validity but only improved in four of the nine items measuring internal validity (p < 0.05). 107 references, of which 83 had performance data eligible for inclusion in a meta-analysis were reviewed by two reviewers. In these references, agreement between reviewers' responses was 71% for compliance, 32% for unsure and 29% for non-compliance. Mean compliance with reporting items was 2, 5.2 for internal validity and 1.5 for external validity. The index test result was described in sufficient detail in 80.1% of studies and was interpreted without knowledge of the reference standard test result in only 33.1%. Loss to follow-up was adequately explained in only 31.1% of studies. The prevalence of deficiencies observed may be due to inadequate reporting but may also reflect lack of attention to methodological issues that could bias the results of diagnostic test performance estimates. QUADAS was a useful tool for assessing and comparing the quality of studies measuring the performance of diagnostic tests but might be improved further by including explicit assessment of population sampling strategy. Crown Copyright © 2017. Published by Elsevier B.V. All rights reserved.
A Case Study: Follow-Up Assessment of Facilitated Communication.

ERIC Educational Resources Information Center

Simon, Elliott W.; And Others

1996-01-01

This study of an adolescent with multiple disabilities, including moderate mental retardation, who was reported to engage in validated facilitated communication (FC) found he did not engage in validated FC; performance was equivalent whether food or nonfood reinforcers were used; and the Picture Exchange Communication System was a valid and…
Display format, highlight validity, and highlight method: Their effects on search performance

NASA Technical Reports Server (NTRS)

Donner, Kimberly A.; Mckay, Tim D.; Obrien, Kevin M.; Rudisill, Marianne

1991-01-01

Display format and highlight validity were shown to affect visual display search performance; however, these studies were conducted on small, artificial displays of alphanumeric stimuli. A study manipulating these variables was conducted using realistic, complex Space Shuttle information displays. A 2x2x3 within-subjects analysis of variance found that search times were faster for items in reformatted displays than for current displays. Responses to valid applications of highlight were significantly faster than responses to non or invalidly highlighted applications. The significant format by highlight validity interaction showed that there was little difference in response time to both current and reformatted displays when the highlight validity was applied; however, under the non or invalid highlight conditions, search times were faster with reformatted displays. A separate within-subject analysis of variance of display format, highlight validity, and several highlight methods did not reveal a main effect of highlight method. In addition, observed display search times were compared to search time predicted by Tullis' Display Analysis Program. Benefits of highlighting and reformatting displays to enhance search and the necessity to consider highlight validity and format characteristics in tandem for predicting search performance are discussed.
Development and Validation of Targeted Next-Generation Sequencing Panels for Detection of Germline Variants in Inherited Diseases.

PubMed

Santani, Avni; Murrell, Jill; Funke, Birgit; Yu, Zhenming; Hegde, Madhuri; Mao, Rong; Ferreira-Gonzalez, Andrea; Voelkerding, Karl V; Weck, Karen E

2017-06-01

- The number of targeted next-generation sequencing (NGS) panels for genetic diseases offered by clinical laboratories is rapidly increasing. Before an NGS-based test is implemented in a clinical laboratory, appropriate validation studies are needed to determine the performance characteristics of the test. - To provide examples of assay design and validation of targeted NGS gene panels for the detection of germline variants associated with inherited disorders. - The approaches used by 2 clinical laboratories for the development and validation of targeted NGS gene panels are described. Important design and validation considerations are examined. - Clinical laboratories must validate performance specifications of each test prior to implementation. Test design specifications and validation data are provided, outlining important steps in validation of targeted NGS panels by clinical diagnostic laboratories.
EEG-neurofeedback for optimising performance. II: creativity, the performing arts and ecological validity.

PubMed

Gruzelier, John H

2014-07-01

As a continuation of a review of evidence of the validity of cognitive/affective gains following neurofeedback in healthy participants, including correlations in support of the gains being mediated by feedback learning (Gruzelier, 2014a), the focus here is on the impact on creativity, especially in the performing arts including music, dance and acting. The majority of research involves alpha/theta (A/T), sensory-motor rhythm (SMR) and heart rate variability (HRV) protocols. There is evidence of reliable benefits from A/T training with advanced musicians especially for creative performance, and reliable benefits from both A/T and SMR training for novice music performance in adults and in a school study with children with impact on creativity, communication/presentation and technique. Making the SMR ratio training context ecologically relevant for actors enhanced creativity in stage performance, with added benefits from the more immersive training context. A/T and HRV training have benefitted dancers. The neurofeedback evidence adds to the rapidly accumulating validation of neurofeedback, while performing arts studies offer an opportunity for ecological validity in creativity research for both creative process and product. Copyright © 2013 Elsevier Ltd. All rights reserved.
Development of self and peer performance assessment on iodometric titration experiment

NASA Astrophysics Data System (ADS)

Nahadi; Siswaningsih, W.; Kusumaningtyas, H.

2018-05-01

This study aims to describe the process in developing of reliable and valid assessment to measure students’ performance on iodometric titration and the effect of the self and peer assessment on students’ performance. The self and peer-instrument provides valuable feedback for the student performance improvement. The developed assessment contains rubric and task for facilitating self and peer assessment. The participants are 24 students at the second-grade student in certain vocational high school in Bandung. The participants divided into two groups. The first 12 students involved in the validity test of the developed assessment, while the remain 12 students participated for the reliability test. The content validity was evaluated based on the judgment experts. Test result of content validity based on judgment expert show that the developed performance assessment instrument categorized as valid on each task with the realibity classified as very good. Analysis of the impact of the self and peer assessment implementation showed that the peer instrument supported the self assessment.
Reliability and Validity of the Professional Counseling Performance Evaluation

ERIC Educational Resources Information Center

Shepherd, J. Brad; Britton, Paula J.; Kress, Victoria E.

2008-01-01

The definition and measurement of counsellor trainee competency is an issue that has received increased attention yet lacks quantitative study. This research evaluates item responses, scale reliability and intercorrelations, interrater agreement, and criterion-related validity of the Professional Performance Fitness Evaluation/Professional…
Are you interested? A meta-analysis of relations between vocational interests and employee performance and turnover.

PubMed

Van Iddekinge, Chad H; Roth, Philip L; Putka, Dan J; Lanivich, Stephen E

2011-11-01

A common belief among researchers is that vocational interests have limited value for personnel selection. However, no comprehensive quantitative summaries of interests validity research have been conducted to substantiate claims for or against the use of interests. To help address this gap, we conducted a meta-analysis of relations between interests and employee performance and turnover using data from 74 studies and 141 independent samples. Overall validity estimates (corrected for measurement error in the criterion but not for range restriction) for single interest scales were .14 for job performance, .26 for training performance, -.19 for turnover intentions, and -.15 for actual turnover. Several factors appeared to moderate interest-criterion relations. For example, validity estimates were larger when interests were theoretically relevant to the work performed in the target job. The type of interest scale also moderated validity, such that corrected validities were larger for scales designed to assess interests relevant to a particular job or vocation (e.g., .23 for job performance) than for scales designed to assess a single, job-relevant realistic, investigative, artistic, social, enterprising, or conventional (i.e., RIASEC) interest (.10) or a basic interest (.11). Finally, validity estimates were largest when studies used multiple interests for prediction, either by using a single job or vocation focused scale (which tend to tap multiple interests) or by using a regression-weighted composite of several RIASEC or basic interest scales. Overall, the results suggest that vocational interests may hold more promise for predicting employee performance and turnover than researchers may have thought. (c) 2011 APA, all rights reserved.
The construct and criterion validity of the multi-source feedback process to assess physician performance: a meta-analysis

PubMed Central

Al Ansari, Ahmed; Donnon, Tyrone; Al Khalifa, Khalid; Darwish, Abdulla; Violato, Claudio

2014-01-01

Background The purpose of this study was to conduct a meta-analysis on the construct and criterion validity of multi-source feedback (MSF) to assess physicians and surgeons in practice. Methods In this study, we followed the guidelines for the reporting of observational studies included in a meta-analysis. In addition to PubMed and MEDLINE databases, the CINAHL, EMBASE, and PsycINFO databases were searched from January 1975 to November 2012. All articles listed in the references of the MSF studies were reviewed to ensure that all relevant publications were identified. All 35 articles were independently coded by two authors (AA, TD), and any discrepancies (eg, effect size calculations) were reviewed by the other authors (KA, AD, CV). Results Physician/surgeon performance measures from 35 studies were identified. A random-effects model of weighted mean effect size differences (d) resulted in: construct validity coefficients for the MSF system on physician/surgeon performance across different levels in practice ranged from d=0.14 (95% confidence interval [CI] 0.40–0.69) to d=1.78 (95% CI 1.20–2.30); construct validity coefficients for the MSF on physician/surgeon performance on two different occasions ranged from d=0.23 (95% CI 0.13–0.33) to d=0.90 (95% CI 0.74–1.10); concurrent validity coefficients for the MSF based on differences in assessor group ratings ranged from d=0.50 (95% CI 0.47–0.52) to d=0.57 (95% CI 0.55–0.60); and predictive validity coefficients for the MSF on physician/surgeon performance across different standardized measures ranged from d=1.28 (95% CI 1.16–1.41) to d=1.43 (95% CI 0.87–2.00). Conclusion The construct and criterion validity of the MSF system is supported by small to large effect size differences based on the MSF process and physician/surgeon performance across different clinical and nonclinical domain measures. PMID:24600300
Validation of a new measure of availability and accommodation of health care that is valid for rural and urban contexts.

PubMed

Haggerty, Jeannie L; Levesque, Jean-Frédéric

2017-04-01

Patients are the most valid source for evaluating the accessibility of services, but a previous study observed differential psychometric performance of instruments in rural and urban respondents. To validate a measure of organizational accessibility free of differential rural-urban performance that predicts consequences of difficult access for patient-initiated care. Sequential qualitative-quantitative study. Qualitative findings used to adapt or develop evaluative and reporting items. Quantitative validation study. Primary data by telephone from 750 urban, rural and remote respondents in Quebec, Canada; follow-up mailed questionnaire to a subset of 316. Items were developed for barriers along the care trajectory. We used common factor and confirmatory factor analysis to identify constructs and compare models. We used item response theory analysis to test for differential rural-urban performance; examine individual item performance; adjust response options; and exclude redundant or non-discriminatory items. We used logistic regression to examine predictive validity of the subscale on access difficulty (outcome). Initial factor resolution suggested geographic and organizational dimensions, plus consequences of access difficulty. After second administration, organizational accommodation and geographic indicators were integrated into a 6-item subscale of Effective Availability and Accommodation, which demonstrates good variability and internal consistency (α = 0.84) and no differential functioning by geographic area. Each unit increase predicts decreased likelihood of consequences of access difficulties (unmet need and problem aggravation). The new subscale is a practical, valid and reliable measure for patients to evaluate first-contact health services accessibility, yielding valid comparisons between urban and rural contexts. © 2016 The Authors. Health Expectations published by John Wiley & Sons Ltd.
Implementing the Science Assessment Standards: Developing and validating a set of laboratory assessment tasks in high school biology

NASA Astrophysics Data System (ADS)

Saha, Gouranga Chandra

Very often a number of factors, especially time, space and money, deter many science educators from using inquiry-based, hands-on, laboratory practical tasks as alternative assessment instruments in science. A shortage of valid inquiry-based laboratory tasks for high school biology has been cited. Driven by this need, this study addressed the following three research questions: (1) How can laboratory-based performance tasks be designed and developed that are doable by students for whom they are designed/written? (2) Do student responses to the laboratory-based performance tasks validly represent at least some of the intended process skills that new biology learning goals want students to acquire? (3) Are the laboratory-based performance tasks psychometrically consistent as individual tasks and as a set? To answer these questions, three tasks were used from the six biology tasks initially designed and developed by an iterative process of trial testing. Analyses of data from 224 students showed that performance-based laboratory tasks that are doable by all students require careful and iterative process of development. Although the students demonstrated more skill in performing than planning and reasoning, their performances at the item level were very poor for some items. Possible reasons for the poor performances have been discussed and suggestions on how to remediate the deficiencies have been made. Empirical evidences for validity and reliability of the instrument have been presented both from the classical and the modern validity criteria point of view. Limitations of the study have been identified. Finally implications of the study and directions for further research have been discussed.
Predictive Variables of Half-Marathon Performance for Male Runners

PubMed Central

Gómez-Molina, Josué; Ogueta-Alday, Ana; Camara, Jesus; Stickley, Christoper; Rodríguez-Marroyo, José A.; García-López, Juan

2017-01-01

The aims of this study were to establish and validate various predictive equations of half-marathon performance. Seventy-eight half-marathon male runners participated in two different phases. Phase 1 (n = 48) was used to establish the equations for estimating half-marathon performance, and Phase 2 (n = 30) to validate these equations. Apart from half-marathon performance, training-related and anthropometric variables were recorded, and an incremental test on a treadmill was performed, in which physiological (VO2max, speed at the anaerobic threshold, peak speed) and biomechanical variables (contact and flight times, step length and step rate) were registered. In Phase 1, half-marathon performance could be predicted to 90.3% by variables related to training and anthropometry (Equation 1), 94.9% by physiological variables (Equation 2), 93.7% by biomechanical parameters (Equation 3) and 96.2% by a general equation (Equation 4). Using these equations, in Phase 2 the predicted time was significantly correlated with performance (r = 0.78, 0.92, 0.90 and 0.95, respectively). The proposed equations and their validation showed a high prediction of half-marathon performance in long distance male runners, considered from different approaches. Furthermore, they improved the prediction performance of previous studies, which makes them a highly practical application in the field of training and performance. Key points The present study obtained four equations involving anthropometric, training, physiological and biomechanical variables to estimate half-marathon performance. These equations were validated in a different population, demonstrating narrows ranges of prediction than previous studies and also their consistency. As a novelty, some biomechanical variables (i.e. step length and step rate at RCT, and maximal step length) have been related to half-marathon performance. PMID:28630571
Beware of external validation! - A Comparative Study of Several Validation Techniques used in QSAR Modelling.

PubMed

Majumdar, Subhabrata; Basak, Subhash C

2018-04-26

Proper validation is an important aspect of QSAR modelling. External validation is one of the widely used validation methods in QSAR where the model is built on a subset of the data and validated on the rest of the samples. However, its effectiveness for datasets with a small number of samples but large number of predictors remains suspect. Calculating hundreds or thousands of molecular descriptors using currently available software has become the norm in QSAR research, owing to computational advances in the past few decades. Thus, for n chemical compounds and p descriptors calculated for each molecule, the typical chemometric dataset today has high value of p but small n (i.e. n < p). Motivated by the evidence of inadequacies of external validation in estimating the true predictive capability of a statistical model in recent literature, this paper performs an extensive and comparative study of this method with several other validation techniques. We compared four validation methods: leave-one-out, K-fold, external and multi-split validation, using statistical models built using the LASSO regression, which simultaneously performs variable selection and modelling. We used 300 simulated datasets and one real dataset of 95 congeneric amine mutagens for this evaluation. External validation metrics have high variation among different random splits of the data, hence are not recommended for predictive QSAR models. LOO has the overall best performance among all validation methods applied in our scenario. Results from external validation are too unstable for the datasets we analyzed. Based on our findings, we recommend using the LOO procedure for validating QSAR predictive models built on high-dimensional small-sample data. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
VALUE - A Framework to Validate Downscaling Approaches for Climate Change Studies

NASA Astrophysics Data System (ADS)

Maraun, Douglas; Widmann, Martin; Gutiérrez, José M.; Kotlarski, Sven; Chandler, Richard E.; Hertig, Elke; Wibig, Joanna; Huth, Radan; Wilke, Renate A. I.

2015-04-01

VALUE is an open European network to validate and compare downscaling methods for climate change research. VALUE aims to foster collaboration and knowledge exchange between climatologists, impact modellers, statisticians, and stakeholders to establish an interdisciplinary downscaling community. A key deliverable of VALUE is the development of a systematic validation framework to enable the assessment and comparison of both dynamical and statistical downscaling methods. Here, we present the key ingredients of this framework. VALUE's main approach to validation is user-focused: starting from a specific user problem, a validation tree guides the selection of relevant validation indices and performance measures. Several experiments have been designed to isolate specific points in the downscaling procedure where problems may occur: what is the isolated downscaling skill? How do statistical and dynamical methods compare? How do methods perform at different spatial scales? Do methods fail in representing regional climate change? How is the overall representation of regional climate, including errors inherited from global climate models? The framework will be the basis for a comprehensive community-open downscaling intercomparison study, but is intended also to provide general guidance for other validation studies.
VALUE: A framework to validate downscaling approaches for climate change studies

NASA Astrophysics Data System (ADS)

Maraun, Douglas; Widmann, Martin; Gutiérrez, José M.; Kotlarski, Sven; Chandler, Richard E.; Hertig, Elke; Wibig, Joanna; Huth, Radan; Wilcke, Renate A. I.

2015-01-01

VALUE is an open European network to validate and compare downscaling methods for climate change research. VALUE aims to foster collaboration and knowledge exchange between climatologists, impact modellers, statisticians, and stakeholders to establish an interdisciplinary downscaling community. A key deliverable of VALUE is the development of a systematic validation framework to enable the assessment and comparison of both dynamical and statistical downscaling methods. In this paper, we present the key ingredients of this framework. VALUE's main approach to validation is user- focused: starting from a specific user problem, a validation tree guides the selection of relevant validation indices and performance measures. Several experiments have been designed to isolate specific points in the downscaling procedure where problems may occur: what is the isolated downscaling skill? How do statistical and dynamical methods compare? How do methods perform at different spatial scales? Do methods fail in representing regional climate change? How is the overall representation of regional climate, including errors inherited from global climate models? The framework will be the basis for a comprehensive community-open downscaling intercomparison study, but is intended also to provide general guidance for other validation studies.
RBANS Validity Indices: a Systematic Review and Meta-Analysis.

PubMed

Shura, Robert D; Brearly, Timothy W; Rowland, Jared A; Martindale, Sarah L; Miskey, Holly M; Duff, Kevin

2018-05-16

Neuropsychology practice organizations have highlighted the need for thorough evaluation of performance validity as part of the neuropsychological assessment process. Embedded validity indices are derived from existing measures and expand the scope of validity assessment. The Repeatable Battery for the Assessment of Neuropsychological Status (RBANS) is a brief instrument that quickly allows a clinician to assess a variety of cognitive domains. The RBANS also contains multiple embedded validity indicators. The purpose of this study was to synthesize the utility of those indicators to assess performance validity. A systematic search was completed, resulting in 11 studies for synthesis and 10 for meta-analysis. Data were synthesized on four indices and three subtests across samples of civilians, service members, and veterans. Sufficient data for meta-analysis were only available for the Effort Index, and related analyses indicated optimal cutoff scores of ≥1 (AUC = .86) and ≥ 3 (AUC = .85). However, outliers and heterogeneity were present indicating the importance of age and evaluation context. Overall, embedded validity indicators have shown adequate diagnostic accuracy across a variety of populations. Recommendations for interpreting these measures and future studies are provided.

A systematic review of reliability and objective criterion-related validity of physical activity questionnaires.

PubMed

Helmerhorst, Hendrik J F; Brage, Søren; Warren, Janet; Besson, Herve; Ekelund, Ulf

2012-08-31

Physical inactivity is one of the four leading risk factors for global mortality. Accurate measurement of physical activity (PA) and in particular by physical activity questionnaires (PAQs) remains a challenge. The aim of this paper is to provide an updated systematic review of the reliability and validity characteristics of existing and more recently developed PAQs and to quantitatively compare the performance between existing and newly developed PAQs.A literature search of electronic databases was performed for studies assessing reliability and validity data of PAQs using an objective criterion measurement of PA between January 1997 and December 2011. Articles meeting the inclusion criteria were screened and data were extracted to provide a systematic overview of measurement properties. Due to differences in reported outcomes and criterion methods a quantitative meta-analysis was not possible.In total, 31 studies testing 34 newly developed PAQs, and 65 studies examining 96 existing PAQs were included. Very few PAQs showed good results on both reliability and validity. Median reliability correlation coefficients were 0.62-0.71 for existing, and 0.74-0.76 for new PAQs. Median validity coefficients ranged from 0.30-0.39 for existing, and from 0.25-0.41 for new PAQs.Although the majority of PAQs appear to have acceptable reliability, the validity is moderate at best. Newly developed PAQs do not appear to perform substantially better than existing PAQs in terms of reliability and validity. Future PAQ studies should include measures of absolute validity and the error structure of the instrument.
A systematic review of reliability and objective criterion-related validity of physical activity questionnaires

PubMed Central

2012-01-01

Physical inactivity is one of the four leading risk factors for global mortality. Accurate measurement of physical activity (PA) and in particular by physical activity questionnaires (PAQs) remains a challenge. The aim of this paper is to provide an updated systematic review of the reliability and validity characteristics of existing and more recently developed PAQs and to quantitatively compare the performance between existing and newly developed PAQs. A literature search of electronic databases was performed for studies assessing reliability and validity data of PAQs using an objective criterion measurement of PA between January 1997 and December 2011. Articles meeting the inclusion criteria were screened and data were extracted to provide a systematic overview of measurement properties. Due to differences in reported outcomes and criterion methods a quantitative meta-analysis was not possible. In total, 31 studies testing 34 newly developed PAQs, and 65 studies examining 96 existing PAQs were included. Very few PAQs showed good results on both reliability and validity. Median reliability correlation coefficients were 0.62–0.71 for existing, and 0.74–0.76 for new PAQs. Median validity coefficients ranged from 0.30–0.39 for existing, and from 0.25–0.41 for new PAQs. Although the majority of PAQs appear to have acceptable reliability, the validity is moderate at best. Newly developed PAQs do not appear to perform substantially better than existing PAQs in terms of reliability and validity. Future PAQ studies should include measures of absolute validity and the error structure of the instrument. PMID:22938557
Examining the validity of AHRQ's patient safety indicators (PSIs): is variation in PSI composite score related to hospital organizational factors?

PubMed

Shin, Marlena H; Sullivan, Jennifer L; Rosen, Amy K; Solomon, Jeffrey L; Dunn, Edward J; Shimada, Stephanie L; Hayes, Jennifer; Rivard, Peter E

2014-12-01

Increasing use of Agency for Healthcare Research and Quality's Patient Safety Indicators (PSIs) for hospital performance measurement intensifies the need to critically assess their validity. Our study examined the extent to which variation in PSI composite score is related to differences in hospital organizational structures or processes (i.e., criterion validity). In site visits to three Veterans Health Administration hospitals with high and three with low PSI composite scores ("low performers" and "high performers," respectively), we interviewed a cross-section of hospital staff. We then coded interview transcripts for evidence in 13 safety-related domains and assessed variation across high and low performers. Evidence of leadership and coordination of work/communication (organizational process domains) was predominantly favorable for high performers only. Evidence in the other domains was either mixed, or there were insufficient data to rate the domains. While we found some evidence of criterion validity, the extent to which variation in PSI rates is related to differences in hospitals' organizational structures/processes needs further study. © The Author(s) 2014.
Performing a Content Validation Study.

ERIC Educational Resources Information Center

Spool, Mark D.

Content validity is concerned with three components: (1) the job content; (2) the test content, and (3) the strength of the relationship between the two. A content validation study, to be considered adequate and defensible should include at least the following four procedures: (1) A thorough and accurate job analysis (to define the job content);…
A new scale for the assessment of performance and capacity of hand function in children with hemiplegic cerebral palsy: reliability and validity studies.

PubMed

Rosa-Rizzotto, M; Visonà Dalla Pozza, L; Corlatti, A; Luparia, A; Marchi, A; Molteni, F; Facchin, P; Pagliano, E; Fedrizzi, E

2014-10-01

In hemiplegic children, the recognition of the activity limitation pattern and the possibility of grading its severity are relevant for clinicians while planning interventions, monitoring results, predicting outcomes. Aim of the study is to examine the reliability and validity of Besta Scale, an instrument used to measure in hemiplegic children from 18 months to 12 years of age both grasp on request (capacity) and spontaneous use of upper limb (performance) in bimanual play activities and in ADL. Psychometric analysis of reliability and of validity of the Besta scale was performed. Outpatient study sample Reliability study: A sample of 39 patients was enrolled. The administration of Besta scale was video-recorded in a standardized manner. All videos were scored by 20 independent raters on subsequent viewing. 3 raters randomly selected from the 20-raters group rescored the same video two years later for intra-rater reliability. Intra and inter-rater reliability were calculated using Intraclass Correlation Coefficient (ICC) and Kendall's coefficient (K), respectively. Internal consistency reliability was assessed using Alpha's Chronbach coefficient. Validity study: a sample of 105 children was assessed 5 times (at t0 and 2, 3, 6 and 12 months later) by 20 independent raters. Each patient underwent at the same time to QUEST and Besta scale administration and assessment. Criterion validity was calculated using rho-Pearson coefficient. Reliability study: The inter-rater reliability calculated with Kendall's coefficient resulted moderate K=0.47. The intra-rater (or test-retest) reliability for 3 raters was excellent (ICC=0.927). The Cronbach's alpha for internal consistency was 0.972. Validity study: Besta scale showed a good criterion validity compared to QUEST increasing by age and severity of impairment. Rho Pearson's correlation coefficient r was 0.81 (P<0.0001). Limitations. Besta scales in infants finds hard to distinguish between mild to moderately impaired hand function. Besta scale scoring system is a valid and reliable tool, utilizable in a clinical setting to monitor evolution of unimanual and bimanual manipulation and to distinguish hand's capacity from performance.
Personality and job performance: the Big Five revisited.

PubMed

Hurtz, G M; Donovan, J J

2000-12-01

Prior meta-analyses investigating the relation between the Big 5 personality dimensions and job performance have all contained a threat to construct validity, in that much of the data included within these analyses was not derived from actual Big 5 measures. In addition, these reviews did not address the relations between the Big 5 and contextual performance. Therefore, the present study sought to provide a meta-analytic estimate of the criterion-related validity of explicit Big 5 measures for predicting job performance and contextual performance. The results for job performance closely paralleled 2 of the previous meta-analyses, whereas analyses with contextual performance showed more complex relations among the Big 5 and performance. A more critical interpretation of the Big 5-performance relationship is presented, and suggestions for future research aimed at enhancing the validity of personality predictors are provided.
Validation of alternative methods for toxicity testing.

PubMed Central

Bruner, L H; Carr, G J; Curren, R D; Chamberlain, M

1998-01-01

Before nonanimal toxicity tests may be officially accepted by regulatory agencies, it is generally agreed that the validity of the new methods must be demonstrated in an independent, scientifically sound validation program. Validation has been defined as the demonstration of the reliability and relevance of a test method for a particular purpose. This paper provides a brief review of the development of the theoretical aspects of the validation process and updates current thinking about objectively testing the performance of an alternative method in a validation study. Validation of alternative methods for eye irritation testing is a specific example illustrating important concepts. Although discussion focuses on the validation of alternative methods intended to replace current in vivo toxicity tests, the procedures can be used to assess the performance of alternative methods intended for other uses. Images Figure 1 PMID:9599695
Linguistic validation and reliability properties are weak investigated of most dementia-specific quality of life measurements-a systematic review.

PubMed

Dichter, Martin Nikolaus; Schwab, Christian G G; Meyer, Gabriele; Bartholomeyczik, Sabine; Halek, Margareta

2016-02-01

For people with dementia, the concept of quality of life (Qol) reflects the disease's impact on the whole person. Thus, Qol is an increasingly used outcome measure in dementia research. This systematic review was performed to identify available dementia-specific Qol measurements and to assess the quality of linguistic validations and reliability studies of these measurements (PROSPERO 2013: CRD42014008725). The MEDLINE, CINAHL, EMBASE, PsycINFO, and Cochrane Methodology Register databases were systematically searched without any date restrictions. Forward and backward citation tracking were performed on the basis of selected articles. A total of 70 articles addressing 19 dementia-specific Qol measurements were identified; nine measurements were adapted to nonorigin countries. The quality of the linguistic validations varied from insufficient to good. Internal consistency was the most frequently tested reliability property. Most of the reliability studies lacked internal validity. Qol measurements for dementia are insufficiently linguistic validated and not well tested for reliability. None of the identified measurements can be recommended without further research. The application of international guidelines and quality criteria is strongly recommended for the performance of linguistic validations and reliability studies of dementia-specific Qol measurements. Copyright © 2016 Elsevier Inc. All rights reserved.
A Model of Physical Performance for Occupational Tasks.

ERIC Educational Resources Information Center

Hogan, Joyce

This report acknowledges the problems faced by industrial/organizational psychologists who must make personnel decisions involving physically demanding jobs. The scarcity of criterion-related validation studies and the difficulty of generalizing validity are considered, and a model of physical performance that builds on Fleishman's (1984)…
Diagnostic Validity of Wechsler Substest Scatter

ERIC Educational Resources Information Center

Watkins, Marley W.

2005-01-01

Cognitive subtest scatter has often been considered to be diagnostically significant. The current study tested the diagnostic validity of four separate operationalizations of WISC-III subtest scatter: (a) range of verbal, performance, and full-scale subtests; (b) variance of verbal, performance, and full-scale subtests; (c) number of subtests…
Derivation of a Performance Checklist for Ultrasound-Guided Arthrocentesis Using the Modified Delphi Method.

PubMed

Kunz, Derek; Pariyadath, Manoj; Wittler, Mary; Askew, Kim; Manthey, David; Hartman, Nicholas

2017-06-01

Arthrocentesis is an important skill for physicians in multiple specialties. Recent studies indicate a superior safety and performance profile for this procedure using ultrasound guidance for needle placement, and improving quality of care requires a valid measurement of competency using this modality. We endeavored to create a validated tool to assess the performance of this procedure using the modified Delphi technique and experts in multiple disciplines across the United States. We derived a 22-item checklist designed to assess competency for the completion of ultrasound-guided arthrocentesis, which demonstrated a Cronbach's alpha of 0.89, indicating an excellent degree of internal consistency. Although we were able to demonstrate content validity for this tool, further validity evidence should be acquired after the tool is used and studied in clinical and simulated contexts. © 2017 by the American Institute of Ultrasound in Medicine.
The importance of assessing for validity of symptom report and performance in attention deficit/hyperactivity disorder (ADHD): Introduction to the special section on noncredible presentation in ADHD.

PubMed

Suhr, Julie A; Berry, David T R

2017-12-01

Invalid self-report and invalid performance occur with high base rates in attention deficit/hyperactivity disorder (ADHD; Harrison, 2006; Musso & Gouvier, 2014). Although much research has focused on the development and validation of symptom validity tests (SVTs) and performance validity tests (PVTs) for psychiatric and neurological presentations, less attention has been given to the use of SVTs and PVTs in ADHD evaluation. This introduction to the special section describes a series of studies examining the use of SVTs and PVTs in adult ADHD evaluation. We present the series of studies in the context of prior research on noncredible presentation and call for future research using improved research methods and with a focus on assessment issues specific to ADHD evaluation. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Development and validation of a Clinical Assessment Tool for Nursing Education (CAT-NE).

PubMed

Skúladóttir, Hafdís; Svavarsdóttir, Margrét Hrönn

2016-09-01

The aim of this study was to develop a valid assessment tool to guide clinical education and evaluate students' performance in clinical nursing education. The development of the Clinical Assessment Tool for Nursing Education (CAT-NE) was based on the theory of nursing as professional caring and the Bologna learning outcomes. Benson and Clark's four steps of instrument development and validation guided the development and assessment of the tool. A mixed-methods approach with individual structured cognitive interviewing and quantitative assessments was used to validate the tool. Supervisory teachers, a pedagogical consultant, clinical expert teachers, clinical teachers, and nursing students at the University of Akureyri in Iceland participated in the process. This assessment tool is valid to assess the clinical performance of nursing students; it consists of rubrics that list the criteria for the students' expected performance. According to the students and their clinical teachers, the assessment tool clarified learning objectives, enhanced the focus of the assessment process, and made evaluation more objective. Training clinical teachers on how to assess students' performances in clinical studies and use the tool enhanced the quality of clinical assessment in nursing education. Copyright © 2016 Elsevier Ltd. All rights reserved.
Evaluating the accuracy of the Wechsler Memory Scale-Fourth Edition (WMS-IV) logical memory embedded validity index for detecting invalid test performance.

PubMed

Soble, Jason R; Bain, Kathleen M; Bailey, K Chase; Kirton, Joshua W; Marceaux, Janice C; Critchfield, Edan A; McCoy, Karin J M; O'Rourke, Justin J F

2018-01-08

Embedded performance validity tests (PVTs) allow for continuous assessment of invalid performance throughout neuropsychological test batteries. This study evaluated the utility of the Wechsler Memory Scale-Fourth Edition (WMS-IV) Logical Memory (LM) Recognition score as an embedded PVT using the Advanced Clinical Solutions (ACS) for WAIS-IV/WMS-IV Effort System. This mixed clinical sample was comprised of 97 total participants, 71 of whom were classified as valid and 26 as invalid based on three well-validated, freestanding criterion PVTs. Overall, the LM embedded PVT demonstrated poor concordance with the criterion PVTs and unacceptable psychometric properties using ACS validity base rates (42% sensitivity/79% specificity). Moreover, 15-39% of participants obtained an invalid ACS base rate despite having a normatively-intact age-corrected LM Recognition total score. Receiving operating characteristic curve analysis revealed a Recognition total score cutoff of < 61% correct improved specificity (92%) while sensitivity remained weak (31%). Thus, results indicated the LM Recognition embedded PVT is not appropriate for use from an evidence-based perspective, and that clinicians may be faced with reconciling how a normatively intact cognitive performance on the Recognition subtest could simultaneously reflect invalid performance validity.
Development of Internet-Based Tasks for the Executive Function Performance Test.

PubMed

Rand, Debbie; Lee Ben-Haim, Keren; Malka, Rachel; Portnoy, Sigal

The Executive Function Performance Test (EFPT) is a reliable and valid performance-based tool to assess executive functions (EFs). This study's objective was to develop and verify two Internet-based tasks for the EFPT. A cross-sectional study assessed the alternate-form reliability of the Internet-based bill-paying and telephone-use tasks in healthy adults and people with subacute stroke (Study 1). It also sought to establish the tasks' criterion reliability for assessing EF deficits by correlating performance with that on the Trail Making Test in five groups: healthy young adults, healthy older adults, people with subacute stroke, people with chronic stroke, and young adults with attention deficit hyperactivity disorder (Study 2). The alternative-form reliability and initial construct validity for the Internet-based bill-paying task were verified. Criterion validity was established for both tasks. The Internet-based tasks are comparable to the original EFPT tasks and can be used for assessment of EF deficits. Copyright © 2018 by the American Occupational Therapy Association, Inc.
Incremental Validity of the New MCAT.

ERIC Educational Resources Information Center

Friedman, Charles P.; Bakewell, William E., Jr.

1980-01-01

The ability of the new Medical College Admission Test (MCAT) to predict performance of first-year medical students at the University of North Carolina was studied. Its incremental validity, determined by computing the additional variance in performance explainable by the MCAT after the effects of other admissions variables were taken into account,…
Ecological Development and Validation of a Music Performance Rating Scale for Five Instrument Families

ERIC Educational Resources Information Center

Wrigley, William J.; Emmerson, Stephen B.

2013-01-01

This study investigated ways to improve the quality of music performance evaluation in an effort to address the accountability imperative in tertiary music education. An enhanced scientific methodology was employed incorporating ecological validity and using recognized qualitative methods involving grounded theory and quantitative methods…
The Development of a Secondary-Level Solo Wind Instrument Performance Rubric Using the Multifaceted Rasch Partial Credit Measurement Model

ERIC Educational Resources Information Center

Wesolowski, Brian C.; Amend, Ross M.; Barnstead, Thomas S.; Edwards, Andrew S.; Everhart, Matthew; Goins, Quentin R.; Grogan, Robert J., III; Herceg, Amanda M.; Jenkins, S. Ira; Johns, Paul M.; McCarver, Christopher J.; Schaps, Robin E.; Sorrell, Gary W.; Williams, Jonathan D.

2017-01-01

The purpose of this study was to describe the development of a valid and reliable rubric to assess secondary-level solo instrumental music performance based on principles of invariant measurement. The research questions that guided this study included (1) What is the psychometric quality (i.e., validity, reliability, and precision) of a scale…
Quantifying the foodscape: A systematic review and meta-analysis of the validity of commercially available business data.

PubMed

Lebel, Alexandre; Daepp, Madeleine I G; Block, Jason P; Walker, Renée; Lalonde, Benoît; Kestens, Yan; Subramanian, S V

2017-01-01

This paper reviews studies of the validity of commercially available business (CAB) data on food establishments ("the foodscape"), offering a meta-analysis of characteristics associated with CAB quality and a case study evaluating the performance of commonly-used validity indicators describing the foodscape. Existing validation studies report a broad range in CAB data quality, although most studies conclude that CAB quality is "moderate" to "substantial". We conclude that current studies may underestimate the quality of CAB data. We recommend that future validation studies use density-adjusted and exposure measures to offer a more meaningful characterization of the relationship of data error with spatial exposure.
Quantifying the foodscape: A systematic review and meta-analysis of the validity of commercially available business data

PubMed Central

Lebel, Alexandre; Daepp, Madeleine I. G.; Block, Jason P.; Walker, Renée; Lalonde, Benoît; Kestens, Yan; Subramanian, S. V.

2017-01-01

This paper reviews studies of the validity of commercially available business (CAB) data on food establishments (“the foodscape”), offering a meta-analysis of characteristics associated with CAB quality and a case study evaluating the performance of commonly-used validity indicators describing the foodscape. Existing validation studies report a broad range in CAB data quality, although most studies conclude that CAB quality is “moderate” to “substantial”. We conclude that current studies may underestimate the quality of CAB data. We recommend that future validation studies use density-adjusted and exposure measures to offer a more meaningful characterization of the relationship of data error with spatial exposure. PMID:28358819

Teamwork Assessment Tools in Obstetric Emergencies: A Systematic Review.

PubMed

Onwochei, Desire N; Halpern, Stephen; Balki, Mrinalini

2017-06-01

Team-based training and simulation can improve patient safety, by improving communication, decision making, and performance of team members. Currently, there is no general consensus on whether or not a specific assessment tool is better adapted to evaluate teamwork in obstetric emergencies. The purpose of this qualitative systematic review was to find the tools available to assess team effectiveness in obstetric emergencies. We searched Embase, Medline, PubMed, Web of Science, PsycINFO, CINAHL, and Google Scholar for prospective studies that evaluated nontechnical skills in multidisciplinary teams involving obstetric emergencies. The search included studies from 1944 until January 11, 2016. Data on reliability and validity measures were collected and used for interpretation. A descriptive analysis was performed on the data. Thirteen studies were included in the final qualitative synthesis. All the studies assessed teams in the context of obstetric simulation scenarios, but only six included anesthetists in the simulations. One study evaluated their teamwork tool using just validity measures, five using just reliability measures, and one used both. The most reliable tools identified were the Clinical Teamwork Scale, the Global Assessment of Obstetric Team Performance, and the Global Rating Scale of performance. However, they were still lacking in terms of quality and validity. More work needs to be conducted to establish the validity of teamwork tools for nontechnical skills, and the development of an ideal tool is warranted. Further studies are required to assess how outcomes, such as performance and patient safety, are influenced when using these tools.
The Minnesota Multiphasic Personality Inventory-2-RF in Treatment-Seeking Veterans with History of Mild Traumatic Brain Injury.

PubMed

Jurick, S M; Crocker, L D; Keller, A V; Hoffman, S N; Bomyea, J; Jacobson, M W; Jak, A J

2018-05-30

This study examined the Minnesota Multiphasic Personality Inventory-Second Edition-Restructured Form (MMPI-2-RF) to better understand symptom presentation in a sample of treatment-seeking Operation Enduring Freedom/Operation Iraqi Freedom (OEF/OIF) Veterans with self-reported history of mild traumatic brain injury (mTBI). Participants underwent a comprehensive clinical neuropsychological battery including performance and symptom validity measures and self-report measures of depressive, posttraumatic, and post-concussive symptomatology. Those with possible symptom exaggeration (SE+) on the MMPI-2-RF were compared with those without (SE-) with regard to injury, psychiatric, validity, and cognitive variables. Between 50% and 87% of participants demonstrated possible symptom exaggeration on one or more MMPI-2-RF validity scales, and a large majority were elevated on content scales related to cognitive, somatic, and emotional complaints. The SE+ group reported higher depressive, posttraumatic, and post-concussive symptomatology, had higher scores on symptom validity measures, and performed more poorly on neuropsychological measures compared with the SE- group. There were no group differences with regard to injury variables or performance validity measures. Participants were more likely to exhibit possible symptom exaggeration on cognitive/somatic compared with traditional psychopathological validity scales. A sizable portion of treatment-seeking OEF/OIF Veterans demonstrated possible symptom exaggeration on MMPI-2-RF validity scales, which was associated with elevated scores on self-report measures and poorer cognitive performance, but not higher rates of performance validity failure, suggesting symptom and performance validity are distinct concepts. These findings have implications for the interpretation of clinical data in the context of possible symptom exaggeration and treatment in Veterans with persistent post-concussive symptoms.
Validity and Reliability of Turkish Male Breast Self-Examination Instrument.

PubMed

Erkin, Özüm; Göl, İlknur

2018-04-01

This study aims to measure the validity and reliability of Turkish male breast self-examination (MBSE) instrument. The methodological study was performed in 2016 at Ege University, Faculty of Nursing, İzmir, Turkey. The MBSE includes ten steps. For validity studies, face validity, content validity, and construct validity (exploratory factor analysis) were done. For reliability study, Kuder Richardson was calculated. The content validity index was found to be 0.94. Kendall W coefficient was 0.80 (p=0.551). The total variance explained by the two factors was found to be 63.24%. Kuder Richardson 21 was done for reliability study and found to be 0.97 for the instrument. The final instrument included 10 steps and two stages. The Turkish version of MBSE is a valid and reliable instrument for early diagnose. The MBSE can be used in Turkish speaking countries and cultures with two stages and 10 steps.
A novel cuffless device for self-measurement of blood pressure: concept, performance and clinical validation.

PubMed

Boubouchairopoulou, N; Kollias, A; Chiu, B; Chen, B; Lagou, S; Anestis, P; Stergiou, G S

2017-07-01

A pocket-size cuffless electronic device for self-measurement of blood pressure (BP) has been developed (Freescan, Maisense Inc., Zhubei, Taiwan). The device estimates BP within 10 s using three embedded electrodes and one force sensor that is applied over the radial pulse to evaluate the pulse wave. Before use, basic anthropometric characteristics are recorded on the device, and individualized initial calibration is required based on a standard BP measurement performed using an upper-arm BP monitor. The device performance in providing valid BP readings was evaluated in 313 normotensive and hypertensive adults in three study phases during which the device sensor was upgraded. A formal validation study of a prototype device against mercury sphygmomanometer was performed according to the American National Standards Institute/Association for the Advancement of Medical Instrumentation/International Organization for Standardization (ANSI/AAMI/ISO) 2013 protocol. The test device succeeded in obtaining a valid BP measurement (three successful readings within up to five attempts) in 55-72% of the participants, which reached 87% with device sensor upgrade. For the validation study, 125 adults were recruited and 85 met the protocol requirements for inclusion. The mean device-observers BP difference was 3.2±6.7 (s.d.) mm Hg for systolic and 2.6±4.6 mm Hg for diastolic BP (criterion 1). The estimated s.d. (inter-subject variability) were 5.83 and 4.17 mm Hg respectively (criterion 2). These data suggest that this prototype cuffless BP monitor provides valid self-measurements in the vast majority of adults, and satisfies the BP measurement accuracy criteria of the ANSI/AAMI/ISO 2013 validation protocol.
Development and validation of a web-based questionnaire for surveying the health and working conditions of high-performance marine craft populations

PubMed Central

de Alwis, Manudul Pahansen; Lo Martire, Riccardo; Äng, Björn O; Garme, Karl

2016-01-01

Background High-performance marine craft crews are susceptible to various adverse health conditions caused by multiple interactive factors. However, there are limited epidemiological data available for assessment of working conditions at sea. Although questionnaire surveys are widely used for identifying exposures, outcomes and associated risks with high accuracy levels, until now, no validated epidemiological tool exists for surveying occupational health and performance in these populations. Aim To develop and validate a web-based questionnaire for epidemiological assessment of occupational and individual risk exposure pertinent to the musculoskeletal health conditions and performance in high-performance marine craft populations. Method A questionnaire for investigating the association between work-related exposure, performance and health was initially developed by a consensus panel under four subdomains, viz. demography, lifestyle, work exposure and health and systematically validated by expert raters for content relevance and simplicity in three consecutive stages, each iteratively followed by a consensus panel revision. The item content validity index (I-CVI) was determined as the proportion of experts giving a rating of 3 or 4. The scale content validity index (S-CVI/Ave) was computed by averaging the I-CVIs for the assessment of the questionnaire as a tool. Finally, the questionnaire was pilot tested. Results The S-CVI/Ave increased from 0.89 to 0.96 for relevance and from 0.76 to 0.94 for simplicity, resulting in 36 items in the final questionnaire. The pilot test confirmed the feasibility of the questionnaire. Conclusions The present study shows that the web-based questionnaire fulfils previously published validity acceptance criteria and is therefore considered valid and feasible for the empirical surveying of epidemiological aspects among high-performance marine craft crews and similar populations. PMID:27324717
Construct Validity of Fresh Frozen Human Cadaver as a Training Model in Minimal Access Surgery

PubMed Central

Macafee, David; Pranesh, Nagarajan; Horgan, Alan F.

2012-01-01

Background: The construct validity of fresh human cadaver as a training tool has not been established previously. The aims of this study were to investigate the construct validity of fresh frozen human cadaver as a method of training in minimal access surgery and determine if novices can be rapidly trained using this model to a safe level of performance. Methods: Junior surgical trainees, novices (<3 laparoscopic procedure performed) in laparoscopic surgery, performed 10 repetitions of a set of structured laparoscopic tasks on fresh frozen cadavers. Expert laparoscopists (>100 laparoscopic procedures) performed 3 repetitions of identical tasks. Performances were scored using a validated, objective Global Operative Assessment of Laparoscopic Skills scale. Scores for 3 consecutive repetitions were compared between experts and novices to determine construct validity. Furthermore, to determine if the novices reached a safe level, a trimmed mean of the experts score was used to define a benchmark. Mann-Whitney U test was used for construct validity analysis and 1-sample t test to compare performances of the novice group with the benchmark safe score. Results: Ten novices and 2 experts were recruited. Four out of 5 tasks (nondominant to dominant hand transfer; simulated appendicectomy; intracorporeal and extracorporeal knot tying) showed construct validity. Novices’ scores became comparable to benchmark scores between the eighth and tenth repetition. Conclusion: Minimal access surgical training using fresh frozen human cadavers appears to have construct validity. The laparoscopic skills of novices can be accelerated through to a safe level within 8 to 10 repetitions. PMID:23318058
Full immersion simulation: validation of a distributed simulation environment for technical and non-technical skills training in Urology.

PubMed

Brewin, James; Tang, Jessica; Dasgupta, Prokar; Khan, Muhammad S; Ahmed, Kamran; Bello, Fernando; Kneebone, Roger; Jaye, Peter

2015-07-01

To evaluate the face, content and construct validity of the distributed simulation (DS) environment for technical and non-technical skills training in endourology. To evaluate the educational impact of DS for urology training. DS offers a portable, low-cost simulated operating room environment that can be set up in any open space. A prospective mixed methods design using established validation methodology was conducted in this simulated environment with 10 experienced and 10 trainee urologists. All participants performed a simulated prostate resection in the DS environment. Outcome measures included surveys to evaluate the DS, as well as comparative analyses of experienced and trainee urologist's performance using real-time and 'blinded' video analysis and validated performance metrics. Non-parametric statistical methods were used to compare differences between groups. The DS environment demonstrated face, content and construct validity for both non-technical and technical skills. Kirkpatrick level 1 evidence for the educational impact of the DS environment was shown. Further studies are needed to evaluate the effect of simulated operating room training on real operating room performance. This study has shown the validity of the DS environment for non-technical, as well as technical skills training. DS-based simulation appears to be a valuable addition to traditional classroom-based simulation training. © 2014 The Authors BJU International © 2014 BJU International Published by John Wiley & Sons Ltd.
Use of Latent Class Analysis to define groups based on validity, cognition, and emotional functioning.

PubMed

Morin, Ruth T; Axelrod, Bradley N

Latent Class Analysis (LCA) was used to classify a heterogeneous sample of neuropsychology data. In particular, we used measures of performance validity, symptom validity, cognition, and emotional functioning to assess and describe latent groups of functioning in these areas. A data-set of 680 neuropsychological evaluation protocols was analyzed using a LCA. Data were collected from evaluations performed for clinical purposes at an urban medical center. A four-class model emerged as the best fitting model of latent classes. The resulting classes were distinct based on measures of performance validity and symptom validity. Class A performed poorly on both performance and symptom validity measures. Class B had intact performance validity and heightened symptom reporting. The remaining two Classes performed adequately on both performance and symptom validity measures, differing only in cognitive and emotional functioning. In general, performance invalidity was associated with worse cognitive performance, while symptom invalidity was associated with elevated emotional distress. LCA appears useful in identifying groups within a heterogeneous sample with distinct performance patterns. Further, the orthogonal nature of performance and symptom validities is supported.
Evaluation of CASL boiling model for DNB performance in full scale 5x5 fuel bundle with spacer grids

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kim, Seung Jun

As one of main tasks for FY17 CASL-THM activity, Evaluation study on applicability of the CASL baseline boiling model for 5x5 DNB application is conducted and the predictive capability of the DNB analysis is reported here. While the baseline CASL-boiling model (GEN- 1A) approach has been successfully implemented and validated with a single pipe application in the previous year’s task, the extended DNB validation for realistic sub-channels with detailed spacer grid configurations are tasked in FY17. The focus area of the current study is to demonstrate the robustness and feasibility of the CASL baseline boiling model for DNB performance inmore » a full 5x5 fuel bundle application. A quantitative evaluation of the DNB predictive capability is performed by comparing with corresponding experimental measurements (i.e. reference for the model validation). The reference data are provided from the Westinghouse Electricity Company (WEC). Two different grid configurations tested here include Non-Mixing Vane Grid (NMVG), and Mixing Vane Grid (MVG). Thorough validation studies with two sub-channel configurations are performed at a wide range of realistic PWR operational conditions.« less
Applied Chaos Level Test for Validation of Signal Conditions Underlying Optimal Performance of Voice Classification Methods.

PubMed

Liu, Boquan; Polce, Evan; Sprott, Julien C; Jiang, Jack J

2018-05-17

The purpose of this study is to introduce a chaos level test to evaluate linear and nonlinear voice type classification method performances under varying signal chaos conditions without subjective impression. Voice signals were constructed with differing degrees of noise to model signal chaos. Within each noise power, 100 Monte Carlo experiments were applied to analyze the output of jitter, shimmer, correlation dimension, and spectrum convergence ratio. The computational output of the 4 classifiers was then plotted against signal chaos level to investigate the performance of these acoustic analysis methods under varying degrees of signal chaos. A diffusive behavior detection-based chaos level test was used to investigate the performances of different voice classification methods. Voice signals were constructed by varying the signal-to-noise ratio to establish differing signal chaos conditions. Chaos level increased sigmoidally with increasing noise power. Jitter and shimmer performed optimally when the chaos level was less than or equal to 0.01, whereas correlation dimension was capable of analyzing signals with chaos levels of less than or equal to 0.0179. Spectrum convergence ratio demonstrated proficiency in analyzing voice signals with all chaos levels investigated in this study. The results of this study corroborate the performance relationships observed in previous studies and, therefore, demonstrate the validity of the validation test method. The presented chaos level validation test could be broadly utilized to evaluate acoustic analysis methods and establish the most appropriate methodology for objective voice analysis in clinical practice.
Validating use of a critical thinking test for the dental admission test.

PubMed

Tsai, Tsung-Hsun

2014-04-01

The purpose of this study was to validate the use of a test to assess dental school applicants' critical thinking abilities. The intent was to include this test on the Dental Admission Test (DAT) if it was shown to enhance the DAT's validity. Correlation and regression analyses of undergraduate and dental school performance with scores on each of the tests on the DAT battery and the California Critical Thinking Skills Test (CCTST) were performed. Data were collected from 439 third- and fourth-year dental students who consented to participate and were enrolled at one of the ten accredited dental schools included in the study. These ten dental schools were from most regions of the United States. This study concluded that including the CCTST on the DAT did not significantly enhance the DAT's validity.
Nutrition screening tools: does one size fit all? A systematic review of screening tools for the hospital setting.

PubMed

van Bokhorst-de van der Schueren, Marian A E; Guaitoli, Patrícia Realino; Jansma, Elise P; de Vet, Henrica C W

2014-02-01

Numerous nutrition screening tools for the hospital setting have been developed. The aim of this systematic review is to study construct or criterion validity and predictive validity of nutrition screening tools for the general hospital setting. A systematic review of English, French, German, Spanish, Portuguese and Dutch articles identified via MEDLINE, Cinahl and EMBASE (from inception to the 2nd of February 2012). Additional studies were identified by checking reference lists of identified manuscripts. Search terms included key words for malnutrition, screening or assessment instruments, and terms for hospital setting and adults. Data were extracted independently by 2 authors. Only studies expressing the (construct, criterion or predictive) validity of a tool were included. 83 studies (32 screening tools) were identified: 42 studies on construct or criterion validity versus a reference method and 51 studies on predictive validity on outcome (i.e. length of stay, mortality or complications). None of the tools performed consistently well to establish the patients' nutritional status. For the elderly, MNA performed fair to good, for the adults MUST performed fair to good. SGA, NRS-2002 and MUST performed well in predicting outcome in approximately half of the studies reviewed in adults, but not in older patients. Not one single screening or assessment tool is capable of adequate nutrition screening as well as predicting poor nutrition related outcome. Development of new tools seems redundant and will most probably not lead to new insights. New studies comparing different tools within one patient population are required. Copyright © 2013 Elsevier Ltd and European Society for Clinical Nutrition and Metabolism. All rights reserved.
Soldier Dimensions in Combat Models

DTIC Science & Technology

1990-05-07

and performance. Questionnaires, SQTs, and ARTEPs were often used. Many scales had estimates of reliability but few had validity data. Most studies...pending its validation . Research plans were provided for applications in simulated combat and with simulation devices, for data previously gathered...regarding reliability and validity . Lack of information following an instrument indicates neither reliability nor validity information was provided by the
A Facet-Factorial Approach towards the Development and Validation of a Jazz Rhythm Section Performance Rating Scale

ERIC Educational Resources Information Center

Wesolowski, Brian C.

2017-01-01

The purpose of this study was to develop a valid and reliable rating scale to assess jazz rhythm sections in the context of jazz big band performance. The research questions that guided this study included: (a) what central factors contribute to the assessment of a jazz rhythm section? (b) what items should be used to describe and assess a jazz…
A semi-automatic method for left ventricle volume estimate: an in vivo validation study

NASA Technical Reports Server (NTRS)

Corsi, C.; Lamberti, C.; Sarti, A.; Saracino, G.; Shiota, T.; Thomas, J. D.

2001-01-01

This study aims to the validation of the left ventricular (LV) volume estimates obtained by processing volumetric data utilizing a segmentation model based on level set technique. The validation has been performed by comparing real-time volumetric echo data (RT3DE) and magnetic resonance (MRI) data. A validation protocol has been defined. The validation protocol was applied to twenty-four estimates (range 61-467 ml) obtained from normal and pathologic subjects, which underwent both RT3DE and MRI. A statistical analysis was performed on each estimate and on clinical parameters as stroke volume (SV) and ejection fraction (EF). Assuming MRI estimates (x) as a reference, an excellent correlation was found with volume measured by utilizing the segmentation procedure (y) (y=0.89x + 13.78, r=0.98). The mean error on SV was 8 ml and the mean error on EF was 2%. This study demonstrated that the segmentation technique is reliably applicable on human hearts in clinical practice.
Ethical leadership: meta-analytic evidence of criterion-related and incremental validity.

PubMed

Ng, Thomas W H; Feldman, Daniel C

2015-05-01

This study examines the criterion-related and incremental validity of ethical leadership (EL) with meta-analytic data. Across 101 samples published over the last 15 years (N = 29,620), we observed that EL demonstrated acceptable criterion-related validity with variables that tap followers' job attitudes, job performance, and evaluations of their leaders. Further, followers' trust in the leader mediated the relationships of EL with job attitudes and performance. In terms of incremental validity, we found that EL significantly, albeit weakly in some cases, predicted task performance, citizenship behavior, and counterproductive work behavior-even after controlling for the effects of such variables as transformational leadership, use of contingent rewards, management by exception, interactional fairness, and destructive leadership. The article concludes with a discussion of ways to strengthen the incremental validity of EL. (PsycINFO Database Record (c) 2015 APA, all rights reserved).
Measuring awareness of financial skills: reliability and validity of a new measure.

PubMed

Cramer, K; Tuokko, H A; Mateer, C A; Hultsch, D F

2004-03-01

This paper examines the psychometric properties of a three-part (participant, informant, and performance) Measure for assessing Awareness of Financial Skills (MAFS). The MAFS was administered to 10 seniors with dementia and 25 well-functioning seniors, and their informants. Measures of cognitive functioning, social desirability, neuroticism, and perceived control were administered to each participant to allow for an assessment of validity. Internal consistency estimates for the participant and informant questionnaires were found to be 0.92 and 0.97, respectively. Convergent validity analysis indicated that performance on this measure was related to level of cognitive functioning, with higher level of unawareness associated with decreased cognitive ability. Discriminant validity analysis showed that performance on this measure was not related to social desirability or neuroticism. This study provides evidence that the MAFS is a reliable and valid tool for assessing awareness of financial skills in older adults.
Empirical Performance of Cross-Validation With Oracle Methods in a Genomics Context.

PubMed

Martinez, Josue G; Carroll, Raymond J; Müller, Samuel; Sampson, Joshua N; Chatterjee, Nilanjan

2011-11-01

When employing model selection methods with oracle properties such as the smoothly clipped absolute deviation (SCAD) and the Adaptive Lasso, it is typical to estimate the smoothing parameter by m-fold cross-validation, for example, m = 10. In problems where the true regression function is sparse and the signals large, such cross-validation typically works well. However, in regression modeling of genomic studies involving Single Nucleotide Polymorphisms (SNP), the true regression functions, while thought to be sparse, do not have large signals. We demonstrate empirically that in such problems, the number of selected variables using SCAD and the Adaptive Lasso, with 10-fold cross-validation, is a random variable that has considerable and surprising variation. Similar remarks apply to non-oracle methods such as the Lasso. Our study strongly questions the suitability of performing only a single run of m-fold cross-validation with any oracle method, and not just the SCAD and Adaptive Lasso.
Training and Assessment of Hysteroscopic Skills: A Systematic Review.

PubMed

Savran, Mona Meral; Sørensen, Stine Maya Dreier; Konge, Lars; Tolsgaard, Martin G; Bjerrum, Flemming

2016-01-01

The aim of this systematic review was to identify studies on hysteroscopic training and assessment. PubMed, Excerpta Medica, the Cochrane Library, and Web of Science were searched in January 2015. Manual screening of references and citation tracking were also performed. Studies on hysteroscopic educational interventions were selected without restrictions on study design, populations, language, or publication year. A qualitative data synthesis including the setting, study participants, training model, training characteristics, hysteroscopic skills, assessment parameters, and study outcomes was performed by 2 authors working independently. Effect sizes were calculated when possible. Overall, 2 raters independently evaluated sources of validity evidence supporting the outcomes of the hysteroscopy assessment tools. A total of 25 studies on hysteroscopy training were identified, of which 23 were performed in simulated settings. Overall, 10 studies used virtual-reality simulators and reported effect sizes for technical skills ranging from 0.31 to 2.65; 12 used inanimate models and reported effect sizes for technical skills ranging from 0.35 to 3.19. One study involved live animal models; 2 studies were performed in clinical settings. The validity evidence supporting the assessment tools used was low. Consensus between the 2 raters on the reported validity evidence was high (94%). This systematic review demonstrated large variations in the effect of different tools for hysteroscopy training. The validity evidence supporting the assessment of hysteroscopic skills was limited. Copyright © 2016 Association of Program Directors in Surgery. Published by Elsevier Inc. All rights reserved.
Air Combat Training: Good Stick Index Validation. Final Report for Period 3 April 1978-1 April 1979.

ERIC Educational Resources Information Center

Moore, Samuel B.; And Others

A study was conducted to investigate and statistically validate a performance measuring system (the Good Stick Index) in the Tactical Air Command Combat Engagement Simulator I (TAC ACES I) Air Combat Maneuvering (ACM) training program. The study utilized a twelve-week sample of eighty-nine student pilots to statistically validate the Good Stick…

Effects of Coaching on the Validity of the SAT: A Simulation Study.

ERIC Educational Resources Information Center

Baydar, Nazli

The effects of student coaching in preparation for the College Board Scholastic Aptitude Test (SAT) on the predictive validity of this test for freshman year performance were studied using data on 1985 freshman year students from four colleges. After the validity of the SAT was estimated for each school, a given proportion of students was picked,…
Threats to Validity When Using Open-Ended Items in International Achievement Studies: Coding Responses to the PISA 2012 Problem-Solving Test in Finland

ERIC Educational Resources Information Center

Arffman, Inga

2016-01-01

Open-ended (OE) items are widely used to gather data on student performance in international achievement studies. However, several factors may threaten validity when using such items. This study examined Finnish coders' opinions about threats to validity when coding responses to OE items in the PISA 2012 problem-solving test. A total of 6…
Design, development, testing and validation of a Photonics Virtual Laboratory for the study of LEDs

NASA Astrophysics Data System (ADS)

Naranjo, Francisco L.; Martínez, Guadalupe; Pérez, Ángel L.; Pardo, Pedro J.

2014-07-01

This work presents the design, development, testing and validation of a Photonic Virtual Laboratory, highlighting the study of LEDs. The study was conducted from a conceptual, experimental and didactic standpoint, using e-learning and m-learning platforms. Specifically, teaching tools that help ensure that our students perform significant learning have been developed. It has been brought together the scientific aspect, such as the study of LEDs, with techniques of generation and transfer of knowledge through the selection, hierarchization and structuring of information using concept maps. For the validation of the didactic materials developed, it has been used procedures with various assessment tools for the collection and processing of data, applied in the context of an experimental design. Additionally, it was performed a statistical analysis to determine the validity of the materials developed. The assessment has been designed to validate the contributions of the new materials developed over the traditional method of teaching, and to quantify the learning achieved by students, in order to draw conclusions that serve as a reference for its application in the teaching and learning processes, and comprehensively validate the work carried out.
A reliability and validity study of the Palliative Performance Scale

PubMed Central

Ho, Francis; Lau, Francis; Downing, Michael G; Lesperance, Mary

2008-01-01

Background The Palliative Performance Scale (PPS) was first introduced in1996 as a new tool for measurement of performance status in palliative care. PPS has been used in many countries and has been translated into other languages. Methods This study evaluated the reliability and validity of PPS. A web-based, case scenarios study with a test-retest format was used to determine reliability. Fifty-three participants were recruited and randomly divided into two groups, each evaluating 11 cases at two time points. The validity study was based on the content validation of 15 palliative care experts conducted over telephone interviews, with discussion on five themes: PPS as clinical assessment tool, the usefulness of PPS, PPS scores affecting decision making, the problems in using PPS, and the adequacy of PPS instruction. Results The intraclass correlation coefficients for absolute agreement were 0.959 and 0.964 for Group 1, at Time-1 and Time-2; 0.951 and 0.931 for Group 2, at Time-1 and Time-2 respectively. Results showed that the participants were consistent in their scoring over the two times, with a mean Cohen's kappa of 0.67 for Group 1 and 0.71 for Group 2. In the validity study, all experts agreed that PPS is a valuable clinical assessment tool in palliative care. Many of them have already incorporated PPS as part of their practice standard. Conclusion The results of the reliability study demonstrated that PPS is a reliable tool. The validity study found that most experts did not feel a need to further modify PPS and, only two experts requested that some performance status measures be defined more clearly. Areas of PPS use include prognostication, disease monitoring, care planning, hospital resource allocation, clinical teaching and research. PPS is also a good communication tool between palliative care workers. PMID:18680590
Addressing criticisms of existing predictive bias research: cognitive ability test scores still overpredict African Americans' job performance.

PubMed

Berry, Christopher M; Zhao, Peng

2015-01-01

Predictive bias studies have generally suggested that cognitive ability test scores overpredict job performance of African Americans, meaning these tests are not predictively biased against African Americans. However, at least 2 issues call into question existing over-/underprediction evidence: (a) a bias identified by Aguinis, Culpepper, and Pierce (2010) in the intercept test typically used to assess over-/underprediction and (b) a focus on the level of observed validity instead of operational validity. The present study developed and utilized a method of assessing over-/underprediction that draws on the math of subgroup regression intercept differences, does not rely on the biased intercept test, allows for analysis at the level of operational validity, and can use meta-analytic estimates as input values. Therefore, existing meta-analytic estimates of key parameters, corrected for relevant statistical artifacts, were used to determine whether African American job performance remains overpredicted at the level of operational validity. African American job performance was typically overpredicted by cognitive ability tests across levels of job complexity and across conditions wherein African American and White regression slopes did and did not differ. Because the present study does not rely on the biased intercept test and because appropriate statistical artifact corrections were carried out, the present study's results are not affected by the 2 issues mentioned above. The present study represents strong evidence that cognitive ability tests generally overpredict job performance of African Americans. (c) 2015 APA, all rights reserved.
Validation of the Penn Acoustic Neuroma Quality-of-Life Scale (PANQOL) for Spanish-Speaking Patients.

PubMed

Medina, Maria Del Mar; Carrillo, Alvaro; Polo, Ruben; Fernandez, Borja; Alonso, Daniel; Vaca, Miguel; Cordero, Adela; Perez, Cecilia; Muriel, Alfonso; Cobeta, Ignacio

2017-04-01

Objective To perform translation, cross-cultural adaptation, and validation of the Penn Acoustic Neuroma Quality-of-Life Scale (PANQOL) to the Spanish language. Study Design Prospective study. Setting Tertiary neurotologic referral center. Subjects and Methods PANQOL was translated and translated back, and a pretest trial was performed. The study included 27 individuals diagnosed with vestibular schwannoma. Inclusion criteria were adults with untreated vestibular schwannoma, diagnosed in the past 12 months. Feasibility, internal consistency, test-retest reliability, construct validity, and ceiling and floor effects were assessed for the present study. Results The mean overall score of the PANQOL was 69.21 (0-100 scale, lowest to highest quality of life). Cronbach's α was 0.87. Intraclass correlation coefficient was performed for each item, with an overall score of 0.92. The κ coefficient scores were between moderate and almost perfect in more than 92% of patients. Anxiety and energy domains of the PANQOL were correlated with both physical and mental components of the SF-12. Hearing, balance, and pain domains were correlated with the SF-12 physical component. Facial and general domains were not significantly correlated with any component of the SF-12. Furthermore, the overall score of the PANQOL was correlated with the physical component of the SF-12. Conclusion Feasibility, internal consistency, reliability, and construct validity outcomes in the current study support the validity of the Spanish version of the PANQOL.
Reproducibility and validity of the Dutch translation of the de Morton Mobility Index (DEMMI) used by physiotherapists in older patients with knee or hip osteoarthritis.

PubMed

Jans, Marielle P; Slootweg, Vera C; Boot, Cecile R; de Morton, Natalie A; van der Sluis, Geert; van Meeteren, Nico L

2011-11-01

To examine the reproducibility, construct validity, and unidimensionality of the Dutch translation of the de Morton Mobility Index (DEMMI), a performance-based measure of mobility for older patients. Cross-sectional study. Rehabilitation center (reproducibility study) and hospital (validity study). Patients (N=28; age >65y) after orthopedic surgery (reproducibility study) and patients (N=219; age >65y) waiting for total hip or total knee arthroplasty (validity study). Not applicable. Not applicable. The intraclass correlation coefficient for interrater reliability was high (.85; 95% confidence interval, 71-.93), and minimal detectable change with 90% confidence was 7 on the 100-point DEMMI scale. Rasch analysis identified that the Dutch translation of the DEMMI is a unidimensional measure of mobility in this population. DEMMI scores showed high correlations with scores on other performance-based measures of mobility (Timed Up and Go test, Spearman r=-.73; Chair Rise Time, r=-.69; walking test, r=.74). A lower correlation of .44 was identified with the self-report measure Western Ontario and McMaster Universities Osteoarthritis Index. The Dutch translation of the DEMMI is a reproducible and valid performance-based measure for assessing mobility in older patients with knee or hip osteoarthritis. Copyright © 2011 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Five-Kilometers Time Trial: Preliminary Validation of a Short Test for Cycling Performance Evaluation.

PubMed

Dantas, Jose Luiz; Pereira, Gleber; Nakamura, Fabio Yuzo

2015-09-01

The five-kilometer time trial (TT5km) has been used to assess aerobic endurance performance without further investigation of its validity. This study aimed to perform a preliminary validation of the TT5km to rank well-trained cyclists based on aerobic endurance fitness and assess changes of the aerobic endurance performance. After the incremental test, 20 cyclists (age = 31.3 ± 7.9 years; body mass index = 22.7 ± 1.5 kg/m(2); maximal aerobic power = 360.5 ± 49.5 W) performed the TT5km twice, collecting performance (time to complete, absolute and relative power output, average speed) and physiological responses (heart rate and electromyography activity). The validation criteria were pacing strategy, absolute and relative reliability, validity, and sensitivity. Sensitivity index was obtained from the ratio between the smallest worthwhile change and typical error. The TT5km showed high absolute (coefficient of variation < 3%) and relative (intraclass coefficient correlation > 0.95) reliability of performance variables, whereas it presented low reliability of physiological responses. The TT5km performance variables were highly correlated with the aerobic endurance indices obtained from incremental test (r > 0.70). These variables showed adequate sensitivity index (> 1). TT5km is a valid test to rank the aerobic endurance fitness of well-trained cyclists and to differentiate changes on aerobic endurance performance. Coaches can detect performance changes through either absolute (± 17.7 W) or relative power output (± 0.3 W.kg(-1)), the time to complete the test (± 13.4 s) and the average speed (± 1.0 km.h(-1)). Furthermore, TT5km performance can also be used to rank the athletes according to their aerobic endurance fitness.
Development and Validation of a Novel Robotic Procedure Specific Simulation Platform: Partial Nephrectomy.

PubMed

Hung, Andrew J; Shah, Swar H; Dalag, Leonard; Shin, Daniel; Gill, Inderbir S

2015-08-01

We developed a novel procedure specific simulation platform for robotic partial nephrectomy. In this study we prospectively evaluate its face, content, construct and concurrent validity. This hybrid platform features augmented reality and virtual reality. Augmented reality involves 3-dimensional robotic partial nephrectomy surgical videos overlaid with virtual instruments to teach surgical anatomy, technical skills and operative steps. Advanced technical skills are assessed with an embedded full virtual reality renorrhaphy task. Participants were classified as novice (no surgical training, 15), intermediate (less than 100 robotic cases, 13) or expert (100 or more robotic cases, 14) and prospectively assessed. Cohort performance was compared with the Kruskal-Wallis test (construct validity). Post-study questionnaire was used to assess the realism of simulation (face validity) and usefulness for training (content validity). Concurrent validity evaluated correlation between virtual reality renorrhaphy task and a live porcine robotic partial nephrectomy performance (Spearman's analysis). Experts rated the augmented reality content as realistic (median 8/10) and helpful for resident/fellow training (8.0-8.2/10). Experts rated the platform highly for teaching anatomy (9/10) and operative steps (8.5/10) but moderately for technical skills (7.5/10). Experts and intermediates outperformed novices (construct validity) in efficiency (p=0.0002) and accuracy (p=0.002). For virtual reality renorrhaphy, experts outperformed intermediates on GEARS metrics (p=0.002). Virtual reality renorrhaphy and in vivo porcine robotic partial nephrectomy performance correlated significantly (r=0.8, p <0.0001) (concurrent validity). This augmented reality simulation platform displayed face, content and construct validity. Performance in the procedure specific virtual reality task correlated highly with a porcine model (concurrent validity). Future efforts will integrate procedure specific virtual reality tasks and their global assessment. Copyright © 2015 American Urological Association Education and Research, Inc. Published by Elsevier Inc. All rights reserved.
Development and testing of the cancer multidisciplinary team meeting observational tool (MDT-MOT)

PubMed Central

Harris, Jenny; Taylor, Cath; Sevdalis, Nick; Jalil, Rozh; Green, James S.A.

2016-01-01

Abstract Objective To develop a tool for independent observational assessment of cancer multidisciplinary team meetings (MDMs), and test criterion validity, inter-rater reliability/agreement and describe performance. Design Clinicians and experts in teamwork used a mixed-methods approach to develop and refine the tool. Study 1 observers rated pre-determined optimal/sub-optimal MDM film excerpts and Study 2 observers independently rated video-recordings of 10 MDMs. Setting Study 2 included 10 cancer MDMs in England. Participants Testing was undertaken by 13 health service staff and a clinical and non-clinical observer. Intervention None. Main Outcome Measures Tool development, validity, reliability/agreement and variability in MDT performance. Results Study 1: Observers were able to discriminate between optimal and sub-optimal MDM performance (P ≤ 0.05). Study 2: Inter-rater reliability was good for 3/10 domains. Percentage of absolute agreement was high (≥80%) for 4/10 domains and percentage agreement within 1 point was high for 9/10 domains. Four MDTs performed well (scored 3+ in at least 8/10 domains), 5 MDTs performed well in 6–7 domains and 1 MDT performed well in only 4 domains. Leadership and chairing of the meeting, the organization and administration of the meeting, and clinical decision-making processes all varied significantly between MDMs (P ≤ 0.01). Conclusions MDT-MOT demonstrated good criterion validity. Agreement between clinical and non-clinical observers (within one point on the scale) was high but this was inconsistent with reliability coefficients and warrants further investigation. If further validated MDT-MOT might provide a useful mechanism for the routine assessment of MDMs by the local workforce to drive improvements in MDT performance. PMID:27084499
Development and testing of the cancer multidisciplinary team meeting observational tool (MDT-MOT).

PubMed

Harris, Jenny; Taylor, Cath; Sevdalis, Nick; Jalil, Rozh; Green, James S A

2016-06-01

To develop a tool for independent observational assessment of cancer multidisciplinary team meetings (MDMs), and test criterion validity, inter-rater reliability/agreement and describe performance. Clinicians and experts in teamwork used a mixed-methods approach to develop and refine the tool. Study 1 observers rated pre-determined optimal/sub-optimal MDM film excerpts and Study 2 observers independently rated video-recordings of 10 MDMs. Study 2 included 10 cancer MDMs in England. Testing was undertaken by 13 health service staff and a clinical and non-clinical observer. None. Tool development, validity, reliability/agreement and variability in MDT performance. Study 1: Observers were able to discriminate between optimal and sub-optimal MDM performance (P ≤ 0.05). Study 2: Inter-rater reliability was good for 3/10 domains. Percentage of absolute agreement was high (≥80%) for 4/10 domains and percentage agreement within 1 point was high for 9/10 domains. Four MDTs performed well (scored 3+ in at least 8/10 domains), 5 MDTs performed well in 6-7 domains and 1 MDT performed well in only 4 domains. Leadership and chairing of the meeting, the organization and administration of the meeting, and clinical decision-making processes all varied significantly between MDMs (P ≤ 0.01). MDT-MOT demonstrated good criterion validity. Agreement between clinical and non-clinical observers (within one point on the scale) was high but this was inconsistent with reliability coefficients and warrants further investigation. If further validated MDT-MOT might provide a useful mechanism for the routine assessment of MDMs by the local workforce to drive improvements in MDT performance. © The Author 2016. Published by Oxford University Press in association with the International Society for Quality in Health Care; all rights reserved.
External validation of the diffuse intrinsic pontine glioma survival prediction model: a collaborative report from the International DIPG Registry and the SIOPE DIPG Registry.

PubMed

Veldhuijzen van Zanten, Sophie E M; Lane, Adam; Heymans, Martijn W; Baugh, Joshua; Chaney, Brooklyn; Hoffman, Lindsey M; Doughman, Renee; Jansen, Marc H A; Sanchez, Esther; Vandertop, William P; Kaspers, Gertjan J L; van Vuurden, Dannis G; Fouladi, Maryam; Jones, Blaise V; Leach, James

2017-08-01

We aimed to perform external validation of the recently developed survival prediction model for diffuse intrinsic pontine glioma (DIPG), and discuss its utility. The DIPG survival prediction model was developed in a cohort of patients from the Netherlands, United Kingdom and Germany, registered in the SIOPE DIPG Registry, and includes age <3 years, longer symptom duration and receipt of chemotherapy as favorable predictors, and presence of ring-enhancement on MRI as unfavorable predictor. Model performance was evaluated by analyzing the discrimination and calibration abilities. External validation was performed using an unselected cohort from the International DIPG Registry, including patients from United States, Canada, Australia and New Zealand. Basic comparison with the results of the original study was performed using descriptive statistics, and univariate- and multivariable regression analyses in the validation cohort. External validation was assessed following a variety of analyses described previously. Baseline patient characteristics and results from the regression analyses were largely comparable. Kaplan-Meier curves of the validation cohort reproduced separated groups of standard (n = 39), intermediate (n = 125), and high-risk (n = 78) patients. This discriminative ability was confirmed by similar values for the hazard ratios across these risk groups. The calibration curve in the validation cohort showed a symmetric underestimation of the predicted survival probabilities. In this external validation study, we demonstrate that the DIPG survival prediction model has acceptable cross-cohort calibration and is able to discriminate patients with short, average, and increased survival. We discuss how this clinico-radiological model may serve a useful role in current clinical practice.
Validation and clinical utility of the executive function performance test in persons with traumatic brain injury.

PubMed

Baum, C M; Wolf, T J; Wong, A W K; Chen, C H; Walker, K; Young, A C; Carlozzi, N E; Tulsky, D S; Heaton, R K; Heinemann, A W

2017-07-01

This study examined the relationships between the Executive Function Performance Test (EFPT), the NIH Toolbox Cognitive Function tests, and neuropsychological executive function measures in 182 persons with traumatic brain injury (TBI) and 46 controls to evaluate construct, discriminant, and predictive validity. Construct validity: There were moderate correlations between the EFPT and the NIH Toolbox Crystallized (r = -.479), Fluid Tests (r = -.420), and Total Composite Scores (r = -.496). Discriminant validity: Significant differences were found in the EFPT total and sequence scores across control, complicated mild/moderate, and severe TBI groups. We found differences in the organisation score between control and severe, and between mild and severe TBI groups. Both TBI groups had significantly lower scores in safety and judgement than controls. Compared to the controls, the severe TBI group demonstrated significantly lower performance on all instrumental activities of daily living (IADL) tasks. Compared to the mild TBI group, the controls performed better on the medication task, the severe TBI group performed worse in the cooking and telephone tasks. Predictive validity: The EFPT predicted the self-perception of independence measured by the TBI-QOL (beta = -0.49, p < .001) for the severe TBI group. Overall, these data support the validity of the EFPT for use in individuals with TBI.
Implementation and application of an interactive user-friendly validation software for RADIANCE

NASA Astrophysics Data System (ADS)

Sundaram, Anand; Boonn, William W.; Kim, Woojin; Cook, Tessa S.

2012-02-01

RADIANCE extracts CT dose parameters from dose sheets using optical character recognition and stores the data in a relational database. To facilitate validation of RADIANCE's performance, a simple user interface was initially implemented and about 300 records were evaluated. Here, we extend this interface to achieve a wider variety of functions and perform a larger-scale validation. The validator uses some data from the RADIANCE database to prepopulate quality-testing fields, such as correspondence between calculated and reported total dose-length product. The interface also displays relevant parameters from the DICOM headers. A total of 5,098 dose sheets were used to test the performance accuracy of RADIANCE in dose data extraction. Several search criteria were implemented. All records were searchable by accession number, study date, or dose parameters beyond chosen thresholds. Validated records were searchable according to additional criteria from validation inputs. An error rate of 0.303% was demonstrated in the validation. Dose monitoring is increasingly important and RADIANCE provides an open-source solution with a high level of accuracy. The RADIANCE validator has been updated to enable users to test the integrity of their installation and verify that their dose monitoring is accurate and effective.
Performance Validation Approach for the GTX Air-Breathing Launch Vehicle

NASA Technical Reports Server (NTRS)

Trefny, Charles J.; Roche, Joseph M.

2002-01-01

The primary objective of the GTX effort is to determine whether or not air-breathing propulsion can enable a launch vehicle to achieve orbit in a single stage. Structural weight, vehicle aerodynamics, and propulsion performance must be accurately known over the entire flight trajectory in order to make a credible assessment. Structural, aerodynamic, and propulsion parameters are strongly interdependent, which necessitates a system approach to design, evaluation, and optimization of a single-stage-to-orbit concept. The GTX reference vehicle serves this purpose, by allowing design, development, and validation of components and subsystems in a system context. The reference vehicle configuration (including propulsion) was carefully chosen so as to provide high potential for structural and volumetric efficiency, and to allow the high specific impulse of air-breathing propulsion cycles to be exploited. Minor evolution of the configuration has occurred as analytical and experimental results have become available. With this development process comes increasing validation of the weight and performance levels used in system performance determination. This paper presents an overview of the GTX reference vehicle and the approach to its performance validation. Subscale test rigs and numerical studies used to develop and validate component performance levels and unit structural weights are outlined. The sensitivity of the equivalent, effective specific impulse to key propulsion component efficiencies is presented. The role of flight demonstration in development and validation is discussed.
Validation of the breast evaluation questionnaire for breast hypertrophy and breast reduction.

PubMed

Lewin, Richard; Elander, Anna; Lundberg, Jonas; Hansson, Emma; Thorarinsson, Andri; Claudelin, Malin; Bladh, Helena; Lidén, Mattias

2018-06-13

There is a lack of published, validated questionnaires for evaluating psychosocial morbidity in patients with breast hypertrophy undergoing breast reduction surgery. To validate the breast evaluation questionnaire (BEQ), originally developed for the assessment of breast augmentation patients, for the assessment of psychosocial morbidity in patients with breast hypertrophy undergoing breast reduction surgery. Validation study Subjects: Women with macromastia Methods: The validation of the BEQ, adapted to breast reduction, was performed in several steps. Content validity, reliability, construct validity and responsiveness were assessed. The original version was adjusted according to the results for content validity and resulted in item reduction and a modified BEQ (mBEQ) that was then assessed for reliability, construct validity and responsiveness. Internal and external validation was performed for the modified BEQ. Convergent validity was tested against Breast-Q (reduction) and discriminate validity was tested against the SF-36. Known-groups validation revealed significant differences between the normal population and patients undergoing breast reduction surgery. The BEQ showed good reliability by test-re-test analysis and high responsiveness. The modified BEQ may be reliable, valid and responsive instrument for assessing women who undergo breast reduction.
Validation of uniaxial and triaxial accelerometers for the assessment of physical activity in preschool children

USDA-ARS?s Scientific Manuscript database

Given the unique physical activity patterns of preschoolers, wearable electronic devices for quantitative assessment of physical activity require validation in this population. Study objective was to validate uniaxial and triaxial accelerometers in preschoolers. Room calorimetry was performed over 3...
The interplay between academic performance and quality of life among preclinical students.

PubMed

Shareef, Mohammad Abrar; AlAmodi, Abdulhadi A; Al-Khateeb, Abdulrahman A; Abudan, Zainab; Alkhani, Mohammed A; Zebian, Sanderlla I; Qannita, Ahmed S; Tabrizi, Mariam J

2015-10-31

The high academic performance of medical students greatly influences their professional competence in long term career. Meanwhile, medical students greatly demand procuring a good quality of life that can help them sustain their medical career. This study examines validity and reliability of the tool among preclinical students and testifies the influence of their scholastic performance along with gender and academic year on their quality of life. A cross sectional study was conducted by distributing World Health Organization Quality of Life, WHOQOL-BREF, survey among medical students of year one to three at Alfaisal University. For validity, item discriminate validity(IDV) and confirmatory factor analysis were measured and for reliability, Cronbach's α test and internal item consistency(IIC) were examined. The association of GPA, gender and academic year with all major domains was drawn using Pearson's correlation, independent samples t-test and one-way ANOVA, respectively. A total of 335 preclinical students have responded to this questionnaire. The construct has demonstrated an adequate validity and good reliability. The high academic performance of students positively correlated with physical (r = 0.23, p < 0.001), psychological health (r = 0.29, p < 0.001), social relations (r = 0.11, p = 0.03) and environment (r = 0.23, p < 0.001). Male student scored higher than female peers in physical and psychological health. This study has identified a direct relationship between the academic performance of preclinical students and their quality of life. The WHOQOL-BREF is a valid and reliable tool among preclinical students and the positive direction of high academic performance with greater QOL suggests that academic achievers procure higher satisfaction and poor achievers need a special attention for the improvement of their quality of life.
Spatio-temporal modeling of chronic PM 10 exposure for the Nurses' Health Study

NASA Astrophysics Data System (ADS)

Yanosky, Jeff D.; Paciorek, Christopher J.; Schwartz, Joel; Laden, Francine; Puett, Robin; Suh, Helen H.

2008-06-01

Chronic epidemiological studies of airborne particulate matter (PM) have typically characterized the chronic PM exposures of their study populations using city- or county-wide ambient concentrations, which limit the studies to areas where nearby monitoring data are available and which ignore within-city spatial gradients in ambient PM concentrations. To provide more spatially refined and precise chronic exposure measures, we used a Geographic Information System (GIS)-based spatial smoothing model to predict monthly outdoor PM10 concentrations in the northeastern and midwestern United States. This model included monthly smooth spatial terms and smooth regression terms of GIS-derived and meteorological predictors. Using cross-validation and other pre-specified selection criteria, terms for distance to road by road class, urban land use, block group and county population density, point- and area-source PM10 emissions, elevation, wind speed, and precipitation were found to be important determinants of PM10 concentrations and were included in the final model. Final model performance was strong (cross-validation R2=0.62), with little bias (-0.4 μg m-3) and high precision (6.4 μg m-3). The final model (with monthly spatial terms) performed better than a model with seasonal spatial terms (cross-validation R2=0.54). The addition of GIS-derived and meteorological predictors improved predictive performance over spatial smoothing (cross-validation R2=0.51) or inverse distance weighted interpolation (cross-validation R2=0.29) methods alone and increased the spatial resolution of predictions. The model performed well in both rural and urban areas, across seasons, and across the entire time period. The strong model performance demonstrates its suitability as a means to estimate individual-specific chronic PM10 exposures for large populations.
Validity and reliability of an online visual-spatial working memory task for self-reliant administration in school-aged children.

PubMed

Van de Weijer-Bergsma, Eva; Kroesbergen, Evelyn H; Prast, Emilie J; Van Luit, Johannes E H

2015-09-01

Working memory is an important predictor of academic performance, and of math performance in particular. Most working memory tasks depend on one-to-one administration by a testing assistant, which makes the use of such tasks in large-scale studies time-consuming and costly. Therefore, an online, self-reliant visual-spatial working memory task (the Lion game) was developed for primary school children (6-12 years of age). In two studies, the validity and reliability of the Lion game were investigated. The results from Study 1 (n = 442) indicated satisfactory six-week test-retest reliability, excellent internal consistency, and good concurrent and predictive validity. The results from Study 2 (n = 5,059) confirmed the results on the internal consistency and predictive validity of the Lion game. In addition, multilevel analysis revealed that classroom membership influenced Lion game scores. We concluded that the Lion game is a valid and reliable instrument for the online computerized and self-reliant measurement of visual-spatial working memory (i.e., updating).

Development and validation of a composite scoring system for robot-assisted surgical training--the Robotic Skills Assessment Score.

PubMed

Chowriappa, Ashirwad J; Shi, Yi; Raza, Syed Johar; Ahmed, Kamran; Stegemann, Andrew; Wilding, Gregory; Kaouk, Jihad; Peabody, James O; Menon, Mani; Hassett, James M; Kesavadas, Thenkurussi; Guru, Khurshid A

2013-12-01

A standardized scoring system does not exist in virtual reality-based assessment metrics to describe safe and crucial surgical skills in robot-assisted surgery. This study aims to develop an assessment score along with its construct validation. All subjects performed key tasks on previously validated Fundamental Skills of Robotic Surgery curriculum, which were recorded, and metrics were stored. After an expert consensus for the purpose of content validation (Delphi), critical safety determining procedural steps were identified from the Fundamental Skills of Robotic Surgery curriculum and a hierarchical task decomposition of multiple parameters using a variety of metrics was used to develop Robotic Skills Assessment Score (RSA-Score). Robotic Skills Assessment mainly focuses on safety in operative field, critical error, economy, bimanual dexterity, and time. Following, the RSA-Score was further evaluated for construct validation and feasibility. Spearman correlation tests performed between tasks using the RSA-Scores indicate no cross correlation. Wilcoxon rank sum tests were performed between the two groups. The proposed RSA-Score was evaluated on non-robotic surgeons (n = 15) and on expert-robotic surgeons (n = 12). The expert group demonstrated significantly better performance on all four tasks in comparison to the novice group. Validation of the RSA-Score in this study was carried out on the Robotic Surgical Simulator. The RSA-Score is a valid scoring system that could be incorporated in any virtual reality-based surgical simulator to achieve standardized assessment of fundamental surgical tents during robot-assisted surgery. Copyright © 2013 Elsevier Inc. All rights reserved.
Achievement-Relevant Personality: Relations with the Big Five and Validation of an Efficient Instrument

PubMed Central

Briley, Daniel A.; Domiteaux, Matthew; Tucker-Drob, Elliot M.

2014-01-01

Many achievement-relevant personality measures (APMs) have been developed, but the interrelations among APMs or associations with the broader personality landscape are not well-known. In Study 1, 214 participants were measured on 36 APMs and a measure of the Big Five. Factor analytic results supported the convergent and discriminant validity of five latent dimensions: performance, mastery, self-doubt, effort, and intellectual investment. Conscientiousness, neuroticism, and openness to experience had the most consistent associations with APMs. We constructed a more efficient scale– the Multidimensional Achievement-Relevant Personality Scale (MAPS). In Study 2, we replicated the factor structure and external correlates of the MAPS in a sample of 359 individuals. Finally, we validated the MAPS with four indicators of academic performance and demonstrated incremental validity. PMID:24839374
Development, Testing, and Validation of a Model-Based Tool to Predict Operator Responses in Unexpected Workload Transitions

NASA Technical Reports Server (NTRS)

Sebok, Angelia; Wickens, Christopher; Sargent, Robert

2015-01-01

One human factors challenge is predicting operator performance in novel situations. Approaches such as drawing on relevant previous experience, and developing computational models to predict operator performance in complex situations, offer potential methods to address this challenge. A few concerns with modeling operator performance are that models need to realistic, and they need to be tested empirically and validated. In addition, many existing human performance modeling tools are complex and require that an analyst gain significant experience to be able to develop models for meaningful data collection. This paper describes an effort to address these challenges by developing an easy to use model-based tool, using models that were developed from a review of existing human performance literature and targeted experimental studies, and performing an empirical validation of key model predictions.
The UKCAT-12 study: educational attainment, aptitude test performance, demographic and socio-economic contextual factors as predictors of first year outcome in a cross-sectional collaborative study of 12 UK medical schools.

PubMed

McManus, I C; Dewberry, Chris; Nicholson, Sandra; Dowell, Jonathan S

2013-11-14

Most UK medical schools use aptitude tests during student selection, but large-scale studies of predictive validity are rare. This study assesses the United Kingdom Clinical Aptitude Test (UKCAT), and its four sub-scales, along with measures of educational attainment, individual and contextual socio-economic background factors, as predictors of performance in the first year of medical school training. A prospective study of 4,811 students in 12 UK medical schools taking the UKCAT from 2006 to 2008 as a part of the medical school application, for whom first year medical school examination results were available in 2008 to 2010. UKCAT scores and educational attainment measures (General Certificate of Education (GCE): A-levels, and so on; or Scottish Qualifications Authority (SQA): Scottish Highers, and so on) were significant predictors of outcome. UKCAT predicted outcome better in female students than male students, and better in mature than non-mature students. Incremental validity of UKCAT taking educational attainment into account was significant, but small. Medical school performance was also affected by sex (male students performing less well), ethnicity (non-White students performing less well), and a contextual measure of secondary schooling, students from secondary schools with greater average attainment at A-level (irrespective of public or private sector) performing less well. Multilevel modeling showed no differences between medical schools in predictive ability of the various measures. UKCAT sub-scales predicted similarly, except that Verbal Reasoning correlated positively with performance on Theory examinations, but negatively with Skills assessments. This collaborative study in 12 medical schools shows the power of large-scale studies of medical education for answering previously unanswerable but important questions about medical student selection, education and training. UKCAT has predictive validity as a predictor of medical school outcome, particularly in mature applicants to medical school. UKCAT offers small but significant incremental validity which is operationally valuable where medical schools are making selection decisions based on incomplete measures of educational attainment. The study confirms the validity of using all the existing measures of educational attainment in full at the time of selection decision-making. Contextual measures provide little additional predictive value, except that students from high attaining secondary schools perform less well, an effect previously shown for UK universities in general.
The UKCAT-12 study: educational attainment, aptitude test performance, demographic and socio-economic contextual factors as predictors of first year outcome in a cross-sectional collaborative study of 12 UK medical schools

PubMed Central

2013-01-01

Background Most UK medical schools use aptitude tests during student selection, but large-scale studies of predictive validity are rare. This study assesses the United Kingdom Clinical Aptitude Test (UKCAT), and its four sub-scales, along with measures of educational attainment, individual and contextual socio-economic background factors, as predictors of performance in the first year of medical school training. Methods A prospective study of 4,811 students in 12 UK medical schools taking the UKCAT from 2006 to 2008 as a part of the medical school application, for whom first year medical school examination results were available in 2008 to 2010. Results UKCAT scores and educational attainment measures (General Certificate of Education (GCE): A-levels, and so on; or Scottish Qualifications Authority (SQA): Scottish Highers, and so on) were significant predictors of outcome. UKCAT predicted outcome better in female students than male students, and better in mature than non-mature students. Incremental validity of UKCAT taking educational attainment into account was significant, but small. Medical school performance was also affected by sex (male students performing less well), ethnicity (non-White students performing less well), and a contextual measure of secondary schooling, students from secondary schools with greater average attainment at A-level (irrespective of public or private sector) performing less well. Multilevel modeling showed no differences between medical schools in predictive ability of the various measures. UKCAT sub-scales predicted similarly, except that Verbal Reasoning correlated positively with performance on Theory examinations, but negatively with Skills assessments. Conclusions This collaborative study in 12 medical schools shows the power of large-scale studies of medical education for answering previously unanswerable but important questions about medical student selection, education and training. UKCAT has predictive validity as a predictor of medical school outcome, particularly in mature applicants to medical school. UKCAT offers small but significant incremental validity which is operationally valuable where medical schools are making selection decisions based on incomplete measures of educational attainment. The study confirms the validity of using all the existing measures of educational attainment in full at the time of selection decision-making. Contextual measures provide little additional predictive value, except that students from high attaining secondary schools perform less well, an effect previously shown for UK universities in general. PMID:24229380
Risk prediction models for graft failure in kidney transplantation: a systematic review.

PubMed

Kaboré, Rémi; Haller, Maria C; Harambat, Jérôme; Heinze, Georg; Leffondré, Karen

2017-04-01

Risk prediction models are useful for identifying kidney recipients at high risk of graft failure, thus optimizing clinical care. Our objective was to systematically review the models that have been recently developed and validated to predict graft failure in kidney transplantation recipients. We used PubMed and Scopus to search for English, German and French language articles published in 2005-15. We selected studies that developed and validated a new risk prediction model for graft failure after kidney transplantation, or validated an existing model with or without updating the model. Data on recipient characteristics and predictors, as well as modelling and validation methods were extracted. In total, 39 articles met the inclusion criteria. Of these, 34 developed and validated a new risk prediction model and 5 validated an existing one with or without updating the model. The most frequently predicted outcome was graft failure, defined as dialysis, re-transplantation or death with functioning graft. Most studies used the Cox model. There was substantial variability in predictors used. In total, 25 studies used predictors measured at transplantation only, and 14 studies used predictors also measured after transplantation. Discrimination performance was reported in 87% of studies, while calibration was reported in 56%. Performance indicators were estimated using both internal and external validation in 13 studies, and using external validation only in 6 studies. Several prediction models for kidney graft failure in adults have been published. Our study highlights the need to better account for competing risks when applicable in such studies, and to adequately account for post-transplant measures of predictors in studies aiming at improving monitoring of kidney transplant recipients. © The Author 2017. Published by Oxford University Press on behalf of ERA-EDTA. All rights reserved.
Does virtual reality simulation have a role in training trauma and orthopaedic surgeons?

PubMed

Bartlett, J D; Lawrence, J E; Stewart, M E; Nakano, N; Khanduja, V

2018-05-01

Aims The aim of this study was to assess the current evidence relating to the benefits of virtual reality (VR) simulation in orthopaedic surgical training, and to identify areas of future research. Materials and Methods A literature search using the MEDLINE, Embase, and Google Scholar databases was performed. The results' titles, abstracts, and references were examined for relevance. Results A total of 31 articles published between 2004 and 2016 and relating to the objective validity and efficacy of specific virtual reality orthopaedic surgical simulators were identified. We found 18 studies demonstrating the construct validity of 16 different orthopaedic virtual reality simulators by comparing expert and novice performance. Eight studies have demonstrated skill acquisition on a simulator by showing improvements in performance with repeated use. A further five studies have demonstrated measurable improvements in operating theatre performance following a period of virtual reality simulator training. Conclusion The demonstration of 'real-world' benefits from the use of VR simulation in knee and shoulder arthroscopy is promising. However, evidence supporting its utility in other forms of orthopaedic surgery is lacking. Further studies of validity and utility should be combined with robust analyses of the cost efficiency of validated simulators to justify the financial investment required for their use in orthopaedic training. Cite this article: Bone Joint J 2018;100-B:559-65.
Validity and Reliability of Accelerometers in Patients With COPD: A SYSTEMATIC REVIEW.

PubMed

Gore, Shweta; Blackwood, Jennifer; Guyette, Mary; Alsalaheen, Bara

2018-05-01

Reduced physical activity is associated with poor prognosis in chronic obstructive pulmonary disease (COPD). Accelerometers have greatly improved quantification of physical activity by providing information on step counts, body positions, energy expenditure, and magnitude of force. The purpose of this systematic review was to compare the validity and reliability of accelerometers used in patients with COPD. An electronic database search of MEDLINE and CINAHL was performed. Study quality was assessed with the Strengthening the Reporting of Observational Studies in Epidemiology checklist while methodological quality was assessed using the modified Quality Appraisal Tool for Reliability Studies. The search yielded 5392 studies; 25 met inclusion criteria. The SenseWear Pro armband reported high criterion validity under controlled conditions (r = 0.75-0.93) and high reliability (ICC = 0.84-0.86) for step counts. The DynaPort MiniMod demonstrated highest concurrent validity for step count using both video and manual methods. Validity of the SenseWear Pro armband varied between studies especially in free-living conditions, slower walking speeds, and with addition of weights during gait. A high degree of variability was found in the outcomes used and statistical analyses performed between studies, indicating a need for further studies to measure reliability and validity of accelerometers in COPD. The SenseWear Pro armband is the most commonly used accelerometer in COPD, but measurement properties are limited by gait speed variability and assistive device use. DynaPort MiniMod and Stepwatch accelerometers demonstrated high validity in patients with COPD but lack reliability data.
Validity analysis on merged and averaged data using within and between analysis: focus on effect of qualitative social capital on self-rated health.

PubMed

Shin, Sang Soo; Shin, Young-Jeon

2016-01-01

With an increasing number of studies highlighting regional social capital (SC) as a determinant of health, many studies are using multi-level analysis with merged and averaged scores of community residents' survey responses calculated from community SC data. Sufficient examination is required to validate if the merged and averaged data can represent the community. Therefore, this study analyzes the validity of the selected indicators and their applicability in multi-level analysis. Within and between analysis (WABA) was performed after creating community variables using merged and averaged data of community residents' responses from the 2013 Community Health Survey in Korea, using subjective self-rated health assessment as a dependent variable. Further analysis was performed following the model suggested by WABA result. Both E-test results (1) and WABA results (2) revealed that single-level analysis needs to be performed using qualitative SC variable with cluster mean centering. Through single-level multivariate regression analysis, qualitative SC with cluster mean centering showed positive effect on self-rated health (0.054, p<0.001), although there was no substantial difference in comparison to analysis using SC variables without cluster mean centering or multi-level analysis. As modification in qualitative SC was larger within the community than between communities, we validate that relational analysis of individual self-rated health can be performed within the group, using cluster mean centering. Other tests besides the WABA can be performed in the future to confirm the validity of using community variables and their applicability in multi-level analysis.
Critical validation studies of neurofeedback.

PubMed

Gruzelier, John; Egner, Tobias

2005-01-01

The field of neurofeedback training has proceeded largely without validation. In this article the authors review studies directed at validating sensory motor rhythm, beta and alpha-theta protocols for improving attention, memory, and music performance in healthy participants. Importantly, benefits were demonstrable with cognitive and neurophysiologic measures that were predicted on the basis of regression models of learning to enhance sensory motor rhythm and beta activity. The first evidence of operant control over the alpha-theta ratio is provided, together with remarkable improvements in artistic aspects of music performance equivalent to two class grades in conservatory students. These are initial steps in providing a much needed scientific basis to neurofeedback.
Simulation verification techniques study. Subsystem simulation validation techniques

NASA Technical Reports Server (NTRS)

Duncan, L. M.; Reddell, J. P.; Schoonmaker, P. B.

1974-01-01

Techniques for validation of software modules which simulate spacecraft onboard systems are discussed. An overview of the simulation software hierarchy for a shuttle mission simulator is provided. A set of guidelines for the identification of subsystem/module performance parameters and critical performance parameters are presented. Various sources of reference data to serve as standards of performance for simulation validation are identified. Environment, crew station, vehicle configuration, and vehicle dynamics simulation software are briefly discussed from the point of view of their interfaces with subsystem simulation modules. A detailed presentation of results in the area of vehicle subsystems simulation modules is included. A list of references, conclusions and recommendations are also given.
Virtual reality simulation training in Otolaryngology.

PubMed

Arora, Asit; Lau, Loretta Y M; Awad, Zaid; Darzi, Ara; Singh, Arvind; Tolley, Neil

2014-01-01

To conduct a systematic review of the validity data for the virtual reality surgical simulator platforms available in Otolaryngology. Ovid and Embase databases searched July 13, 2013. Four hundred and nine abstracts were independently reviewed by 2 authors. Thirty-six articles which fulfilled the search criteria were retrieved and viewed in full text. These articles were assessed for quantitative data on at least one aspect of face, content, construct or predictive validity. Papers were stratified by simulator, sub-specialty and further classified by the validation method used. There were 21 articles reporting applications for temporal bone surgery (n = 12), endoscopic sinus surgery (n = 6) and myringotomy (n = 3). Four different simulator platforms were validated for temporal bone surgery and two for each of the other surgical applications. Face/content validation represented the most frequent study type (9/21). Construct validation studies performed on temporal bone and endoscopic sinus surgery simulators showed that performance measures reliably discriminated between different experience levels. Simulation training improved cadaver temporal bone dissection skills and operating room performance in sinus surgery. Several simulator platforms particularly in temporal bone surgery and endoscopic sinus surgery are worthy of incorporation into training programmes. Standardised metrics are necessary to guide curriculum development in Otolaryngology. Copyright © 2013 Surgical Associates Ltd. Published by Elsevier Ltd. All rights reserved.
Reliability and validity of generalizable skills instruments for students who are deaf, blind, or visually impaired.

PubMed

Loeding, B L; Greenan, J P

1998-12-01

The study examined the validity and reliability of four assessments, with three instruments per domain. Domains included generalizable mathematics, communication, interpersonal relations, and reasoning skills. Participants were deaf, legally blind, or visually impaired students enrolled in vocational classes at residential secondary schools. The researchers estimated the internal consistency reliability, test-retest reliability, and construct validity correlations of three subinstruments: student self-ratings, teacher ratings, and performance assessments. The data suggest that these instruments are highly internally consistent measures of generalizable vocational skills. Four performance assessments have high-to-moderate test-retest reliability estimates, and were generally considered to possess acceptable validity and reliability.
Ruggedness testing and validation of a practical analytical method for > 100 veterinary drug residues in bovine muscle by ultrahigh performance liquid chromatography – tandem mass spectrometry

USDA-ARS?s Scientific Manuscript database

In this study, optimization, extension, and validation of a streamlined, qualitative and quantitative multiclass, multiresidue method was conducted to monitor great than100 veterinary drug residues in meat using ultrahigh-performance liquid chromatography – tandem mass spectrometry (UHPLC-MS/MS). I...
Development and Validation of a Rating Scale for Wind Jazz Improvisation Performance

ERIC Educational Resources Information Center

Smith, Derek T.

2009-01-01

The purpose of this study was to construct and validate a rating scale for collegiate wind jazz improvisation performance. The 14-item Wind Jazz Improvisation Evaluation Scale (WJIES) was constructed and refined through a facet-rational approach to scale development. Five wind jazz students and one professional jazz educator were asked to record…
Applying the APA/AERA/NCME "Standards": Evidence for the Validity and Reliability of Three Statewide Teaching Assessment Instruments.

ERIC Educational Resources Information Center

Rothenberg, Lori; Hessling, Peter A.

The statewide teaching performance assessment instruments being used in Georgia, North Carolina, and Florida were examined. Forty-one reliability and validity studies regarding the instruments in use in each state were collected from state departments and universities. Georgia uses the Georgia Teacher Performance Assessment Instrument. North…
Validity of the Optometry Admission Test in Predicting Performance in Schools and Colleges of Optometry.

ERIC Educational Resources Information Center

Kramer, Gene A.; Johnston, JoElle

1997-01-01

A study examined the relationship between Optometry Admission Test scores and pre-optometry or undergraduate grade point average (GPA) with first and second year performance in optometry schools. The test's predictive validity was limited but significant, and comparable to those reported for other admission tests. In addition, the scores…
Dynamic Time Warping compared to established methods for validation of musculoskeletal models.

PubMed

Gaspar, Martin; Welke, Bastian; Seehaus, Frank; Hurschler, Christof; Schwarze, Michael

2017-04-11

By means of Multi-Body musculoskeletal simulation, important variables such as internal joint forces and moments can be estimated which cannot be measured directly. Validation can ensued by qualitative or by quantitative methods. Especially when comparing time-dependent signals, many methods do not perform well and validation is often limited to qualitative approaches. The aim of the present study was to investigate the capabilities of the Dynamic Time Warping (DTW) algorithm for comparing time series, which can quantify phase as well as amplitude errors. We contrast the sensitivity of DTW with other established metrics: the Pearson correlation coefficient, cross-correlation, the metric according to Geers, RMSE and normalized RMSE. This study is based on two data sets, where one data set represents direct validation and the other represents indirect validation. Direct validation was performed in the context of clinical gait-analysis on trans-femoral amputees fitted with a 6 component force-moment sensor. Measured forces and moments from amputees' socket-prosthesis are compared to simulated forces and moments. Indirect validation was performed in the context of surface EMG measurements on a cohort of healthy subjects with measurements taken of seven muscles of the leg, which were compared to simulated muscle activations. Regarding direct validation, a positive linear relation between results of RMSE and nRMSE to DTW can be seen. For indirect validation, a negative linear relation exists between Pearson correlation and cross-correlation. We propose the DTW algorithm for use in both direct and indirect quantitative validation as it correlates well with methods that are most suitable for one of the tasks. However, in DV it should be used together with methods resulting in a dimensional error value, in order to be able to interpret results more comprehensible. Copyright © 2017 Elsevier Ltd. All rights reserved.
[Balanced scorecard for performance measurement of a nursing organization in a Korean hospital].

PubMed

Hong, Yoonmi; Hwang, Kyung Ja; Kim, Mi Ja; Park, Chang Gi

2008-02-01

The purpose of this study was to develop a balanced scorecard (BSC) for performance measurement of a Korean hospital nursing organization and to evaluate the validity and reliability of performance measurement indicators. Two hundred fifty-nine nurses in a Korean hospital participated in a survey questionnaire that included 29-item performance evaluation indicators developed by investigators of this study based on the Kaplan and Norton's BSC (1992). Cronbach's alpha was used to test the reliability of the BSC. Exploratory and confirmatory factor analysis with a structure equation model (SEM) was applied to assess the construct validity of the BSC. Cronbach's alpha of 29 items was .948. Factor analysis of the BSC showed 5 principal components (eigen value >1.0) which explained 62.7% of the total variance, and it included a new one, community service. The SEM analysis results showed that 5 components were significant for the hospital BSC tool. High degree of reliability and validity of this BSC suggests that it may be used for performance measurements of a Korean hospital nursing organization. Future studies may consider including a balanced number of nurse managers and staff nurses in the study. Further data analysis on the relationships among factors is recommended.
Victoria Symptom Validity Test performance in children and adolescents with neurological disorders.

PubMed

Brooks, Brian L

2012-12-01

It is becoming increasingly more important to study, use, and promote the utility of measures that are designed to detect non-compliance with testing (i.e., poor effort, symptom non-validity, response bias) as part of neuropsychological assessments with children and adolescents. Several measures have evidence for use in pediatrics, but there is a paucity of published support for the Victoria Symptom Validity Test (VSVT) in this population. The purpose of this study was to examine the performance on the VSVT in a sample of pediatric patients with known neurological disorders. The sample consisted of 100 consecutively referred children and adolescents between the ages of 6 and 19 years (mean = 14.0, SD = 3.1) with various neurological diagnoses. On the VSVT total items, 95% of the sample had performance in the "valid" range, with 5% being deemed "questionable" and 0% deemed "invalid". On easy items, 97% were "valid", 2% were "questionable", and 1% was "invalid." For difficult items, 84% were "valid," 16% were "questionable," and 0% was "invalid." For those patients given two effort measures (i.e., VSVT and Test of Memory Malingering; n = 65), none was identified as having poor test-taking compliance on both measures. VSVT scores were significantly correlated with age, intelligence, processing speed, and functional ratings of daily abilities (attention, executive functioning, and adaptive functioning), but not objective performance on the measure of sustained attention, verbal memory, or visual memory. The VSVT has potential to be used in neuropsychological assessments with pediatric patients.

Cultural factors affecting the differential performance of Israeli and Palestinian children on the Loewenstein Occupational Therapy Cognitive Assessment.

PubMed

Josman, Naomi; Abdallah, Taisir M; Engel-Yeger, Batya

2010-01-01

Cognitive performance is essential for children's functioning and may also predict school readiness. The suitability of Western standardized assessments for cognitive performance among children from different cultures needs to be elaborated. This study referred to the existence of differences in cognitive performance between and within children from the middle-east-Israeli and Palestinian on the Loewenstein Occupational Therapy Cognitive Assessment (LOTCA), by elucidating cultural effects on the construct validity of the LOTCA using factor analysis. Participants included 101 Israeli and 125 Palestinian children from kindergarten, first and second grade who underwent the LOTCA. Factor analysis revealed four factors underlying items on the LOTCA, explaining the differences found between Israeli and Palestinian children in most of LOTCA subtests. Culture may affect the construct validity of the LOTCA and may explain the difference in performance between both cultural groups. LOTCA's validity as well as the validity of other instruments on which norms and decisions regarding the child's development and performance are made should be further evaluated among children from different cultural backgrounds. 2010 Elsevier Ltd. All rights reserved.
Performance assessment instrument to assess the senior high students' psychomotor for the salt hydrolysis material

NASA Astrophysics Data System (ADS)

Nahadi, Firman, Harry; Yulina, Erlis

2016-02-01

The purposes of this study were to develop a performance assessment instrument for assessing the competence of psychomotor high school students on salt hydrolysis concepts. The design used in this study was the Research & Development which consists of three phases: development, testing and application of instruments. Subjects in this study were high school students in class XI science, which amounts to 93 students. In the development phase, seven validators validated 17 tasks instrument. In the test phase, we divided 19 students into three-part different times to conduct performance test in salt hydrolysis lab work and observed by six raters. The first, the second, and the third groups recpectively consist of five, six, and eight students. In the application phase, two raters observed the performance of 74 students in the salt hydrolysis lab work in several times. The results showed that 16 of 17 tasks of performance assessment instrument developed can be stated to be valid with CVR value of 1,00 and 0,714. While, the rest was not valid with CVR value was 0.429, below the critical value (0.622). In the test phase, reliability value of instrument obtained were 0,951 for the five-student group, 0,806 for the six-student group and 0,743 for the eight-student group. From the interviews, teachers strongly agree with the performance instrument developed. They stated that the instrument was feasible to use for maximum number of students were six in a single observation.
Instruments to assess patients with rotator cuff pathology: a systematic review of measurement properties.

PubMed

Longo, Umile Giuseppe; Saris, Daniël; Poolman, Rudolf W; Berton, Alessandra; Denaro, Vincenzo

2012-10-01

The aims of this study were to obtain an overview of the methodological quality of studies on the measurement properties of rotator cuff questionnaires and to describe how well various aspects of the design and statistical analyses of studies on measurement properties are performed. A systematic review of published studies on the measurement properties of rotator cuff questionnaires was performed. Two investigators independently rated the quality of the studies using the Consensus-based Standards for the selection of health Measurement Instruments checklist. This checklist was developed in an international Delphi consensus study. Sixteen studies were included, in which two measurement instruments were evaluated, namely the Western Ontario Rotator Cuff Index and the Rotator Cuff Quality-of-Life Measure. The methodological quality of the included studies was adequate on some properties (construct validity, reliability, responsiveness, internal consistency, and translation) but need to be improved on other aspects. The most important methodological aspects that need to be developed are as follows: measurement error, content validity, structural validity, cross-cultural validity, criterion validity, and interpretability. Considering the importance of adequate measurement properties, it is concluded that, in the field of rotator cuff pathology, there is room for improvement in the methodological quality of studies measurement properties. II.
A multicenter prospective cohort study on camera navigation training for key user groups in minimally invasive surgery.

PubMed

Graafland, Maurits; Bok, Kiki; Schreuder, Henk W R; Schijven, Marlies P

2014-06-01

Untrained laparoscopic camera assistants in minimally invasive surgery (MIS) may cause suboptimal view of the operating field, thereby increasing risk for errors. Camera navigation is often performed by the least experienced member of the operating team, such as inexperienced surgical residents, operating room nurses, and medical students. The operating room nurses and medical students are currently not included as key user groups in structured laparoscopic training programs. A new virtual reality laparoscopic camera navigation (LCN) module was specifically developed for these key user groups. This multicenter prospective cohort study assesses face validity and construct validity of the LCN module on the Simendo virtual reality simulator. Face validity was assessed through a questionnaire on resemblance to reality and perceived usability of the instrument among experts and trainees. Construct validity was assessed by comparing scores of groups with different levels of experience on outcome parameters of speed and movement proficiency. The results obtained show uniform and positive evaluation of the LCN module among expert users and trainees, signifying face validity. Experts and intermediate experience groups performed significantly better in task time and camera stability during three repetitions, compared to the less experienced user groups (P < .007). Comparison of learning curves showed significant improvement of proficiency in time and camera stability for all groups during three repetitions (P < .007). The results of this study show face validity and construct validity of the LCN module. The module is suitable for use in training curricula for operating room nurses and novice surgical trainees, aimed at improving team performance in minimally invasive surgery. © The Author(s) 2013.
Developing and validating risk prediction models in an individual participant data meta-analysis

PubMed Central

2014-01-01

Background Risk prediction models estimate the risk of developing future outcomes for individuals based on one or more underlying characteristics (predictors). We review how researchers develop and validate risk prediction models within an individual participant data (IPD) meta-analysis, in order to assess the feasibility and conduct of the approach. Methods A qualitative review of the aims, methodology, and reporting in 15 articles that developed a risk prediction model using IPD from multiple studies. Results The IPD approach offers many opportunities but methodological challenges exist, including: unavailability of requested IPD, missing patient data and predictors, and between-study heterogeneity in methods of measurement, outcome definitions and predictor effects. Most articles develop their model using IPD from all available studies and perform only an internal validation (on the same set of data). Ten of the 15 articles did not allow for any study differences in baseline risk (intercepts), potentially limiting their model’s applicability and performance in some populations. Only two articles used external validation (on different data), including a novel method which develops the model on all but one of the IPD studies, tests performance in the excluded study, and repeats by rotating the omitted study. Conclusions An IPD meta-analysis offers unique opportunities for risk prediction research. Researchers can make more of this by allowing separate model intercept terms for each study (population) to improve generalisability, and by using ‘internal-external cross-validation’ to simultaneously develop and validate their model. Methodological challenges can be reduced by prospectively planned collaborations that share IPD for risk prediction. PMID:24397587
Statistical considerations on prognostic models for glioma

PubMed Central

Molinaro, Annette M.; Wrensch, Margaret R.; Jenkins, Robert B.; Eckel-Passow, Jeanette E.

2016-01-01

Given the lack of beneficial treatments in glioma, there is a need for prognostic models for therapeutic decision making and life planning. Recently several studies defining subtypes of glioma have been published. Here, we review the statistical considerations of how to build and validate prognostic models, explain the models presented in the current glioma literature, and discuss advantages and disadvantages of each model. The 3 statistical considerations to establishing clinically useful prognostic models are: study design, model building, and validation. Careful study design helps to ensure that the model is unbiased and generalizable to the population of interest. During model building, a discovery cohort of patients can be used to choose variables, construct models, and estimate prediction performance via internal validation. Via external validation, an independent dataset can assess how well the model performs. It is imperative that published models properly detail the study design and methods for both model building and validation. This provides readers the information necessary to assess the bias in a study, compare other published models, and determine the model's clinical usefulness. As editors, reviewers, and readers of the relevant literature, we should be cognizant of the needed statistical considerations and insist on their use. PMID:26657835
Measurement of predictive validity in violence risk assessment studies: a second-order systematic review.

PubMed

Singh, Jay P; Desmarais, Sarah L; Van Dorn, Richard A

2013-01-01

The objective of the present review was to examine how predictive validity is analyzed and reported in studies of instruments used to assess violence risk. We reviewed 47 predictive validity studies published between 1990 and 2011 of 25 instruments that were included in two recent systematic reviews. Although all studies reported receiver operating characteristic curve analyses and the area under the curve (AUC) performance indicator, this methodology was defined inconsistently and findings often were misinterpreted. In addition, there was between-study variation in benchmarks used to determine whether AUCs were small, moderate, or large in magnitude. Though virtually all of the included instruments were designed to produce categorical estimates of risk - through the use of either actuarial risk bins or structured professional judgments - only a minority of studies calculated performance indicators for these categorical estimates. In addition to AUCs, other performance indicators, such as correlation coefficients, were reported in 60% of studies, but were infrequently defined or interpreted. An investigation of sources of heterogeneity did not reveal significant variation in reporting practices as a function of risk assessment approach (actuarial vs. structured professional judgment), study authorship, geographic location, type of journal (general vs. specialized audience), sample size, or year of publication. Findings suggest a need for standardization of predictive validity reporting to improve comparison across studies and instruments. Copyright © 2013 John Wiley & Sons, Ltd.
Implementing Lumberjacks and Black Swans Into Model-Based Tools to Support Human-Automation Interaction.

PubMed

Sebok, Angelia; Wickens, Christopher D

2017-03-01

The objectives were to (a) implement theoretical perspectives regarding human-automation interaction (HAI) into model-based tools to assist designers in developing systems that support effective performance and (b) conduct validations to assess the ability of the models to predict operator performance. Two key concepts in HAI, the lumberjack analogy and black swan events, have been studied extensively. The lumberjack analogy describes the effects of imperfect automation on operator performance. In routine operations, an increased degree of automation supports performance, but in failure conditions, increased automation results in more significantly impaired performance. Black swans are the rare and unexpected failures of imperfect automation. The lumberjack analogy and black swan concepts have been implemented into three model-based tools that predict operator performance in different systems. These tools include a flight management system, a remotely controlled robotic arm, and an environmental process control system. Each modeling effort included a corresponding validation. In one validation, the software tool was used to compare three flight management system designs, which were ranked in the same order as predicted by subject matter experts. The second validation compared model-predicted operator complacency with empirical performance in the same conditions. The third validation compared model-predicted and empirically determined time to detect and repair faults in four automation conditions. The three model-based tools offer useful ways to predict operator performance in complex systems. The three tools offer ways to predict the effects of different automation designs on operator performance.
Analysis of Carbamate Pesticides: Validation of Semi-Volatile Analysis by HPLC-MS/MS by EPA Method MS666

DOE Office of Scientific and Technical Information (OSTI.GOV)

Owens, J; Koester, C

The Environmental Protection Agency's (EPA) Region 5 Chicago Regional Laboratory (CRL) developed a method for analysis of aldicarb, bromadiolone, carbofuran, oxamyl, and methomyl in water by high performance liquid chromatography tandem mass spectrometry (HPLC-MS/MS), titled Method EPA MS666. This draft standard operating procedure (SOP) was distributed to multiple EPA laboratories and to Lawrence Livermore National Laboratory, which was tasked to serve as a reference laboratory for EPA's Environmental Reference Laboratory Network (ERLN) and to develop and validate analytical procedures. The primary objective of this study was to validate and verify the analytical procedures described in MS666 for analysis of carbamatemore » pesticides in aqueous samples. The gathered data from this validation study will be used to: (1) demonstrate analytical method performance; (2) generate quality control acceptance criteria; and (3) revise the SOP to provide a validated method that would be available for use during a homeland security event. The data contained in this report will be compiled, by EPA CRL, with data generated by other EPA Regional laboratories so that performance metrics of Method EPA MS666 can be determined.« less
A Unified Model of Performance: Validation of its Predictions across Different Sleep/Wake Schedules.

PubMed

Ramakrishnan, Sridhar; Wesensten, Nancy J; Balkin, Thomas J; Reifman, Jaques

2016-01-01

Historically, mathematical models of human neurobehavioral performance developed on data from one sleep study were limited to predicting performance in similar studies, restricting their practical utility. We recently developed a unified model of performance (UMP) to predict the effects of the continuum of sleep loss-from chronic sleep restriction (CSR) to total sleep deprivation (TSD) challenges-and validated it using data from two studies of one laboratory. Here, we significantly extended this effort by validating the UMP predictions across a wide range of sleep/wake schedules from different studies and laboratories. We developed the UMP on psychomotor vigilance task (PVT) lapse data from one study encompassing four different CSR conditions (7 d of 3, 5, 7, and 9 h of sleep/night), and predicted performance in five other studies (from four laboratories), including different combinations of TSD (40 to 88 h), CSR (2 to 6 h of sleep/night), control (8 to 10 h of sleep/night), and nap (nocturnal and diurnal) schedules. The UMP accurately predicted PVT performance trends across 14 different sleep/wake conditions, yielding average prediction errors between 7% and 36%, with the predictions lying within 2 standard errors of the measured data 87% of the time. In addition, the UMP accurately predicted performance impairment (average error of 15%) for schedules (TSD and naps) not used in model development. The unified model of performance can be used as a tool to help design sleep/wake schedules to optimize the extent and duration of neurobehavioral performance and to accelerate recovery after sleep loss. © 2016 Associated Professional Sleep Societies, LLC.
Simulated Driving Assessment (SDA) for Teen Drivers: Results from a Validation Study

PubMed Central

McDonald, Catherine C.; Kandadai, Venk; Loeb, Helen; Seacrist, Thomas S.; Lee, Yi-Ching; Winston, Zachary; Winston, Flaura K.

2015-01-01

Background Driver error and inadequate skill are common critical reasons for novice teen driver crashes, yet few validated, standardized assessments of teen driving skills exist. The purpose of this study was to evaluate the construct and criterion validity of a newly developed Simulated Driving Assessment (SDA) for novice teen drivers. Methods The SDA's 35-minute simulated drive incorporates 22 variations of the most common teen driver crash configurations. Driving performance was compared for 21 inexperienced teens (age 16–17 years, provisional license ≤90 days) and 17 experienced adults (age 25–50 years, license ≥5 years, drove ≥100 miles per week, no collisions or moving violations ≤3 years). SDA driving performance (Error Score) was based on driving safety measures derived from simulator and eye-tracking data. Negative driving outcomes included simulated collisions or run-off-the-road incidents. A professional driving evaluator/instructor reviewed videos of SDA performance (DEI Score). Results The SDA demonstrated construct validity: 1.) Teens had a higher Error Score than adults (30 vs. 13, p=0.02); 2.) For each additional error committed, the relative risk of a participant's propensity for a simulated negative driving outcome increased by 8% (95% CI: 1.05–1.10, p<0.01). The SDA demonstrated criterion validity: Error Score was correlated with DEI Score (r=−0.66, p<0.001). Conclusions This study supports the concept of validated simulated driving tests like the SDA to assess novice driver skill in complex and hazardous driving scenarios. The SDA, as a standard protocol to evaluate teen driver performance, has the potential to facilitate screening and assessment of teen driving readiness and could be used to guide targeted skill training. PMID:25740939
Reliability and validity of two self-report measures of impairment and disability for MS. North American Research Consortium on Multiple Sclerosis Outcomes Study Group.

PubMed

Schwartz, C E; Vollmer, T; Lee, H

1999-01-01

To describe the results of a multicenter study that validated two new patient-reported measures of neurologic impairment and disability for use in MS clinical research. Self-reported data can provide a cost-effective means to assess patient functioning, and can be useful for screening patients who require additional evaluation. Thirteen MS centers from the United States and Canada implemented a cross-sectional validation study of two new measures of neurologic function. The Symptom Inventory is a measure of neurologic impairment with six subscales designed to correlate with localization of brain lesion. The Performance Scales measure disability in eight domains of function: mobility, hand function, vision, fatigue, cognition, bladder/bowel, sensory, and spasticity. Measures given for comparison included a neurologic examination (Expanded Disability Status Scale, Ambulation Index, Disease Steps) as well as the patient-reported Health Status Questionnaire and the Quality of Well-being Index. Participants included 274 MS patients and 296 healthy control subjects who were matched to patients on age, gender, and education. Both the Symptom Inventory and the Performance Scales showed high test-retest and internal consistency reliability. Correlational analyses supported the construct validity of both measures. Discriminant function analysis reduced the Symptom Inventory to 29 items without sacrificing reliability and increased its discriminant validity. The Performance Scales explained more variance in clinical outcomes and global quality of life than the Symptom Inventory, and there was some evidence that the two measures complemented each other in predicting Quality of Well-being Index scores. The Symptom Inventory and the Performance Scales are reliable and valid measures.
Validation of a new mortality risk prediction model for people 65 years and older in northwest Russia: The Crystal risk score.

PubMed

Turusheva, Anna; Frolova, Elena; Bert, Vaes; Hegendoerfer, Eralda; Degryse, Jean-Marie

2017-07-01

Prediction models help to make decisions about further management in clinical practice. This study aims to develop a mortality risk score based on previously identified risk predictors and to perform internal and external validations. In a population-based prospective cohort study of 611 community-dwelling individuals aged 65+ in St. Petersburg (Russia), all-cause mortality risks over 2.5 years follow-up were determined based on the results obtained from anthropometry, medical history, physical performance tests, spirometry and laboratory tests. C-statistic, risk reclassification analysis, integrated discrimination improvement analysis, decision curves analysis, internal validation and external validation were performed. Older adults were at higher risk for mortality [HR (95%CI)=4.54 (3.73-5.52)] when two or more of the following components were present: poor physical performance, low muscle mass, poor lung function, and anemia. If anemia was combined with high C-reactive protein (CRP) and high B-type natriuretic peptide (BNP) was added the HR (95%CI) was slightly higher (5.81 (4.73-7.14)) even after adjusting for age, sex and comorbidities. Our models were validated in an external population of adults 80+. The extended model had a better predictive capacity for cardiovascular mortality [HR (95%CI)=5.05 (2.23-11.44)] compared to the baseline model [HR (95%CI)=2.17 (1.18-4.00)] in the external population. We developed and validated a new risk prediction score that may be used to identify older adults at higher risk for mortality in Russia. Additional studies need to determine which targeted interventions improve the outcomes of these at-risk individuals. Copyright © 2017 Elsevier B.V. All rights reserved.
Simulated training in colonoscopic stenting of colonic strictures: validation of a cadaver model.

PubMed

Iordache, F; Bucobo, J C; Devlin, D; You, K; Bergamaschi, R

2015-07-01

There are currently no available simulation models for training in colonoscopic stent deployment. The aim of this study was to validate a cadaver model for simulation training in colonoscopy with stent deployment for colonic strictures. This was a prospective study enrolling surgeons at a single institution. Participants performed colonoscopic stenting on a cadaver model. Their performance was assessed by two independent observers. Measurements were performed for quantitative analysis (time to identify stenosis, time for deployment, accuracy) and a weighted score was devised for assessment. The Mann-Whitney U-test and Student's t-test were used for nonparametric and parametric data, respectively. Cohen's kappa coefficient was used for reliability. Twenty participants performed a colonoscopy with deployment of a self-expandable metallic stent in two cadavers (groups A and B) with 20 strictures overall. The median time was 206 s. The model was able to differentiate between experts and novices (P = 0. 013). The results showed a good consensus estimate of reliability, with kappa = 0.571 (P < 0.0001). The cadaver model described in this study has content, construct and concurrent validity for simulation training in colonoscopic deployment of self-expandable stents for colonic strictures. Further studies are needed to evaluate the predictive validity of this model in terms of skill transfer to clinical practice. Colorectal Disease © 2014 The Association of Coloproctology of Great Britain and Ireland.
The Edinburgh Postnatal Depression Scale (EPDS): translation and validation study of the Iranian version.

PubMed

Montazeri, Ali; Torkan, Behnaz; Omidvari, Sepideh

2007-04-04

The Edinburgh Postnatal Depression Scale (EPDS) is a widely used instrument to measure postnatal depression. This study aimed to translate and to test the reliability and validity of the EPDS in Iran. The English language version of the EPDS was translated into Persian (Iranian language) and was used in this study. The questionnaire was administered to a consecutive sample of 100 women with normal (n = 50) and caesarean section (n = 50) deliveries at two points in time: 6 to 8 weeks and 12 to 14 weeks after delivery. Statistical analysis was performed to test the reliability and validity of the EPDS. Overall 22% of women at time 1 and 18% at time 2 reported experiencing postpartum depression. In general, the Iranian version of the EPDS was found to be acceptable to almost all women. Cronbach's alpha coefficient (to test reliability) was found to be 0.77 at time 1 and 0.86 at time 2. In addition, test-rest reliability was performed and the intraclass correlation coefficient was found to be 0.80. Validity as performed using known groups comparison showed satisfactory results. The questionnaire discriminated well between sub-groups of women differing in mode of delivery in the expected direction. The factor analysis indicated a three-factor structure that jointly accounted for 58% of the variance. This preliminary validation study of the Iranian version of the EPDS proved that it is an acceptable, reliable and valid measure of postnatal depression. It seems that the EPDS not only measures postpartum depression but also may be measuring something more.
Validity and reliability of the Short Physical Performance Battery (SPPB): a pilot study on mobility in the Colombian Andes.

PubMed

Gómez, José Fernando; Curcio, Carmen-Lucía; Alvarado, Beatriz; Zunzunegui, María Victoria; Guralnik, Jack

2013-07-01

To assess the validity (convergent and construct) and reliability of the Short Physical Performance Battery (SPPB) among non-disabled adults between 65 to 74 years of age residing in the Andes Mountains of Colombia. Design Validation study; 150 subjects aged 65 to 74 years recruited from elderly associations (day-centers) in Manizales, Colombia. The SPPB tests of balance, including time to walk 4 meters and time required to stand from a chair 5 times were administered to all participants. Reliability was analyzed with a 7-day interval between assessments and use of repeated ANOVA testing. Construct validity was assessed using factor analysis and by testing the relationship between SPPB and depressive symptoms, cognitive function, and self rated health (SRH), while the concurrent validity was measured through relationships with mobility limitations and disability in Activities of Daily Living (ADL). ANOVA tests were used to establish these associations. Test-retest reliability of the SPPB was high: 0.87 (CI95%: 0.77-0.96). A one factor solution was found with three SPPB tests. SPPB was related to self-rated health, limitations in walking and climbing steps and to indicators of disability, as well as to cognitive function and depression. There was a graded decrease in the mean SPPB score with increasing disability and poor health. The Spanish version of SPPB is reliable and valid to assess physical performance among older adults from our region. Future studies should establish their clinical applications and explore usage in population studies.
Validity and reliability of sleep time questionnaires in children and adolescents: A systematic review and meta-analysis.

PubMed

Nascimento-Ferreira, Marcus V; Collese, Tatiana S; de Moraes, Augusto César F; Rendo-Urteaga, Tara; Moreno, Luis A; Carvalho, Heráclito B

2016-12-01

Sleep duration has been associated with several health outcomes in children and adolescents. As an extensive number of questionnaires are currently used to investigate sleep schedule or sleep time, we performed a systematic review of criterion validation of sleep time questionnaires for children and adolescents, considering accelerometers as the reference method. We found a strong correlation between questionnaires and accelerometers for weeknights and a moderate correlation for weekend nights. When considering only studies performing a reliability assessment of the used questionnaires, a significant increase in the correlations for both weeknights and weekend nights was observed. In conclusion, moderate to strong criterion validity of sleep time questionnaires was observed; however, the reliability assessment of the questionnaires showed strong validation performance. Copyright © 2015 Elsevier Ltd. All rights reserved.
Construct validity of the ovine model in endoscopic sinus surgery training.

PubMed

Awad, Zaid; Taghi, Ali; Sethukumar, Priya; Tolley, Neil S

2015-03-01

To demonstrate construct validity of the ovine model as a tool for training in endoscopic sinus surgery (ESS). Prospective, cross-sectional evaluation study. Over 18 consecutive months, trainees and experts were evaluated in their ability to perform a range of tasks (based on previous face validation and descriptive studies conducted by the same group) relating to ESS on the sheep-head model. Anonymized randomized video recordings of the above were assessed by two independent and blinded assessors. A validated assessment tool utilizing a five-point Likert scale was employed. Construct validity was calculated by comparing scores across training levels and experts using mean and interquartile range of global and task-specific scores. Subgroup analysis of the intermediate group ascertained previous experience. Nonparametric descriptive statistics were used, and analysis was carried out using SPSS version 21 (IBM, Armonk, NY). Reliability of the assessment tool was confirmed. The model discriminated well between different levels of expertise in global and task-specific scores. A positive correlation was noted between year in training and both global and task-specific scores (P < .001). Experience of the intermediate group was variable, and the number of ESS procedures performed under supervision had the highest impact on performance. This study describes an alternative model for ESS training and assessment. It is also the first to demonstrate construct validity of the sheep-head model for ESS training. © 2014 The American Laryngological, Rhinological and Otological Society, Inc.
Convergent validity and sex differences in healthy elderly adults for performance on 3D virtual reality navigation learning and 2D hidden maze tasks.

PubMed

Tippett, William J; Lee, Jang-Han; Mraz, Richard; Zakzanis, Konstantine K; Snyder, Peter J; Black, Sandra E; Graham, Simon J

2009-04-01

This study assessed the convergent validity of a virtual environment (VE) navigation learning task, the Groton Maze Learning Test (GMLT), and selected traditional neuropsychological tests performed in a group of healthy elderly adults (n = 24). The cohort was divided equally between males and females to explore performance variability due to sex differences, which were subsequently characterized and reported as part of the analysis. To facilitate performance comparisons, specific "efficiency" scores were created for both the VE navigation task and the GMLT. Men reached peak performance more rapidly than women during VE navigation and on the GMLT and significantly outperformed women on the first learning trial in the VE. Results suggest reasonable convergent validity across the VE task, GMLT, and selected neuropsychological tests for assessment of spatial memory.
Development and validation of a web-based questionnaire for surveying the health and working conditions of high-performance marine craft populations.

PubMed

de Alwis, Manudul Pahansen; Lo Martire, Riccardo; Äng, Björn O; Garme, Karl

2016-06-20

High-performance marine craft crews are susceptible to various adverse health conditions caused by multiple interactive factors. However, there are limited epidemiological data available for assessment of working conditions at sea. Although questionnaire surveys are widely used for identifying exposures, outcomes and associated risks with high accuracy levels, until now, no validated epidemiological tool exists for surveying occupational health and performance in these populations. To develop and validate a web-based questionnaire for epidemiological assessment of occupational and individual risk exposure pertinent to the musculoskeletal health conditions and performance in high-performance marine craft populations. A questionnaire for investigating the association between work-related exposure, performance and health was initially developed by a consensus panel under four subdomains, viz. demography, lifestyle, work exposure and health and systematically validated by expert raters for content relevance and simplicity in three consecutive stages, each iteratively followed by a consensus panel revision. The item content validity index (I-CVI) was determined as the proportion of experts giving a rating of 3 or 4. The scale content validity index (S-CVI/Ave) was computed by averaging the I-CVIs for the assessment of the questionnaire as a tool. Finally, the questionnaire was pilot tested. The S-CVI/Ave increased from 0.89 to 0.96 for relevance and from 0.76 to 0.94 for simplicity, resulting in 36 items in the final questionnaire. The pilot test confirmed the feasibility of the questionnaire. The present study shows that the web-based questionnaire fulfils previously published validity acceptance criteria and is therefore considered valid and feasible for the empirical surveying of epidemiological aspects among high-performance marine craft crews and similar populations. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/

Improving the accuracy of blood pressure measurement: the influence of the European Society of Hypertension International Protocol (ESH-IP) for the validation of blood pressure measuring devices and future perspectives.

PubMed

Stergiou, George S; Asmar, Roland; Myers, Martin; Palatini, Paolo; Parati, Gianfranco; Shennan, Andrew; Wang, Jiguang; O'Brien, Eoin

2018-03-01

The European Society of Hypertension (ESH) International Protocol (ESH-IP) for the validation of blood pressure (BP) measuring devices was published in 2002, with the main objective of simplifying the validation procedures, so that more BP monitors would be subjected to independent validation. This article provides an overview of the international impact of the ESH-IP and of the lessons learned from its use, to be able to justify further developments in validation protocols. A review of published (PubMed) validation studies from 2002 to 2017 was performed. One hundred and seventy-seven validation studies using the ESH-IP, 59 using the British Hypertension Society protocol, 46 using the Association for the Advancement of Medical Instrumentation (AAMI) standard and 23 using the International Organization for Standardization (ISO) standard were identified. Lists of validated office-clinic, home and ambulatory BP monitors are provided. Of the ESH-IP studies, 93% tested oscillometric devices, 80% upper arm, 71% home, 25% office and 7% ambulatory monitors (some had more than one function). The original goal of the ESH-IP has been fulfilled in that in the last decade the number of published validation studies has more than doubled. It is now recognized that the provision of accurate devices would be best served by having a universal protocol. An international initiative has been put in place by AAMI, ESH and ISO experts aiming to reach consensus for a universal validation protocol to be accepted worldwide, which will allow a more thorough evaluation of the accuracy and performance of future BP monitors.
Training and Validation of Standardized Patients for Unannounced Assessment of Physicians' Management of Depression

ERIC Educational Resources Information Center

Shirazi, Mandana; Sadeghi, Majid; Emami, A.; Kashani, A. Sabouri; Parikh, Sagar; Alaeddini, F.; Arbabi, Mohammad; Wahlstrom, Rolf

2011-01-01

Objective: Standardized patients (SPs) have been developed to measure practitioner performance in actual practice settings, but results have not been fully validated for psychiatric disorders. This study describes the process of creating reliable and valid SPs for unannounced assessment of general-practitioners' management of depression disorders…
Validity Issues in Assessing Dispositions: The Confirmatory Factor Analysis of a Teacher Dispositions Form

ERIC Educational Resources Information Center

Niu, Chunling; Everson, Kimberlee; Dietrich, Sylvia; Zippay, Cassie

2017-01-01

Critics against the inclusion of dispositions as part of the teacher education accreditation focus on the dearth of empirical literature on reliably and validly accessing dispositions (Borko, Liston, & Whitcomb, 2007). In this study, a confirmatory factor analysis (CFA) was performed to test the factorial validity of a teacher dispositions…
Interlaboratory validation of an improved U.S. Food and Drug Administration method for detection of Cyclospora cayetanensis in produce using TaqMan real-time PCR

USDA-ARS?s Scientific Manuscript database

A collaborative validation study was performed to evaluate the performance of a new U.S. Food and Drug Administration method developed for detection of the protozoan parasite, Cyclospora cayetanensis, on cilantro and raspberries. The method includes a sample preparation step in which oocysts are re...
Construct Validation of Physical Activity Surveys in Culturally Diverse Older Adults: A Comparison of Four Commonly Used Questionnaires

ERIC Educational Resources Information Center

Moore, Delilah S.; Ellis, Rebecca; Allen, Priscilla D.; Cherry, Katie E.; Monroe, Pamela A.; O'Neil, Carol E.; Wood, Robert H.

2008-01-01

The purpose of this study was to establish validity evidence of four physical activity (PA) questionnaires in culturally diverse older adults by comparing self-report PA with performance-based physical function. Participants were 54 older adults who completed the Continuous Scale Physical Functional Performance 10-item Test (CS-PFP10), Physical…
Predicting Job Performance for the Visually Impaired: Validity of the Fine Finger Dexterity Work Task.

ERIC Educational Resources Information Center

Giesen, J. Martin; And Others

The study was designed to determine the reliability and criterion validity of a psychomotor performance test (the Fine Finger Dexterity Work Task Unit) with 40 partially or totally blind adults. Reliability was established by using the test-retest method. A supervisory rating was developed and the reliability established by using the split-half…
Evidences of Validity of a Scale for Mapping Professional as Defining Competences and Performance by Brazilian Tutors

ERIC Educational Resources Information Center

Coelho, Francisco Antonio, Jr.; Ferreira, Rodrigo Rezende; Paschoal, Tatiane; Faiad, Cristiane; Meneses, Paulo Murce

2015-01-01

The purpose of this study was twofold: to assess evidences of construct validity of the Brazilian Scale of Tutors Competences in the field of Open and Distance Learning and to examine if variables such as professional experience, perception of the student´s learning performance and prior experience influence the development of technical and…
Improving the Validity of the PSNI in Assessing the Performance of Deaf Parents of Hearing Children.

ERIC Educational Resources Information Center

Mallory, Barbara L.; And Others

1992-01-01

This study, involving 15 deaf parents, their hearing children, and the children's hearing grandparents, examined the content validity of the Parental Strengths and Needs Inventory for evaluating the child-rearing performance of deaf adults. The inventory was found to be inadequate for assessing the strengths and needs of deaf parents. (Author/JDD)
Validity and Reliability of the Clinical Competency Evaluation Instrument for Use among Physiotherapy Students: Pilot study.

PubMed

Muhamad, Zailani; Ramli, Ayiesah; Amat, Salleh

2015-05-01

The aim of this study was to determine the content validity, internal consistency, test-retest reliability and inter-rater reliability of the Clinical Competency Evaluation Instrument (CCEVI) in assessing the clinical performance of physiotherapy students. This study was carried out between June and September 2013 at University Kebangsaan Malaysia (UKM), Kuala Lumpur, Malaysia. A panel of 10 experts were identified to establish content validity by evaluating and rating each of the items used in the CCEVI with regards to their relevance in measuring students' clinical competency. A total of 50 UKM undergraduate physiotherapy students were assessed throughout their clinical placement to determine the construct validity of these items. The instrument's reliability was determined through a cross-sectional study involving a clinical performance assessment of 14 final-year undergraduate physiotherapy students. The content validity index of the entire CCEVI was 0.91, while the proportion of agreement on the content validity indices ranged from 0.83-1.00. The CCEVI construct validity was established with factor loading of ≥0.6, while internal consistency (Cronbach's alpha) overall was 0.97. Test-retest reliability of the CCEVI was confirmed with a Pearson's correlation range of 0.91-0.97 and an intraclass coefficient correlation range of 0.95-0.98. Inter-rater reliability of the CCEVI domains ranged from 0.59 to 0.97 on initial and subsequent assessments. This pilot study confirmed the content validity of the CCEVI. It showed high internal consistency, thereby providing evidence that the CCEVI has moderate to excellent inter-rater reliability. However, additional refinement in the wording of the CCEVI items, particularly in the domains of safety and documentation, is recommended to further improve the validity and reliability of the instrument.
Measurement properties of depression questionnaires in patients with diabetes: a systematic review.

PubMed

van Dijk, Susan E M; Adriaanse, Marcel C; van der Zwaan, Lennart; Bosmans, Judith E; van Marwijk, Harm W J; van Tulder, Maurits W; Terwee, Caroline B

2018-06-01

To conduct a systematic review on measurement properties of questionnaires measuring depressive symptoms in adult patients with type 1 or type 2 diabetes. A systematic review of the literature in MEDLINE, EMbase and PsycINFO was performed. Full text, original articles, published in any language up to October 2016 were included. Eligibility for inclusion was independently assessed by three reviewers who worked in pairs. Methodological quality of the studies was evaluated by two independent reviewers using the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) checklist. Quality of the questionnaires was rated per measurement property, based on the number and quality of the included studies and the reported results. Of 6286 unique hits, 21 studies met our criteria evaluating nine different questionnaires in multiple settings and languages. The methodological quality of the included studies was variable for the different measurement properties: 9/15 studies scored 'good' or 'excellent' on internal consistency, 2/5 on reliability, 0/1 on content validity, 10/10 on structural validity, 8/11 on hypothesis testing, 1/5 on cross-cultural validity, and 4/9 on criterion validity. For the CES-D, there was strong evidence for good internal consistency, structural validity, and construct validity; moderate evidence for good criterion validity; and limited evidence for good cross-cultural validity. The PHQ-9 and WHO-5 also performed well on several measurement properties. However, the evidence for structural validity of the PHQ-9 was inconclusive. The WHO-5 was less extensively researched and originally not developed to measure depression. Currently, the CES-D is best supported for measuring depressive symptoms in diabetes patients.
Validating the Assessment for Measuring Indonesian Secondary School Students Performance in Ecology

NASA Astrophysics Data System (ADS)

Rachmatullah, A.; Roshayanti, F.; Ha, M.

2017-09-01

The aims of this current study are validating the American Association for the Advancement of Science (AAAS) Ecology assessment and examining the performance of Indonesian secondary school students on the assessment. A total of 611 Indonesian secondary school students (218 middle school students and 393 high school students) participated in the study. Forty-five items of AAAS assessment in the topic of Interdependence in Ecosystems were divided into two versions which every version has 21 similar items. Linking item method was used as the method to combine those two versions of assessment and further Rasch analyses were utilized to validate the instrument. Independent sample t-test was also run to compare the performance of Indonesian students and American students based on the mean of item difficulty. We found that from the total of 45 items, three items were identified as misfitting items. Later on, we also found that both Indonesian middle and high school students were significantly lower performance with very large and medium effect size compared to American students. We will discuss our findings in the regard of validation issue and the connection to Indonesian student’s science literacy.
Vacuum decay container closure integrity leak test method development and validation for a lyophilized product-package system.

PubMed

Patel, Jayshree; Mulhall, Brian; Wolf, Heinz; Klohr, Steven; Guazzo, Dana Morton

2011-01-01

A leak test performed according to ASTM F2338-09 Standard Test Method for Nondestructive Detection of Leaks in Packages by Vacuum Decay Method was developed and validated for container-closure integrity verification of a lyophilized product in a parenteral vial package system. This nondestructive leak test method is intended for use in manufacturing as an in-process package integrity check, and for testing product stored on stability in lieu of sterility tests. Method development and optimization challenge studies incorporated artificially defective packages representing a range of glass vial wall and sealing surface defects, as well as various elastomeric stopper defects. Method validation required 3 days of random-order replicate testing of a test sample population of negative-control, no-defect packages and positive-control, with-defect packages. Positive-control packages were prepared using vials each with a single hole laser-drilled through the glass vial wall. Hole creation and hole size certification was performed by Lenox Laser. Validation study results successfully demonstrated the vacuum decay leak test method's ability to accurately and reliably detect those packages with laser-drilled holes greater than or equal to approximately 5 μm in nominal diameter. All development and validation studies were performed at Whitehouse Analytical Laboratories in Whitehouse, NJ, under the direction of consultant Dana Guazzo of RxPax, LLC, using a VeriPac 455 Micro Leak Test System by Packaging Technologies & Inspection (Tuckahoe, NY). Bristol Myers Squibb (New Brunswick, NJ) fully subsidized all work. A leak test performed according to ASTM F2338-09 Standard Test Method for Nondestructive Detection of Leaks in Packages by Vacuum Decay Method was developed and validated to detect defects in stoppered vial packages containing lyophilized product for injection. This nondestructive leak test method is intended for use in manufacturing as an in-process package integrity check, and for testing product stored on stability in lieu of sterility tests. Test method validation study results proved the method capable of detecting holes laser-drilled through the glass vial wall greater than or equal to 5 μm in nominal diameter. Total test time is less than 1 min per package. All method development and validation studies were performed at Whitehouse Analytical Laboratories in Whitehouse, NJ, under the direction of consultant Dana Guazzo of RxPax, LLC, using a VeriPac 455 Micro Leak Test System by Packaging Technologies & Inspection (Tuckahoe, NY). Bristol Myers Squibb (New Brunswick, NJ) fully subsidized all work.
Validation of the Behavioral Risk Factor Surveillance System Sleep Questions

PubMed Central

Jungquist, Carla R.; Mund, Jaime; Aquilina, Alan T.; Klingman, Karen; Pender, John; Ochs-Balcom, Heather; van Wijngaarden, Edwin; Dickerson, Suzanne S.

2016-01-01

Study Objective: Sleep problems may constitute a risk for health problems, including cardiovascular disease, depression, diabetes, poor work performance, and motor vehicle accidents. The primary purpose of this study was to assess the validity of the current Behavioral Risk Factor Surveillance System (BRFSS) sleep questions by establishing the sensitivity and specificity for detection of sleep/ wake disturbance. Methods: Repeated cross-sectional assessment of 300 community dwelling adults over the age of 18 who did not wear CPAP or oxygen during sleep. Reliability and validity testing of the BRFSS sleep questions was performed comparing to BFRSS responses to data from home sleep study, actigraphy for 14 days, Insomnia Severity Index, Epworth Sleepiness Scale, and PROMIS-57. Results: Only two of the five BRFSS sleep questions were found valid and reliable in determining total sleep time and excessive daytime sleepiness. Conclusions: Refinement of the BRFSS questions is recommended. Citation: Jungquist CR, Mund J, Aquilina AT, Klingman K, Pender J, Ochs-Balcom H, van Wijngaarden E, Dickerson SS. Validation of the behavioral risk factor surveillance system sleep questions. J Clin Sleep Med 2016;12(3):301–310. PMID:26446246
Predictive Variables of Half-Marathon Performance for Male Runners.

PubMed

Gómez-Molina, Josué; Ogueta-Alday, Ana; Camara, Jesus; Stickley, Christoper; Rodríguez-Marroyo, José A; García-López, Juan

2017-06-01

The aims of this study were to establish and validate various predictive equations of half-marathon performance. Seventy-eight half-marathon male runners participated in two different phases. Phase 1 (n = 48) was used to establish the equations for estimating half-marathon performance, and Phase 2 (n = 30) to validate these equations. Apart from half-marathon performance, training-related and anthropometric variables were recorded, and an incremental test on a treadmill was performed, in which physiological (VO 2max , speed at the anaerobic threshold, peak speed) and biomechanical variables (contact and flight times, step length and step rate) were registered. In Phase 1, half-marathon performance could be predicted to 90.3% by variables related to training and anthropometry (Equation 1), 94.9% by physiological variables (Equation 2), 93.7% by biomechanical parameters (Equation 3) and 96.2% by a general equation (Equation 4). Using these equations, in Phase 2 the predicted time was significantly correlated with performance (r = 0.78, 0.92, 0.90 and 0.95, respectively). The proposed equations and their validation showed a high prediction of half-marathon performance in long distance male runners, considered from different approaches. Furthermore, they improved the prediction performance of previous studies, which makes them a highly practical application in the field of training and performance.
Validation of a novel basic virtual reality simulator, the LAP-X, for training basic laparoscopic skills.

PubMed

Kawaguchi, Koji; Egi, Hiroyuki; Hattori, Minoru; Sawada, Hiroyuki; Suzuki, Takahisa; Ohdan, Hideki

2014-10-01

Virtual reality surgical simulators are becoming popular as a means of providing trainees with an opportunity to practice laparoscopic skills. The Lap-X (Epona Medical, Rotterdam, the Netherlands) is a novel VR simulator for training basic skills in laparoscopic surgery. The objective of this study was to validate the LAP-X laparoscopic virtual reality simulator by assessing the face and construct validity in order to determine whether the simulator is adequate for basic skills training. The face and content validity were evaluated using a structured questionnaire. To assess the construct validity, the participants, nine expert surgeons (median age: 40 (32-45)) (>100 laparoscopic procedures) and 11 novices performed three basic laparoscopic tasks using the Lap-X. The participants reported a high level of content validity. No significant differences were found between the expert surgeons and the novices (Ps > 0.246). The performance of the expert surgeons on the three tasks was significantly better than that of the novices in all parameters (Ps < 0.05). This study demonstrated the face, content and construct validity of the Lap-X. The Lap-X holds real potential as a home and hospital training device.
Development, construct validity and test-retest reliability of a field-based wheelchair mobility performance test for wheelchair basketball.

PubMed

de Witte, Annemarie M H; Hoozemans, Marco J M; Berger, Monique A M; van der Slikke, Rienk M A; van der Woude, Lucas H V; Veeger, Dirkjan H E J

2018-01-01

The aim of this study was to develop and describe a wheelchair mobility performance test in wheelchair basketball and to assess its construct validity and reliability. To mimic mobility performance of wheelchair basketball matches in a standardised manner, a test was designed based on observation of wheelchair basketball matches and expert judgement. Forty-six players performed the test to determine its validity and 23 players performed the test twice for reliability. Independent-samples t-tests were used to assess whether the times needed to complete the test were different for classifications, playing standards and sex. Intraclass correlation coefficients (ICC) were calculated to quantify reliability of performance times. Males performed better than females (P < 0.001, effect size [ES] = -1.26) and international men performed better than national men (P < 0.001, ES = -1.62). Performance time of low (≤2.5) and high (≥3.0) classification players was borderline not significant with a moderate ES (P = 0.06, ES = 0.58). The reliability was excellent for overall performance time (ICC = 0.95). These results show that the test can be used as a standardised mobility performance test to validly and reliably assess the capacity in mobility performance of elite wheelchair basketball athletes. Furthermore, the described methodology of development is recommended for use in other sports to develop sport-specific tests.
Virtual reality simulator training for laparoscopic colectomy: what metrics have construct validity?

PubMed

Shanmugan, Skandan; Leblanc, Fabien; Senagore, Anthony J; Ellis, C Neal; Stein, Sharon L; Khan, Sadaf; Delaney, Conor P; Champagne, Bradley J

2014-02-01

Virtual reality simulation for laparoscopic colectomy has been used for training of surgical residents and has been considered as a model for technical skills assessment of board-eligible colorectal surgeons. However, construct validity (the ability to distinguish between skill levels) must be confirmed before widespread implementation. This study was designed to specifically determine which metrics for laparoscopic sigmoid colectomy have evidence of construct validity. General surgeons that had performed fewer than 30 laparoscopic colon resections and laparoscopic colorectal experts (>200 laparoscopic colon resections) performed laparoscopic sigmoid colectomy on the LAP Mentor model. All participants received a 15-minute instructional warm-up and had never used the simulator before the study. Performance was then compared between each group for 21 metrics (procedural, 14; intraoperative errors, 7) to determine specifically which measurements demonstrate construct validity. Performance was compared with the Mann-Whitney U-test (p < 0.05 was significant). Fifty-three surgeons; 29 general surgeons, and 24 colorectal surgeons enrolled in the study. The virtual reality simulators for laparoscopic sigmoid colectomy demonstrated construct validity for 8 of 14 procedural metrics by distinguishing levels of surgical experience (p < 0.05). The most discriminatory procedural metrics (p < 0.01) favoring experts were reduced instrument path length, accuracy of the peritoneal/medial mobilization, and dissection of the inferior mesenteric artery. Intraoperative errors were not discriminatory for most metrics and favored general surgeons for colonic wall injury (general surgeons, 0.7; colorectal surgeons, 3.5; p = 0.045). Individual variability within the general surgeon and colorectal surgeon groups was not accounted for. The virtual reality simulators for laparoscopic sigmoid colectomy demonstrated construct validity for 8 procedure-specific metrics. However, using virtual reality simulator metrics to detect intraoperative errors did not discriminate between groups. If the virtual reality simulator continues to be used for the technical assessment of trainees and board-eligible surgeons, the evaluation of performance should be limited to procedural metrics.
The Study Skills Questionnaire (SSQUES): Preliminary Validation of a Measure for Assessing Students' Perceived Areas of Weakness.

ERIC Educational Resources Information Center

McCombs, Barbara L.; Dobrovolny, Jacqueline L.

The potential reliability and construct and predictive validity of a 30-item Study Skills Questionnaire (SSQUES) was evaluated for its ability to: (1) predict student performance in a self-paced, individualized, or computer-managed instructional environment, and (2) identify students needing some type of study skills remediation. The study was…
Teacher Evaluation Project. The Beginning Teacher Program, Intellectual Skills Development, Validity Studies of the Evaluation System, Special Instrument Development. Report for 1984-1985.

ERIC Educational Resources Information Center

Florida Coalition for the Development of a Performance Measurement System, Tallahassee.

Reports, summaries, and recommendations are presented on the following research studies: (1) Beginning Teacher Studies; (2) Instructional Skills for Teaching Higher Order Thinking; (3) Development of the Conferential Observation Instrument; (4) Predictive Validity Studies Conducted to Test the Relationship Between Teacher Performance as Measured…
Empirical Performance of Cross-Validation With Oracle Methods in a Genomics Context

PubMed Central

Martinez, Josue G.; Carroll, Raymond J.; Müller, Samuel; Sampson, Joshua N.; Chatterjee, Nilanjan

2012-01-01

When employing model selection methods with oracle properties such as the smoothly clipped absolute deviation (SCAD) and the Adaptive Lasso, it is typical to estimate the smoothing parameter by m-fold cross-validation, for example, m = 10. In problems where the true regression function is sparse and the signals large, such cross-validation typically works well. However, in regression modeling of genomic studies involving Single Nucleotide Polymorphisms (SNP), the true regression functions, while thought to be sparse, do not have large signals. We demonstrate empirically that in such problems, the number of selected variables using SCAD and the Adaptive Lasso, with 10-fold cross-validation, is a random variable that has considerable and surprising variation. Similar remarks apply to non-oracle methods such as the Lasso. Our study strongly questions the suitability of performing only a single run of m-fold cross-validation with any oracle method, and not just the SCAD and Adaptive Lasso. PMID:22347720

Objective assessment based on motion-related metrics and technical performance in laparoscopic suturing.

PubMed

Sánchez-Margallo, Juan A; Sánchez-Margallo, Francisco M; Oropesa, Ignacio; Enciso, Silvia; Gómez, Enrique J

2017-02-01

The aim of this study is to present the construct and concurrent validity of a motion-tracking method of laparoscopic instruments based on an optical pose tracker and determine its feasibility as an objective assessment tool of psychomotor skills during laparoscopic suturing. A group of novice ([Formula: see text] laparoscopic procedures), intermediate (11-100 laparoscopic procedures) and experienced ([Formula: see text] laparoscopic procedures) surgeons performed three intracorporeal sutures on an ex vivo porcine stomach. Motion analysis metrics were recorded using the proposed tracking method, which employs an optical pose tracker to determine the laparoscopic instruments' position. Construct validation was measured for all 10 metrics across the three groups and between pairs of groups. Concurrent validation was measured against a previously validated suturing checklist. Checklists were completed by two independent surgeons over blinded video recordings of the task. Eighteen novices, 15 intermediates and 11 experienced surgeons took part in this study. Execution time and path length travelled by the laparoscopic dissector presented construct validity. Experienced surgeons required significantly less time ([Formula: see text]), travelled less distance using both laparoscopic instruments ([Formula: see text]) and made more efficient use of the work space ([Formula: see text]) compared with novice and intermediate surgeons. Concurrent validation showed strong correlation between both the execution time and path length and the checklist score ([Formula: see text] and [Formula: see text], [Formula: see text]). The suturing performance was successfully assessed by the motion analysis method. Construct and concurrent validity of the motion-based assessment method has been demonstrated for the execution time and path length metrics. This study demonstrates the efficacy of the presented method for objective evaluation of psychomotor skills in laparoscopic suturing. However, this method does not take into account the quality of the suture. Thus, future works will focus on developing new methods combining motion analysis and qualitative outcome evaluation to provide a complete performance assessment to trainees.
Factor structure and validation of the Attentional Control Scale.

PubMed

Judah, Matt R; Grant, DeMond M; Mills, Adam C; Lechner, William V

2014-04-01

The Attentional Control Scale (ACS; Derryberry & Reed, 2002) has been used to assess executive control over attention in numerous studies, but no published data have examined the factor structure of the English version. The current studies addressed this need and tested the predictive and convergent validity of the ACS subscales. In Study 1, exploratory factor analysis yielded a two-factor model with Focusing and Shifting subscales. In Study 2, confirmatory factor analysis supported this model and suggested superior fit compared to the factor structure of the Icelandic version (Ólafsson et al., 2011). Study 3 examined correlations between the ACS subscales and measures of working memory, anxiety, and cognitive control. Study 4 examined correlations between the subscales and reaction times on a mixed-antisaccade task, revealing positive correlations for antisaccade performance and prosaccade latency with Focusing scores and between switch trial performance and Shifting scores. Additionally, the findings partially supported unique relationships between Focusing and trait anxiety and between Shifting and depression that have been noted in recent research. Although the results generally support the validity of the ACS, additional research using performance-based tasks is needed.
Select Methodology for Validating Advanced Satellite Measurement Systems

NASA Technical Reports Server (NTRS)

Larar, Allen M.; Zhou, Daniel K.; Liu, Xi; Smith, William L.

2008-01-01

Advanced satellite sensors are tasked with improving global measurements of the Earth's atmosphere, clouds, and surface to enable enhancements in weather prediction, climate monitoring capability, and environmental change detection. Measurement system validation is crucial to achieving this goal and maximizing research and operational utility of resultant data. Field campaigns including satellite under-flights with well calibrated FTS sensors aboard high-altitude aircraft are an essential part of the validation task. This presentation focuses on an overview of validation methodology developed for assessment of high spectral resolution infrared systems, and includes results of preliminary studies performed to investigate the performance of the Infrared Atmospheric Sounding Interferometer (IASI) instrument aboard the MetOp-A satellite.
Analytic Validation of Immunohistochemical Assays: A Comparison of Laboratory Practices Before and After Introduction of an Evidence-Based Guideline.

PubMed

Fitzgibbons, Patrick L; Goldsmith, Jeffrey D; Souers, Rhona J; Fatheree, Lisa A; Volmar, Keith E; Stuart, Lauren N; Nowak, Jan A; Astles, J Rex; Nakhleh, Raouf E

2017-09-01

- Laboratories must demonstrate analytic validity before any test can be used clinically, but studies have shown inconsistent practices in immunohistochemical assay validation. - To assess changes in immunohistochemistry analytic validation practices after publication of an evidence-based laboratory practice guideline. - A survey on current immunohistochemistry assay validation practices and on the awareness and adoption of a recently published guideline was sent to subscribers enrolled in one of 3 relevant College of American Pathologists proficiency testing programs and to additional nonsubscribing laboratories that perform immunohistochemical testing. The results were compared with an earlier survey of validation practices. - Analysis was based on responses from 1085 laboratories that perform immunohistochemical staining. Of 1057 responses, 65.4% (691) were aware of the guideline recommendations before this survey was sent and 79.9% (550 of 688) of those have already adopted some or all of the recommendations. Compared with the 2010 survey, a significant number of laboratories now have written validation procedures for both predictive and nonpredictive marker assays and specifications for the minimum numbers of cases needed for validation. There was also significant improvement in compliance with validation requirements, with 99% (100 of 102) having validated their most recently introduced predictive marker assay, compared with 74.9% (326 of 435) in 2010. The difficulty in finding validation cases for rare antigens and resource limitations were cited as the biggest challenges in implementing the guideline. - Dissemination of the 2014 evidence-based guideline validation practices had a positive impact on laboratory performance; some or all of the recommendations have been adopted by nearly 80% of respondents.
The Brief Fear of Negative Evaluation Scale (BFNE): translation and validation study of the Iranian version.

PubMed

Tavoli, Azadeh; Melyani, Mahdiyeh; Bakhtiari, Maryam; Ghaedi, Gholam Hossein; Montazeri, Ali

2009-07-09

The Brief Fear of Negative Evaluation Scale (BFNE) is a commonly used instrument to measure social anxiety. This study aimed to translate and to test the reliability and validity of the BFNE in Iran. The English language version of the BFNE was translated into Persian (Iranian language) and was used in this study. The questionnaire was administered to a consecutive sample of 235 students with (n = 33, clinical group) and without social phobia (n = 202, non-clinical group). In addition to the BFNE, two standard instruments were used to measure social phobia severity: the Social Phobia Inventory (SPIN), and the Social Interaction Anxiety Scale (SIAS). All participants completed a brief background information questionnaire, the SPIN, the SIAS and the BFNE scales. Statistical analysis was performed to test the reliability and validity of the BFNE. In all 235 students were studied (111 male and 124 female). The mean age for non-clinical group was 22.2 (SD = 2.1) years and for clinical sample it was 22.4 (SD = 1.8) years. Cronbach's alpha coefficient (to test reliability) was acceptable for both non-clinical and clinical samples (alpha = 0.90 and 0.82 respectively). In addition, 3-week test-retest reliability was performed in non-clinical sample and the intraclass correlation coefficient (ICC) was quite high (ICC = 0.71). Validity as performed using convergent and discriminant validity showed satisfactory results. The questionnaire correlated well with established measures of social phobia such as the SPIN (r = 0.43, p < 0.001) and the SIAS (r = 0.54, p < 0.001). Also the BFNE discriminated well between men and women with and without social phobia in the expected direction. Factor analysis supported a two-factor solution corresponding to positive and reverse-worded items. This validation study of the Iranian version of BFNE proved that it is an acceptable, reliable and valid measure of social phobia. However, since the scale showed a two-factor structure and this does not confirm to the theoretical basis for the BFNE, thus we suggest the use of the BFNE-II when it becomes available in Iran. The validation study of the BFNE-II is in progress.
Exploring rationality in schizophrenia.

PubMed

Revsbech, Rasmus; Mortensen, Erik Lykke; Owen, Gareth; Nordgaard, Julie; Jansson, Lennart; Sæbye, Ditte; Flensborg-Madsen, Trine; Parnas, Josef

2015-06-01

Empirical studies of rationality (syllogisms) in patients with schizophrenia have obtained different results. One study found that patients reason more logically if the syllogism is presented through an unusual content. To explore syllogism-based rationality in schizophrenia. Thirty-eight first-admitted patients with schizophrenia and 38 healthy controls solved 29 syllogisms that varied in presentation content (ordinary v. unusual) and validity (valid v. invalid). Statistical tests were made of unadjusted and adjusted group differences in models adjusting for intelligence and neuropsychological test performance. Controls outperformed patients on all syllogism types, but the difference between the two groups was only significant for valid syllogisms presented with unusual content. However, when adjusting for intelligence and neuropsychological test performance, all group differences became non-significant. When taking intelligence and neuropsychological performance into account, patients with schizophrenia and controls perform similarly on syllogism tests of rationality. None. © The Royal College of Psychiatrists 2015. This is an open access article distributed under the terms of the Creative Commons Non-Commercial, No Derivatives (CC BY-NC-ND) licence.
What Do HPT Consultants Do for Performance Analysis?

ERIC Educational Resources Information Center

Kang, Sung

2017-01-01

This study was conducted to contribute to the field of Human Performance Technology (HPT) through the validation of the performance analysis process of the International Society for Performance Improvement (ISPI) HPT model, the most representative and frequently utilized process model in the HPT field. The study was conducted using content…
Measuring the effect of inter-study variability on estimating prediction error.

PubMed

Ma, Shuyi; Sung, Jaeyun; Magis, Andrew T; Wang, Yuliang; Geman, Donald; Price, Nathan D

2014-01-01

The biomarker discovery field is replete with molecular signatures that have not translated into the clinic despite ostensibly promising performance in predicting disease phenotypes. One widely cited reason is lack of classification consistency, largely due to failure to maintain performance from study to study. This failure is widely attributed to variability in data collected for the same phenotype among disparate studies, due to technical factors unrelated to phenotypes (e.g., laboratory settings resulting in "batch-effects") and non-phenotype-associated biological variation in the underlying populations. These sources of variability persist in new data collection technologies. Here we quantify the impact of these combined "study-effects" on a disease signature's predictive performance by comparing two types of validation methods: ordinary randomized cross-validation (RCV), which extracts random subsets of samples for testing, and inter-study validation (ISV), which excludes an entire study for testing. Whereas RCV hardwires an assumption of training and testing on identically distributed data, this key property is lost in ISV, yielding systematic decreases in performance estimates relative to RCV. Measuring the RCV-ISV difference as a function of number of studies quantifies influence of study-effects on performance. As a case study, we gathered publicly available gene expression data from 1,470 microarray samples of 6 lung phenotypes from 26 independent experimental studies and 769 RNA-seq samples of 2 lung phenotypes from 4 independent studies. We find that the RCV-ISV performance discrepancy is greater in phenotypes with few studies, and that the ISV performance converges toward RCV performance as data from additional studies are incorporated into classification. We show that by examining how fast ISV performance approaches RCV as the number of studies is increased, one can estimate when "sufficient" diversity has been achieved for learning a molecular signature likely to translate without significant loss of accuracy to new clinical settings.
Assessing Meritorious Teacher Performance: A Differential Validity Study.

ERIC Educational Resources Information Center

Ellett, Chad D; Capie, William

The Teacher Assessment and Development System (TADS) - Meritorious Teacher Program (MTP) FORM instrument is used in the Dade County Public Schools, Miami, Florida, to evaluate teachers. Its validity for decisions concerning merit pay for master teachers was examined in this study. Specifically, its ability to discriminate between high performing…
Validation of the NCC Code for Staged Transverse Injection and Computations for a RBCC Combustor

NASA Technical Reports Server (NTRS)

Ajmani, Kumud; Liu, Nan-Suey

2005-01-01

The NCC code was validated for a case involving staged transverse injection into Mach 2 flow behind a rearward facing step. Comparisons with experimental data and with solutions from the FPVortex code was then used to perform computations to study fuel-air mixing for the combustor of a candidate rocket based combined cycle engine geometry. Comparisons with a one-dimensional analysis and a three-dimensional code (VULCAN) were performed to assess the qualitative and quantitative performance of the NCC solver.
Early Identification of Children at Risk for Academic Difficulties Using Standardized Assessment: Stability and Predictive Validity of Preschool Math and Language Scores

ERIC Educational Resources Information Center

Frans, Niek; Post, Wendy J.; Huisman, Mark; Oenema-Mostert, Ineke C. E.; Keegstra, Anne L.; Minnaert, Alexander E. M. G.

2017-01-01

Despite the claim by several researchers that variability in performance may complicate the identification of "at-risk" children, variability in the academic performance of young children remains an undervalued area of research. The goal of this study is to examine the predictive validity for future scores and the score stability of two…
Control Performance, Aerodynamic Modeling, and Validation of Coupled Simulation Techniques for Guided Projectile Roll Dynamics

DTIC Science & Technology

2014-11-01

39–44) has been explored in depth in the literature. Of particular interest for this study are investigations into roll control. Isolating the...Control Performance, Aerodynamic Modeling, and Validation of Coupled Simulation Techniques for Guided Projectile Roll Dynamics by Jubaraj...Simulation Techniques for Guided Projectile Roll Dynamics Jubaraj Sahu, Frank Fresconi, and Karen R. Heavey Weapons and Materials Research
The Validity and Reliability of the Gymaware Linear Position Transducer for Measuring Counter-Movement Jump Performance in Female Athletes

ERIC Educational Resources Information Center

O'Donnell, Shannon; Tavares, Francisco; McMaster, Daniel; Chambers, Samuel; Driller, Matthew

2018-01-01

The current study aimed to assess the validity and test-retest reliability of a linear position transducer when compared to a force plate through a counter-movement jump in female participants. Twenty-seven female recreational athletes (19 ± 2 years) performed three counter-movement jumps simultaneously using the linear position transducer and…
Development and Validation of Chemistry Self-Efficacy Scale for College Students

ERIC Educational Resources Information Center

Uzuntiryaki, Esen; Aydin, Yesim Capa

2009-01-01

This study described the process of developing and validating the College Chemistry Self-Efficacy Scale (CCSS) that can be used to assess college students' beliefs in their ability to perform essential tasks in chemistry. In the first phase, data collected from 363 college students provided evidence for the validity and reliability of the new…
Tracing Professional Proficiencies Back to Curricula in Professional Education: A Content Validation Based on the Convergent Validity of Multiple Job Analysis Measures.

ERIC Educational Resources Information Center

Burgar, Paul S.

A study was commissioned by a large petrochemical concern in order to validate professional degrees as a job entry requirement. The investigations considered two issues: (1) "Are activities performed by professionals (chemists and engineers) measurably different from the activities of subordinate technical personnel?" and "What…
Establishing the reliability and concurrent validity of physical performance tests using virtual reality equipment for community-dwelling healthy elders.

PubMed

Griswold, David; Rockwell, Kyle; Killa, Carri; Maurer, Michael; Landgraff, Nancy; Learman, Ken

2015-01-01

The aim of this study was to determine the reliability and concurrent validity of commonly used physical performance tests using the OmniVR Virtual Rehabilitation System for healthy community-dwelling elders. Participants (N = 40) were recruited by the authors and were screened for eligibility. The initial method of measurement was randomized to either virtual reality (VR) or clinically based measures (CM). Physical performance tests included the five times sit to stand, Timed Up and Go (TUG), Forward Functional Reach (FFR) and 30-s stand test. A random number generator determined the testing order. The test-re-test reliability for the VR and CM was determined. Furthermore, concurrent validity was determined using a Pearson product moment correlation (Pearson r). The VR demonstrated excellent reliability for 5 × STS intraclass correlation coefficient (ICC) = 0.931(3,1), FFR ICC = 0.846(3,1) and the TUG ICC = 0.944(3,1). The concurrent validity data for the VR and CM (ICC 3, k) were moderate for FFR ICC = 0.682, excellent 5 × STS ICC = 0.889 and excellent for the TUG ICC = 0.878. The concurrent validity of the 30-s stand test was good ICC = 0.735(3,1). This study supports the use of VR equipment for measuring physical performance tests in the clinic for healthy community-dwelling elders. Virtual reality equipment is not only used to treat balance impairments but it is also used to measure and determine physical impairments through the use of physical performance tests. Virtual reality equipment is a reliable and valid tool for collecting physical performance data for the 5 × STS, FFR, TUG and 30-s stand test for healthy community-dwelling elders.
Assessing the validity of sales self-efficacy: a cautionary tale.

PubMed

Gupta, Nina; Ganster, Daniel C; Kepes, Sven

2013-07-01

We developed a focused, context-specific measure of sales self-efficacy and assessed its incremental validity against the broad Big 5 personality traits with department store salespersons, using (a) both a concurrent and a predictive design and (b) both objective sales measures and supervisory ratings of performance. We found that in the concurrent study, sales self-efficacy predicted objective and subjective measures of job performance more than did the Big 5 measures. Significant differences between the predictability of subjective and objective measures of performance were not observed. Predictive validity coefficients were generally lower than concurrent validity coefficients. The results suggest that there are different dynamics operating in concurrent and predictive designs and between broad and contextualized measures; they highlight the importance of distinguishing between these designs and measures in meta-analyses. The results also point to the value of focused, context-specific personality predictors in selection research. PsycINFO Database Record (c) 2013 APA, all rights reserved.
Validation of X1 motorcycle model in industrial plant layout by using WITNESSTM simulation software

NASA Astrophysics Data System (ADS)

Hamzas, M. F. M. A.; Bareduan, S. A.; Zakaria, M. Z.; Tan, W. J.; Zairi, S.

2017-09-01

This paper demonstrates a case study on simulation, modelling and analysis for X1 Motorcycles Model. In this research, a motorcycle assembly plant has been selected as a main place of research study. Simulation techniques by using Witness software were applied to evaluate the performance of the existing manufacturing system. The main objective is to validate the data and find out the significant impact on the overall performance of the system for future improvement. The process of validation starts when the layout of the assembly line was identified. All components are evaluated to validate whether the data is significance for future improvement. Machine and labor statistics are among the parameters that were evaluated for process improvement. Average total cycle time for given workstations is used as criterion for comparison of possible variants. From the simulation process, the data used are appropriate and meet the criteria for two-sided assembly line problems.
Beyond the hype: deep neural networks outperform established methods using a ChEMBL bioactivity benchmark set.

PubMed

Lenselink, Eelke B; Ten Dijke, Niels; Bongers, Brandon; Papadatos, George; van Vlijmen, Herman W T; Kowalczyk, Wojtek; IJzerman, Adriaan P; van Westen, Gerard J P

2017-08-14

The increase of publicly available bioactivity data in recent years has fueled and catalyzed research in chemogenomics, data mining, and modeling approaches. As a direct result, over the past few years a multitude of different methods have been reported and evaluated, such as target fishing, nearest neighbor similarity-based methods, and Quantitative Structure Activity Relationship (QSAR)-based protocols. However, such studies are typically conducted on different datasets, using different validation strategies, and different metrics. In this study, different methods were compared using one single standardized dataset obtained from ChEMBL, which is made available to the public, using standardized metrics (BEDROC and Matthews Correlation Coefficient). Specifically, the performance of Naïve Bayes, Random Forests, Support Vector Machines, Logistic Regression, and Deep Neural Networks was assessed using QSAR and proteochemometric (PCM) methods. All methods were validated using both a random split validation and a temporal validation, with the latter being a more realistic benchmark of expected prospective execution. Deep Neural Networks are the top performing classifiers, highlighting the added value of Deep Neural Networks over other more conventional methods. Moreover, the best method ('DNN_PCM') performed significantly better at almost one standard deviation higher than the mean performance. Furthermore, Multi-task and PCM implementations were shown to improve performance over single task Deep Neural Networks. Conversely, target prediction performed almost two standard deviations under the mean performance. Random Forests, Support Vector Machines, and Logistic Regression performed around mean performance. Finally, using an ensemble of DNNs, alongside additional tuning, enhanced the relative performance by another 27% (compared with unoptimized 'DNN_PCM'). Here, a standardized set to test and evaluate different machine learning algorithms in the context of multi-task learning is offered by providing the data and the protocols. Graphical Abstract .
Impact of Cognitive Abilities and Prior Knowledge on Complex Problem Solving Performance - Empirical Results and a Plea for Ecologically Valid Microworlds.

PubMed

Süß, Heinz-Martin; Kretzschmar, André

2018-01-01

The original aim of complex problem solving (CPS) research was to bring the cognitive demands of complex real-life problems into the lab in order to investigate problem solving behavior and performance under controlled conditions. Up until now, the validity of psychometric intelligence constructs has been scrutinized with regard to its importance for CPS performance. At the same time, different CPS measurement approaches competing for the title of the best way to assess CPS have been developed. In the first part of the paper, we investigate the predictability of CPS performance on the basis of the Berlin Intelligence Structure Model and Cattell's investment theory as well as an elaborated knowledge taxonomy. In the first study, 137 students managed a simulated shirt factory ( Tailorshop ; i.e., a complex real life-oriented system) twice, while in the second study, 152 students completed a forestry scenario ( FSYS ; i.e., a complex artificial world system). The results indicate that reasoning - specifically numerical reasoning (Studies 1 and 2) and figural reasoning (Study 2) - are the only relevant predictors among the intelligence constructs. We discuss the results with reference to the Brunswik symmetry principle. Path models suggest that reasoning and prior knowledge influence problem solving performance in the Tailorshop scenario mainly indirectly. In addition, different types of system-specific knowledge independently contribute to predicting CPS performance. The results of Study 2 indicate that working memory capacity, assessed as an additional predictor, has no incremental validity beyond reasoning. We conclude that (1) cognitive abilities and prior knowledge are substantial predictors of CPS performance, and (2) in contrast to former and recent interpretations, there is insufficient evidence to consider CPS a unique ability construct. In the second part of the paper, we discuss our results in light of recent CPS research, which predominantly utilizes the minimally complex systems (MCS) measurement approach. We suggest ecologically valid microworlds as an indispensable tool for future CPS research and applications.

Field validation of the dnph method for aldehydes and ketones. Final report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Workman, G.S.; Steger, J.L.

1996-04-01

A stationary source emission test method for selected aldehydes and ketones has been validated. The method employs a sampling train with impingers containing 2,4-dinitrophenylhydrazine (DNPH) to derivatize the analytes. The resulting hydrazones are recovered and analyzed by high performance liquid chromatography. Nine analytes were studied; the method was validated for formaldehyde, acetaldehyde, propionaldehyde, acetophenone and isophorone. Acrolein, menthyl ethyl ketone, menthyl isobutyl ketone, and quinone did not meet the validation criteria. The study employed the validation techniques described in EPA method 301, which uses train spiking to determine bias, and collocated sampling trains to determine precision. The studies were carriedmore » out at a plywood veneer dryer and a polyester manufacturing plant.« less
External model validation of binary clinical risk prediction models in cardiovascular and thoracic surgery.

PubMed

Hickey, Graeme L; Blackstone, Eugene H

2016-08-01

Clinical risk-prediction models serve an important role in healthcare. They are used for clinical decision-making and measuring the performance of healthcare providers. To establish confidence in a model, external model validation is imperative. When designing such an external model validation study, thought must be given to patient selection, risk factor and outcome definitions, missing data, and the transparent reporting of the analysis. In addition, there are a number of statistical methods available for external model validation. Execution of a rigorous external validation study rests in proper study design, application of suitable statistical methods, and transparent reporting. Copyright © 2016 The American Association for Thoracic Surgery. Published by Elsevier Inc. All rights reserved.
Adaptation of Organizational Justice in Sport Scale into Turkish Language: Validity and Reliability Study

ERIC Educational Resources Information Center

Sayin, Ayfer; Sahin, Mustafa Yasar

2017-01-01

The present study aimed to provide a Turkish adaptation of the Organizational Justice in Sport Scale and perform reliability and validity studies. Answers provided by 260 participants who work as football, male basketball and female basketball coaches in National Collegiate Athletic Association (NCAA) were analysed using the original scale that…
Basic School Skills Inventory-3: Validity and Reliability Study

ERIC Educational Resources Information Center

Yildiz, F. Ülkü; Çagdas, Aysel; Kayili, Gökhan

2017-01-01

The purpose of this study is to perform the validity-reliability analysis of the three subtests of Basic School Skills Inventory 3--Mathematics, Classroom Behavior and Daily Life skills--and do its adaptation for four to six year-old Turkish children. The sample of the study included 595 four to six year-old Turkish children attending public and…
A Systematic Review of Validated Methods for Identifying Cerebrovascular Accident or Transient Ischemic Attack Using Administrative Data

PubMed Central

Andrade, Susan E.; Harrold, Leslie R.; Tjia, Jennifer; Cutrona, Sarah L.; Saczynski, Jane S.; Dodd, Katherine S.; Goldberg, Robert J.; Gurwitz, Jerry H.

2012-01-01

Purpose To perform a systematic review of the validity of algorithms for identifying cerebrovascular accidents (CVAs) or transient ischemic attacks (TIAs) using administrative and claims data. Methods PubMed and Iowa Drug Information Service (IDIS) searches of the English language literature were performed to identify studies published between 1990 and 2010 that evaluated the validity of algorithms for identifying CVAs (ischemic and hemorrhagic strokes, intracranial hemorrhage and subarachnoid hemorrhage) and/or TIAs in administrative data. Two study investigators independently reviewed the abstracts and articles to determine relevant studies according to pre-specified criteria. Results A total of 35 articles met the criteria for evaluation. Of these, 26 articles provided data to evaluate the validity of stroke, 7 reported the validity of TIA, 5 reported the validity of intracranial bleeds (intracerebral hemorrhage and subarachnoid hemorrhage), and 10 studies reported the validity of algorithms to identify the composite endpoints of stroke/TIA or cerebrovascular disease. Positive predictive values (PPVs) varied depending on the specific outcomes and algorithms evaluated. Specific algorithms to evaluate the presence of stroke and intracranial bleeds were found to have high PPVs (80% or greater). Algorithms to evaluate TIAs in adult populations were generally found to have PPVs of 70% or greater. Conclusions The algorithms and definitions to identify CVAs and TIAs using administrative and claims data differ greatly in the published literature. The choice of the algorithm employed should be determined by the stroke subtype of interest. PMID:22262598
Batch Effect Confounding Leads to Strong Bias in Performance Estimates Obtained by Cross-Validation

PubMed Central

Delorenzi, Mauro

2014-01-01

Background With the large amount of biological data that is currently publicly available, many investigators combine multiple data sets to increase the sample size and potentially also the power of their analyses. However, technical differences (“batch effects”) as well as differences in sample composition between the data sets may significantly affect the ability to draw generalizable conclusions from such studies. Focus The current study focuses on the construction of classifiers, and the use of cross-validation to estimate their performance. In particular, we investigate the impact of batch effects and differences in sample composition between batches on the accuracy of the classification performance estimate obtained via cross-validation. The focus on estimation bias is a main difference compared to previous studies, which have mostly focused on the predictive performance and how it relates to the presence of batch effects. Data We work on simulated data sets. To have realistic intensity distributions, we use real gene expression data as the basis for our simulation. Random samples from this expression matrix are selected and assigned to group 1 (e.g., ‘control’) or group 2 (e.g., ‘treated’). We introduce batch effects and select some features to be differentially expressed between the two groups. We consider several scenarios for our study, most importantly different levels of confounding between groups and batch effects. Methods We focus on well-known classifiers: logistic regression, Support Vector Machines (SVM), k-nearest neighbors (kNN) and Random Forests (RF). Feature selection is performed with the Wilcoxon test or the lasso. Parameter tuning and feature selection, as well as the estimation of the prediction performance of each classifier, is performed within a nested cross-validation scheme. The estimated classification performance is then compared to what is obtained when applying the classifier to independent data. PMID:24967636
Measurement of Functional Cognition and Complex Everyday Activities in Older Adults with Mild Cognitive Impairment and Mild Dementia: Validity of the Large Allen's Cognitive Level Screen.

PubMed

Wesson, Jacqueline; Clemson, Lindy; Crawford, John D; Kochan, Nicole A; Brodaty, Henry; Reppermund, Simone

2017-05-01

To explore the validity of the Large Allen's Cognitive Level Screen-5 (LACLS-5) as a performance-based measure of functional cognition, representing an ability to perform complex everyday activities in older adults with mild cognitive impairment (MCI) and mild dementia living in the community. Using cross-sectional data from the Sydney Memory and Ageing Study, 160 community-dwelling older adults with normal cognition (CN; N = 87), MCI (N = 43), or dementia (N = 30) were studied. Functional cognition (LACLS-5), complex everyday activities (Disability Assessment for Dementia [DAD]), Assessment of Motor and Process Skills [AMPS]), and neuropsychological measures were used. Participants with dementia performed worse than CN on all clinical measures, and MCI participants were intermediate. Correlational analyses showed that LACLS-5 was most strongly related to AMPS Process scores, DAD instrumental activities of daily living subscale, Mini-Mental State Exam, Block Design, Logical Memory, and Trail Making Test B. Multiple regression analysis indicated that both cognitive (Block Design) and functional measures (AMPS Process score) and sex predicted LACLS-5 performance. Finally, LACLS-5 was able to adequately discriminate between CN and dementia and between MCI and dementia but was unable to reliably distinguish between CN and MCI. Construct validity, including convergent and discriminative validity, was supported. LACLS-5 is a valid performance-based measure for evaluating functional cognition. Discriminativevalidity is acceptable for identifying mild dementia but requires further refinement for detecting MCI. Copyright © 2017 American Association for Geriatric Psychiatry. Published by Elsevier Inc. All rights reserved.
Supervisor Health and Safety Support: Scale Development and Validation

PubMed Central

Butts, Marcus M.; Hurst, Carrie S.; Eby, Lillian T.

2013-01-01

Executive Summary Two studies were conducted to develop a psychometrically sound measure of supervisor health and safety support (SHSS). We identified three dimensions of supervisor support (physical health, psychological health, safety) and used Study 1 to develop items and establish content validity. Study 2 was used to establish the dimensionality of the new measure and provide criterion-related and discriminant validity evidence of the measure using supervisor and subordinate data. The measure had incremental validity in predicting employee performance and psychological strain outcomes above and beyond general work support variables. Implications of these findings and for workplace support theory and practice are discussed. PMID:24771991
Assessing the Validity of Self-Rated Health with the Short Physical Performance Battery: A Cross-Sectional Analysis of the International Mobility in Aging Study.

PubMed

Pérez-Zepeda, Mario U; Belanger, Emmanuelle; Zunzunegui, Maria-Victoria; Phillips, Susan; Ylli, Alban; Guralnik, Jack

2016-01-01

The aim of this study was to explore the validity of self-rated health across different populations of older adults, when compared to the Short Physical Performance Battery. Cross-sectional analysis of the International Mobility in Aging Study. Five locations: Saint-Hyacinthe and Kingston (Canada), Tirana (Albania), Manizales (Colombia), and Natal (Brazil). Older adults between 65 and 74 years old (n = 1,995). The Short Physical Performance Battery (SPPB) was used to measure physical performance. Self-rated health was assessed with one single five-point question. Linear trends between SPPB scores and self-rated health were tested separately for men and women at each of the five international study sites. Poor physical performance (independent variable) (SPPB less than 8) was used in logistic regression models of self-rated health (dependent variable), adjusting for potential covariates. All analyses were stratified by gender and site of origin. A significant linear association was found between the mean scores of the Short Physical Performance Battery and ordinal categories of self-rated health across research sites and gender groups. After extensive control for objective physical and mental health indicators and socio-demographic variables, these graded associations became non-significant in some research sites. These findings further confirm the validity of SRH as a measure of overall health status in older adults.
Alberta infant motor scale: reliability and validity when used on preterm infants in Taiwan.

PubMed

Jeng, S F; Yau, K I; Chen, L C; Hsiao, S F

2000-02-01

The goal of this study was to examine the reliability and validity of measurements obtained with the Alberta Infant Motor Scale (AIMS) for evaluation of preterm infants in Taiwan. Two independent groups of preterm infants were used to investigate the reliability (n=45) and validity (n=41) for the AIMS. In the reliability study, the AIMS was administered to the infants by a physical therapist, and infant performance was videotaped. The performance was then rescored by the same therapist and by 2 other therapists to examine the intrarater and interrater reliability. In the validity study, the AIMS and the Bayley Motor Scale were administered to the infants at 6 and 12 months of age to examine criterion-related validity. Intraclass correlation coefficients (ICCs) for intrarater and interrater reliability of measurements obtained with the AIMS were high (ICC=.97-.99). The AIMS scores correlated with the Bayley Motor Scale scores at 6 and 12 months (r=.78 and.90), although the AIMS scores at 6 months were only moderately predictive of the motor function at 12 months (r=.56). The results suggest that measurements obtained with the AIMS have acceptable reliability and concurrent validity but limited predictive value for evaluating preterm Taiwanese infants.
Validity and reliability of the Short Physical Performance Battery (SPPB)

PubMed Central

Curcio, Carmen-Lucía; Alvarado, Beatriz; Zunzunegui, María Victoria; Guralnik, Jack

2013-01-01

Objectives: To assess the validity (convergent and construct) and reliability of the Short Physical Performance Battery (SPPB) among non-disabled adults between 65 to 74 years of age residing in the Andes Mountains of Colombia. Methods: Design Validation study; Participants: 150 subjects aged 65 to 74 years recruited from elderly associations (day-centers) in Manizales, Colombia. Measurements: The SPPB tests of balance, including time to walk 4 meters and time required to stand from a chair 5 times were administered to all participants. Reliability was analyzed with a 7-day interval between assessments and use of repeated ANOVA testing. Construct validity was assessed using factor analysis and by testing the relationship between SPPB and depressive symptoms, cognitive function, and self rated health (SRH), while the concurrent validity was measured through relationships with mobility limitations and disability in Activities of Daily Living (ADL). ANOVA tests were used to establish these associations. Results: Test-retest reliability of the SPPB was high: 0.87 (CI95%: 0.77-0.96). A one factor solution was found with three SPPB tests. SPPB was related to self-rated health, limitations in walking and climbing steps and to indicators of disability, as well as to cognitive function and depression. There was a graded decrease in the mean SPPB score with increasing disability and poor health. Conclusion: The Spanish version of SPPB is reliable and valid to assess physical performance among older adults from our region. Future studies should establish their clinical applications and explore usage in population studies. PMID:24892614
A neural networks application for the study of the influence of transport conditions on the working performance

NASA Astrophysics Data System (ADS)

Anghel, D.-C.; Ene, A.; Ştirbu, C.; Sicoe, G.

2017-10-01

This paper presents a study about the factors that influence the working performances of workers in the automotive industry. These factors regard mainly the transportations conditions, taking into account the fact that a large number of workers live in places that are far away of the enterprise. The quantitative data obtained from this study will be generalized by using a neural network, software simulated. The neural network is able to estimate the performance of workers even for the combinations of input factors that had been not recorded by the study. The experimental data obtained from the study will be divided in two classes. The first class that contains approximately 80% of data will be used by the Java software for the training of the neural network. The weights resulted from the training process will be saved in a text file. The other class that contains the rest of the 20% of experimental data will be used to validate the neural network. The training and the validation of the networks are performed in a Java software (TrainAndValidate java class). We designed another java class, Test.java that will be used with new input data, for new situations. The experimental data collected from the study. The software that simulated the neural network. The software that estimates the working performance, when new situations are met. This application is useful for human resources department of an enterprise. The output results are not quantitative. They are qualitative (from low performance to high performance, divided in five classes).
Advanced Concept Studies for Supersonic Commercial Transports Entering Service in the 2018 to 2020 Period

NASA Technical Reports Server (NTRS)

Morgenstern, John; Norstrud, Nicole; Sokhey, Jack; Martens, Steve; Alonso, Juan J.

2013-01-01

Lockheed Martin Aeronautics Company (LM), working in conjunction with General Electric Global Research (GE GR), Rolls-Royce Liberty Works (RRLW), and Stanford University, herein presents results from the "N+2 Supersonic Validations" contract s initial 22 month phase, addressing the NASA solicitation "Advanced Concept Studies for Supersonic Commercial Transports Entering Service in the 2018 to 2020 Period." This report version adds documentation of an additional three month low boom test task. The key technical objective of this effort was to validate integrated airframe and propulsion technologies and design methodologies. These capabilities aspired to produce a viable supersonic vehicle design with environmental and performance characteristics. Supersonic testing of both airframe and propulsion technologies (including LM3: 97-023 low boom testing and April-June nozzle acoustic testing) verified LM s supersonic low-boom design methodologies and both GE and RRLW's nozzle technologies for future implementation. The N+2 program is aligned with NASA s Supersonic Project and is focused on providing system-level solutions capable of overcoming the environmental and performance/efficiency barriers to practical supersonic flight. NASA proposed "Initial Environmental Targets and Performance Goals for Future Supersonic Civil Aircraft". The LM N+2 studies are built upon LM s prior N+3 100 passenger design studies. The LM N+2 program addresses low boom design and methodology validations with wind tunnel testing, performance and efficiency goals with system level analysis, and low noise validations with two nozzle (GE and RRLW) acoustic tests.
Development and co-validation of porcine insulin certified reference material by high-performance liquid chromatography-isotope dilution mass spectrometry.

PubMed

Wu, Liqing; Takatsu, Akiko; Park, Sang-Ryoul; Yang, Bin; Yang, Huaxin; Kinumi, Tomoya; Wang, Jing; Bi, Jiaming; Wang, Yang

2015-04-01

This article concerns the development and co-validation of a porcine insulin (pINS) certified reference material (CRM) produced by the National Institute of Metrology, People's Republic of China. Each CRM unit contained about 15 mg of purified solid pINS. The moisture content, amount of ignition residue, molecular mass, and purity of the pINS were measured. Both high-performance liquid chromatography-isotope dilution mass spectrometry and a purity deduction method were used to determine the mass fraction of the pINS. Fifteen units were selected to study the between-bottle homogeneity, and no inhomogeneity was observed. A stability study concluded that the CRM was stable for at least 12 months at -20 °C. The certified value of the CRM was (0.892 ± 0.036) g/g. A co-validation of the CRM was performed among Chinese, Japanese, and Korean laboratories under the framework of the Asian Collaboration on Reference Materials. The co-validation results agreed well with the certified value of the CRM. Consequently, the pINS CRM may be used as a calibration material or as a validation standard for pharmaceutical purposes to improve the quality of pharmaceutical products.
Validity, Reliability, and Sensitivity of a Volleyball Intermittent Endurance Test.

PubMed

Rodríguez-Marroyo, Jose A; Medina-Carrillo, Javier; García-López, Juan; Morante, Juan C; Villa, José G; Foster, Carl

2017-03-01

To analyze the concurrent and construct validity of a volleyball intermittent endurance test (VIET). The VIET's test-retest reliability and sensitivity to assess seasonal changes was also studied. During the preseason, 71 volleyball players of different competitive levels took part in this study. All performed the VIET and a graded treadmill test with gas-exchange measurement (GXT). Thirty-one of the players performed an additional VIET to analyze the test-retest reliability. To test the VIET's sensitivity, 28 players repeated the VIET and GXT at the end of their season. Significant (P < .001) relationships between VIET distance and maximal oxygen uptake (r = .74) and GXT maximal speed (r = .78) were observed. There were no significant differences between the VIET performance test and retest (1542.1 ± 338.1 vs 1567.1 ± 358.2 m). Significant (P < .001) relationships and intraclass correlation coefficient (ICC) were found (r = .95, ICC = .96) for VIET performance. VIET performance increased significantly (P < .001) with player performance level and was sensitive to fitness changes across the season (1458.8 ± 343.5 vs 1581.1 ± 334.0 m, P < .01). The VIET may be considered a valid, reliable, and sensitive test to assess the aerobic endurance in volleyball players.
Modern modeling techniques had limited external validity in predicting mortality from traumatic brain injury.

PubMed

van der Ploeg, Tjeerd; Nieboer, Daan; Steyerberg, Ewout W

2016-10-01

Prediction of medical outcomes may potentially benefit from using modern statistical modeling techniques. We aimed to externally validate modeling strategies for prediction of 6-month mortality of patients suffering from traumatic brain injury (TBI) with predictor sets of increasing complexity. We analyzed individual patient data from 15 different studies including 11,026 TBI patients. We consecutively considered a core set of predictors (age, motor score, and pupillary reactivity), an extended set with computed tomography scan characteristics, and a further extension with two laboratory measurements (glucose and hemoglobin). With each of these sets, we predicted 6-month mortality using default settings with five statistical modeling techniques: logistic regression (LR), classification and regression trees, random forests (RFs), support vector machines (SVM) and neural nets. For external validation, a model developed on one of the 15 data sets was applied to each of the 14 remaining sets. This process was repeated 15 times for a total of 630 validations. The area under the receiver operating characteristic curve (AUC) was used to assess the discriminative ability of the models. For the most complex predictor set, the LR models performed best (median validated AUC value, 0.757), followed by RF and support vector machine models (median validated AUC value, 0.735 and 0.732, respectively). With each predictor set, the classification and regression trees models showed poor performance (median validated AUC value, <0.7). The variability in performance across the studies was smallest for the RF- and LR-based models (inter quartile range for validated AUC values from 0.07 to 0.10). In the area of predicting mortality from TBI, nonlinear and nonadditive effects are not pronounced enough to make modern prediction methods beneficial. Copyright © 2016 Elsevier Inc. All rights reserved.
Assessing reliability and validity measures in managed care studies.

PubMed

Montoya, Isaac D

2003-01-01

To review the reliability and validity literature and develop an understanding of these concepts as applied to managed care studies. Reliability is a test of how well an instrument measures the same input at varying times and under varying conditions. Validity is a test of how accurately an instrument measures what one believes is being measured. A review of reliability and validity instructional material was conducted. Studies of managed care practices and programs abound. However, many of these studies utilize measurement instruments that were developed for other purposes or for a population other than the one being sampled. In other cases, instruments have been developed without any testing of the instrument's performance. The lack of reliability and validity information may limit the value of these studies. This is particularly true when data are collected for one purpose and used for another. The usefulness of certain studies without reliability and validity measures is questionable, especially in cases where the literature contradicts itself
Image quality validation of Sentinel 2 Level-1 products: performance status at the beginning of the constellation routine phase

NASA Astrophysics Data System (ADS)

Francesconi, Benjamin; Neveu-VanMalle, Marion; Espesset, Aude; Alhammoud, Bahjat; Bouzinac, Catherine; Clerc, Sébastien; Gascon, Ferran

2017-09-01

Sentinel-2 is an Earth Observation mission developed by the European Space Agency (ESA) in the frame of the Copernicus program of the European Commission. The mission is based on a constellation of 2-satellites: Sentinel-2A launched in June 2015 and Sentinel-2B launched in March 2017. It offers an unprecedented combination of systematic global coverage of land and coastal areas, a high revisit of five days at the equator and 2 days at mid-latitudes under the same viewing conditions, high spatial resolution, and a wide field of view for multispectral observations from 13 bands in the visible, near infrared and short wave infrared range of the electromagnetic spectrum. The mission performances are routinely and closely monitored by the S2 Mission Performance Centre (MPC), including a consortium of Expert Support Laboratories (ESL). This publication focuses on the Sentinel-2 Level-1 product quality validation activities performed by the MPC. It presents an up-to-date status of the Level-1 mission performances at the beginning of the constellation routine phase. Level-1 performance validations routinely performed cover Level-1 Radiometric Validation (Equalisation Validation, Absolute Radiometry Vicarious Validation, Absolute Radiometry Cross-Mission Validation, Multi-temporal Relative Radiometry Vicarious Validation and SNR Validation), and Level-1 Geometric Validation (Geolocation Uncertainty Validation, Multi-spectral Registration Uncertainty Validation and Multi-temporal Registration Uncertainty Validation). Overall, the Sentinel-2 mission is proving very successful in terms of product quality thereby fulfilling the promises of the Copernicus program.
Evaluating physician performance at individualizing care: a pilot study tracking contextual errors in medical decision making.

PubMed

Weiner, Saul J; Schwartz, Alan; Yudkowsky, Rachel; Schiff, Gordon D; Weaver, Frances M; Goldberg, Julie; Weiss, Kevin B

2007-01-01

Clinical decision making requires 2 distinct cognitive skills: the ability to classify patients' conditions into diagnostic and management categories that permit the application of research evidence and the ability to individualize or-more specifically-to contextualize care for patients whose circumstances and needs require variation from the standard approach to care. The purpose of this study was to develop and test a methodology for measuring physicians' performance at contextualizing care and compare it to their performance at planning biomedically appropriate care. First, the authors drafted 3 cases, each with 4 variations, 3 of which are embedded with biomedical and/or contextual information that is essential to planning care. Once the cases were validated as instruments for assessing physician performance, 54 internal medicine residents were then presented with opportunities to make these preidentified biomedical or contextual errors, and data were collected on information elicitation and error making. The case validation process was successful in that, in the final iteration, the physicians who received the contextual variant of cases proposed an alternate plan of care to those who received the baseline variant 100% of the time. The subsequent piloting of these validated cases unmasked previously unmeasured differences in physician performance at contextualizing care. The findings, which reflect the performance characteristics of the study population, are presented. This pilot study demonstrates a methodology for measuring physician performance at contextualizing care and illustrates the contribution of such information to an overall assessment of physician practice.
Establishing Content Validity for a Literacy Coach Performance Appraisal Instrument

ERIC Educational Resources Information Center

Lane, Mae; Robbins, Mary; Price, Debra

2013-01-01

This study's purpose was to determine whether or not the Literacy Coach Appraisal Instrument developed for use in evaluating literacy coaches had content validity. The study, a fully mixed concurrent equal status design conducted from a pragmatist philosophy, collected qualitative and quantitative data from literacy experts about the elements of…

Feasibility and validity of the structured attention module among economically disadvantaged preschool-age children.

PubMed

Bush, Hillary H; Eisenhower, Abbey; Briggs-Gowan, Margaret; Carter, Alice S

2015-01-01

Rooted in the theory of attention put forth by Mirsky, Anthony, Duncan, Ahearn, and Kellam (1991), the Structured Attention Module (SAM) is a developmentally sensitive, computer-based performance task designed specifically to assess sustained selective attention among 3- to 6-year-old children. The current study addressed the feasibility and validity of the SAM among 64 economically disadvantaged preschool-age children (mean age = 58 months; 55% female); a population known to be at risk for attention problems and adverse math performance outcomes. Feasibility was demonstrated by high completion rates and strong associations between SAM performance and age. Principal Factor Analysis with rotation produced robust support for a three-factor model (Accuracy, Speed, and Endurance) of SAM performance, which largely corresponded with existing theorized models of selective and sustained attention. Construct validity was evidenced by positive correlations between SAM Composite scores and all three SAM factors and IQ, and between SAM Accuracy and sequential memory. Value-added predictive validity was not confirmed through main effects of SAM on math performance above and beyond age and IQ; however, significant interactions by child sex were observed: Accuracy and Endurance both interacted with child sex to predict math performance. In both cases, the SAM factors predicted math performance more strongly for girls than for boys. There were no overall sex differences in SAM performance. In sum, the current findings suggest that interindividual variation in sustained selective attention, and potentially other aspects of attention and executive function, among young, high-risk children can be captured validly with developmentally sensitive measures.
Incremental Validity of WISC-IV[superscript UK] Factor Index Scores with a Referred Irish Sample: Predicting Performance on the WIAT-II[superscript UK

ERIC Educational Resources Information Center

Canivez, Gary L.; Watkins, Marley W.; James, Trevor; Good, Rebecca; James, Kate

2014-01-01

Background: Subtest and factor scores have typically provided little incremental predictive validity beyond the omnibus IQ score. Aims: This study examined the incremental validity of Wechsler Intelligence Scale for Children-Fourth UK Edition (WISC-IV[superscript UK]; Wechsler, 2004a, "Wechsler Intelligence Scale for Children-Fourth UK…
Investigation of Malaysian Higher Education Quality Culture and Workforce Performance

ERIC Educational Resources Information Center

Ali, Hairuddin Mohd; Musah, Mohammed Borhandden

2012-01-01

Purpose: The purpose of this study is to examine the relationship between the quality culture and workforce performance in the Malaysian higher education sector. The study also aims to test and validate the psychometric properties of the quality culture and workforce performance instruments used in the study. Design/methodology/approach: A total…
Effectively Coping With Task Stress: A Study of the Validity of the Trait Emotional Intelligence Questionnaire-Short Form (TEIQue-SF).

PubMed

O'Connor, Peter; Nguyen, Jessica; Anglim, Jeromy

2017-01-01

In this study, we investigated the validity of the Trait Emotional Intelligence Questionnaire-Short Form (TEIQue-SF; Petrides, 2009) in the context of task-induced stress. We used a total sample of 225 volunteers to investigate (a) the incremental validity of the TEIQue-SF over other predictors of coping with task-induced stress, and (b) the construct validity of the TEIQue-SF by examining the mechanisms via which scores from the TEIQue-SF predict coping outcomes. Results demonstrated that the TEIQue-SF possessed incremental validity over the Big Five personality traits in the prediction of emotion-focused coping. Results also provided support for the construct validity of the TEIQue-SF by demonstrating that this measure predicted adaptive coping via emotion-focused channels. Specifically, results showed that, following a task stressor, the TEIQue-SF predicted low negative affect and high task performance via high levels of emotion-focused coping. Consistent with the purported theoretical nature of the trait emotional intelligence (EI) construct, trait EI as assessed by the TEIQue-SF primarily enhances affect and performance in stressful situations by regulating negative emotions.
Criteria of Police Officer Performance.

ERIC Educational Resources Information Center

Baehr, Melany E.

The indifferent success in achieving the goal of reliable and valid police officer work performance measures has been attributed to the complexity of the job. This study explored reasons for unsatisfactory police performance evaluation, reviewing and integrating previous studies and new recently collected but not yet published data. Performance…
Expression signature as a biomarker for prenatal diagnosis of trisomy 21.

PubMed

Volk, Marija; Maver, Aleš; Lovrečić, Luca; Juvan, Peter; Peterlin, Borut

2013-01-01

A universal biomarker panel with the potential to predict high-risk pregnancies or adverse pregnancy outcome does not exist. Transcriptome analysis is a powerful tool to capture differentially expressed genes (DEG), which can be used as biomarker-diagnostic-predictive tool for various conditions in prenatal setting. In search of biomarker set for predicting high-risk pregnancies, we performed global expression profiling to find DEG in Ts21. Subsequently, we performed targeted validation and diagnostic performance evaluation on a larger group of case and control samples. Initially, transcriptomic profiles of 10 cultivated amniocyte samples with Ts21 and 9 with normal euploid constitution were determined using expression microarrays. Datasets from Ts21 transcriptomic studies from GEO repository were incorporated. DEG were discovered using linear regression modelling and validated using RT-PCR quantification on an independent sample of 16 cases with Ts21 and 32 controls. The classification performance of Ts21 status based on expression profiling was performed using supervised machine learning algorithm and evaluated using a leave-one-out cross validation approach. Global gene expression profiling has revealed significant expression changes between normal and Ts21 samples, which in combination with data from previously performed Ts21 transcriptomic studies, were used to generate a multi-gene biomarker for Ts21, comprising of 9 gene expression profiles. In addition to biomarker's high performance in discriminating samples from global expression profiling, we were also able to show its discriminatory performance on a larger sample set 2, validated using RT-PCR experiment (AUC=0.97), while its performance on data from previously published studies reached discriminatory AUC values of 1.00. Our results show that transcriptomic changes might potentially be used to discriminate trisomy of chromosome 21 in the prenatal setting. As expressional alterations reflect both, causal and reactive cellular mechanisms, transcriptomic changes may thus have future potential in the diagnosis of a wide array of heterogeneous diseases that result from genetic disturbances.
RELIABILITY AND VALIDITY OF A MODIFIED ISOMETRIC DYNAMOMETER IN THE ASSESSMENT OF MUSCULAR PERFORMANCE IN INDIVIDUALS WITH ANTERIOR CRUCIATE LIGAMENT RECONSTRUCTION

PubMed Central

de Vasconcelos, Rodrigo Antunes; Bevilaqua-Grossi, Débora; Shimano, Antonio Carlos; Paccola, Cleber Jansen; Salvini, Tânia Fátima; Prado, Christiane Lanatovits; Junior, Wilson A. Mello

2015-01-01

Objectives: The aim of this study was to evaluate the reliability and validity of a modified isometric dynamometer (MID) in performance deficits of the knee extensor and flexor muscles in normal individuals and in those with ACL reconstructions. Methods: Sixty male subjects were invited to participate of the study, being divided into three groups with 20 subjects each: control group (GC), group of individuals with ACL reconstruction with patellar tendon graft (GTP, and group of individuals with ACL reconstruction with hamstrings graft (GTF). All individuals performed isometric tests in the MID, muscular strength deficits collected were subsequently compared to the tests performed on the Biodex System 3 operating in the isometric and isokinetic mode at speeds of 60°/s and 180o/s. Intraclass ICC correlation calculations were done in order to assess MID reliability, specificity, sensitivity and Kappa's consistency coefficient calculations, respectively, for assessing the MID's validity in detecting muscular deficits and intra- and intergroup comparisons when performing the four strength tests using the ANOVA method. Results: The modified isometric dynamometer (MID) showed excellent reliability and good validity in the assessment of the performance of the knee extensor and flexor muscles groups. In the comparison between groups, the GTP showed significantly greater deficits as compared to the GTF and GC groups. Conclusion: Isometric dynamometers connected to mechanotherapy equipments could be an alternative option to collect data concerning performance deficits of the extensor and flexor muscles groups of the knee in subjects with ACL reconstruction. PMID:27004175
Training safer orthopedic surgeons. Construct validation of a virtual-reality simulator for hip fracture surgery.

PubMed

Akhtar, Kashif; Sugand, Kapil; Sperrin, Matthew; Cobb, Justin; Standfield, Nigel; Gupte, Chinmay

2015-01-01

Virtual-reality (VR) simulation in orthopedic training is still in its infancy, and much of the work has been focused on arthroscopy. We evaluated the construct validity of a new VR trauma simulator for performing dynamic hip screw (DHS) fixation of a trochanteric femoral fracture. 30 volunteers were divided into 3 groups according to the number of postgraduate (PG) years and the amount of clinical experience: novice (1-4 PG years; less than 10 DHS procedures); intermediate (5-12 PG years; 10-100 procedures); expert (> 12 PG years; > 100 procedures). Each participant performed a DHS procedure and objective performance metrics were recorded. These data were analyzed with each performance metric taken as the dependent variable in 3 regression models. There were statistically significant differences in performance between groups for (1) number of attempts at guide-wire insertion, (2) total fluoroscopy time, (3) tip-apex distance, (4) probability of screw cutout, and (5) overall simulator score. The intermediate group performed the procedure most quickly, with the lowest fluoroscopy time, the lowest tip-apex distance, the lowest probability of cutout, and the highest simulator score, which correlated with their frequency of exposure to running the trauma lists for hip fracture surgery. This study demonstrates the construct validity of a haptic VR trauma simulator with surgeons undertaking the procedure most frequently performing best on the simulator. VR simulation may be a means of addressing restrictions on working hours and allows trainees to practice technical tasks without putting patients at risk. The VR DHS simulator evaluated in this study may provide valid assessment of technical skill.
Gathering Validity Evidence for Surgical Simulation: A Systematic Review.

PubMed

Borgersen, Nanna Jo; Naur, Therese M H; Sørensen, Stine M D; Bjerrum, Flemming; Konge, Lars; Subhi, Yousif; Thomsen, Ann Sofia S

2018-06-01

To identify current trends in the use of validity frameworks in surgical simulation, to provide an overview of the evidence behind the assessment of technical skills in all surgical specialties, and to present recommendations and guidelines for future validity studies. Validity evidence for assessment tools used in the evaluation of surgical performance is of paramount importance to ensure valid and reliable assessment of skills. We systematically reviewed the literature by searching 5 databases (PubMed, EMBASE, Web of Science, PsycINFO, and the Cochrane Library) for studies published from January 1, 2008, to July 10, 2017. We included original studies evaluating simulation-based assessments of health professionals in surgical specialties and extracted data on surgical specialty, simulator modality, participant characteristics, and the validity framework used. Data were synthesized qualitatively. We identified 498 studies with a total of 18,312 participants. Publications involving validity assessments in surgical simulation more than doubled from 2008 to 2010 (∼30 studies/year) to 2014 to 2016 (∼70 to 90 studies/year). Only 6.6% of the studies used the recommended contemporary validity framework (Messick). The majority of studies used outdated frameworks such as face validity. Significant differences were identified across surgical specialties. The evaluated assessment tools were mostly inanimate or virtual reality simulation models. An increasing number of studies have gathered validity evidence for simulation-based assessments in surgical specialties, but the use of outdated frameworks remains common. To address the current practice, this paper presents guidelines on how to use the contemporary validity framework when designing validity studies.
The Development of the Functional Literacy Experience Scale Based upon Ecological Theory (FLESBUET) and Validity-Reliability Study

ERIC Educational Resources Information Center

Özenç, Emine Gül; Dogan, M. Cihangir

2014-01-01

This study aims to perform a validity-reliability test by developing the Functional Literacy Experience Scale based upon Ecological Theory (FLESBUET) for primary education students. The study group includes 209 fifth grade students at Sabri Taskin Primary School in the Kartal District of Istanbul, Turkey during the 2010-2011 academic year.…
An extended protocol for usability validation of medical devices: Research design and reference model.

PubMed

Schmettow, Martin; Schnittker, Raphaela; Schraagen, Jan Maarten

2017-05-01

This paper proposes and demonstrates an extended protocol for usability validation testing of medical devices. A review of currently used methods for the usability evaluation of medical devices revealed two main shortcomings. Firstly, the lack of methods to closely trace the interaction sequences and derive performance measures. Secondly, a prevailing focus on cross-sectional validation studies, ignoring the issues of learnability and training. The U.S. Federal Drug and Food Administration's recent proposal for a validation testing protocol for medical devices is then extended to address these shortcomings: (1) a novel process measure 'normative path deviations' is introduced that is useful for both quantitative and qualitative usability studies and (2) a longitudinal, completely within-subject study design is presented that assesses learnability, training effects and allows analysis of diversity of users. A reference regression model is introduced to analyze data from this and similar studies, drawing upon generalized linear mixed-effects models and a Bayesian estimation approach. The extended protocol is implemented and demonstrated in a study comparing a novel syringe infusion pump prototype to an existing design with a sample of 25 healthcare professionals. Strong performance differences between designs were observed with a variety of usability measures, as well as varying training-on-the-job effects. We discuss our findings with regard to validation testing guidelines, reflect on the extensions and discuss the perspectives they add to the validation process. Copyright © 2017 Elsevier Inc. All rights reserved.
Validation on milk and sprouts of EN ISO 16654:2001 - Microbiology of food and animal feeding stuffs - Horizontal method for the detection of Escherichia coli O157.

PubMed

Tozzoli, Rosangela; Maugliani, Antonella; Michelacci, Valeria; Minelli, Fabio; Caprioli, Alfredo; Morabito, Stefano

2018-05-08

In 2006, the European Committee for standardisation (CEN)/Technical Committee 275 - Food analysis - Horizontal methods/Working Group 6 - Microbiology of the food chain (TC275/WG6), launched the project of validating the method ISO 16654:2001 for the detection of Escherichia coli O157 in foodstuff by the evaluation of its performance, in terms of sensitivity and specificity, through collaborative studies. Previously, a validation study had been conducted to assess the performance of the Method No 164 developed by the Nordic Committee for Food Analysis (NMKL), which aims at detecting E. coli O157 in food as well, and is based on a procedure equivalent to that of the ISO 16654:2001 standard. Therefore, CEN established that the validation data obtained for the NMKL Method 164 could be exploited for the ISO 16654:2001 validation project, integrated with new data obtained through two additional interlaboratory studies on milk and sprouts, run in the framework of the CEN mandate No. M381. The ISO 16654:2001 validation project was led by the European Union Reference Laboratory for Escherichia coli including VTEC (EURL-VTEC), which organized the collaborative validation study on milk in 2012 with 15 participating laboratories and that on sprouts in 2014, with 14 participating laboratories. In both studies, a total of 24 samples were tested by each laboratory. Test materials were spiked with different concentration of E. coli O157 and the 24 samples corresponded to eight replicates of three levels of contamination: zero, low and high spiking level. The results submitted by the participating laboratories were analyzed to evaluate the sensitivity and specificity of the ISO 16654:2001 method when applied to milk and sprouts. The performance characteristics calculated on the data of the collaborative validation studies run under the CEN mandate No. M381 returned sensitivity and specificity of 100% and 94.4%, respectively for the milk study. As for sprouts matrix, the sensitivity resulted in 75.9% in the low level of contamination samples and 96.4% in samples spiked with high level of E. coli O157 and specificity was calculated as 99.1%. Copyright © 2018 Elsevier B.V. All rights reserved.
Field assessment of balance in 10 to 14 year old children, reproducibility and validity of the Nintendo Wii board.

PubMed

Larsen, Lisbeth Runge; Jørgensen, Martin Grønbech; Junge, Tina; Juul-Kristensen, Birgit; Wedderkopp, Niels

2014-06-10

Because body proportions in childhood are different to those in adulthood, children have a relatively higher centre of mass location. This biomechanical difference and the fact that children's movements have not yet fully matured result in different sway performances in children and adults. When assessing static balance, it is essential to use objective, sensitive tools, and these types of measurement have previously been performed in laboratory settings. However, the emergence of technologies like the Nintendo Wii Board (NWB) might allow balance assessment in field settings. As the NWB has only been validated and tested for reproducibility in adults, the purpose of this study was to examine reproducibility and validity of the NWB in a field setting, in a population of children. Fifty-four 10-14 year-olds from the CHAMPS-Study DK performed four different balance tests: bilateral stance with eyes open (1), unilateral stance on dominant (2) and non-dominant leg (3) with eyes open, and bilateral stance with eyes closed (4). Three rounds of the four tests were completed with the NWB and with a force platform (AMTI). To assess reproducibility, an intra-day test-retest design was applied with a two-hour break between sessions. Bland-Altman plots supplemented by Minimum Detectable Change (MDC) and concordance correlation coefficient (CCC) demonstrated satisfactory reproducibility for the NWB and the AMTI (MDC: 26.3-28.2%, CCC: 0.76-0.86) using Centre Of Pressure path Length as measurement parameter. Bland-Altman plots demonstrated satisfactory concurrent validity between the NWB and the AMTI, supplemented by satisfactory CCC in all tests (CCC: 0.74-0.87). The ranges of the limits of agreement in the validity study were comparable to the limits of agreement of the reproducibility study. Both NWB and AMTI have satisfactory reproducibility for testing static balance in a population of children. Concurrent validity of NWB compared with AMTI was satisfactory. Furthermore, the results from the concurrent validity study were comparable to the reproducibility results of the NWB and the AMTI. Thus, NWB has the potential to replace the AMTI in field settings in studies including children. Future studies are needed to examine intra-subject variability and to test the predictive validity of NWB.
Field assessment of balance in 10 to 14 year old children, reproducibility and validity of the Nintendo Wii board

PubMed Central

2014-01-01

Background Because body proportions in childhood are different to those in adulthood, children have a relatively higher centre of mass location. This biomechanical difference and the fact that children’s movements have not yet fully matured result in different sway performances in children and adults. When assessing static balance, it is essential to use objective, sensitive tools, and these types of measurement have previously been performed in laboratory settings. However, the emergence of technologies like the Nintendo Wii Board (NWB) might allow balance assessment in field settings. As the NWB has only been validated and tested for reproducibility in adults, the purpose of this study was to examine reproducibility and validity of the NWB in a field setting, in a population of children. Methods Fifty-four 10–14 year-olds from the CHAMPS-Study DK performed four different balance tests: bilateral stance with eyes open (1), unilateral stance on dominant (2) and non-dominant leg (3) with eyes open, and bilateral stance with eyes closed (4). Three rounds of the four tests were completed with the NWB and with a force platform (AMTI). To assess reproducibility, an intra-day test-retest design was applied with a two-hour break between sessions. Results Bland-Altman plots supplemented by Minimum Detectable Change (MDC) and concordance correlation coefficient (CCC) demonstrated satisfactory reproducibility for the NWB and the AMTI (MDC: 26.3-28.2%, CCC: 0.76-0.86) using Centre Of Pressure path Length as measurement parameter. Bland-Altman plots demonstrated satisfactory concurrent validity between the NWB and the AMTI, supplemented by satisfactory CCC in all tests (CCC: 0.74-0.87). The ranges of the limits of agreement in the validity study were comparable to the limits of agreement of the reproducibility study. Conclusion Both NWB and AMTI have satisfactory reproducibility for testing static balance in a population of children. Concurrent validity of NWB compared with AMTI was satisfactory. Furthermore, the results from the concurrent validity study were comparable to the reproducibility results of the NWB and the AMTI. Thus, NWB has the potential to replace the AMTI in field settings in studies including children. Future studies are needed to examine intra-subject variability and to test the predictive validity of NWB. PMID:24913461
The prone bridge test: Performance, validity, and reliability among older and younger adults.

PubMed

Bohannon, Richard W; Steffl, Michal; Glenney, Susan S; Green, Michelle; Cashwell, Leah; Prajerova, Kveta; Bunn, Jennifer

2018-04-01

The prone bridge maneuver, or plank, has been viewed as a potential alternative to curl-ups for assessing trunk muscle performance. The purpose of this study was to assess prone bridge test performance, validity, and reliability among younger and older adults. Sixty younger (20-35 years old) and 60 older (60-79 years old) participants completed this study. Groups were evenly divided by sex. Participants completed surveys regarding physical activity and abdominal exercise participation. Height, weight, body mass index (BMI), and waist circumference were measured. On two occasions, 5-9 days apart, participants held a prone bridge until volitional exhaustion or until repeated technique failure. Validity was examined using data from the first session: convergent validity by calculating correlations between survey responses, anthropometrics, and prone bridge time, known groups validity by using an ANOVA comparing bridge times of younger and older adults and of men and women. Test-retest reliability was examined by using a paired t-test to compare prone bridge times for Session1 and Session 2. Furthermore, an intraclass correlation coefficient (ICC) was used to characterize relative reliability and minimal detectable change (MDC 95% ) was used to describe absolute reliability. The mean prone bridge time was 145.3 ± 71.5 s, and was positively correlated with physical activity participation (p ≤ 0.001) and negatively correlated with BMI and waist circumference (p ≤ 0.003). Younger participants had significantly longer plank times than older participants (p = 0.003). The ICC between testing sessions was 0.915. The prone bridge test is a valid and reliable measure for evaluating abdominal performance in both younger and older adults. Copyright © 2017 Elsevier Ltd. All rights reserved.
S007--Preliminary Evaluation of the Pattern Cutting and the Ligating Loop Virtual Laparoscopic Trainers

PubMed Central

Chellali, A.; Ahn, W.; Sankaranarayanan, G.; Flinn, J. T.; Schwaitzberg, S. D.; Jones, D.B.; De, Suvranu; Cao, C.G.L.

2014-01-01

Introduction The Fundamentals of Laparoscopic Surgery (FLS) trainer is currently the standard for training and evaluating basic laparoscopic skills. However, its manual scoring system is time-consuming and subjective. The Virtual Basic Laparoscopic Skill Trainer (VBLaST©) is the virtual version of the FLS trainer which allows automatic and real time assessment of skill performance, as well as force feedback. In this study, the VBLaST© pattern cutting (VBLaST-PC©) and ligating loop (VBLaST-LL©) tasks were evaluated as part of a validation study. We hypothesized that performance would be similar on the FLS and VBLaST© trainers, and that subjects with more experience would perform better than those with less experience on both trainers. Methods Fifty-five subjects with varying surgical experience were recruited at the Learning Center during the 2013 SAGES annual meeting and were divided into two groups: experts (PGY 5, surgical fellows and surgical attendings) and novices (PGY 1–4). They were asked to perform the pattern cutting or the ligating loop task on the FLS and the VBLaST© trainers. Their performance scores for each trainer were calculated and compared. Results There were no significant differences between the FLS and VBLaST© scores for either the pattern cutting or the ligating loop task. Experts’ scores were significantly higher than the scores for novices on both trainers. Conclusion This study showed that the subjects’ performance on the VBLaST© trainer was similar to the FLS performance for both tasks. Both the VBLaST-PC© and the VBLaST-LL© tasks permitted discrimination between the novice and expert groups. Though concurrent and discriminant validity has been established, further studies to establish convergent and predictive validity are needed. Once validated as a training system for laparoscopic skills, the system is expected to overcome the current limitations of the FLS trainer. PMID:25159626
The validation of Huffaz Intelligence Test (HIT)

NASA Astrophysics Data System (ADS)

Rahim, Mohd Azrin Mohammad; Ahmad, Tahir; Awang, Siti Rahmah; Safar, Ajmain

2017-08-01

In general, a hafiz who can memorize the Quran has many specialties especially in respect to their academic performances. In this study, the theory of multiple intelligences introduced by Howard Gardner is embedded in a developed psychometric instrument, namely Huffaz Intelligence Test (HIT). This paper presents the validation and the reliability of HIT of some tahfiz students in Malaysia Islamic schools. A pilot study was conducted involving 87 huffaz who were randomly selected to answer the items in HIT. The analysis method used includes Partial Least Square (PLS) on reliability, convergence and discriminant validation. The study has validated nine intelligences. The findings also indicated that the composite reliabilities for the nine types of intelligences are greater than 0.8. Thus, the HIT is a valid and reliable instrument to measure the multiple intelligences among huffaz.
MIMoSA: An Automated Method for Intermodal Segmentation Analysis of Multiple Sclerosis Brain Lesions.

PubMed

Valcarcel, Alessandra M; Linn, Kristin A; Vandekar, Simon N; Satterthwaite, Theodore D; Muschelli, John; Calabresi, Peter A; Pham, Dzung L; Martin, Melissa Lynne; Shinohara, Russell T

2018-03-08

Magnetic resonance imaging (MRI) is crucial for in vivo detection and characterization of white matter lesions (WMLs) in multiple sclerosis. While WMLs have been studied for over two decades using MRI, automated segmentation remains challenging. Although the majority of statistical techniques for the automated segmentation of WMLs are based on single imaging modalities, recent advances have used multimodal techniques for identifying WMLs. Complementary modalities emphasize different tissue properties, which help identify interrelated features of lesions. Method for Inter-Modal Segmentation Analysis (MIMoSA), a fully automatic lesion segmentation algorithm that utilizes novel covariance features from intermodal coupling regression in addition to mean structure to model the probability lesion is contained in each voxel, is proposed. MIMoSA was validated by comparison with both expert manual and other automated segmentation methods in two datasets. The first included 98 subjects imaged at Johns Hopkins Hospital in which bootstrap cross-validation was used to compare the performance of MIMoSA against OASIS and LesionTOADS, two popular automatic segmentation approaches. For a secondary validation, a publicly available data from a segmentation challenge were used for performance benchmarking. In the Johns Hopkins study, MIMoSA yielded average Sørensen-Dice coefficient (DSC) of .57 and partial AUC of .68 calculated with false positive rates up to 1%. This was superior to performance using OASIS and LesionTOADS. The proposed method also performed competitively in the segmentation challenge dataset. MIMoSA resulted in statistically significant improvements in lesion segmentation performance compared with LesionTOADS and OASIS, and performed competitively in an additional validation study. Copyright © 2018 by the American Society of Neuroimaging.
The predictive validity of selection for entry into postgraduate training in general practice: evidence from three longitudinal studies

PubMed Central

Patterson, Fiona; Lievens, Filip; Kerrin, Máire; Munro, Neil; Irish, Bill

2013-01-01

Background The selection methodology for UK general practice is designed to accommodate several thousand applicants per year and targets six core attributes identified in a multi-method job-analysis study Aim To evaluate the predictive validity of selection methods for entry into postgraduate training, comprising a clinical problem-solving test, a situational judgement test, and a selection centre. Design and setting A three-part longitudinal predictive validity study of selection into training for UK general practice. Method In sample 1, participants were junior doctors applying for training in general practice (n = 6824). In sample 2, participants were GP registrars 1 year into training (n = 196). In sample 3, participants were GP registrars sitting the licensing examination after 3 years, at the end of training (n = 2292). The outcome measures include: assessor ratings of performance in a selection centre comprising job simulation exercises (sample 1); supervisor ratings of trainee job performance 1 year into training (sample 2); and licensing examination results, including an applied knowledge examination and a 12-station clinical skills objective structured clinical examination (OSCE; sample 3). Results Performance ratings at selection predicted subsequent supervisor ratings of job performance 1 year later. Selection results also significantly predicted performance on both the clinical skills OSCE and applied knowledge examination for licensing at the end of training. Conclusion In combination, these longitudinal findings provide good evidence of the predictive validity of the selection methods, and are the first reported for entry into postgraduate training. Results show that the best predictor of work performance and training outcomes is a combination of a clinical problem-solving test, a situational judgement test, and a selection centre. Implications for selection methods for all postgraduate specialties are considered. PMID:24267856
The predictive validity of selection for entry into postgraduate training in general practice: evidence from three longitudinal studies.

PubMed

Patterson, Fiona; Lievens, Filip; Kerrin, Máire; Munro, Neil; Irish, Bill

2013-11-01

The selection methodology for UK general practice is designed to accommodate several thousand applicants per year and targets six core attributes identified in a multi-method job-analysis study To evaluate the predictive validity of selection methods for entry into postgraduate training, comprising a clinical problem-solving test, a situational judgement test, and a selection centre. A three-part longitudinal predictive validity study of selection into training for UK general practice. In sample 1, participants were junior doctors applying for training in general practice (n = 6824). In sample 2, participants were GP registrars 1 year into training (n = 196). In sample 3, participants were GP registrars sitting the licensing examination after 3 years, at the end of training (n = 2292). The outcome measures include: assessor ratings of performance in a selection centre comprising job simulation exercises (sample 1); supervisor ratings of trainee job performance 1 year into training (sample 2); and licensing examination results, including an applied knowledge examination and a 12-station clinical skills objective structured clinical examination (OSCE; sample 3). Performance ratings at selection predicted subsequent supervisor ratings of job performance 1 year later. Selection results also significantly predicted performance on both the clinical skills OSCE and applied knowledge examination for licensing at the end of training. In combination, these longitudinal findings provide good evidence of the predictive validity of the selection methods, and are the first reported for entry into postgraduate training. Results show that the best predictor of work performance and training outcomes is a combination of a clinical problem-solving test, a situational judgement test, and a selection centre. Implications for selection methods for all postgraduate specialties are considered.

Word Memory Test Performance Across Cognitive Domains, Psychiatric Presentations, and Mild Traumatic Brain Injury.

PubMed

Rowland, Jared A; Miskey, Holly M; Brearly, Timothy W; Martindale, Sarah L; Shura, Robert D

2017-05-01

The current study addressed two aims: (i) determine how Word Memory Test (WMT) performance relates to test performance across numerous cognitive domains and (ii) evaluate how current psychiatric disorders or mild traumatic brain injury (mTBI) history affects performance on the WMT after excluding participants with poor symptom validity. Participants were 235 Iraq and Afghanistan-era veterans (Mage = 35.5) who completed a comprehensive neuropsychological battery. Participants were divided into two groups based on WMT performance (Pass = 193, Fail = 42). Tests were grouped into cognitive domains and an average z-score was calculated for each domain. Significant differences were found between those who passed and those who failed the WMT on the memory, attention, executive function, and motor output domain z-scores. WMT failure was associated with a larger performance decrement in the memory domain than the sensation or visuospatial-construction domains. Participants with a current psychiatric diagnosis or mTBI history were significantly more likely to fail the WMT, even after removing participants with poor symptom validity. Results suggest that the WMT is most appropriate for assessing validity in the domains of attention, executive function, motor output and memory, with little relationship to performance in domains of sensation or visuospatial-construction. Comprehensive cognitive batteries would benefit from inclusion of additional performance validity tests in these domains. Additionally, symptom validity did not explain higher rates of WMT failure in individuals with a current psychiatric diagnosis or mTBI history. Further research is needed to better understand how these conditions may affect WMT performance. Published by Oxford University Press 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.
Freezing of gait and fall detection in Parkinson's disease using wearable sensors: a systematic review.

PubMed

Silva de Lima, Ana Lígia; Evers, Luc J W; Hahn, Tim; Bataille, Lauren; Hamilton, Jamie L; Little, Max A; Okuma, Yasuyuki; Bloem, Bastiaan R; Faber, Marjan J

2017-08-01

Despite the large number of studies that have investigated the use of wearable sensors to detect gait disturbances such as Freezing of gait (FOG) and falls, there is little consensus regarding appropriate methodologies for how to optimally apply such devices. Here, an overview of the use of wearable systems to assess FOG and falls in Parkinson's disease (PD) and validation performance is presented. A systematic search in the PubMed and Web of Science databases was performed using a group of concept key words. The final search was performed in January 2017, and articles were selected based upon a set of eligibility criteria. In total, 27 articles were selected. Of those, 23 related to FOG and 4 to falls. FOG studies were performed in either laboratory or home settings, with sample sizes ranging from 1 PD up to 48 PD presenting Hoehn and Yahr stage from 2 to 4. The shin was the most common sensor location and accelerometer was the most frequently used sensor type. Validity measures ranged from 73-100% for sensitivity and 67-100% for specificity. Falls and fall risk studies were all home-based, including samples sizes of 1 PD up to 107 PD, mostly using one sensor containing accelerometers, worn at various body locations. Despite the promising validation initiatives reported in these studies, they were all performed in relatively small sample sizes, and there was a significant variability in outcomes measured and results reported. Given these limitations, the validation of sensor-derived assessments of PD features would benefit from more focused research efforts, increased collaboration among researchers, aligning data collection protocols, and sharing data sets.
Cultural Adaptation and Validation of the Cultural Self-Efficacy Scale for Colombian Nursing Professionals.

PubMed

Herrero-Hahn, Raquel; Rojas, Juan Guillermo; Ospina-Díaz, Juan Manuel; Montoya-Juárez, Rafael; Restrepo-Medrano, Juan Carlos; Hueso-Montoro, César

2017-03-01

The level of cultural self-efficacy indicates the degree of confidence nursing professionals possess for their ability to provide culturally competent care. Cultural adaptation and validation of the Cultural Self-Efficacy Scale was performed for nursing professionals in Colombia. A scale validation study was conducted. Cultural adaptation and validation of the Cultural Self-Efficacy Scale was performed using a sample of 190 nurses in Colombia, between September 2013 and April 2014. This sample was chosen via systematic random sampling from a finite population. The scale was culturally adapted. Cronbach's alpha for the revised scale was .978. Factor analysis revealed the existence of six factors grouped in three dimensions that explained 68% of the variance. The results demonstrated that the version of the Cultural Self-Efficacy Scale adapted to the Colombian context is a valid and reliable instrument for determining the level of cultural self-efficacy of nursing professionals.
Multicenter validation of a bedside antisaccade task as a measure of executive function

PubMed Central

Hellmuth, J.; Mirsky, J.; Heuer, H.W.; Matlin, A.; Jafari, A.; Garbutt, S.; Widmeyer, M.; Berhel, A.; Sinha, L.; Miller, B.L.; Kramer, J.H.

2012-01-01

Objective: To create and validate a simple, standardized version of the antisaccade (AS) task that requires no specialized equipment for use as a measure of executive function in multicenter clinical studies. Methods: The bedside AS (BAS) task consisted of 40 pseudorandomized AS trials presented on a laptop computer. BAS performance was compared with AS performance measured using an infrared eye tracker in normal elders (NE) and individuals with mild cognitive impairment (MCI) or dementia (n = 33). The neuropsychological domain specificity of the BAS was then determined in a cohort of NE, MCI, and dementia (n = 103) at UCSF, and the BAS was validated as a measure of executive function in a 6-center cohort (n = 397) of normal adults and patients with a variety of brain diseases. Results: Performance on the BAS and laboratory AS task was strongly correlated and BAS performance was most strongly associated with neuropsychological measures of executive function. Even after controlling for disease severity and processing speed, BAS performance was associated with multiple assessments of executive function, most strongly the informant-based Frontal Systems Behavior Scale. Conclusions: The BAS is a simple, valid measure of executive function in aging and neurologic disease. PMID:22573640
Uncertainty estimates of purity measurements based on current information: toward a "live validation" of purity methods.

PubMed

Apostol, Izydor; Kelner, Drew; Jiang, Xinzhao Grace; Huang, Gang; Wypych, Jette; Zhang, Xin; Gastwirt, Jessica; Chen, Kenneth; Fodor, Szilan; Hapuarachchi, Suminda; Meriage, Dave; Ye, Frank; Poppe, Leszek; Szpankowski, Wojciech

2012-12-01

To predict precision and other performance characteristics of chromatographic purity methods, which represent the most widely used form of analysis in the biopharmaceutical industry. We have conducted a comprehensive survey of purity methods, and show that all performance characteristics fall within narrow measurement ranges. This observation was used to develop a model called Uncertainty Based on Current Information (UBCI), which expresses these performance characteristics as a function of the signal and noise levels, hardware specifications, and software settings. We applied the UCBI model to assess the uncertainty of purity measurements, and compared the results to those from conventional qualification. We demonstrated that the UBCI model is suitable to dynamically assess method performance characteristics, based on information extracted from individual chromatograms. The model provides an opportunity for streamlining qualification and validation studies by implementing a "live validation" of test results utilizing UBCI as a concurrent assessment of measurement uncertainty. Therefore, UBCI can potentially mitigate the challenges associated with laborious conventional method validation and facilitates the introduction of more advanced analytical technologies during the method lifecycle.
Organisational Climate as a Predictor of Workforce Performance in the Malaysian Higher Education Institutions

ERIC Educational Resources Information Center

Musah, Mohammed Borhandden; Ali, Hairuddin Mohd; al-Hudawi, Shafeeq Hussain Vazhathodi; Tahir, Lokman Mohd; Binti Daud, Khadijah; Bin Said, Hamdan; Kamil, Naail Mohammed

2016-01-01

Purpose: This study aims to investigate whether organisational climate (OC) predicts academic staff performance at Malaysian higher education institutions (HEIs). The study equally aims at validating the psychometric properties of OC and workforce performance (WFP) constructs. Design/methodology/approach: Survey questionnaires were administered to…
Simulated Driving Assessment (SDA) for teen drivers: results from a validation study.

PubMed

McDonald, Catherine C; Kandadai, Venk; Loeb, Helen; Seacrist, Thomas S; Lee, Yi-Ching; Winston, Zachary; Winston, Flaura K

2015-06-01

Driver error and inadequate skill are common critical reasons for novice teen driver crashes, yet few validated, standardised assessments of teen driving skills exist. The purpose of this study is to evaluate the construct and criterion validity of a newly developed Simulated Driving Assessment (SDA) for novice teen drivers. The SDA's 35 min simulated drive incorporates 22 variations of the most common teen driver crash configurations. Driving performance was compared for 21 inexperienced teens (age 16-17 years, provisional license ≤90 days) and 17 experienced adults (age 25-50 years, license ≥5 years, drove ≥100 miles per week, no collisions or moving violations ≤3 years). SDA driving performance (Error Score) was based on driving safety measures derived from simulator and eye-tracking data. Negative driving outcomes included simulated collisions or run-off-the-road incidents. A professional driving evaluator/instructor (DEI Score) reviewed videos of SDA performance. The SDA demonstrated construct validity: (1) teens had a higher Error Score than adults (30 vs. 13, p=0.02); (2) For each additional error committed, the RR of a participant's propensity for a simulated negative driving outcome increased by 8% (95% CI 1.05 to 1.10, p<0.01). The SDA-demonstrated criterion validity: Error Score was correlated with DEI Score (r=-0.66, p<0.001). This study supports the concept of validated simulated driving tests like the SDA to assess novice driver skill in complex and hazardous driving scenarios. The SDA, as a standard protocol to evaluate teen driver performance, has the potential to facilitate screening and assessment of teen driving readiness and could be used to guide targeted skill training. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
The development of performance-based practical assessment model at civil engineering workshop in state polytechnic

NASA Astrophysics Data System (ADS)

Kristinayanti, W. S.; Mas Pertiwi, I. G. A. I.; Evin Yudhi, S.; Lokantara, W. D.

2018-01-01

Assessment is an important element in education that shall oversees students’ competence not only in terms of cognitive aspect, but alsothe students’ psychomotorin a comprehensive way. Civil Engineering Department at Bali State Polytechnic,as a vocational education institution, emphasizes on not only the theoretical foundation of the study, but also the application throughpracticum in workshop-based learning. We are aware of a need for performance-based assessment for these students, which would be essential for the student’s all-round performance in their studies.We try to develop a performance-based practicum assessment model that is needed to assess student’s ability in workshop-based learning. This research was conducted in three stages, 1) learning needs analysis, 2) instruments development, and 3) testing of instruments. The study uses rubrics set-up to test students’ competence in the workshop and test the validity. We obtained 34-point valid statement out of 35, and resulted in value of Cronbach’s alpha equal to 0.977. In expert test we obtained a value of CVI = 0.75 which means that the drafted assessment is empirically valid within thetrial group.
Practical Aspects of Designing and Conducting Validation Studies Involving Multi-study Trials.

PubMed

Coecke, Sandra; Bernasconi, Camilla; Bowe, Gerard; Bostroem, Ann-Charlotte; Burton, Julien; Cole, Thomas; Fortaner, Salvador; Gouliarmou, Varvara; Gray, Andrew; Griesinger, Claudius; Louhimies, Susanna; Gyves, Emilio Mendoza-de; Joossens, Elisabeth; Prinz, Maurits-Jan; Milcamps, Anne; Parissis, Nicholaos; Wilk-Zasadna, Iwona; Barroso, João; Desprez, Bertrand; Langezaal, Ingrid; Liska, Roman; Morath, Siegfried; Reina, Vittorio; Zorzoli, Chiara; Zuang, Valérie

This chapter focuses on practical aspects of conducting prospective in vitro validation studies, and in particular, by laboratories that are members of the European Union Network of Laboratories for the Validation of Alternative Methods (EU-NETVAL) that is coordinated by the EU Reference Laboratory for Alternatives to Animal Testing (EURL ECVAM). Prospective validation studies involving EU-NETVAL, comprising a multi-study trial involving several laboratories or "test facilities", typically consist of two main steps: (1) the design of the validation study by EURL ECVAM and (2) the execution of the multi-study trial by a number of qualified laboratories within EU-NETVAL, coordinated and supported by EURL ECVAM. The approach adopted in the conduct of these validation studies adheres to the principles described in the OECD Guidance Document on the Validation and International Acceptance of new or updated test methods for Hazard Assessment No. 34 (OECD 2005). The context and scope of conducting prospective in vitro validation studies is dealt with in Chap. 4 . Here we focus mainly on the processes followed to carry out a prospective validation of in vitro methods involving different laboratories with the ultimate aim of generating a dataset that can support a decision in relation to the possible development of an international test guideline (e.g. by the OECD) or the establishment of performance standards.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Rainer, Leo I.; Hoeschele, Marc A.; Apte, Michael G.

This report addresses the results of detailed monitoring completed under Program Element 6 of Lawrence Berkeley National Laboratory's High Performance Commercial Building Systems (HPCBS) PIER program. The purpose of the Energy Simulations and Projected State-Wide Energy Savings project is to develop reasonable energy performance and cost models for high performance relocatable classrooms (RCs) across California climates. A key objective of the energy monitoring was to validate DOE2 simulations for comparison to initial DOE2 performance projections. The validated DOE2 model was then used to develop statewide savings projections by modeling base case and high performance RC operation in the 16 Californiamore » climate zones. The primary objective of this phase of work was to utilize detailed field monitoring data to modify DOE2 inputs and generate performance projections based on a validated simulation model. Additional objectives include the following: (1) Obtain comparative performance data on base case and high performance HVAC systems to determine how they are operated, how they perform, and how the occupants respond to the advanced systems. This was accomplished by installing both HVAC systems side-by-side (i.e., one per module of a standard two module, 24 ft by 40 ft RC) on the study RCs and switching HVAC operating modes on a weekly basis. (2) Develop projected statewide energy and demand impacts based on the validated DOE2 model. (3) Develop cost effectiveness projections for the high performance HVAC system in the 16 California climate zones.« less
Development of a proficiency-based virtual reality simulation training curriculum for laparoscopic appendicectomy.

PubMed

Sirimanna, Pramudith; Gladman, Marc A

2017-10-01

Proficiency-based virtual reality (VR) training curricula improve intraoperative performance, but have not been developed for laparoscopic appendicectomy (LA). This study aimed to develop an evidence-based training curriculum for LA. A total of 10 experienced (>50 LAs), eight intermediate (10-30 LAs) and 20 inexperienced (<10 LAs) operators performed guided and unguided LA tasks on a high-fidelity VR simulator using internationally relevant techniques. The ability to differentiate levels of experience (construct validity) was measured using simulator-derived metrics. Learning curves were analysed. Proficiency benchmarks were defined by the performance of the experienced group. Intermediate and experienced participants completed a questionnaire to evaluate the realism (face validity) and relevance (content validity). Of 18 surgeons, 16 (89%) considered the VR model to be visually realistic and 17 (95%) believed that it was representative of actual practice. All 'guided' modules demonstrated construct validity (P < 0.05), with learning curves that plateaued between sessions 6 and 9 (P < 0.01). When comparing inexperienced to intermediates to experienced, the 'unguided' LA module demonstrated construct validity for economy of motion (5.00 versus 7.17 versus 7.84, respectively; P < 0.01) and task time (864.5 s versus 477.2 s versus 352.1 s, respectively, P < 0.01). Construct validity was also confirmed for number of movements, path length and idle time. Validated modules were used for curriculum construction, with proficiency benchmarks used as performance goals. A VR LA model was realistic and representative of actual practice and was validated as a training and assessment tool. Consequently, the first evidence-based internationally applicable training curriculum for LA was constructed, which facilitates skill acquisition to proficiency. © 2017 Royal Australasian College of Surgeons.
Development and validation of a music performance anxiety inventory for gifted adolescent musicians.

PubMed

Osborne, Margaret S; Kenny, Dianna T

2005-01-01

Music performance anxiety (MPA) is a distressing experience for musicians of all ages, yet the empirical investigation of MPA in adolescents has received little attention to date. No measures specifically targeting MPA in adolescents have been empirically validated. This article presents findings of an initial study into the psychometric properties and validation of the Music Performance Anxiety Inventory for Adolescents (MPAI-A), a new self-report measure of MPA for this group. Data from 381 elite young musicians aged 12-19 years was used to investigate the factor structure, internal reliability, construct and divergent validity of the MPAI-A. Cronbach's alpha for the full measure was .91. Factor analysis identified three factors, which together accounted for 53% of the variance. Construct validity was demonstrated by significant positive relationships with social phobia (measured using the Social Phobia Anxiety Inventory [Beidel, D. C., Turner, S. M., & Morris, T. L. (1995). A new inventory to assess childhood social anxiety and phobia: The Social Phobia and Anxiety Inventory for Children. Psychological Assessment, 7(1), 73-79; Beidel, D. C., Turner, S. M., & Morris, T. L. (1998). Social Phobia and Anxiety Inventory for Children (SPAI-C). North Tonawanda, NY: Multi-Health Systems Inc.]) and trait anxiety (measured using the State Trait Anxiety Inventory [Spielberger, C. D. (1983). State-Trait Anxiety Inventory STAI (Form Y). Palo Alto, CA: Consulting Psychologists Press, Inc.]). The MPAI-A demonstrated convergent validity by a moderate to strong positive correlation with an adult measure of MPA. Discriminant validity was established by a weaker positive relationship with depression, and no relationship with externalizing behavior problems. It is hoped that the MPAI-A, as the first empirically validated measure of adolescent musicians' performance anxiety, will enhance and promote phenomenological and treatment research in this area.
Daily functioning profile of children with attention deficit hyperactive disorder: A pilot study using an ecological assessment.

PubMed

Rosenblum, Sara; Frisch, Carmit; Deutsh-Castel, Tsofia; Josman, Naomi

2015-01-01

Children with attention-deficit hyperactivity disorder (ADHD) often present with activities of daily living (ADL) performance deficits. This study aimed to compare the performance characteristics of children with ADHD to those of controls based on the Do-Eat assessment tool, and to establish the tool's validity. Participants were 23 children with ADHD and 24 matched controls, aged 6-9 years. In addition to the Do-Eat, the Children Activity Scale-Parent (ChAS-P) and the Behavioral Rating Inventory of Executive Function (BRIEF) were used to measure sensorimotor abilities and executive function (EF). Significant differences were found in the Do-Eat scores between children with ADHD and controls. Significant moderate correlations were found between the Do-Eat sensorimotor scores, the ChAS-P and the BRIEF scores in the ADHD group. Significant correlations were found between performance on the Do-Eat and the ChAS-P questionnaire scores, verifying the tool's ecological validity. A single discriminant function described primarily by four Do-Eat variables, correctly classified 95.5% of the study participants into their respective study groups, establishing the tool's predictive validity within this population. These preliminary findings indicate that the Do-Eat may serve as a reliable and valid tool that provides insight into the daily functioning characteristics of children with ADHD. However, further research on larger samples is indicated.
Validity of the Agency for Health Care Research and Quality Patient Safety Indicators and the Centers for Medicare and Medicaid Hospital-acquired Conditions: A Systematic Review and Meta-Analysis.

PubMed

Winters, Bradford D; Bharmal, Aamir; Wilson, Renee F; Zhang, Allen; Engineer, Lilly; Defoe, Deidre; Bass, Eric B; Dy, Sydney; Pronovost, Peter J

2016-12-01

The Agency for Health Care Research and Quality Patient Safety Indicators (PSIs) and Centers for Medicare and Medicaid Services Hospital-acquired Conditions (HACs) are increasingly being used for pay-for-performance and public reporting despite concerns over their validity. Given the potential for these measures to misinform patients, misclassify hospitals, and misapply financial and reputational harm to hospitals, these need to be rigorously evaluated. We performed a systematic review and meta-analysis to assess PSI and HAC measure validity. We searched MEDLINE and the gray literature from January 1, 1990 through January 14, 2015 for studies that addressed the validity of the HAC measures and PSIs. Secondary outcomes included the effects of present on admission (POA) modifiers, and the most common reasons for discrepancies. We developed pooled results for measures evaluated by ≥3 studies. We propose a threshold of 80% for positive predictive value or sensitivity for pay-for-performance and public reporting suitability. Only 5 measures, Iatrogenic Pneumothorax (PSI 6/HAC 17), Central Line-associated Bloodstream Infections (PSI 7), Postoperative hemorrhage/hematoma (PSI 9), Postoperative deep vein thrombosis/pulmonary embolus (PSI 12), and Accidental Puncture/Laceration (PSI 15), had sufficient data for pooled meta-analysis. Only PSI 15 (Accidental Puncture and Laceration) met our proposed threshold for validity (positive predictive value only) but this result was weakened by considerable heterogeneity. Coding errors were the most common reasons for discrepancies between medical record review and administrative databases. POA modifiers may improve the validity of some measures. This systematic review finds that there is limited validity for the PSI and HAC measures when measured against the reference standard of a medical chart review. Their use, as they currently exist, for public reporting and pay-for-performance, should be publicly reevaluated in light of these findings.
Exploring rationality in schizophrenia

PubMed Central

Mortensen, Erik Lykke; Owen, Gareth; Nordgaard, Julie; Jansson, Lennart; Sæbye, Ditte; Flensborg-Madsen, Trine; Parnas, Josef

2015-01-01

Background Empirical studies of rationality (syllogisms) in patients with schizophrenia have obtained different results. One study found that patients reason more logically if the syllogism is presented through an unusual content. Aims To explore syllogism-based rationality in schizophrenia. Method Thirty-eight first-admitted patients with schizophrenia and 38 healthy controls solved 29 syllogisms that varied in presentation content (ordinary v. unusual) and validity (valid v. invalid). Statistical tests were made of unadjusted and adjusted group differences in models adjusting for intelligence and neuropsychological test performance. Results Controls outperformed patients on all syllogism types, but the difference between the two groups was only significant for valid syllogisms presented with unusual content. However, when adjusting for intelligence and neuropsychological test performance, all group differences became non-significant. Conclusions When taking intelligence and neuropsychological performance into account, patients with schizophrenia and controls perform similarly on syllogism tests of rationality. Declaration of interest None. Copyright and usage © The Royal College of Psychiatrists 2015. This is an open access article distributed under the terms of the Creative Commons Non-Commercial, No Derivatives (CC BY-NC-ND) licence. PMID:27703730
U.S.-MEXICO BORDER PROGRAM ARIZONA BORDER STUDY--STANDARD OPERATING PROCEDURE FOR PERFORMANCE OF COMPUTER SOFTWARE: VERIFICATION AND VALIDATION (IIT-A-2.0)

EPA Science Inventory

The purpose of this SOP is to define the procedures for the initial and periodic verification and validation of computer programs. The programs are used during the Arizona NHEXAS project and Border study at the Illinois Institute of Technology (IIT) site. Keywords: computers; s...
Conceptualization of Approaches and Thought Processes Emerging in Validating of Model in Mathematical Modeling in Technology Aided Environment

ERIC Educational Resources Information Center

Hidiroglu, Çaglar Naci; Bukova Güzel, Esra

2013-01-01

The aim of the present study is to conceptualize the approaches displayed for validation of model and thought processes provided in mathematical modeling process performed in technology-aided learning environment. The participants of this grounded theory study were nineteen secondary school mathematics student teachers. The data gathered from the…
NHEXAS PHASE I ARIZONA STUDY--STANDARD OPERATING PROCEDURE FOR PERFORMANCE OF COMPUTER SOFTWARE: VERIFICATION AND VALIDATION (UA-D-2.0)

EPA Science Inventory

The purpose of this SOP is to define the procedures used for the initial and periodic verification and validation of computer programs used during the Arizona NHEXAS project and the "Border" study. Keywords: Computers; Software; QA/QC.
The National Human Exposure Assessment Sur...
The Construction and Validation of an Abridged Version of the Autism-Spectrum Quotient (AQ-Short)

ERIC Educational Resources Information Center

Hoekstra, Rosa A.; Vinkhuyzen, Anna A. E.; Wheelwright, Sally; Bartels, Meike; Boomsma, Dorret I.; Baron-Cohen, Simon; Posthuma, Danielle; van der Sluis, Sophie

2011-01-01

This study reports on the development and validation of an abridged version of the 50-item Autism-Spectrum Quotient (AQ), a self-report measure of autistic traits. We aimed to reduce the number of items whilst retaining high validity and a meaningful factor structure. The item reduction procedure was performed on data from 1,263 Dutch students and…
Validation approach for a fast and simple targeted screening method for 75 antibiotics in meat and aquaculture products using LC-MS/MS.

PubMed

Dubreil, Estelle; Gautier, Sophie; Fourmond, Marie-Pierre; Bessiral, Mélaine; Gaugain, Murielle; Verdon, Eric; Pessel, Dominique

2017-04-01

An approach is described to validate a fast and simple targeted screening method for antibiotic analysis in meat and aquaculture products by LC-MS/MS. The strategy of validation was applied for a panel of 75 antibiotics belonging to different families, i.e., penicillins, cephalosporins, sulfonamides, macrolides, quinolones and phenicols. The samples were extracted once with acetonitrile, concentrated by evaporation and injected into the LC-MS/MS system. The approach chosen for the validation was based on the Community Reference Laboratory (CRL) guidelines for the validation of screening qualitative methods. The aim of the validation was to prove sufficient sensitivity of the method to detect all the targeted antibiotics at the level of interest, generally the maximum residue limit (MRL). A robustness study was also performed to test the influence of different factors. The validation showed that the method is valid to detect and identify 73 antibiotics of the 75 antibiotics studied in meat and aquaculture products at the validation levels.

Performance Evaluation of a Data Validation System

NASA Technical Reports Server (NTRS)

Wong, Edmond (Technical Monitor); Sowers, T. Shane; Santi, L. Michael; Bickford, Randall L.

2005-01-01

Online data validation is a performance-enhancing component of modern control and health management systems. It is essential that performance of the data validation system be verified prior to its use in a control and health management system. A new Data Qualification and Validation (DQV) Test-bed application was developed to provide a systematic test environment for this performance verification. The DQV Test-bed was used to evaluate a model-based data validation package known as the Data Quality Validation Studio (DQVS). DQVS was employed as the primary data validation component of a rocket engine health management (EHM) system developed under NASA's NGLT (Next Generation Launch Technology) program. In this paper, the DQVS and DQV Test-bed software applications are described, and the DQV Test-bed verification procedure for this EHM system application is presented. Test-bed results are summarized and implications for EHM system performance improvements are discussed.
Does True Neurocognitive Dysfunction Contribute to Minnesota Multiphasic Personality Inventory-2nd Edition-Restructured Form Cognitive Validity Scale Scores?

PubMed

Martin, Phillip K; Schroeder, Ryan W; Heinrichs, Robin J; Baade, Lyle E

2015-08-01

Previous research has demonstrated RBS and FBS-r to identify non-credible reporters of cognitive symptoms, but the extent that these scales might be influenced by true neurocognitive dysfunction has not been previously studied. The present study examined the relationship between these cognitive validity scales and neurocognitive performance across seven domains of cognitive functioning, both before and after controlling for PVT status in 120 individuals referred for neuropsychological evaluations. Variance in RBS, but not FBS-r, was significantly accounted for by neurocognitive test performance across most cognitive domains. After controlling for PVT status, however, relationships between neurocognitive test performance and validity scales were no longer significant for RBS, and remained non-significant for FBS-r. Additionally, PVT failure accounted for a significant proportion of the variance in both RBS and FBS-r. Results support both the convergent and discriminant validity of RBS and FBS-r. As neither scale was impacted by true neurocognitive dysfunction, these findings provide further support for the use of RBS and FBS-r in neuropsychological evaluations. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Embedded performance validity testing in neuropsychological assessment: Potential clinical tools.

PubMed

Rickards, Tyler A; Cranston, Christopher C; Touradji, Pegah; Bechtold, Kathleen T

2018-01-01

The article aims to suggest clinically-useful tools in neuropsychological assessment for efficient use of embedded measures of performance validity. To accomplish this, we integrated available validity-related and statistical research from the literature, consensus statements, and survey-based data from practicing neuropsychologists. We provide recommendations for use of 1) Cutoffs for embedded performance validity tests including Reliable Digit Span, California Verbal Learning Test (Second Edition) Forced Choice Recognition, Rey-Osterrieth Complex Figure Test Combination Score, Wisconsin Card Sorting Test Failure to Maintain Set, and the Finger Tapping Test; 2) Selecting number of performance validity measures to administer in an assessment; and 3) Hypothetical clinical decision-making models for use of performance validity testing in a neuropsychological assessment collectively considering behavior, patient reporting, and data indicating invalid or noncredible performance. Performance validity testing helps inform the clinician about an individual's general approach to tasks: response to failure, task engagement and persistence, compliance with task demands. Data-driven clinical suggestions provide a resource to clinicians and to instigate conversation within the field to make more uniform, testable decisions to further the discussion, and guide future research in this area.
The Brief Fear of Negative Evaluation Scale (BFNE): translation and validation study of the Iranian version

PubMed Central

Tavoli, Azadeh; Melyani, Mahdiyeh; Bakhtiari, Maryam; Ghaedi, Gholam Hossein; Montazeri, Ali

2009-01-01

Background The Brief Fear of Negative Evaluation Scale (BFNE) is a commonly used instrument to measure social anxiety. This study aimed to translate and to test the reliability and validity of the BFNE in Iran. Methods The English language version of the BFNE was translated into Persian (Iranian language) and was used in this study. The questionnaire was administered to a consecutive sample of 235 students with (n = 33, clinical group) and without social phobia (n = 202, non-clinical group). In addition to the BFNE, two standard instruments were used to measure social phobia severity: the Social Phobia Inventory (SPIN), and the Social Interaction Anxiety Scale (SIAS). All participants completed a brief background information questionnaire, the SPIN, the SIAS and the BFNE scales. Statistical analysis was performed to test the reliability and validity of the BFNE. Results In all 235 students were studied (111 male and 124 female). The mean age for non-clinical group was 22.2 (SD = 2.1) years and for clinical sample it was 22.4 (SD = 1.8) years. Cronbach's alpha coefficient (to test reliability) was acceptable for both non-clinical and clinical samples (α = 0.90 and 0.82 respectively). In addition, 3-week test-retest reliability was performed in non-clinical sample and the intraclass correlation coefficient (ICC) was quite high (ICC = 0.71). Validity as performed using convergent and discriminant validity showed satisfactory results. The questionnaire correlated well with established measures of social phobia such as the SPIN (r = 0.43, p < 0.001) and the SIAS (r = 0.54, p < 0.001). Also the BFNE discriminated well between men and women with and without social phobia in the expected direction. Factor analysis supported a two-factor solution corresponding to positive and reverse-worded items. Conclusion This validation study of the Iranian version of BFNE proved that it is an acceptable, reliable and valid measure of social phobia. However, since the scale showed a two-factor structure and this does not confirm to the theoretical basis for the BFNE, thus we suggest the use of the BFNE-II when it becomes available in Iran. The validation study of the BFNE-II is in progress. PMID:19589161
The LEAP™ Gesture Interface Device and Take-Home Laparoscopic Simulators: A Study of Construct and Concurrent Validity.

PubMed

Partridge, Roland W; Brown, Fraser S; Brennan, Paul M; Hennessey, Iain A M; Hughes, Mark A

2016-02-01

To assess the potential of the LEAP™ infrared motion tracking device to map laparoscopic instrument movement in a simulated environment. Simulator training is optimized when augmented by objective performance feedback. We explore the potential LEAP has to provide this in a way compatible with affordable take-home simulators. LEAP and the previously validated InsTrac visual tracking tool mapped expert and novice performances of a standardized simulated laparoscopic task. Ability to distinguish between the 2 groups (construct validity) and correlation between techniques (concurrent validity) were the primary outcome measures. Forty-three expert and 38 novice performances demonstrated significant differences in LEAP-derived metrics for instrument path distance (P < .001), speed (P = .002), acceleration (P < .001), motion smoothness (P < .001), and distance between the instruments (P = .019). Only instrument path distance demonstrated a correlation between LEAP and InsTrac tracking methods (novices: r = .663, P < .001; experts: r = .536, P < .001). Consistency of LEAP tracking was poor (average % time hands not tracked: 31.9%). The LEAP motion device is able to track the movement of hands using instruments in a laparoscopic box simulator. Construct validity is demonstrated by its ability to distinguish novice from expert performances. Only time and instrument path distance demonstrated concurrent validity with an existing tracking method however. A number of limitations to the tracking method used by LEAP have been identified. These need to be addressed before it can be considered an alternative to visual tracking for the delivery of objective performance metrics in take-home laparoscopic simulators. © The Author(s) 2015.
Assessing the Validity of Self-Rated Health with the Short Physical Performance Battery: A Cross-Sectional Analysis of the International Mobility in Aging Study

PubMed Central

Belanger, Emmanuelle; Zunzunegui, Maria–Victoria; Phillips, Susan; Ylli, Alban; Guralnik, Jack

2016-01-01

Objective The aim of this study was to explore the validity of self-rated health across different populations of older adults, when compared to the Short Physical Performance Battery. Design Cross-sectional analysis of the International Mobility in Aging Study. Setting Five locations: Saint-Hyacinthe and Kingston (Canada), Tirana (Albania), Manizales (Colombia), and Natal (Brazil). Participants Older adults between 65 and 74 years old (n = 1,995). Methods The Short Physical Performance Battery (SPPB) was used to measure physical performance. Self-rated health was assessed with one single five-point question. Linear trends between SPPB scores and self-rated health were tested separately for men and women at each of the five international study sites. Poor physical performance (independent variable) (SPPB less than 8) was used in logistic regression models of self-rated health (dependent variable), adjusting for potential covariates. All analyses were stratified by gender and site of origin. Results A significant linear association was found between the mean scores of the Short Physical Performance Battery and ordinal categories of self-rated health across research sites and gender groups. After extensive control for objective physical and mental health indicators and socio-demographic variables, these graded associations became non-significant in some research sites. Conclusion These findings further confirm the validity of SRH as a measure of overall health status in older adults. PMID:27089219
Prediction models for successful external cephalic version: a systematic review.

PubMed

Velzel, Joost; de Hundt, Marcella; Mulder, Frederique M; Molkenboer, Jan F M; Van der Post, Joris A M; Mol, Ben W; Kok, Marjolein

2015-12-01

To provide an overview of existing prediction models for successful ECV, and to assess their quality, development and performance. We searched MEDLINE, EMBASE and the Cochrane Library to identify all articles reporting on prediction models for successful ECV published from inception to January 2015. We extracted information on study design, sample size, model-building strategies and validation. We evaluated the phases of model development and summarized their performance in terms of discrimination, calibration and clinical usefulness. We collected different predictor variables together with their defined significance, in order to identify important predictor variables for successful ECV. We identified eight articles reporting on seven prediction models. All models were subjected to internal validation. Only one model was also validated in an external cohort. Two prediction models had a low overall risk of bias, of which only one showed promising predictive performance at internal validation. This model also completed the phase of external validation. For none of the models their impact on clinical practice was evaluated. The most important predictor variables for successful ECV described in the selected articles were parity, placental location, breech engagement and the fetal head being palpable. One model was assessed using discrimination and calibration using internal (AUC 0.71) and external validation (AUC 0.64), while two other models were assessed with discrimination and calibration, respectively. We found one prediction model for breech presentation that was validated in an external cohort and had acceptable predictive performance. This model should be used to council women considering ECV. Copyright © 2015. Published by Elsevier Ireland Ltd.
Factors affecting unsafe behavior in construction projects: development and validation of a new questionnaire.

PubMed

Asilian-Mahabadi, Hassan; Khosravi, Yahya; Hassanzadeh-Rangi, Narmin; Hajizadeh, Ebrahim; Behzadan, Amir H

2018-02-05

Occupational safety in general, and construction safety in particular, is a complex phenomenon. This study was designed to develop a new valid measure to evaluate factors affecting unsafe behavior in the construction industry. A new questionnaire was generated from qualitative research according to the principles of grounded theory. Key measurement properties (face validity, content validity, construct validity, reliability and discriminative validity) were examined using qualitative and quantitative approaches. The receiver operating characteristic curve was used to estimate the discriminating power and the optimal cutoff score. Construct validity revealed an interpretable 12-factor structure which explained 61.87% of variance. Good internal consistency (Cronbach's α = 0.94) and stability (intra-class correlation coefficient = 0.93) were found for the new instrument. The area under the curve, sensitivity and specificity were 0.80, 0.80 and 0.75, respectively. The new instrument also discriminated safety performance among the construction sites with different workers' accident histories (F = 6.40, p < 0.05). The new instrument appears to be a valid, reliable and sensitive instrument that will contribute to investigating the root causes of workers' unsafe behaviors, thus promoting safety performance in the construction industry.
Translation, cultural adaptation and validation of the Diabetes Attitudes Scale - third version into Brazilian Portuguese 1

PubMed Central

Vieira, Gisele de Lacerda Chaves; Pagano, Adriana Silvino; Reis, Ilka Afonso; Rodrigues, Júlia Santos Nunes; Torres, Heloísa de Carvalho

2018-01-01

ABSTRACT Objective: to perform the translation, adaptation and validation of the Diabetes Attitudes Scale - third version instrument into Brazilian Portuguese. Methods: methodological study carried out in six stages: initial translation, synthesis of the initial translation, back-translation, evaluation of the translated version by the Committee of Judges (27 Linguists and 29 health professionals), pre-test and validation. The pre-test and validation (test-retest) steps included 22 and 120 health professionals, respectively. The Content Validity Index, the analyses of internal consistency and reproducibility were performed using the R statistical program. Results: in the content validation, the instrument presented good acceptance among the Judges with a mean Content Validity Index of 0.94. The scale presented acceptable internal consistency (Cronbach’s alpha = 0.60), while the correlation of the total score at the test and retest moments was considered high (Polychoric Correlation Coefficient = 0.86). The Intra-class Correlation Coefficient, for the total score, presented a value of 0.65. Conclusion: the Brazilian version of the instrument (Escala de Atitudes dos Profissionais em relação ao Diabetes Mellitus) was considered valid and reliable for application by health professionals in Brazil. PMID:29319739
Validation of nomograms for overall survival, cancer-specific survival, and recurrence in carcinoma of the major salivary glands.

PubMed

Hay, Ashley; Migliacci, Jocelyn; Zanoni, Daniella Karassawa; Patel, Snehal; Yu, Changhong; Kattan, Michael W; Ganly, Ian

2018-05-01

The purpose of this study was to investigate the performance of the Memorial Sloan Kettering Cancer Center salivary carcinoma nomograms predicting overall survival, cancer-specific survival, and recurrence with an external validation dataset. The validation dataset comprised 123 patients treated between 2010 and 2015 at our institution. They were evaluated by assessing discrimination (concordance index [C-index]) and calibration (plotting predicted vs actual probabilities for quintiles). The validation cohort (n = 123) showed some differences to the original cohort (n = 301). The validation cohort had less high-grade cancers (P = .006), less lymphovascular invasion (LVI; P < .001) and shorter follow-up of 19 months versus 45.6 months. Validation showed a C-index of 0.833 (95% confidence interval [CI] 0.758-0.908), 0.807 (95% CI 0.717-0.898), and 0.844 (95% CI 0.768-0.920) for overall survival, cancer-specific survival, and recurrence, respectively. The 3 salivary gland nomograms performed well using a contemporary validation dataset, despite limitations related to sample size, follow-up, and differences in clinical and pathology characteristics between the original and validation cohorts. © 2018 Wiley Periodicals, Inc.
Validation of NOViSE.

PubMed

Korzeniowski, Przemyslaw; Brown, Daniel C; Sodergren, Mikael H; Barrow, Alastair; Bello, Fernando

2017-02-01

The goal of this study was to establish face, content, and construct validity of NOViSE-the first force-feedback enabled virtual reality (VR) simulator for natural orifice transluminal endoscopic surgery (NOTES). Fourteen surgeons and surgical trainees performed 3 simulated hybrid transgastric cholecystectomies using a flexible endoscope on NOViSE. Four of them were classified as "NOTES experts" who had independently performed 10 or more simulated or human NOTES procedures. Seven participants were classified as "Novices" and 3 as "Gastroenterologists" with no or minimal NOTES experience. A standardized 5-point Likert-type scale questionnaire was administered to assess the face and content validity. NOViSE showed good overall face and content validity. In 14 out of 15 statements pertaining to face validity (graphical appearance, endoscope and tissue behavior, overall realism), ≥50% of responses were "agree" or "strongly agree." In terms of content validity, 85.7% of participants agreed or strongly agreed that NOViSE is a useful training tool for NOTES and 71.4% that they would recommend it to others. Construct validity was established by comparing a number of performance metrics such as task completion times, path lengths, applied forces, and so on. NOViSE demonstrated early signs of construct validity. Experts were faster and used a shorter endoscopic path length than novices in all but one task. The results indicate that NOViSE authentically recreates a transgastric hybrid cholecystectomy and sets promising foundations for the further development of a VR training curriculum for NOTES without compromising patient safety or requiring expensive animal facilities.
Validity and Reliability of the 8-Item Work Limitations Questionnaire.

PubMed

Walker, Timothy J; Tullar, Jessica M; Diamond, Pamela M; Kohl, Harold W; Amick, Benjamin C

2017-12-01

Purpose To evaluate factorial validity, scale reliability, test-retest reliability, convergent validity, and discriminant validity of the 8-item Work Limitations Questionnaire (WLQ) among employees from a public university system. Methods A secondary analysis using de-identified data from employees who completed an annual Health Assessment between the years 2009-2015 tested research aims. Confirmatory factor analysis (CFA) (n = 10,165) tested the latent structure of the 8-item WLQ. Scale reliability was determined using a CFA-based approach while test-retest reliability was determined using the intraclass correlation coefficient. Convergent/discriminant validity was tested by evaluating relations between the 8-item WLQ with health/performance variables for convergent validity (health-related work performance, number of chronic conditions, and general health) and demographic variables for discriminant validity (gender and institution type). Results A 1-factor model with three correlated residuals demonstrated excellent model fit (CFI = 0.99, TLI = 0.99, RMSEA = 0.03, and SRMR = 0.01). The scale reliability was acceptable (0.69, 95% CI 0.68-0.70) and the test-retest reliability was very good (ICC = 0.78). Low-to-moderate associations were observed between the 8-item WLQ and the health/performance variables while weak associations were observed between the demographic variables. Conclusions The 8-item WLQ demonstrated sufficient reliability and validity among employees from a public university system. Results suggest the 8-item WLQ is a usable alternative for studies when the more comprehensive 25-item WLQ is not available.
Face validity, construct validity and training benefits of a virtual reality TURP simulator.

PubMed

Bright, Elizabeth; Vine, Samuel; Wilson, Mark R; Masters, Rich S W; McGrath, John S

2012-01-01

To assess face validity, construct validity and the training benefits of a virtual reality TURP simulator. 11 novices (no TURP experience) and 7 experts (>200 TURP's) completed a virtual reality median lobe prostate resection task on the TURPsim™ (Simbionix USA Corp., Cleveland, OH). Performance indicators (percentage of prostate resected (PR), percentage of capsular resection (CR) and time diathermy loop active without tissue contact (TAWC) were recorded via the TURPsim™ and compared between novices and experts to assess construct validity. Verbal comments provided by experts following task completion were used to assess face validity. Repeated attempts of the task by the novices were analysed to assess the training benefits of the TURPsim™. Experts resected a significantly greater percentage of prostate per minute (p < 0.01) and had significantly less active diathermy time without tissue contact (p < 0.01) than novices. After practice, novices were able to perform the simulation more effectively, with significant improvement in all measured parameters. Improvement in performance was noted in novices following repetitive training, as evidenced by improved TAWC scores that were not significantly different from the expert group (p = 0.18). This study has established face and construct validity for the TURPsim™. The potential benefit in using this tool to train novices has also been demonstrated. Copyright © 2012 Surgical Associates Ltd. Published by Elsevier Ltd. All rights reserved.
Progress Towards a Microgravity CFD Validation Study Using the ISS SPHERES-SLOSH Experiment

NASA Technical Reports Server (NTRS)

Storey, Jedediah M.; Kirk, Daniel; Marsell, Brandon (Editor); Schallhorn, Paul (Editor)

2017-01-01

Understanding, predicting, and controlling fluid slosh dynamics is critical to safety and improving performance of space missions when a significant percentage of the spacecrafts mass is a liquid. Computational fluid dynamics simulations can be used to predict the dynamics of slosh, but these programs require extensive validation. Many CFD programs have been validated by slosh experiments using various fluids in earth gravity, but prior to the ISS SPHERES-Slosh experiment1, little experimental data for long-duration, zero-gravity slosh existed. This paper presents the current status of an ongoing CFD validation study using the ISS SPHERES-Slosh experimental data.
Progress Towards a Microgravity CFD Validation Study Using the ISS SPHERES-SLOSH Experiment

NASA Technical Reports Server (NTRS)

Storey, Jed; Kirk, Daniel (Editor); Marsell, Brandon (Editor); Schallhorn, Paul (Editor)

2017-01-01

Understanding, predicting, and controlling fluid slosh dynamics is critical to safety and improving performance of space missions when a significant percentage of the spacecrafts mass is a liquid. Computational fluid dynamics simulations can be used to predict the dynamics of slosh, but these programs require extensive validation. Many CFD programs have been validated by slosh experiments using various fluids in earth gravity, but prior to the ISS SPHERES-Slosh experiment, little experimental data for long-duration, zero-gravity slosh existed. This paper presents the current status of an ongoing CFD validation study using the ISS SPHERES-Slosh experimental data.
The relationship between external and internal validity of randomized controlled trials: A sample of hypertension trials from China.

PubMed

Zhang, Xin; Wu, Yuxia; Ren, Pengwei; Liu, Xueting; Kang, Deying

2015-10-30

To explore the relationship between the external validity and the internal validity of hypertension RCTs conducted in China. Comprehensive literature searches were performed in Medline, Embase, Cochrane Central Register of Controlled Trials (CCTR), CBMdisc (Chinese biomedical literature database), CNKI (China National Knowledge Infrastructure/China Academic Journals Full-text Database) and VIP (Chinese scientific journals database) as well as advanced search strategies were used to locate hypertension RCTs. The risk of bias in RCTs was assessed by a modified scale, Jadad scale respectively, and then studies with 3 or more grading scores were included for the purpose of evaluating of external validity. A data extract form including 4 domains and 25 items was used to explore relationship of the external validity and the internal validity. Statistic analyses were performed by using SPSS software, version 21.0 (SPSS, Chicago, IL). 226 hypertension RCTs were included for final analysis. RCTs conducted in university affiliated hospitals (P < 0.001) or secondary/tertiary hospitals (P < 0.001) were scored at higher internal validity. Multi-center studies (median = 4.0, IQR = 2.0) were scored higher internal validity score than single-center studies (median = 3.0, IQR = 1.0) (P < 0.001). Funding-supported trials had better methodological quality (P < 0.001). In addition, the reporting of inclusion criteria also leads to better internal validity (P = 0.004). Multivariate regression indicated sample size, industry-funding, quality of life (QOL) taken as measure and the university affiliated hospital as trial setting had statistical significance (P < 0.001, P < 0.001, P = 0.001, P = 0.006 respectively). Several components relate to the external validity of RCTs do associate with the internal validity, that do not stand in an easy relationship to each other. Regarding the poor reporting, other possible links between two variables need to trace in the future methodological researches.
The interference of introversion-extraversion and depressive symptomatology with reasoning performance: a behavioural study.

PubMed

Papageorgiou, Charalabos; Rabavilas, Andreas D; Stachtea, Xanthy; Giannakakis, Giorgos A; Kyprianou, Miltiades; Papadimitriou, George N; Stefanis, Costas N

2012-04-01

The objective of this study was to investigate the link between the Eysenck Personality Questionnaire (EPQ) scores and depressive symptomatology with reasoning performance induced by a task including valid and invalid Aristotelian syllogisms. The EPQ and the Zung Depressive Scale (ZDS) were completed by 48 healthy subjects (27 male, 21 female) aged 33.5 ± 9.0 years. Additionally, the subjects engaged into two reasoning tasks (valid vs. invalid syllogisms). Analysis showed that the judgment of invalid syllogisms is a more difficult task than of valid judgments (65.1% vs. 74.6% of correct judgments respectively, p < 0.01). In both conditions, the subjects' degree of confidence is significantly higher when they make a correct judgment than when they make an incorrect judgment (83.8 ± 11.2 vs. 75.3 ± 17.3, p < 0.01). Subjects with extraversion as measured by EPQ and high sexual desire as rated by the relative ZDS subscale are more prone to make incorrect judgments in the valid syllogisms, while, at the same time, they are more confident in their responses. The effects of extraversion/introversion and sexual desire on the outcome measures of the valid condition are not commutative but additive. These findings indicate that extraversion/introversion and sexual desire variations may have a detrimental effect in the reasoning performance.
Risk management in technovigilance: construction and validation of a medical-hospital product evaluation instrument.

PubMed

Kuwabara, Cleuza Catsue Takeda; Evora, Yolanda Dora Martinez; de Oliveira, Márcio Mattos Borges

2010-01-01

With the continuous incorporation of health technologies, hospital risk management should be implemented to systemize the monitoring of adverse effects, performing actions to control and eliminate their damage. As part of these actions, Technovigilance is active in the procedures of acquisition, use and quality control of health products and equipment. This study aimed to construct and validate an instrument to evaluate medical-hospital products. This is a quantitative, exploratory, longitudinal and methodological development study, based on the Six Sigma quality management model, which has as its principle basis the component stages of the DMAIC Cycle. For data collection and content validation, the Delphi technique was used with professionals from the Brazilian Sentinel Hospital Network. It was concluded that the instrument developed permitted the evaluation of the product, differentiating between the results of the tested brands, in line with the initial study goal of qualifying the evaluations performed.
Validity and reliability of a novel measure of activity performance and participation.

PubMed

Murgatroyd, Phil; Karimi, Leila

2016-01-01

To develop and evaluate an innovative clinician-rated measure, which produces global numerical ratings of activity performance and participation. Repeated measures study with 48 community-dwelling participants investigating clinical sensibility, comprehensiveness, practicality, inter-rater reliability, responsiveness, sensitivity and concurrent validity with Barthel Index. Important clinimetric characteristics including comprehensiveness and ease of use were rated >8/10 by clinicians. Inter-rater reliability was excellent on the summary scores (intraclass correlation of 0.95-0.98). There was good evidence that the new outcome measure distinguished between known high and low functional scoring groups, including both responsiveness to change and sensitivity at the same time point in numerous tests. Concurrent validity with the Barthel Index was fair to high (Spearman Rank Order Correlation 0.32-0.85, p > 0.05). The new measure's summary scores were nearly twice as responsive to change compared with the Barthel Index. Other more detailed data could also be generated by the new measure. The Activity Performance Measure is an innovative outcome instrument that showed good clinimetric qualities in this initial study. Some of the results were strong, given the sample size, and further trial and evaluation is appropriate. Implications for Rehabilitation The Activity Performance Measure is an innovative outcome measure covering activity performance and participation. In an initial evaluation, it showed good clinimetric qualities including responsiveness to change, sensitivity, practicality, clinical sensibility, item coverage, inter-rater reliability and concurrent validity with the Barthel Index. Further trial and evaluation is appropriate.
Validation and refinement of mixture volumetric material properties identified in superpave monitoring project II : phase II.

DOT National Transportation Integrated Search

2015-02-01

This study was initiated to validate and refine mixture volumetric material properties identified in the : Superpave Monitoring Project II. It has been found that differences in performance are primarily controlled : by differences in gradation and r...

Validity of FAA-approved color vision tests for class II and class III aeromedical screening.

DOT National Transportation Integrated Search

1993-09-01

All clinical color vision tests currently used in the medical examination of pilots were studied regarding validity for prediction of performance on practical tests of ability to discriminate the aviation signal colors, red, green, and white given un...
Development of a new instrument for determining the level of chewing function in children.

PubMed

Serel Arslan, S; Demir, N; Barak Dolgun, A; Karaduman, A A

2016-07-01

This study aimed to develop a chewing performance scale that classifies chewing from normal to severely impaired and to investigate its validity and reliability. The study included the developmental phase and reported the content, structural, criterion validity, interobserver and intra-observer reliability of the chewing performance scale, which was called the Karaduman Chewing Performance Scale (KCPS). A dysphagia literature review, other questionnaires and clinical experiences were used in the developmental phase. Seven experts assessed the steps for content validity over two Delphi rounds. To test structural, criterion validity, interobserver and intra-observer reliability, two swallowing therapists evaluated chewing videos of 144 children (Group I: 61 healthy children without chewing disorders, mean age of 42·38 ± 9·36 months; Group II: 83 children with cerebral palsy who have chewing disorders, mean age of 39·09 ± 22·95 months) using KCPS. The Behavioral Pediatrics Feeding Assessment Scale (BPFAS) was used for criterion validity. The KCPS steps arranged between 0-4 were found to be necessary. The content validity index was 0·885. The KCPS levels were found to be different between groups I and II (χ(2) = 123·286, P < 0·001). A moderately strong positive correlation was found between the KCPS and the subscales of the BPFAS (r = 0·444-0·773, P < 0·001). An excellent positive correlation was detected between two swallowing therapists and between two examinations of one swallowing therapist (r = 0·962, P < 0·001; r = 0·990, P < 0·001, respectively). The KCPS is a valid, reliable, quick and clinically easy-to-use functional instrument for determining the level of chewing function in children. © 2016 John Wiley & Sons Ltd.
A validated UPLC-MS/MS method for flibanserin in plasma and its pharmacokinetic interaction with bosentan in rats.

PubMed

Iqbal, Muzaffar; Ezzeldin, Essam; Rezk, Naser L; Bajrai, Amal A; Al-Rashood, Khalid A

2018-04-25

The purpose of this study was development, validation and application of ultra-performance liquid chromatography (UPLC)-ESI-MS/MS method for quantitation of flibanserin in plasma samples. After extraction of analyte from plasma by diethyl ether, separation was performed on UPLC C 18 column using mobile phase composition of 10 mM ammonium formate-acetonitrile (30:70, v/v) by isocratic elution of 0.3 ml/min. The multiple reaction monitoring transitions of m/z 391.13→ 161.04 and 384.20→ 253.06 were used for detection of analyte and internal standard (quetiapine), respectively. The calibration curves were linear (r ≥0.995) between 0.22 and 555 ng/ml concentration and all validation results were within the acceptable range as per US FDA guidelines. The assay procedure was fully validated and successfully applied in pharmacokinetic interaction study of flibanserin with bosentan in rats.
Assessment of communication, professionalism, and surgical skills in an objective structured performance-related examination (OSPRE): a psychometric study.

PubMed

Ponton-Carss, Alicia; Hutchison, Carol; Violato, Claudio

2011-10-01

The purpose of this study was to investigate the reliability and validity of a performance assessment of communication, professionalism, and surgical skills competencies for surgery residents. Fourteen residents from the general surgery program of the University of Calgary were assessed in 7 surgical simulation stations that included communication and professionalism skills. The internal consistency reliability of the checklists and global rating scales combined was adequate for communication (α = .75-.92) and surgical skills (α = .86-.96), but not for professionalism (α = 0). There was evidence of validity as surgical skills performance improved as a function of postgraduate year level but not for the professionalism checklist. Surgical skills and communication correlated in the 2 stations assessed (r = .55 and .57; P < .05). There is evidence for both reliability and validity for simultaneously assessing surgical skills and communication skills. Further instrument development is required to assess professionalism in a structured examination context. Copyright © 2011 Elsevier Inc. All rights reserved.
Exploring discrepancies between quantitative validation results and the geomorphic plausibility of statistical landslide susceptibility maps

NASA Astrophysics Data System (ADS)

Steger, Stefan; Brenning, Alexander; Bell, Rainer; Petschko, Helene; Glade, Thomas

2016-06-01

Empirical models are frequently applied to produce landslide susceptibility maps for large areas. Subsequent quantitative validation results are routinely used as the primary criteria to infer the validity and applicability of the final maps or to select one of several models. This study hypothesizes that such direct deductions can be misleading. The main objective was to explore discrepancies between the predictive performance of a landslide susceptibility model and the geomorphic plausibility of subsequent landslide susceptibility maps while a particular emphasis was placed on the influence of incomplete landslide inventories on modelling and validation results. The study was conducted within the Flysch Zone of Lower Austria (1,354 km2) which is known to be highly susceptible to landslides of the slide-type movement. Sixteen susceptibility models were generated by applying two statistical classifiers (logistic regression and generalized additive model) and two machine learning techniques (random forest and support vector machine) separately for two landslide inventories of differing completeness and two predictor sets. The results were validated quantitatively by estimating the area under the receiver operating characteristic curve (AUROC) with single holdout and spatial cross-validation technique. The heuristic evaluation of the geomorphic plausibility of the final results was supported by findings of an exploratory data analysis, an estimation of odds ratios and an evaluation of the spatial structure of the final maps. The results showed that maps generated by different inventories, classifiers and predictors appeared differently while holdout validation revealed similar high predictive performances. Spatial cross-validation proved useful to expose spatially varying inconsistencies of the modelling results while additionally providing evidence for slightly overfitted machine learning-based models. However, the highest predictive performances were obtained for maps that explicitly expressed geomorphically implausible relationships indicating that the predictive performance of a model might be misleading in the case a predictor systematically relates to a spatially consistent bias of the inventory. Furthermore, we observed that random forest-based maps displayed spatial artifacts. The most plausible susceptibility map of the study area showed smooth prediction surfaces while the underlying model revealed a high predictive capability and was generated with an accurate landslide inventory and predictors that did not directly describe a bias. However, none of the presented models was found to be completely unbiased. This study showed that high predictive performances cannot be equated with a high plausibility and applicability of subsequent landslide susceptibility maps. We suggest that greater emphasis should be placed on identifying confounding factors and biases in landslide inventories. A joint discussion between modelers and decision makers of the spatial pattern of the final susceptibility maps in the field might increase their acceptance and applicability.
The Tube 3 module designed for practicing vesicourethral anastomosis in a virtual reality robotic simulator: determination of face, content, and construct validity.

PubMed

Kang, Sung Gu; Cho, Seok; Kang, Seok Ho; Haidar, Abdul Muhsin; Samavedi, Srinivas; Palmer, Kenneth J; Patel, Vipul R; Cheon, Jun

2014-08-01

To better use virtual reality robotic simulators and offer surgeons more practical exercises, we developed the Tube 3 module for practicing vesicourethral anastomosis (VUA), one of the most complex steps in the robot-assisted radical prostatectomy procedure. Herein, we describe the principle of the Tube 3 module and evaluate its face, content, and construct validity. Residents and attending surgeons participated in a prospective study approved by the institutional review board. We divided subjects into 2 groups, those with experience and novices. Each subject performed a simulated VUA using the Tube 3 module. A built-in scoring algorithm recorded the data from each performance. After completing the Tube 3 module exercise, each subject answered a questionnaire to provide data to be used for face and content validation. The novice group consisted of 10 residents. The experienced subjects (n = 10) had each previously performed at least 10 robotic surgeries. The experienced group outperformed the novice group in most variables, including task time, total score, total economy of motion, and number of instrument collisions (P <.05). Additionally, 80% of the experienced surgeons agreed that the module reflects the technical skills required to perform VUA and would be a useful training tool. We describe the Tube 3 module for practicing VUA, which showed excellent face, content, and construct validity. The task needs to be refined in the future to reflect VUA under real operating conditions, and concurrent and predictive validity studies are currently underway. Copyright © 2014 Elsevier Inc. All rights reserved.
MetaKTSP: a meta-analytic top scoring pair method for robust cross-study validation of omics prediction analysis.

PubMed

Kim, SungHwan; Lin, Chien-Wei; Tseng, George C

2016-07-01

Supervised machine learning is widely applied to transcriptomic data to predict disease diagnosis, prognosis or survival. Robust and interpretable classifiers with high accuracy are usually favored for their clinical and translational potential. The top scoring pair (TSP) algorithm is an example that applies a simple rank-based algorithm to identify rank-altered gene pairs for classifier construction. Although many classification methods perform well in cross-validation of single expression profile, the performance usually greatly reduces in cross-study validation (i.e. the prediction model is established in the training study and applied to an independent test study) for all machine learning methods, including TSP. The failure of cross-study validation has largely diminished the potential translational and clinical values of the models. The purpose of this article is to develop a meta-analytic top scoring pair (MetaKTSP) framework that combines multiple transcriptomic studies and generates a robust prediction model applicable to independent test studies. We proposed two frameworks, by averaging TSP scores or by combining P-values from individual studies, to select the top gene pairs for model construction. We applied the proposed methods in simulated data sets and three large-scale real applications in breast cancer, idiopathic pulmonary fibrosis and pan-cancer methylation. The result showed superior performance of cross-study validation accuracy and biomarker selection for the new meta-analytic framework. In conclusion, combining multiple omics data sets in the public domain increases robustness and accuracy of the classification model that will ultimately improve disease understanding and clinical treatment decisions to benefit patients. An R package MetaKTSP is available online. (http://tsenglab.biostat.pitt.edu/software.htm). ctseng@pitt.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Validation of thermal effects of LED package by using Elmer finite element simulation method

NASA Astrophysics Data System (ADS)

Leng, Lai Siang; Retnasamy, Vithyacharan; Mohamad Shahimin, Mukhzeer; Sauli, Zaliman; Taniselass, Steven; Bin Ab Aziz, Muhamad Hafiz; Vairavan, Rajendaran; Kirtsaeng, Supap

2017-02-01

The overall performance of the Light-emitting diode, LED package is critically affected by the heat attribution. In this study, open source software - Elmer FEM has been utilized to study the thermal analysis of the LED package. In order to perform a complete simulation study, both Salome software and ParaView software were introduced as Pre and Postprocessor. The thermal effect of the LED package was evaluated by this software. The result has been validated with commercially licensed software based on previous work. The percentage difference from both simulation results is less than 5% which is tolerable and comparable.
Validation studies and proficiency testing.

PubMed

Ankilam, Elke; Heinze, Petra; Kay, Simon; Van den Eede, Guy; Popping, Bert

2002-01-01

Genetically modified organisms (GMOs) entered the European food market in 1996. Current legislation demands the labeling of food products if they contain <1% GMO, as assessed for each ingredient of the product. To create confidence in the testing methods and to complement enforcement requirements, there is an urgent need for internationally validated methods, which could serve as reference methods. To date, several methods have been submitted to validation trials at an international level; approaches now exist that can be used in different circumstances and for different food matrixes. Moreover, the requirement for the formal validation of methods is clearly accepted; several national and international bodies are active in organizing studies. Further validation studies, especially on the quantitative polymerase chain reaction methods, need to be performed to cover the rising demand for new extraction methods and other background matrixes, as well as for novel GMO constructs.
Virtual temporal bone dissection system: OSU virtual temporal bone system: development and testing.

PubMed

Wiet, Gregory J; Stredney, Don; Kerwin, Thomas; Hittle, Bradley; Fernandez, Soledad A; Abdel-Rasoul, Mahmoud; Welling, D Bradley

2012-03-01

The objective of this project was to develop a virtual temporal bone dissection system that would provide an enhanced educational experience for the training of otologic surgeons. A randomized, controlled, multi-institutional, single-blinded validation study. The project encompassed four areas of emphasis: structural data acquisition, integration of the system, dissemination of the system, and validation. Structural acquisition was performed on multiple imaging platforms. Integration achieved a cost-effective system. Dissemination was achieved on different levels including casual interest, downloading of software, and full involvement in development and validation studies. A validation study was performed at eight different training institutions across the country using a two-arm randomized trial where study subjects were randomized to a 2-week practice session using either the virtual temporal bone or standard cadaveric temporal bones. Eighty subjects were enrolled and randomized to one of the two treatment arms; 65 completed the study. There was no difference between the two groups using a blinded rating tool to assess performance after training. A virtual temporal bone dissection system has been developed and compared to cadaveric temporal bones for practice using a multicenter trial. There was no statistical difference between practice on the current simulator compared to practice on human cadaveric temporal bones. Further refinements in structural acquisition and interface design have been identified, which can be implemented prior to full incorporation into training programs and used for objective skills assessment. Copyright © 2012 The American Laryngological, Rhinological, and Otological Society, Inc.
Prognostic indices for early mortality in ischaemic stroke - meta-analysis.

PubMed

Mattishent, K; Kwok, C S; Mahtani, A; Pelpola, K; Myint, P K; Loke, Y K

2016-01-01

Several models have been developed to predict mortality in ischaemic stroke. We aimed to evaluate systematically the performance of published stroke prognostic scores. We searched MEDLINE and EMBASE in February 2014 for prognostic models (published between 2003 and 2014) used in predicting early mortality (<6 months) after ischaemic stroke. We evaluated discriminant ability of the tools through meta-analysis of the area under the curve receiver operating characteristic curve (AUROC) or c-statistic. We evaluated the following components of study validity: collection of prognostic variables, neuroimaging, treatment pathways and missing data. We identified 18 articles (involving 163 240 patients) reporting on the performance of prognostic models for mortality in ischaemic stroke, with 15 articles providing AUC for meta-analysis. Most studies were either retrospective, or post hoc analyses of prospectively collected data; all but three reported validation data. The iSCORE had the largest number of validation cohorts (five) within our systematic review and showed good performance in four different countries, pooled AUC 0.84 (95% CI 0.82-0.87). We identified other potentially useful prognostic tools that have yet to be as extensively validated as iSCORE - these include SOAR (2 studies, pooled AUC 0.79, 95% CI 0.78-0.80), GWTG (2 studies, pooled AUC 0.72, 95% CI 0.72-0.72) and PLAN (1 study, pooled AUC 0.85, 95% CI 0.84-0.87). Our meta-analysis has identified and summarized the performance of several prognostic scores with modest to good predictive accuracy for early mortality in ischaemic stroke, with the iSCORE having the broadest evidence base. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Comparative assessment of three standardized robotic surgery training methods.

PubMed

Hung, Andrew J; Jayaratna, Isuru S; Teruya, Kara; Desai, Mihir M; Gill, Inderbir S; Goh, Alvin C

2013-10-01

To evaluate three standardized robotic surgery training methods, inanimate, virtual reality and in vivo, for their construct validity. To explore the concept of cross-method validity, where the relative performance of each method is compared. Robotic surgical skills were prospectively assessed in 49 participating surgeons who were classified as follows: 'novice/trainee': urology residents, previous experience <30 cases (n = 38) and 'experts': faculty surgeons, previous experience ≥30 cases (n = 11). Three standardized, validated training methods were used: (i) structured inanimate tasks; (ii) virtual reality exercises on the da Vinci Skills Simulator (Intuitive Surgical, Sunnyvale, CA, USA); and (iii) a standardized robotic surgical task in a live porcine model with performance graded by the Global Evaluative Assessment of Robotic Skills (GEARS) tool. A Kruskal-Wallis test was used to evaluate performance differences between novices and experts (construct validity). Spearman's correlation coefficient (ρ) was used to measure the association of performance across inanimate, simulation and in vivo methods (cross-method validity). Novice and expert surgeons had previously performed a median (range) of 0 (0-20) and 300 (30-2000) robotic cases, respectively (P < 0.001). Construct validity: experts consistently outperformed residents with all three methods (P < 0.001). Cross-method validity: overall performance of inanimate tasks significantly correlated with virtual reality robotic performance (ρ = -0.7, P < 0.001) and in vivo robotic performance based on GEARS (ρ = -0.8, P < 0.0001). Virtual reality performance and in vivo tissue performance were also found to be strongly correlated (ρ = 0.6, P < 0.001). We propose the novel concept of cross-method validity, which may provide a method of evaluating the relative value of various forms of skills education and assessment. We externally confirmed the construct validity of each featured training tool. © 2013 BJU International.
Individual safety performance in the construction industry: development and validation of two short scales.

PubMed

DeArmond, Sarah; Smith, April E; Wilson, Christina L; Chen, Peter Y; Cigularov, Konstantin P

2011-05-01

In the current research a short measure of safety performance is developed for use in the construction industry and the relationships between different components of safety performance and safety outcomes (e.g., occupational injuries and work-related pain) are explored within the construction context. This research consists of two field studies. In the first, comprehensive measures of safety compliance and safety participation were shortened and modified to be appropriate for use in construction. Evidence of reliability and validity is provided. Both safety compliance and safety participation were negatively related to occupational injuries, yet these two correlations were not statistically different. In the second study, we investigated the relationships between these two components of safety performance and work-related pain frequency, in addition to replicating Study 1. Safety compliance had a stronger negative relationship with pain than safety participation. Implications for research are discussed. Copyright © 2010 Elsevier Ltd. All rights reserved.
Aircraft Wake Vortex Spacing System (AVOSS) Performance Update and Validation Study

NASA Technical Reports Server (NTRS)

Rutishauser, David K.; OConnor, Cornelius J.

2001-01-01

An analysis has been performed on data generated from the two most recent field deployments of the Aircraft Wake VOrtex Spacing System (AVOSS). The AVOSS provides reduced aircraft spacing criteria for wake vortex avoidance as compared to the FAA spacing applied under Instrument Flight Rules (IFR). Several field deployments culminating in a system demonstration at Dallas Fort Worth (DFW) International Airport in the summer of 2000 were successful in showing a sound operational concept and the system's potential to provide a significant benefit to airport operations. For DFW, a predicted average throughput increase of 6% was observed. This increase implies 6 or 7 more aircraft on the ground in a one-hour period for DFW operations. Several studies of performance correlations to system configuration options, design options, and system inputs are also reported. The studies focus on the validation performance of the system.
Validation of the da Vinci Surgical Skill Simulator across three surgical disciplines: A pilot study

PubMed Central

Alzahrani, Tarek; Haddad, Richard; Alkhayal, Abdullah; Delisle, Josée; Drudi, Laura; Gotlieb, Walter; Fraser, Shannon; Bergman, Simon; Bladou, Frank; Andonian, Sero; Anidjar, Maurice

2013-01-01

Objective: In this paper, we evaluate face, content and construct validity of the da Vinci Surgical Skills Simulator (dVSSS) across 3 surgical disciplines. Methods: In total, 48 participants from urology, gynecology and general surgery participated in the study as novices (0 robotic cases performed), intermediates (1–74) or experts (≥75). Each participant completed 9 tasks (Peg board level 2, match board level 2, needle targeting, ring and rail level 2, dots and needles level 1, suture sponge level 2, energy dissection level 1, ring walk level 3 and tubes). The Mimic Technologies software scored each task from 0 (worst) to 100 (best) using several predetermined metrics. Face and content validity were evaluated by a questionnaire administered after task completion. Wilcoxon test was used to perform pair wise comparisons. Results: The expert group comprised of 6 attending surgeons. The intermediate group included 4 attending surgeons, 3 fellows and 5 residents. The novices included 1 attending surgeon, 1 fellow, 13 residents, 13 medical students and 2 research assistants. The median number of robotic cases performed by experts and intermediates were 250 and 9, respectively. The median overall realistic score (face validity) was 8/10. Experts rated the usefulness of the simulator as a training tool for residents (content validity) as 8.5/10. For construct validity, experts outperformed novices in all 9 tasks (p < 0.05). Intermediates outperformed novices in 7 of 9 tasks (p < 0.05); there were no significant differences in the energy dissection and ring walk tasks. Finally, experts scored significantly better than intermediates in only 3 of 9 tasks (matchboard, dots and needles and energy dissection) (p < 0.05). Conclusions: This study confirms the face, content and construct validities of the dVSSS across urology, gynecology and general surgery. Larger sample size and more complex tasks are needed to further differentiate intermediates from experts. PMID:23914275
Validity, Reliability, and Performance Determinants of a New Job-Specific Anaerobic Work Capacity Test for the Norwegian Navy Special Operations Command.

PubMed

Angeltveit, Andreas; Paulsen, Gøran; Solberg, Paul A; Raastad, Truls

2016-02-01

Operators in Special Operation Forces (SOF) have a particularly demanding profession where physical and psychological capacities can be challenged to the extremes. The diversity of physical capacities needed depend on the mission. Consequently, tests used to monitor SOF operators' physical fitness should cover a broad range of physical capacities. Whereas tests for strength and aerobic endurance are established, there is no test for specific anaerobic work capacity described in the literature. The purpose of this study was therefore to evaluate the reliability, validity, and to identify performance determinants of a new test developed for testing specific anaerobic work capacity in SOF operators. Nineteen active young students were included in the concurrent validity part of the study. The students performed the evacuation (EVAC) test 3 times and the results were compared for reliability and with performance in the Wingate cycle test, 300-m sprint, and a maximal accumulated oxygen deficit (MAOD) test. In part II of the study, 21 Norwegian Navy Special Operations Command operators conducted the EVAC test, anthropometric measurements, a dual x-ray absorptiometry scan, leg press, isokinetic knee extensions, maximal oxygen uptake test, and countermovement jump (CMJ) test. The EVAC test showed good reliability after 1 familiarization trial (intraclass correlation = 0.89; coefficient of variance = 3.7%). The EVAC test correlated well with the Wingate test (r = -0.68), 300-m sprint time (r = 0.51), and 300-m mean power (W) (r = -0.67). No significant correlation was found with the MAOD test. In part II of the study, height, body mass, lean body mass, isokinetic knee extension torque, maximal oxygen uptake, and maximal power in a CMJ was significantly correlated with performance in the EVAC test. The EVAC test is a reliable and valid test for anaerobic work capacity for SOF operators, and muscle mass, leg strength, and leg power seem to be the most important determinants of performance.
Evidencing the association between swimming capacities and performance indicators in water polo: a multiple regression study.

PubMed

Kontic, Dean; Zenic, Natasa; Uljevic, Ognjen; Sekulic, Damir; Lesnik, Blaz

2017-06-01

Swimming capacities are hypothesized to be important determinants of water polo performance but there is an evident lack of studies examining different swimming capacities in relation to specific offensive and defensive performance variables in this sport. The aim of this study was to determine the relationship between five swimming capacities and six performance determinants in water polo. The sample comprised 79 high-level youth water polo players (all males, 17-18 years of age). The variables included six performance-related variables (agility in offence and defense, efficacy in offence and defense, polyvalence in offence and defense), and five swimming-capacity tests (water polo sprint test [15 m], swimming sprint test [25 m], short-distance [100 m], aerobic endurance [400 m] and an anaerobic lactate endurance test [4× 50 m]). First, multiple regressions were calculated for one-half of the sample of subjects which were then validated with the remaining half of the sample. The 25-m swim was not included in the regression analyses due to the multicollinearity with other predictors. The originally calculated regression models were validated for defensive agility (R=0.67 and R=0.55 for the original regression calculation and validation subsample, respectively) offensive agility (R=0.59 and R=0.61), and offensive efficacy (R=0.64 and R=0.58). Anaerobic lactate endurance is a significant predictor of offensive and defensive agility, while 15 m sprint significantly contributes to offensive efficacy. Swimming capacities are not found to be related to the polyvalence of the players. The most superior offensive performance can be expected from those players with a high level of anaerobic lactate endurance and advanced sprinting capacity, while anaerobic lactate endurance is recognized as most important quality in defensive duties. Future studies should observe players' polyvalence in relation to (theoretical) knowledge of technical and tactical tasks. Results reinforce the need for the cross-validation of the prediction-models in sport and exercise sciences.
The methodological quality of three foundational law enforcement Drug Influence Evaluation validation studies.

PubMed

Kane, Greg

2013-11-04

A Drug Influence Evaluation (DIE) is a formal assessment of an impaired driving suspect, performed by a trained law enforcement officer who uses circumstantial facts, questioning, searching, and a physical exam to form an unstandardized opinion as to whether a suspect's driving was impaired by drugs. This paper first identifies the scientific studies commonly cited in American criminal trials as evidence of DIE accuracy, and second, uses the QUADAS tool to investigate whether the methodologies used by these studies allow them to correctly quantify the diagnostic accuracy of the DIEs currently administered by US law enforcement. Three studies were selected for analysis. For each study, the QUADAS tool identified biases that distorted reported accuracies. The studies were subject to spectrum bias, selection bias, misclassification bias, verification bias, differential verification bias, incorporation bias, and review bias. The studies quantified DIE performance with prevalence-dependent accuracy statistics that are internally but not externally valid. The accuracies reported by these studies do not quantify the accuracy of the DIE process now used by US law enforcement. These studies do not validate current DIE practice.
Performance Validity Testing in Neuropsychology: Scientific Basis and Clinical Application-A Brief Review.

PubMed

Greher, Michael R; Wodushek, Thomas R

2017-03-01

Performance validity testing refers to neuropsychologists' methodology for determining whether neuropsychological test performances completed in the course of an evaluation are valid (ie, the results of true neurocognitive function) or invalid (ie, overly impacted by the patient's effort/engagement in testing). This determination relies upon the use of either standalone tests designed for this sole purpose, or specific scores/indicators embedded within traditional neuropsychological measures that have demonstrated this utility. In response to a greater appreciation for the critical role that performance validity issues play in neuropsychological testing and the need to measure this variable to the best of our ability, the scientific base for performance validity testing has expanded greatly over the last 20 to 30 years. As such, the majority of current day neuropsychologists in the United States use a variety of measures for the purpose of performance validity testing as part of everyday forensic and clinical practice and address this issue directly in their evaluations. The following is the first article of a 2-part series that will address the evolution of performance validity testing in the field of neuropsychology, both in terms of the science as well as the clinical application of this measurement technique. The second article of this series will review performance validity tests in terms of methods for development of these measures, and maximizing of diagnostic accuracy.
Development of the Academic Performance Perception Scale

ERIC Educational Resources Information Center

Gur, Recep

2017-01-01

Purpose: While numerous studies about academic performance that focused on only one factor, studies aiming to measure academicians' perceptions across many factors have not been observed in the literature. The current study aims to fill this gap and become a resource for upcoming studies. The aim of this study is to develop a valid and reliable…

Incremental Validity of Useful Field of View Subtests for the Prediction of Instrumental Activities of Daily Living

PubMed Central

Aust, Frederik; Edwards, Jerri D.

2015-01-01

Introduction The Useful Field of View Test (UFOV®) is a cognitive measure that predicts older adults’ ability to perform a range of everyday activities. However, little is known about the individual contribution of each subtest to these predictions and the underlying constructs of UFOV performance remain a topic of debate. Method We investigated the incremental validity of UFOV subtests for the prediction of Instrumental Activities of Daily Living (IADL) performance in two independent datasets, the SKILL (n = 828) and ACTIVE (n = 2426) studies. We, then, explored the cognitive and visual abilities assessed by UFOV using a range of neuropsychological and vision tests administered in the SKILL study. Results In the four subtest variant of UFOV, only subtests 2 and 3 consistently made independent contributions to the prediction of IADL performance across three different behavioral measures. In all cases, the incremental validity of UFOV subtests 1 and 4 was negligible. Furthermore, we found that UFOV was related to processing speed, general non-speeded cognition, and visual function; the omission of subtests 1 and 4 from the test score did not affect these associations. Conclusions UFOV subtests 1 and 4 appear to be of limited use to predict IADL and possibly other everyday activities. Future experimental research should investigate if shortening the UFOV by omitting these subtests is a reliable and valid assessment approach. PMID:26782018
Epidemiology of bruxism in adults: a systematic review of the literature.

PubMed

Manfredini, Daniele; Winocur, Ephraim; Guarda-Nardini, Luca; Paesani, Daniel; Lobbezoo, Frank

2013-01-01

To perform a systematic review of the literature dealing with the prevalence of bruxism in adult populations. A systematic search of the medical literature was performed to identify all peer-reviewed English-language papers dealing with the prevalence assessment of either awake or sleep bruxism at the general population level by the adoption of questionnaires, clinical assessments, and polysomnographic (PSG) or electromyographic (EMG) recordings. Quality assessment of the reviewed papers was performed according to the Methodological evaluation of Observational REsearch (MORE) checklist, which enables the identification of flaws in the external and internal validity. Cut-off criteria for an acceptable external validity were established to select studies for the discussion of prevalence data. For each included study, the sample features, diagnostic strategy, and prevalence of bruxism in relation to age, sex, and circadian rhythm, if available, were recorded. Thirty-five publications were included in the review. Several methodological problems limited the external validity of findings in most studies, and prevalence data extraction was performed only on seven papers. Of those, only one paper had a flaw less external validity, whilst internal validity was low in all the selected papers due to their self-reported bruxism diagnosis alone, mainly based on only one or two questionnaire items. No epidemiologic data were available from studies adopting other diagnostic strategies (eg, PSG, EMG). Generically identified "bruxism" was assessed in two studies reporting an 8% to 31.4% prevalence, awake bruxism was investigated in two studies describing a 22.1% to 31% prevalence, and prevalence of sleep bruxism was found to be more consistent across the three studies investigating the report of "frequent" bruxism (12.8% ± 3.1%). Bruxism activities were found to be unrelated to sex, and a decrease with age was described in elderly people. The present systematic review described variable prevalence data for bruxism activities. Findings must be interpreted with caution due to the poor methodological quality of the reviewed literature and to potential diagnostic bias related with having to rely on an individual's self-report of bruxism.
Reliability and validity of a Turkish version of the Global Pelvic Floor Bother Questionnaire.

PubMed

Doğan, Hanife; Özengin, Nuriye; Bakar, Yeşim; Duran, Bülent

2016-10-01

The aim of this study was to translate the Global Pelvic Floor Bother Questionnaire (GPFBQ) into Turkish and to assess its validity and reliability. The Turkish adaptation of the GPFBQ was created by following the stages of the intercultural adaptation process. A test-retest interval of 1 week was used to assess the reliability, which was examined by the intraclass correlation coefficient. The validity of the GPFBQ was assessed and compared with the Pelvic Floor Distress Inventory-20 (PFDI-20) and the Pelvic Floor Impact Questionnaire-7 (PFIQ-7) using Spearman's rank correlation coefficients. For construct validity, confirmatory factor analysis was performed. A total of 131 women, whose mean age was 46.83 years, were included in the study. The test-retest reliability of the GPFBQ was excellent (0.998, p < 0.0001). The GPFBQ correlated significantly with the PFDI-20 (r = 0.860, p = 0.00) and PFIQ-7 (r = 0.802, p = 0.00). Confirmatory factor analysis was performed to determine construct validity, and it was found that it had four dimensions. The Turkish version of the GPFBQ is a valid and reliable tool for assessing the symptoms of bother and severity in Turkish-speaking women with pelvic floor dysfunction.
An investigation of new toxicity test method performance in validation studies: 1. Toxicity test methods that have predictive capacity no greater than chance.

PubMed

Bruner, L H; Carr, G J; Harbell, J W; Curren, R D

2002-06-01

An approach commonly used to measure new toxicity test method (NTM) performance in validation studies is to divide toxicity results into positive and negative classifications, and the identify true positive (TP), true negative (TN), false positive (FP) and false negative (FN) results. After this step is completed, the contingent probability statistics (CPS), sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) are calculated. Although these statistics are widely used and often the only statistics used to assess the performance of toxicity test methods, there is little specific guidance in the validation literature on what values for these statistics indicate adequate performance. The purpose of this study was to begin developing data-based answers to this question by characterizing the CPS obtained from an NTM whose data have a completely random association with a reference test method (RTM). Determining the CPS of this worst-case scenario is useful because it provides a lower baseline from which the performance of an NTM can be judged in future validation studies. It also provides an indication of relationships in the CPS that help identify random or near-random relationships in the data. The results from this study of randomly associated tests show that the values obtained for the statistics vary significantly depending on the cut-offs chosen, that high values can be obtained for individual statistics, and that the different measures cannot be considered independently when evaluating the performance of an NTM. When the association between results of an NTM and RTM is random the sum of the complementary pairs of statistics (sensitivity + specificity, NPV + PPV) is approximately 1, and the prevalence (i.e., the proportion of toxic chemicals in the population of chemicals) and PPV are equal. Given that combinations of high sensitivity-low specificity or low specificity-high sensitivity (i.e., the sum of the sensitivity and specificity equal to approximately 1) indicate lack of predictive capacity, an NTM having these performance characteristics should be considered no better for predicting toxicity than by chance alone.
Impact of Cognitive Abilities and Prior Knowledge on Complex Problem Solving Performance – Empirical Results and a Plea for Ecologically Valid Microworlds

PubMed Central

Süß, Heinz-Martin; Kretzschmar, André

2018-01-01

The original aim of complex problem solving (CPS) research was to bring the cognitive demands of complex real-life problems into the lab in order to investigate problem solving behavior and performance under controlled conditions. Up until now, the validity of psychometric intelligence constructs has been scrutinized with regard to its importance for CPS performance. At the same time, different CPS measurement approaches competing for the title of the best way to assess CPS have been developed. In the first part of the paper, we investigate the predictability of CPS performance on the basis of the Berlin Intelligence Structure Model and Cattell’s investment theory as well as an elaborated knowledge taxonomy. In the first study, 137 students managed a simulated shirt factory (Tailorshop; i.e., a complex real life-oriented system) twice, while in the second study, 152 students completed a forestry scenario (FSYS; i.e., a complex artificial world system). The results indicate that reasoning – specifically numerical reasoning (Studies 1 and 2) and figural reasoning (Study 2) – are the only relevant predictors among the intelligence constructs. We discuss the results with reference to the Brunswik symmetry principle. Path models suggest that reasoning and prior knowledge influence problem solving performance in the Tailorshop scenario mainly indirectly. In addition, different types of system-specific knowledge independently contribute to predicting CPS performance. The results of Study 2 indicate that working memory capacity, assessed as an additional predictor, has no incremental validity beyond reasoning. We conclude that (1) cognitive abilities and prior knowledge are substantial predictors of CPS performance, and (2) in contrast to former and recent interpretations, there is insufficient evidence to consider CPS a unique ability construct. In the second part of the paper, we discuss our results in light of recent CPS research, which predominantly utilizes the minimally complex systems (MCS) measurement approach. We suggest ecologically valid microworlds as an indispensable tool for future CPS research and applications. PMID:29867627
Assessment study of insight ARTHRO VR (®) arthroscopy virtual training simulator: face, content, and construct validities.

PubMed

Bayona, Sofía; Fernández-Arroyo, José Manuel; Martín, Isaac; Bayona, Pilar

2008-09-01

The aims of this study were to test the face, content, and construct validities of a virtual-reality haptic arthroscopy simulator and to validate four assessment hypothesis. The participants in our study were 94 arthroscopists attending an international conference on arthroscopy. The interviewed surgeons had been performing arthroscopies for a mean of 8.71 years (σ = 6.94 years). We explained the operation, functionality, instructions for use, and the exercises provided by the simulator. They performed a trial exercise and then an exercise in which performance was recorded. After having using it, the arthroscopists answered a questionnaire. The simulator was classified as one of the best training methods (over phantoms), and obtained a mark of 7.10 out of 10 as an evaluation tool. The simulator was considered more useful for inexperienced surgeons than for surgeons with experience (mean difference 1.88 out of 10, P value < 0.001). The participants valued the simulator at 8.24 as a tool for learning skills, its fidelity at 7.41, the quality of the platform at 7.54, and the content of the exercises at 7.09. It obtained a global score of 7.82. Of the subjects, 30.8% said they would practise with the simulator more than 6 h per week. Of the surgeons, 89.4% affirmed that they would recommend the simulator to their colleagues. The data gathered support the first three hypotheses, as well as face and content validities. Results show statistically significant differences between experts and novices, thus supporting the construct validity, but studies with a larger sample must be carried out to verify this. We propose concrete solutions and an equation to calculate economy of movement. Analogously, we analyze competence measurements and propose an equation to provide a single measurement that contains them all and that, according to the surgeons' criteria, is as reliable as the judgment of experts observing the performance of an apprentice.
Validity of the Medical College Admission Test for Predicting MD-PhD Student Outcomes

ERIC Educational Resources Information Center

Bills, James L.; VanHouten, Jacob; Grundy, Michelle M.; Chalkley, Roger; Dermody, Terence S.

2016-01-01

The Medical College Admission Test (MCAT) is a quantitative metric used by MD and MD-PhD programs to evaluate applicants for admission. This study assessed the validity of the MCAT in predicting training performance measures and career outcomes for MD-PhD students at a single institution. The study population consisted of 153 graduates of the…
Measuring Emotional Intelligence in Early Adolescence with the MSCEIT-YV: Psychometric Properties and Relationship with Academic Performance and Psychosocial Functioning

ERIC Educational Resources Information Center

Rivers, Susan E.; Brackett, Marc A.; Reyes, Maria R.; Mayer, John D.; Caruso, David R.; Salovey, Peter

2012-01-01

Emotional intelligence (EI) theory provides a framework to study the role of emotion skills in social, personal, and academic functioning. Reporting data validating the importance of EI among youth have been limited due to a dearth of measurement instruments. In two studies, the authors examined the reliability and validity of the…
The Study of Validity and Reliability of the Perceived Value Scale of Prospective Teachers in Terms of Teaching Profession

ERIC Educational Resources Information Center

Demir, Engin; Budak, Yusuf; Demir, Cennet Gologlu

2017-01-01

The aim of this study was to develop "Perceived Value Scale in regard to Teaching Profession of Prospective Teachers." The validity and reliability analysis of the scale, developed for prospective elementary school teachers, was performed. In order to determine the values of the teaching profession, first of all, the related literature…
U.S.-MEXICO BORDER PROGRAM ARIZONA BORDER STUDY--STANDARD OPERATING PROCEDURE FOR PERFORMANCE OF COMPUTER SOFTWARE: VERIFICATION AND VALIDATION (UA-D-2.0)

EPA Science Inventory

The purpose of this SOP is to define the procedures used for the initial and periodic verification and validation of computer programs used during the Arizona NHEXAS project and the Border study. Keywords: Computers; Software; QA/QC.
The U.S.-Mexico Border Program is sponsored ...
The reliability and validity of a short food frequency questionnaire among 9–11-year olds: a multinational study on three middle-income and high-income countries

PubMed Central

Saloheimo, T; González, S A; Erkkola, M; Milauskas, D M; Meisel, J D; Champagne, C M; Tudor-Locke, C; Sarmiento, O; Katzmarzyk, P T; Fogelholm, M

2015-01-01

Objective: The main aim of this study was to assess the reliability and validity of a food frequency questionnaire with 23 food groups (I-FFQ) among a sample of 9–11-year-old children from three different countries that differ on economical development and income distribution, and to assess differences between country sites. Furthermore, we assessed factors associated with I-FFQ's performance. Methods: This was an ancillary study of the International Study of Childhood Obesity, Lifestyle and the Environment. Reliability (n=321) and validity (n=282) components of this study had the same participants. Participation rates were 95% and 70%, respectively. Participants completed two I-FFQs with a mean interval of 4.9 weeks to assess reliability. A 3-day pre-coded food diary (PFD) was used as the reference method in the validity analyses. Wilcoxon signed-rank tests, intraclass correlation coefficients and cross-classifications were used to assess the reliability of I-FFQ. Spearman correlation coefficients, percentage difference and cross-classifications were used to assess the validity of I-FFQ. A logistic regression model was used to assess the relation of selected variables with the estimate of validity. Analyses based on information in the PFDs were performed to assess how participants interpreted food groups. Results: Reliability correlation coefficients ranged from 0.37 to 0.78 and gross misclassification for all food groups was <5%. Validity correlation coefficients were below 0.5 for 22/23 food groups, and they differed among country sites. For validity, gross misclassification was <5% for 22/23 food groups. Over- or underestimation did not appear for 19/23 food groups. Logistic regression showed that country of participation and parental education were associated (P⩽0.05) with the validity of I-FFQ. Analyses of children's interpretation of food groups suggested that the meaning of most food groups was understood by the children. Conclusion: I-FFQ is a moderately reliable method and its validity ranged from low to moderate, depending on food group and country site. PMID:27152180
Validation of Welding Curriculum Manual.

ERIC Educational Resources Information Center

Stone, Sheila D.

A study was conducted to validate the welding curriculum materials developed and published by the Oklahoma State Department of Vocational and Technical Education. Twelve instructors collected achievement data (unit tests, assignment sheets, and evaluation forms) concerning the performance of 280 students on a total of 46 instructional units. Item…
Development, calibration, and validation of performance prediction models for the Texas M-E flexible pavement design system.

DOT National Transportation Integrated Search

2010-08-01

This study was intended to recommend future directions for the development of TxDOTs Mechanistic-Empirical : (TexME) design system. For stress predictions, a multi-layer linear elastic system was evaluated and its validity was : verified by compar...
Comprehension of Written Grammar Test: Reliability and Known-Groups Validity Study With Hearing and Deaf and Hard-of-Hearing Students.

PubMed

Cannon, Joanna E; Hubley, Anita M; Millhoff, Courtney; Mazlouman, Shahla

2016-01-01

The aim of the current study was to gather validation evidence for the Comprehension of Written Grammar (CWG; Easterbrooks, 2010) receptive test of 26 grammatical structures of English print for use with children who are deaf and hard of hearing (DHH). Reliability and validity data were collected for 98 participants (49 DHH and 49 hearing) in Grades 2-6. The objectives were to: (a) examine 4-week test-retest reliability data; and (b) provide evidence of known-groups validity by examining expected differences between the groups on the CWG vocabulary pretest and main test, as well as selected structures. Results indicated excellent test-retest reliability estimates for CWG test scores. DHH participants performed statistically significantly lower on the CWG vocabulary pretest and main test than the hearing participants. Significantly lower performance by DHH participants on most expected grammatical structures (e.g., basic sentence patterns, auxiliary "be" singular/plural forms, tense, comparatives, and complementation) also provided known groups evidence. Overall, the findings of this study showed strong evidence of the reliability of scores and known group-based validity of inferences made from the CWG. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Simulated ventriculostomy training with conventional neuronavigational equipment used clinically in the operating room: prospective validation study.

PubMed

Kirkman, Matthew A; Muirhead, William; Sevdalis, Nick; Nandi, Dipankar

2015-01-01

Simulation is gaining increasing interest as a method of delivering high-quality, time-effective, and safe training to neurosurgical residents. However, most current simulators are purpose-built for simulation, being relatively expensive and inaccessible to many residents. The purpose of this study was to provide the first comprehensive validity assessment of ventriculostomy performance metrics from the Medtronic StealthStation S7 Surgical Navigation System, a neuronavigational tool widely used in the clinical setting, as a training tool for simulated ventriculostomy while concomitantly reporting on stress measures. A prospective study where participants performed 6 simulated ventriculostomy attempts on a model head with StealthStation-coregistered imaging. The performance measures included distance of the ventricular catheter tip to the foramen of Monro and presence of the catheter tip in the ventricle. Data on objective and self-reported stress and workload measures were also collected. The operating rooms of the National Hospital for Neurology and Neurosurgery, Queen Square, London. A total of 31 individuals with varying levels of prior ventriculostomy experience, varying in seniority from medical student to senior resident. Performance at simulated ventriculostomy improved significantly over subsequent attempts, irrespective of previous ventriculostomy experience. Performance improved whether or not the StealthStation display monitor was used for real-time visual feedback, but performance was optimal when it was. Further, performance was inversely correlated with both objective and self-reported measures of stress (traditionally referred to as concurrent validity). Stress and workload measures were well-correlated with each other, and they also correlated with technical performance. These initial data support the use of the StealthStation as a training tool for simulated ventriculostomy, providing a safe environment for repeated practice with immediate feedback. Although the potential implications are profound for neurosurgical education and training, further research following this proof-of-concept study is required on a larger scale for full validation and proof that training translates into improved long-term simulated and patient outcomes. Copyright © 2015 Association of Program Directors in Surgery. Published by Elsevier Inc. All rights reserved.
Identification of biomarkers for lung cancer in never smokers — EDRN Public Portal

Cancer.gov

The overall goal of this project is to identify, verify and apply biomarkers for the early diagnosis or risk assessment of lung cancer in never smokers. The first year will be regarded as a year of discovery. After successful demonstration of the feasibility of the approach for novel marker discovery, funding will be applied for to perform confirmation and preclinical studies on the biomarkers and validation studies (specific aims 2 and 3, to be performed in years two and three). Year two can be regarded as the year of confirmation and year three as the year of validation.
Assessing Cognitive Performance in Badminton Players: A Reproducibility and Validity Study

PubMed Central

van de Water, Tanja; Faber, Irene; Elferink-Gemser, Marije

2017-01-01

Abstract Fast reaction and good inhibitory control are associated with elite sports performance. To evaluate the reproducibility and validity of a newly developed Badminton Reaction Inhibition Test (BRIT), fifteen elite (25 ± 4 years) and nine non-elite (24 ± 4 years) Dutch male badminton players participated in the study. The BRIT measured four components: domain-general reaction time, badminton-specific reaction time, domain-general inhibitory control and badminton-specific inhibitory control. Five participants were retested within three weeks on the badminton-specific components. Reproducibility was acceptable for badminton-specific reaction time (ICC = 0.626, CV = 6%) and for badminton-specific inhibitory control (ICC = 0.317, CV = 13%). Good construct validity was shown for badminton-specific reaction time discriminating between elite and non-elite players (F = 6.650, p < 0.05). Elite players did not outscore non-elite players on domain-general reaction time nor on both components of inhibitory control (p > 0.05). Concurrent validity for domain-general reaction time was good, as it was associated with a national ranking for elite (p = 0.70, p < 0.01) and non-elite (p = 0.70, p < 0.05) players. No relationship was found between the national ranking and badminton-specific reaction time, nor both components of inhibitory control (p > 0.05). In conclusion, reproducibility and validity of inhibitory control assessment was not confirmed, however, the BRIT appears a reproducible and valid measure of reaction time in badminton players. Reaction time measured with the BRIT may provide input for training programs aiming to improve badminton players’ performance. PMID:28210347
Assessing Cognitive Performance in Badminton Players: A Reproducibility and Validity Study.

PubMed

van de Water, Tanja; Huijgen, Barbara; Faber, Irene; Elferink-Gemser, Marije

2017-01-01

Fast reaction and good inhibitory control are associated with elite sports performance. To evaluate the reproducibility and validity of a newly developed Badminton Reaction Inhibition Test (BRIT), fifteen elite (25 ± 4 years) and nine non-elite (24 ± 4 years) Dutch male badminton players participated in the study. The BRIT measured four components: domain-general reaction time, badminton-specific reaction time, domain-general inhibitory control and badminton-specific inhibitory control. Five participants were retested within three weeks on the badminton-specific components. Reproducibility was acceptable for badminton-specific reaction time (ICC = 0.626, CV = 6%) and for badminton-specific inhibitory control (ICC = 0.317, CV = 13%). Good construct validity was shown for badminton-specific reaction time discriminating between elite and non-elite players (F = 6.650, p < 0.05). Elite players did not outscore non-elite players on domain-general reaction time nor on both components of inhibitory control (p > 0.05). Concurrent validity for domain-general reaction time was good, as it was associated with a national ranking for elite (p = 0.70, p < 0.01) and non-elite (p = 0.70, p < 0.05) players. No relationship was found between the national ranking and badminton-specific reaction time, nor both components of inhibitory control (p > 0.05). In conclusion, reproducibility and validity of inhibitory control assessment was not confirmed, however, the BRIT appears a reproducible and valid measure of reaction time in badminton players. Reaction time measured with the BRIT may provide input for training programs aiming to improve badminton players' performance.
Reliable and valid assessment of point-of-care ultrasonography.

PubMed

Todsen, Tobias; Tolsgaard, Martin Grønnebæk; Olsen, Beth Härstedt; Henriksen, Birthe Merete; Hillingsø, Jens Georg; Konge, Lars; Jensen, Morten Lind; Ringsted, Charlotte

2015-02-01

To explore the reliability and validity of the Objective Structured Assessment of Ultrasound Skills (OSAUS) scale for point-of-care ultrasonography (POC US) performance. POC US is increasingly used by clinicians and is an essential part of the management of acute surgical conditions. However, the quality of performance is highly operator-dependent. Therefore, reliable and valid assessment of trainees' ultrasonography competence is needed to ensure patient safety. Twenty-four physicians, representing novices, intermediates, and experts in POC US, scanned 4 different surgical patient cases in a controlled set-up. All ultrasound examinations were video-recorded and assessed by 2 blinded radiologists using OSAUS. Reliability was examined using generalizability theory. Construct validity was examined by comparing performance scores between the groups and by correlating physicians' OSAUS scores with diagnostic accuracy. The generalizability coefficient was high (0.81) and a D-study demonstrated that 1 assessor and 5 cases would result in similar reliability. The construct validity of the OSAUS scale was supported by a significant difference in the mean scores between the novice group (17.0; SD 8.4) and the intermediate group (30.0; SD 10.1), P = 0.007, as well as between the intermediate group and the expert group (72.9; SD 4.4), P = 0.04, and by a high correlation between OSAUS scores and diagnostic accuracy (Spearman ρ correlation coefficient = 0.76; P < 0.001). This study demonstrates high reliability as well as evidence of construct validity of the OSAUS scale for assessment of POC US competence. Hence, the OSAUS scale may be suitable for both in-training as well as end-of-training assessment.
Development of Airport Surface Required Navigation Performance (RNP)

NASA Technical Reports Server (NTRS)

Cassell, Rick; Smith, Alex; Hicok, Dan

1999-01-01

The U.S. and international aviation communities have adopted the Required Navigation Performance (RNP) process for defining aircraft performance when operating the en-route, approach and landing phases of flight. RNP consists primarily of the following key parameters - accuracy, integrity, continuity, and availability. The processes and analytical techniques employed to define en-route, approach and landing RNP have been applied in the development of RNP for the airport surface. To validate the proposed RNP requirements several methods were used. Operational and flight demonstration data were analyzed for conformance with proposed requirements, as were several aircraft flight simulation studies. The pilot failure risk component was analyzed through several hypothetical scenarios. Additional simulator studies are recommended to better quantify crew reactions to failures as well as additional simulator and field testing to validate achieved accuracy performance, This research was performed in support of the NASA Low Visibility Landing and Surface Operations Programs.

An Examination of Coach and Player Relationships According to the Adapted LMX 7 Scale: A Validity and Reliability Study

ERIC Educational Resources Information Center

Caliskan, Gokhan

2015-01-01

The current study aims to test the reliability and validity of the Leader-Member Exchange (LMX 7) scale with regard to coach--player relationships in sports settings. A total of 330 professional soccer players from the Turkish Super League as well as from the First and Second Leagues participated in this study. Factor analyses were performed to…
Red flags in the clinical interview may forecast invalid neuropsychological testing.

PubMed

Keesler, Michael E; McClung, Kirstie; Meredith-Duliba, Tawny; Williams, Kelli; Swirsky-Sacchetti, Thomas

2017-04-01

Evaluating assessment validity is expected in neuropsychological evaluation, particularly in cases with identified secondary gain, where malingering or somatization may be present. Assessed with standalone measures and embedded indices, all within the testing portion of the examination, research on validity of self-report in the clinical interview is limited. Based on experience with litigation-involved examinees recovering from mild traumatic brain injury (mTBI), it was hypothesized that inconsistently reported date of injury (DOI) and/or loss of consciousness (LOC) might predict invalid performance on neurocognitive testing. This archival study examined cases of litigation-involved mTBI patients seen at an outpatient neuropsychological practice in Philadelphia, PA. Coded data included demographic variables, performance validity measures, and consistency between self-report and medicolegal records. A significant relationship was found between the consistency of examinees' self-report with records and their scores on performance validity testing, X 2 (1, N = 84) = 24.18, p < .01, Φ = .49. Post hoc testing revealed significant between-group differences in three of four comparisons, with medium to large effect sizes. A final post hoc analysis found significance between the number of performance validity tests (PVTs) failed and the extent to which an examinee incorrectly reported DOI r(83) = .49, p < .01. Using inconsistently reported LOC and/or DOI to predict an examinee's performance as invalid had a 75% sensitivity and a 75% specificity. Examinees whose reported DOI or LOC differs from records may be more likely to fail one or more PVTs, suggesting possible symptom exaggeration and/or under performance on cognitive testing.s.
Invalid before impaired: an emerging paradox of embedded validity indicators.

PubMed

Erdodi, Laszlo A; Lichtenstein, Jonathan D

Embedded validity indicators (EVIs) are cost-effective psychometric tools to identify non-credible response sets during neuropsychological testing. As research on EVIs expands, assessors are faced with an emerging contradiction: the range of credible impairment disappears between the 'normal' and 'invalid' range of performance. We labeled this phenomenon as the invalid-before-impaired paradox. This study was designed to explore the origin of this psychometric anomaly, subject it to empirical investigation, and generate potential solutions. Archival data were analyzed from a mixed clinical sample of 312 (M Age = 45.2; M Education = 13.6) patients medically referred for neuropsychological assessment. The distribution of scores on eight subtests of the third and fourth editions of Wechsler Adult Intelligence Scale (WAIS) were examined in relation to the standard normal curve and two performance validity tests (PVTs). Although WAIS subtests varied in their sensitivity to non-credible responding, they were all significant predictors of performance validity. While subtests previously identified as EVIs (Digit Span, Coding, and Symbol Search) were comparably effective at differentiating credible and non-credible response sets, their classification accuracy was driven by their base rate of low scores, requiring different cutoffs to achieve comparable specificity. Invalid performance had a global effect on WAIS scores. Genuine impairment and non-credible performance can co-exist, are often intertwined, and may be psychometrically indistinguishable. A compromise between the alpha and beta bias on PVTs based on a balanced, objective evaluation of the evidence that requires concessions from both sides is needed to maintain/restore the credibility of performance validity assessment.
Validation of the INCEPT: A Multisource Feedback Tool for Capturing Different Perspectives on Physicians' Professional Performance.

PubMed

van der Meulen, Mirja W; Boerebach, Benjamin C M; Smirnova, Alina; Heeneman, Sylvia; Oude Egbrink, Mirjam G A; van der Vleuten, Cees P M; Arah, Onyebuchi A; Lombarts, Kiki M J M H

2017-01-01

Multisource feedback (MSF) instruments are used to and must feasibly provide reliable and valid data on physicians' performance from multiple perspectives. The "INviting Co-workers to Evaluate Physicians Tool" (INCEPT) is a multisource feedback instrument used to evaluate physicians' professional performance as perceived by peers, residents, and coworkers. In this study, we report on the validity, reliability, and feasibility of the INCEPT. The performance of 218 physicians was assessed by 597 peers, 344 residents, and 822 coworkers. Using explorative and confirmatory factor analyses, multilevel regression analyses between narrative and numerical feedback, item-total correlations, interscale correlations, Cronbach's α and generalizability analyses, the psychometric qualities, and feasibility of the INCEPT were investigated. For all respondent groups, three factors were identified, although constructed slightly different: "professional attitude," "patient-centeredness," and "organization and (self)-management." Internal consistency was high for all constructs (Cronbach's α ≥ 0.84 and item-total correlations ≥ 0.52). Confirmatory factor analyses indicated acceptable to good fit. Further validity evidence was given by the associations between narrative and numerical feedback. For reliable total INCEPT scores, three peer, two resident and three coworker evaluations were needed; for subscale scores, evaluations of three peers, three residents and three to four coworkers were sufficient. The INCEPT instrument provides physicians performance feedback in a valid and reliable way. The number of evaluations to establish reliable scores is achievable in a regular clinical department. When interpreting feedback, physicians should consider that respondent groups' perceptions differ as indicated by the different item clustering per performance factor.
The reliability and validity of fatigue measures during multiple-sprint work: an issue revisited.

PubMed

Glaister, Mark; Howatson, Glyn; Pattison, John R; McInnes, Gill

2008-09-01

The ability to repeatedly produce a high-power output or sprint speed is a key fitness component of most field and court sports. The aim of this study was to evaluate the validity and reliability of eight different approaches to quantify this parameter in tests of multiple-sprint performance. Ten physically active men completed two trials of each of two multiple-sprint running protocols with contrasting recovery periods. Protocol 1 consisted of 12 x 30-m sprints repeated every 35 seconds; protocol 2 consisted of 12 x 30-m sprints repeated every 65 seconds. All testing was performed in an indoor sports facility, and sprint times were recorded using twin-beam photocells. All but one of the formulae showed good construct validity, as evidenced by similar within-protocol fatigue scores. However, the assumptions on which many of the formulae were based, combined with poor or inconsistent test-retest reliability (coefficient of variation range: 0.8-145.7%; intraclass correlation coefficient range: 0.09-0.75), suggested many problems regarding logical validity. In line with previous research, the results support the percentage decrement calculation as the most valid and reliable method of quantifying fatigue in tests of multiple-sprint performance.
Methodology, Methods, and Metrics for Testing and Evaluating Augmented Cognition Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Greitzer, Frank L.

The augmented cognition research community seeks cognitive neuroscience-based solutions to improve warfighter performance by applying and managing mitigation strategies to reduce workload and improve the throughput and quality of decisions. The focus of augmented cognition mitigation research is to define, demonstrate, and exploit neuroscience and behavioral measures that support inferences about the warfighter’s cognitive state that prescribe the nature and timing of mitigation. A research challenge is to develop valid evaluation methodologies, metrics and measures to assess the impact of augmented cognition mitigations. Two considerations are external validity, which is the extent to which the results apply to operational contexts;more » and internal validity, which reflects the reliability of performance measures and the conclusions based on analysis of results. The scientific rigor of the research methodology employed in conducting empirical investigations largely affects the validity of the findings. External validity requirements also compel us to demonstrate operational significance of mitigations. Thus it is important to demonstrate effectiveness of mitigations under specific conditions. This chapter reviews some cognitive science and methodological considerations in designing augmented cognition research studies and associated human performance metrics and analysis methods to assess the impact of augmented cognition mitigations.« less
Analysis of Aurora's Performance Simulation Engine for Three Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Freeman, Janine; Simon, Joseph

2015-07-07

Aurora Solar Inc. is building a cloud-based optimization platform to automate the design, engineering, and permit generation process of solar photovoltaic (PV) installations. They requested that the National Renewable Energy Laboratory (NREL) validate the performance of the PV system performance simulation engine of Aurora Solar’s solar design platform, Aurora. In previous work, NREL performed a validation of multiple other PV modeling tools 1, so this study builds upon that work by examining all of the same fixed-tilt systems with available module datasheets that NREL selected and used in the aforementioned study. Aurora Solar set up these three operating PV systemsmore » in their modeling platform using NREL-provided system specifications and concurrent weather data. NREL then verified the setup of these systems, ran the simulations, and compared the Aurora-predicted performance data to measured performance data for those three systems, as well as to performance data predicted by other PV modeling tools.« less
Validity and reliability of an instrumented leg-extension machine for measuring isometric muscle strength of the knee extensors.

PubMed

Ruschel, Caroline; Haupenthal, Alessandro; Jacomel, Gabriel Fernandes; Fontana, Heiliane de Brito; Santos, Daniela Pacheco dos; Scoz, Robson Dias; Roesler, Helio

2015-05-20

Isometric muscle strength of knee extensors has been assessed for estimating performance, evaluating progress during physical training, and investigating the relationship between isometric and dynamic/functional performance. To assess the validity and reliability of an adapted leg-extension machine for measuring isometric knee extensor force. Validity (concurrent approach) and reliability (test and test-retest approach) study. University laboratory. 70 healthy men and women aged between 20 and 30 y (39 in the validity study and 31 in the reliability study). Intraclass correlation coefficient (ICC) values calculated for the maximum voluntary isometric torque of knee extensors at 30°, 60°, and 90°, measured with the prototype and with an isokinetic dynamometer (ICC2,1, validity study) and measured with the prototype in test and retest sessions, scheduled from 48 h to 72 h apart (ICC1,1, reliability study). In the validity analysis, the prototype showed good agreement for measurements at 30° (ICC2,1 = .75, SEM = 18.2 Nm) and excellent agreement for measurements at 60° (ICC2,1 = .93, SEM = 9.6 Nm) and at 90° (ICC2,1 = .94, SEM = 8.9 Nm). Regarding the reliability analysis, between-days' ICC1,1 were good to excellent, ranging from .88 to .93. Standard error of measurement and minimal detectable difference based on test-retest ranged from 11.7 Nm to 18.1 Nm and 32.5 Nm to 50.1 Nm, respectively, for the 3 analyzed knee angles. The analysis of validity and repeatability of the prototype for measuring isometric muscle strength has shown to be good or excellent, depending on the knee joint angle analyzed. The new instrument, which presents a relative low cost and easiness of transportation when compared with an isokinetic dynamometer, is valid and provides consistent data concerning isometric strength of knee extensors and, for this reason, can be used for practical, clinical, and research purposes.
Concurrent and convergent validity of the mobility- and multidimensional-hierarchical disability categorization models with physical performance in community older adults.

PubMed

Hu, Ming-Hsia; Yeh, Chih-Jun; Chen, Tou-Rong; Wang, Ching-Yi

2014-01-01

A valid, time-efficient and easy-to-use instrument is important for busy clinical settings, large scale surveys, or community screening use. The purpose of this study was to validate the mobility hierarchical disability categorization model (an abbreviated model) by investigating its concurrent validity with the multidimensional hierarchical disability categorization model (a comprehensive model) and triangulating both models with physical performance measures in older adults. 604 community-dwelling older adults of at least 60 years in age volunteered to participate. Self-reported function on mobility, instrumental activities of daily living (IADL) and activities of daily living (ADL) domains were recorded and then the disability status determined based on both the multidimensional hierarchical categorization model and the mobility hierarchical categorization model. The physical performance measures, consisting of grip strength and usual and fastest gait speeds (UGS, FGS), were collected on the same day. Both categorization models showed high correlation (γs = 0.92, p < 0.001) and agreement (kappa = 0.61, p < 0.0001). Physical performance measures demonstrated significant different group means among the disability subgroups based on both categorization models. The results of multiple regression analysis indicated that both models individually explain similar amount of variance on all physical performances, with adjustments for age, sex, and number of comorbidities. Our results found that the mobility hierarchical disability categorization model is a valid and time efficient tool for large survey or screening use.
DNA Commission of the International Society for Forensic Genetics: Recommendations on the validation of software programs performing biostatistical calculations for forensic genetics applications.

PubMed

Coble, M D; Buckleton, J; Butler, J M; Egeland, T; Fimmers, R; Gill, P; Gusmão, L; Guttman, B; Krawczak, M; Morling, N; Parson, W; Pinto, N; Schneider, P M; Sherry, S T; Willuweit, S; Prinz, M

2016-11-01

The use of biostatistical software programs to assist in data interpretation and calculate likelihood ratios is essential to forensic geneticists and part of the daily case work flow for both kinship and DNA identification laboratories. Previous recommendations issued by the DNA Commission of the International Society for Forensic Genetics (ISFG) covered the application of bio-statistical evaluations for STR typing results in identification and kinship cases, and this is now being expanded to provide best practices regarding validation and verification of the software required for these calculations. With larger multiplexes, more complex mixtures, and increasing requests for extended family testing, laboratories are relying more than ever on specific software solutions and sufficient validation, training and extensive documentation are of upmost importance. Here, we present recommendations for the minimum requirements to validate bio-statistical software to be used in forensic genetics. We distinguish between developmental validation and the responsibilities of the software developer or provider, and the internal validation studies to be performed by the end user. Recommendations for the software provider address, for example, the documentation of the underlying models used by the software, validation data expectations, version control, implementation and training support, as well as continuity and user notifications. For the internal validations the recommendations include: creating a validation plan, requirements for the range of samples to be tested, Standard Operating Procedure development, and internal laboratory training and education. To ensure that all laboratories have access to a wide range of samples for validation and training purposes the ISFG DNA commission encourages collaborative studies and public repositories of STR typing results. Published by Elsevier Ireland Ltd.
A semi-automated volumetric software for segmentation and perfusion parameter quantification of brain tumors using 320-row multidetector computed tomography: a validation study.

PubMed

Chae, Soo Young; Suh, Sangil; Ryoo, Inseon; Park, Arim; Noh, Kyoung Jin; Shim, Hackjoon; Seol, Hae Young

2017-05-01

We developed a semi-automated volumetric software, NPerfusion, to segment brain tumors and quantify perfusion parameters on whole-brain CT perfusion (WBCTP) images. The purpose of this study was to assess the feasibility of the software and to validate its performance compared with manual segmentation. Twenty-nine patients with pathologically proven brain tumors who underwent preoperative WBCTP between August 2012 and February 2015 were included. Three perfusion parameters, arterial flow (AF), equivalent blood volume (EBV), and Patlak flow (PF, which is a measure of permeability of capillaries), of brain tumors were generated by a commercial software and then quantified volumetrically by NPerfusion, which also semi-automatically segmented tumor boundaries. The quantification was validated by comparison with that of manual segmentation in terms of the concordance correlation coefficient and Bland-Altman analysis. With NPerfusion, we successfully performed segmentation and quantified whole volumetric perfusion parameters of all 29 brain tumors that showed consistent perfusion trends with previous studies. The validation of the perfusion parameter quantification exhibited almost perfect agreement with manual segmentation, with Lin concordance correlation coefficients (ρ c ) for AF, EBV, and PF of 0.9988, 0.9994, and 0.9976, respectively. On Bland-Altman analysis, most differences between this software and manual segmentation on the commercial software were within the limit of agreement. NPerfusion successfully performs segmentation of brain tumors and calculates perfusion parameters of brain tumors. We validated this semi-automated segmentation software by comparing it with manual segmentation. NPerfusion can be used to calculate volumetric perfusion parameters of brain tumors from WBCTP.
Assessment of performance validity in the Stroop Color and Word Test in mild traumatic brain injury patients: a criterion-groups validation design.

PubMed

Guise, Brian J; Thompson, Matthew D; Greve, Kevin W; Bianchini, Kevin J; West, Laura

2014-03-01

The current study assessed performance validity on the Stroop Color and Word Test (Stroop) in mild traumatic brain injury (TBI) using criterion-groups validation. The sample consisted of 77 patients with a reported history of mild TBI. Data from 42 moderate-severe TBI and 75 non-head-injured patients with other clinical diagnoses were also examined. TBI patients were categorized on the basis of Slick, Sherman, and Iverson (1999) criteria for malingered neurocognitive dysfunction (MND). Classification accuracy is reported for three indicators (Word, Color, and Color-Word residual raw scores) from the Stroop across a range of injury severities. With false-positive rates set at approximately 5%, sensitivity was as high as 29%. The clinical implications of these findings are discussed. © 2012 The British Psychological Society.
The Children's Social Understanding Scale: construction and validation of a parent-report measure for assessing individual differences in children's theories of mind.

PubMed

Tahiroglu, Deniz; Moses, Louis J; Carlson, Stephanie M; Mahy, Caitlin E V; Olofson, Eric L; Sabbagh, Mark A

2014-11-01

Children's theory of mind (ToM) is typically measured with laboratory assessments of performance. Although these measures have generated a wealth of informative data concerning developmental progressions in ToM, they may be less useful as the sole source of information about individual differences in ToM and their relation to other facets of development. In the current research, we aimed to expand the repertoire of methods available for measuring ToM by developing and validating a parent-report ToM measure: the Children's Social Understanding Scale (CSUS). We present 3 studies assessing the psychometric properties of the CSUS. Study 1 describes item analysis, internal consistency, test-retest reliability, and relation of the scale to children's performance on laboratory ToM tasks. Study 2 presents cross-validation data for the scale in a different sample of preschool children with a different set of ToM tasks. Study 3 presents further validation data for the scale with a slightly older age group and a more advanced ToM task, while controlling for several other relevant cognitive abilities. The findings indicate that the CSUS is a reliable and valid measure of individual differences in children's ToM that may be of great value as a complement to standard ToM tasks in many different research contexts. (PsycINFO Database Record (c) 2014 APA, all rights reserved).
Development and validation of the Perceived Game-Specific Soccer Competence Scale.

PubMed

Forsman, Hannele; Gråstén, Arto; Blomqvist, Minna; Davids, Keith; Liukkonen, Jarmo; Konttinen, Niilo

2016-07-01

The objective of this study was to create a valid, self-reported, game-specific soccer competence scale. A structural model of perceived competence, performance measures and motivation was tested as the basis for the scale. A total of 1321 soccer players (261 females, 1060 males) ranging from 12 to 15 years (13.4 ± 1.0 years) participated in the study. They completed the Perceived Game-Specific Soccer Competence Scale (PGSSCS), self-assessments of tactical skills and motivation, as well as technical and speed and agility tests. Results of factor analyses, tests of internal consistency and correlations between PGSSCS subscales, performance measures and motivation supported the reliability and validity of the PGSSCS. The scale can be considered a suitable instrument to assess perceived game-specific competence among young soccer players.
Validity and test-retest reliability of an at-work production loss instrument.

PubMed

Aboagye, E; Jensen, I; Bergström, G; Hagberg, J; Axén, I; Lohela-Karlsson, M

2016-07-01

Besides causing ill health, a poor work environment may contribute to production loss. Production loss assessment instruments emphasize health-related consequences but there is no instrument to measure reduced work performance related to the work environment. To examine convergent validity and test-retest reliability of health-related production loss (HRPL) and work environment-related production loss (WRPL) against a valid comparable instrument, the Health and Work Performance Questionnaire (HPQ). Cross-sectional study of employees, not on sick leave, who were asked to self-rate their work performance and production losses. Using the Pearson correlation and Bland and Altman's Test of Agreement, convergent validity was examined. Subgroup analyses were performed for employees recording problem-specific reduced work performance. Consistency of pairs of HRPL and WRPL for samples responding to both assessments was expressed using Intraclass Correlation Coefficient (ICC) and tests of repeatability. A total of 88 employees participated and 44 responded to both assessments. Test of agreement between measurements estimates a mean difference of 0.34 for HRPL and -0.03 for WRPL compared with work performance. This indicates that the production loss questions are valid and moderately associated with work performance for the total sample and subgroups. ICC for paired HRPL assessments was 0.90 and 0.91 for WRPL, i.e. the test-retest reliability was good and suggests stability in the instrument. HRPL and WRPL can be used to measure production loss due to health-related and work environment-related problems. These results may have implications for advancing methods of assessing production loss, which represents an important cost to employers. © The Author 2016. Published by Oxford University Press on behalf of the Society of Occupational Medicine. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Design and validation of the INICIARE instrument, for the assessment of dependency level in acutely ill hospitalised patients.

PubMed

Morales-Asencio, José Miguel; Porcel-Gálvez, Ana María; Oliveros-Valenzuela, Rosa; Rodríguez-Gómez, Susana; Sánchez-Extremera, Lucrecia; Serrano-López, Francisco Andrés; Aranda-Gallardo, Marta; Canca-Sánchez, José Carlos; Barrientos-Trigo, Sergio

2015-03-01

The aim of this study was to establish the validity and reliability of an instrument (Inventario del NIvel de Cuidados mediante IndicAdores de clasificación de Resultados de Enfermería) used to assess the dependency level in acutely hospitalised patients. This instrument is novel, and it is based on the Nursing Outcomes Classification. Multiple existing instruments for needs assessment have been poorly validated and based predominately on interventions. Standardised Nursing Languages offer an ideal framework to develop nursing sensitive instruments. A cross-sectional validation study in two acute care hospitals in Spain. This study was implemented in two phases. First, the research team developed the instrument to be validated. In the second phase, the validation process was performed by experts, and the data analysis was conducted to establish the psychometric properties of the instrument. Seven hundred and sixty-one patient ratings performed by nurses were collected during the course of the research study. Data analysis yielded a Cronbach's alpha of 0·91. An exploratory factorial analysis identified three factors (Physiological, Instrumental and Cognitive-behavioural), which explained 74% of the variance. Inventario del NIvel de Cuidados mediante IndicAdores de clasificación de Resultados de Enfermería was demonstrated to be a valid and reliable instrument based on its use in acutely hospitalised patients to assess the level of dependency. Inventario del NIvel de Cuidados mediante IndicAdores de clasificación de Resultados de Enfermería can be used as an assessment tool in hospitalised patients during the nursing process throughout the entire hospitalisation period. It contributes information to support decisions on nursing diagnoses, interventions and outcomes. It also enables data codification in large databases. © 2014 John Wiley & Sons Ltd.
A systematic review of a functional assessment Tool: UCSD Performance-based skill assessment (UPSA).

PubMed

Becattini-Oliveira, Ana Claudia; Dutra, Douglas de Farias; Spenciere de Oliveira Campos, Bárbara; de Araujo, Verônica Carvalho; Charchat-Fichman, Helenice

2018-05-18

Performance based assessment instruments have been employed in functional capacity measurement of mental disorders. The aim of this systematic review was to identify the psychometric properties of the UCSD Performance-based Skill Assessment (UPSA). A search was conducted using the PRISMA protocol and 'UPSA' as key word term on electronic databases, with a date range for articles published from 2001-2017. Published studies involving community-dwelling adults were included. Pharmacological and/or clinical interventions involving clinical outcomes and/or institutionalized samples were excluded. Data related to construct validity, test-retest reliability and sensitivity/specificity were extracted, summarized and analyzed according to UPSA versions and psychiatric disorders. Fifty-eight studies including 8782 Community-dwelling adults met selection criteria. Data supporting the construct and known-groups validity were extracted from 41 studies involving Schizophrenia and schizoaffective disorders and 17 studies involving other metal illness. The UPSA was culturally adapted to 8 different languages and employed in 17 countries. Few studies reported sensitivity and specificity and the cut-off points could not be generalized. Moderate to strong evidence of construct validity and test-retest reliability was found. Few studies proposed cut-off points. The UPSA showed good psychometric properties in different versions including those culturally adapted. Copyright © 2018 Elsevier B.V. All rights reserved.
Cross-cultural validity and measurement invariance of the Organizational Stressor Indicator for Sport Performers (OSI-SP) across three countries.

PubMed

Arnold, R; Ponnusamy, V; Zhang, C-Q; Gucciardi, D F

2017-08-01

Organizational stressors are a universal phenomenon which can be particularly prevalent and problematic for sport performers. In view of their global existence, it is surprising that no studies have examined cross-cultural differences in organizational stressors. One explanation for this is that the Organizational Stressor Indicator for Sport Performers (OSI-SP; Arnold, Fletcher, & Daniels, 2013), which can comprehensively measure the organizational pressures that sport performers have encountered, has not yet been translated from English into any other languages nor scrutinized cross-culturally. The first purpose of this study, therefore, was to examine the cross-cultural validity of the OSI-SP. In addition, the study aimed to test the equivalence of the OSI-SP's factor structure across cultures. British (n = 379), Chinese (n = 335), and Malaysian (n = 444) sport performers completed the OSI-SP. Confirmatory factor analyses confirmed the cross-cultural validity of the factorial model for the British and Malaysian samples; however, the overall model fit for the Chinese data did not meet all guideline values. Support was provided for the equality of factor loadings, variances, and covariances on the OSI-SP across the British and Malaysian cultures. These findings advance knowledge and understanding on the cross-cultural existence, conceptualization, and operationalization of organizational stressors. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Validating Analytical Protocols to Determine Selected Pesticides and PCBs Using Routine Samples.

PubMed

Pindado Jiménez, Oscar; García Alonso, Susana; Pérez Pastor, Rosa María

2017-01-01

This study aims at providing recommendations concerning the validation of analytical protocols by using routine samples. It is intended to provide a case-study on how to validate the analytical methods in different environmental matrices. In order to analyze the selected compounds (pesticides and polychlorinated biphenyls) in two different environmental matrices, the current work has performed and validated two analytical procedures by GC-MS. A description is given of the validation of the two protocols by the analysis of more than 30 samples of water and sediments collected along nine months. The present work also scopes the uncertainty associated with both analytical protocols. In detail, uncertainty of water sample was performed through a conventional approach. However, for the sediments matrices, the estimation of proportional/constant bias is also included due to its inhomogeneity. Results for the sediment matrix are reliable, showing a range 25-35% of analytical variability associated with intermediate conditions. The analytical methodology for the water matrix determines the selected compounds with acceptable recoveries and the combined uncertainty ranges between 20 and 30%. Analyzing routine samples is rarely applied to assess trueness of novel analytical methods and up to now this methodology was not focused on organochlorine compounds in environmental matrices.
Validation of the tool assessment of clinical education (AssCE): A study using Delphi method and clinical experts.

PubMed

Löfmark, Anna; Mårtensson, Gunilla

2017-03-01

The aim of the present study was to establish the validity of the tool Assessment of Clinical Education (AssCE). The tool is widely used in Sweden and some Nordic countries for assessing nursing students' performance in clinical education. It is important that the tools in use be subjected to regular audit and critical reviews. The validation process, performed in two stages, was concluded with a high level of congruence. In the first stage, Delphi technique was used to elaborate the AssCE tool using a group of 35 clinical nurse lecturers. After three rounds, we reached consensus. In the second stage, a group of 46 clinical nurse lecturers representing 12 universities in Sweden and Norway audited the revised version of the AssCE in relation to learning outcomes from the last clinical course at their respective institutions. Validation of the revised AssCE was established with high congruence between the factors in the AssCE and examined learning outcomes. The revised AssCE tool seems to meet its objective to be a validated assessment tool for use in clinical nursing education. Copyright © 2016 Elsevier Ltd. All rights reserved.

Analysis of Ethanolamines: Validation of Semi-Volatile Analysis by HPLC-MS/MS by EPA Method MS888

DOE Office of Scientific and Technical Information (OSTI.GOV)

Owens, J; Vu, A; Koester, C

The Environmental Protection Agency's (EPA) Region 5 Chicago Regional Laboratory (CRL) developed a method titled 'Analysis of Diethanolamine, Triethanolamine, n-Methyldiethanolamine, and n-Ethyldiethanolamine in Water by Single Reaction Monitoring Liquid Chromatography/Tandem Mass Spectrometry (LC/MS/MS): EPA Method MS888'. This draft standard operating procedure (SOP) was distributed to multiple EPA laboratories and to Lawrence Livermore National Laboratory, which was tasked to serve as a reference laboratory for EPA's Environmental Reference Laboratory Network (ERLN) and to develop and validate analytical procedures. The primary objective of this study was to validate and verify the analytical procedures described in 'EPA Method MS888' for analysis of themore » listed ethanolamines in aqueous samples. The gathered data from this validation study will be used to: (1) demonstrate analytical method performance; (2) generate quality control acceptance criteria; and (3) revise the SOP to provide a validated method that would be available for use during a homeland security event. The data contained in this report will be compiled, by EPA CRL, with data generated by other EPA Regional laboratories so that performance metrics of 'EPA Method MS888' can be determined.« less
The performance of seven QPrediction risk scores in an independent external sample of patients from general practice: a validation study

PubMed Central

Hippisley-Cox, Julia; Coupland, Carol; Brindle, Peter

2014-01-01

Objectives To validate the performance of a set of risk prediction algorithms developed using the QResearch database, in an independent sample from general practices contributing to the Clinical Research Data Link (CPRD). Setting Prospective open cohort study using practices contributing to the CPRD database and practices contributing to the QResearch database. Participants The CPRD validation cohort consisted of 3.3 million patients, aged 25–99 years registered at 357 general practices between 1 Jan 1998 and 31 July 2012. The validation statistics for QResearch were obtained from the original published papers which used a one-third sample of practices separate to those used to derive the score. A cohort from QResearch was used to compare incidence rates and baseline characteristics and consisted of 6.8 million patients from 753 practices registered between 1 Jan 1998 and until 31 July 2013. Outcome measures Incident events relating to seven different risk prediction scores: QRISK2 (cardiovascular disease); QStroke (ischaemic stroke); QDiabetes (type 2 diabetes); QFracture (osteoporotic fracture and hip fracture); QKidney (moderate and severe kidney failure); QThrombosis (venous thromboembolism); QBleed (intracranial bleed and upper gastrointestinal haemorrhage). Measures of discrimination and calibration were calculated. Results Overall, the baseline characteristics of the CPRD and QResearch cohorts were similar though QResearch had higher recording levels for ethnicity and family history. The validation statistics for each of the risk prediction scores were very similar in the CPRD cohort compared with the published results from QResearch validation cohorts. For example, in women, the QDiabetes algorithm explained 50% of the variation within CPRD compared with 51% on QResearch and the receiver operator curve value was 0.85 on both databases. The scores were well calibrated in CPRD. Conclusions Each of the algorithms performed practically as well in the external independent CPRD validation cohorts as they had in the original published QResearch validation cohorts. PMID:25168040
Delphi Method Validation of a Procedural Performance Checklist for Insertion of an Ultrasound-Guided Internal Jugular Central Line.

PubMed

Hartman, Nicholas; Wittler, Mary; Askew, Kim; Manthey, David

2016-01-01

Placement of ultrasound-guided central lines is a critical skill for physicians in several specialties. Improving the quality of care delivered surrounding this procedure demands rigorous measurement of competency, and validated tools to assess performance are essential. Using the iterative, modified Delphi technique and experts in multiple disciplines across the United States, the study team created a 30-item checklist designed to assess competency in the placement of ultrasound-guided internal jugular central lines. Cronbach α was .94, indicating an excellent degree of internal consistency. Further validation of this checklist will require its implementation in simulated and clinical environments. © The Author(s) 2014.
An Experimental Study of the Internal Consistency of Judgments Made in Bookmark Standard Setting

ERIC Educational Resources Information Center

Clauser, Brian E.; Baldwin, Peter; Margolis, Melissa J.; Mee, Janet; Winward, Marcia

2017-01-01

Validating performance standards is challenging and complex. Because of the difficulties associated with collecting evidence related to external criteria, validity arguments rely heavily on evidence related to internal criteria--especially evidence that expert judgments are internally consistent. Given its importance, it is somewhat surprising…
Validating MMI Scores: Are We Measuring Multiple Attributes?

ERIC Educational Resources Information Center

Oliver, Tom; Hecker, Kent; Hausdorf, Peter A.; Conlon, Peter

2014-01-01

The multiple mini-interview (MMI) used in health professional schools' admission processes is reported to assess multiple non-cognitive constructs such as ethical reasoning, oral communication, or problem evaluation. Though validation studies have been performed with total MMI scores, there is a paucity of information regarding how well MMI…
Simulator validation results and proposed reporting format from flight testing a software model of a complex, high-performance airplane.

DOT National Transportation Integrated Search

2008-01-01

Computer simulations are often used in aviation studies. These simulation tools may require complex, high-fidelity aircraft models. Since many of the flight models used are third-party developed products, independent validation is desired prior to im...
Domestic violence on children: development and validation of an instrument to evaluate knowledge of health professionals 1

PubMed Central

Oliveira, Lanuza Borges; Soares, Fernanda Amaral; Silveira, Marise Fagundes; de Pinho, Lucinéia; Caldeira, Antônio Prates; Leite, Maísa Tavares de Souza

2016-01-01

ABSTRACT Objective: to develop and validate an instrument to evaluate the knowledge of health professionals about domestic violence on children. Method: this was a study conducted with 194 physicians, nurses and dentists. A literature review was performed for preparation of the items and identification of the dimensions. Apparent and content validation was performed using analysis of three experts and 27 professors of the pediatric health discipline. For construct validation, Cronbach's alpha was used, and the Kappa test was applied to verify reproducibility. The criterion validation was conducted using the Student's t-test. Results: the final instrument included 56 items; the Cronbach alpha was 0.734, the Kappa test showed a correlation greater than 0.6 for most items, and the Student t-test showed a statistically significant value to the level of 5% for the two selected variables: years of education and using the Family Health Strategy. Conclusion: the instrument is valid and can be used as a promising tool to develop or direct actions in public health and evaluate knowledge about domestic violence on children. PMID:27556878
Methodological Issues in Curriculum-Based Reading Assessment.

ERIC Educational Resources Information Center

Fuchs, Lynn S.; And Others

1984-01-01

Three studies involving elementary students examined methodological issues in curriculum-based reading assessment. Results indicated that (1) whereas sample duration did not affect concurrent validity, increasing duration reduced performance instability and increased performance slopes and (2) domain size was related inversely to performance slope…
Validation of the Narrowing Beam Walking Test in Lower Limb Prosthesis Users.

PubMed

Sawers, Andrew; Hafner, Brian

2018-04-11

To evaluate the content, construct, and discriminant validity of the Narrowing Beam Walking Test (NBWT), a performance-based balance test for lower limb prosthesis users. Cross-sectional study. Research laboratory and prosthetics clinic. Unilateral transtibial and transfemoral prosthesis users (N=40). Not applicable. Content validity was examined by quantifying the percentage of participants receiving maximum or minimum scores (ie, ceiling and floor effects). Convergent construct validity was examined using correlations between participants' NBWT scores and scores or times on existing clinical balance tests regularly administered to lower limb prosthesis users. Known-groups construct validity was examined by comparing NBWT scores between groups of participants with different fall histories, amputation levels, amputation etiologies, and functional levels. Discriminant validity was evaluated by analyzing the area under each test's receiver operating characteristic (ROC) curve. No minimum or maximum scores were recorded on the NBWT. NBWT scores demonstrated strong correlations (ρ=.70‒.85) with scores/times on performance-based balance tests (timed Up and Go test, Four Square Step Test, and Berg Balance Scale) and a moderate correlation (ρ=.49) with the self-report Activities-specific Balance Confidence scale. NBWT performance was significantly lower among participants with a history of falls (P=.003), transfemoral amputation (P=.011), and a lower mobility level (P<.001). The NBWT also had the largest area under the ROC curve (.81) and was the only test to exhibit an area that was statistically significantly >.50 (ie, chance). The results provide strong evidence of content, construct, and discriminant validity for the NBWT as a performance-based test of balance ability. The evidence supports its use to assess balance impairments and fall risk in unilateral transtibial and transfemoral prosthesis users. Copyright © 2018 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Development and validation of a new questionnaire for the assessment of subjective physical performance in adult patients with haemophilia--the HEP-Test-Q.

PubMed

von Mackensen, S; Czepa, D; Herbsleb, M; Hilberg, T

2010-01-01

Specific research studies for the investigation of physical performance in haemophilic patients are rare. However, these instruments become increasingly more important to evaluate therapeutic treatments. Within the frame of the Haemophilia & Exercise Project (HEP), a new questionnaire, namely HEP-Test-Q, has been developed for the assessment of subjective physical performance in haemophilic adults. In this article, the development and validation of the HEP-Test-Q is described. The development consisted of different phases including item collection, pilot testing and field testing. The preliminary version was pilot-tested in 24 German HEP-participants. Following evaluation and preliminary psychometric analysis, the HEP-Test-Q was revised. The final version consists of 25 items pertaining to the domains 'mobility', 'strength & coordination', 'endurance' and 'body perception', which was administered to 43 German haemophilic patients (43.8 +/- 11.2 years). Psychometric analysis included reliability and validity testing. Convergent validity was tested correlating the HEP-Test-Q with SF-36, Haem-A-QoL, HAL and the Orthopaedic Joint Score. Discriminant validity tested different clinical subgroups. Patients accepted the questionnaire and found it easy to fill in. Psychometric testing revealed good values for reliability in terms of internal consistency (Cronbach's alpha = 0.96) and test-retest reliability (r = 0.90) as well as for convergent validity correlating highly with Haem-A-QoL, HAL and SF-36. Discriminant validity testing showed significant differences for age, hepatitis A and hepatitis B and the number of target joints. HEP-Test-Q is a short and well-accepted questionnaire, assessing subjective physical performance of haemophiliacs, which might be combined with objective assessments to reveal aspects, which cannot be measured objectively, such as body perception.
Using Patient Feedback to Optimize the Design of a Certolizumab Pegol Electromechanical Self-Injection Device: Insights from Human Factors Studies.

PubMed

Domańska, Barbara; Stumpp, Oliver; Poon, Steven; Oray, Serkan; Mountian, Irina; Pichon, Clovis

2018-01-01

We incorporated patient feedback from human factors studies (HFS) in the patient-centric design and validation of ava ® , an electromechanical device (e-Device) for self-injecting the anti-tumor necrosis factor certolizumab pegol (CZP). Healthcare professionals, caregivers, healthy volunteers, and patients with rheumatoid arthritis, psoriatic arthritis, ankylosing spondylitis, or Crohn's disease participated in 11 formative HFS to optimize the e-Device design through intended user feedback; nine studies involved simulated injections. Formative participant questionnaire feedback was collected following e-Device prototype handling. Validation HFS (one EU study and one US study) assessed the safe and effective setup and use of the e-Device using 22 predefined critical tasks. Task outcomes were categorized as "failures" if participants did not succeed within three attempts. Two hundred eighty-three participants entered formative (163) and validation (120) HFS; 260 participants performed one or more simulated e-Device self-injections. Design changes following formative HFS included alterations to buttons and the graphical user interface screen. All validation HFS participants completed critical tasks necessary for CZP dose delivery, with minimal critical task failures (12 of 572 critical tasks, 2.1%, in the EU study, and 2 of 5310 critical tasks, less than 0.1%, in the US study). CZP e-Device development was guided by intended user feedback through HFS, ensuring the final design addressed patients' needs. In both validation studies, participants successfully performed all critical tasks, demonstrating safe and effective e-Device self-injections. UCB Pharma. Plain language summary available on the journal website.
Measuring activities and participation in persons with haemophilia: A systematic review of commonly used instruments.

PubMed

Timmer, M A; Gouw, S C; Feldman, B M; Zwagemaker, A; de Kleijn, P; Pisters, M F; Schutgens, R E G; Blanchette, V; Srivastava, A; David, J A; Fischer, K; van der Net, J

2018-03-01

Monitoring clinical outcome in persons with haemophilia (PWH) is essential in order to provide optimal treatment for individual patients and compare effectiveness of treatment strategies. Experience with measurement of activities and participation in haemophilia is limited and consensus on preferred tools is lacking. The aim of this study was to give a comprehensive overview of the measurement properties of a selection of commonly used tools developed to assess activities and participation in PWH. Electronic databases were searched for articles that reported on reliability, validity or responsiveness of predetermined measurement tools (5 self-reported and 4 performance based measurement tools). Methodological quality of the studies was assessed according to the COSMIN checklist. Best evidence synthesis was used to summarize evidence on the measurement properties. The search resulted in 3453 unique hits. Forty-two articles were included. The self-reported Haemophilia Acitivity List (HAL), Pediatric HAL (PedHAL) and the performance based Functional Independence Score in Haemophilia (FISH) were studied most extensively. Methodological quality of the studies was limited. Measurement error, cross-cultural validity and responsiveness have been insufficiently evaluated. Albeit based on limited evidence, the measurement properties of the PedHAL, HAL and FISH are currently considered most satisfactory. Further research needs to focus on measurement error, responsiveness, interpretability and cross-cultural validity of the self-reported tools and validity of performance based tools which are able to assess limitations in sports and leisure activities. © 2018 The Authors. Haemophilia Published by John Wiley & Sons Ltd.
[Development and validation of the Family Vulnerability Index to Disability and Dependence (FVI-DD)].

PubMed

Amendola, Fernanda; Alvarenga, Márcia Regina Martins; Latorre, Maria do Rosário Dias de Oliveira; Oliveira, Maria Amélia de Campos

2014-02-01

This exploratory, descriptive, cross-sectional, and quantitative study aimed to develop and validate an index of family vulnerability to disability and dependence (FVI-DD). This study was adapted from the Family Development Index, with the addition of social and health indicators of disability and dependence. The instrument was applied to 248 families in the city of Sao Paulo, followed by exploratory factor analysis. Factor validation was performed using the concurrent and discriminant validity of the Lawton scale and Katz Index. The descriptive level adopted for the study was p < 0.05. The final vulnerability index comprised 50 questions classified into seven factors contemplating social and health dimensions, and this index exhibited good internal consistency (Cronbach's alpha = 0.82). FVI-DD was validated using both the Lawton scale and Katz Index. We conclude that FVI-DD can accurately and reliably assess family vulnerability to disability and dependence.
Further Validation of a CFD Code for Calculating the Performance of Two-Stage Light Gas Guns

NASA Technical Reports Server (NTRS)

Bogdanoff, David W.

2017-01-01

Earlier validations of a higher-order Godunov code for modeling the performance of two-stage light gas guns are reviewed. These validation comparisons were made between code predictions and experimental data from the NASA Ames 1.5" and 0.28" guns and covered muzzle velocities of 6.5 to 7.2 km/s. In the present report, five more series of code validation comparisons involving experimental data from the Ames 0.22" (1.28" pump tube diameter), 0.28", 0.50", 1.00" and 1.50" guns are presented. The total muzzle velocity range of the validation data presented herein is 3 to 11.3 km/s. The agreement between the experimental data and CFD results is judged to be very good. Muzzle velocities were predicted within 0.35 km/s for 74% of the cases studied with maximum differences being 0.5 km/s and for 4 out of 50 cases, 0.5 - 0.7 km/s.
Validating a biometric authentication system: sample size requirements.

PubMed

Dass, Sarat C; Zhu, Yongfang; Jain, Anil K

2006-12-01

Authentication systems based on biometric features (e.g., fingerprint impressions, iris scans, human face images, etc.) are increasingly gaining widespread use and popularity. Often, vendors and owners of these commercial biometric systems claim impressive performance that is estimated based on some proprietary data. In such situations, there is a need to independently validate the claimed performance levels. System performance is typically evaluated by collecting biometric templates from n different subjects, and for convenience, acquiring multiple instances of the biometric for each of the n subjects. Very little work has been done in 1) constructing confidence regions based on the ROC curve for validating the claimed performance levels and 2) determining the required number of biometric samples needed to establish confidence regions of prespecified width for the ROC curve. To simplify the analysis that address these two problems, several previous studies have assumed that multiple acquisitions of the biometric entity are statistically independent. This assumption is too restrictive and is generally not valid. We have developed a validation technique based on multivariate copula models for correlated biometric acquisitions. Based on the same model, we also determine the minimum number of samples required to achieve confidence bands of desired width for the ROC curve. We illustrate the estimation of the confidence bands as well as the required number of biometric samples using a fingerprint matching system that is applied on samples collected from a small population.
Validating Savings Claims of Cold Climate Zero Energy Ready Homes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Williamson, J.; Puttagunta, S.

This study was intended to validate actual performance of three ZERHs in the Northeast to energy models created in REM/Rate v14.5 (one of the certified software programs used to generate a HERS Index) and the National Renewable Energy Laboratory’s Building Energy Optimization (BEopt™) v2.3 E+ (a more sophisticated hourly energy simulation software). This report details the validation methods used to analyze energy consumption at each home.
Sensor data validation and reconstruction. Phase 1: System architecture study

NASA Technical Reports Server (NTRS)

1991-01-01

The sensor validation and data reconstruction task reviewed relevant literature and selected applicable validation and reconstruction techniques for further study; analyzed the selected techniques and emphasized those which could be used for both validation and reconstruction; analyzed Space Shuttle Main Engine (SSME) hot fire test data to determine statistical and physical relationships between various parameters; developed statistical and empirical correlations between parameters to perform validation and reconstruction tasks, using a computer aided engineering (CAE) package; and conceptually designed an expert system based knowledge fusion tool, which allows the user to relate diverse types of information when validating sensor data. The host hardware for the system is intended to be a Sun SPARCstation, but could be any RISC workstation with a UNIX operating system and a windowing/graphics system such as Motif or Dataviews. The information fusion tool is intended to be developed using the NEXPERT Object expert system shell, and the C programming language.
Simulation-based assessment to identify critical gaps in safe anesthesia resident performance.

PubMed

Blum, Richard H; Boulet, John R; Cooper, Jeffrey B; Muret-Wagstaff, Sharon L

2014-01-01

Valid methods are needed to identify anesthesia resident performance gaps early in training. However, many assessment tools in medicine have not been properly validated. The authors designed and tested use of a behaviorally anchored scale, as part of a multiscenario simulation-based assessment system, to identify high- and low-performing residents with regard to domains of greatest concern to expert anesthesiology faculty. An expert faculty panel derived five key behavioral domains of interest by using a Delphi process (1) Synthesizes information to formulate a clear anesthetic plan; (2) Implements a plan based on changing conditions; (3) Demonstrates effective interpersonal and communication skills with patients and staff; (4) Identifies ways to improve performance; and (5) Recognizes own limits. Seven simulation scenarios spanning pre-to-postoperative encounters were used to assess performances of 22 first-year residents and 8 fellows from two institutions. Two of 10 trained faculty raters blinded to trainee program and training level scored each performance independently by using a behaviorally anchored rating scale. Residents, fellows, facilitators, and raters completed surveys. Evidence supporting the reliability and validity of the assessment scores was procured, including a high generalizability coefficient (ρ = 0.81) and expected performance differences between first-year resident and fellow participants. A majority of trainees, facilitators, and raters judged the assessment to be useful, realistic, and representative of critical skills required for safe practice. The study provides initial evidence to support the validity of a simulation-based performance assessment system for identifying critical gaps in safe anesthesia resident performance early in training.
Need for cognition and cognitive performance from a cross-cultural perspective: examples of academic success and solving anagrams.

PubMed

Gülgöz, S

2001-01-01

The cross-cultural validity of the Need for Cognition Scale and its relationship with cognitive performance were investigated in two studies. In the first study, the relationships between the scale and university entrance scores, course grades, study skills, and social desirability were examined. Using the short form of the Turkish version of the Need for Cognition Scale (S. Gülöz & C. J. Sadowski, 1995) no correlation with academic performance was found but there was significant correlation with a study skills scale and a social desirability scale created for this study. When regression analysis was used to predict grade point average, the Need for Cognition Scale was a significant predictor. In the second study, participants low or high in need for cognition solved multiple-solution anagrams. The instructions preceding the task set the participants' expectations regarding task difficulty. An interaction between expectation and need for cognition indicated that participants with low need for cognition performed worse when they expected difficult problems. Results of the two studies showed that need for cognition has cross-cultural validity and that its effect on cognitive performance was mediated by other variables.
Validation of the Filovirus Plaque Assay for Use in Preclinical Studies

DTIC Science & Technology

2016-09-02

filoviruses in virus stocks, prepared viral challenge inocula and samples from research animals has recently been fully characterized and standardized for...and robust for filovirus titration in samples associated with the performance of GLP animal model studies. Keywords: Plaque assay; filovirus; Ebola...ebolavirus; marburgvirus; Marburg virus; Vero E6 cells; GLP compliant; validation; animal rule DISTRIBUTION STATEMENT A: Approved for public

Validating Proposed Learning Progressions on Force and Motion Using the Force Concept Inventory: Findings from Singapore Secondary Schools

ERIC Educational Resources Information Center

Fulmer, Gavin W.

2015-01-01

This study examines the validity of 2 proposed learning progressions on the force concept when tested using items from the Force Concept Inventory (FCI). This is the first study to compare students' performance with respect to learning progressions both for force and motion and for Newton's third law in parallel. It is also among the first studies…
Assessing the Relationship Between Observed Teaching Practice and Reading Growth in First Grade English Learners: A Validation Study

ERIC Educational Resources Information Center

Baker, Scott K.; Gersten, Russell; Haager, Diane; Dingle, Mary; Goldenberg, Claude

2005-01-01

Validation of a classroom observation measure for use with English Learners (ELs) in Grade 1 is the focus of this study. Fourteen teachers were observed during reading and language arts instruction with an instrument used to generate overall ratings of instructional quality on a number of dimensions. In these classrooms, the reading performance of…
Development and Initial Validation of the NyTid Test: A Movement Assessment Tool for Compulsory School Pupils

ERIC Educational Resources Information Center

Tidén, Anna; Lundqvist, Carolina; Nyberg, Marie

2015-01-01

This study presents the development process and initial validation of the NyTid test, a process-oriented movement assessment tool for compulsory school pupils. A sample of 1,260 (627 girls and 633 boys; mean age of 14.39) Swedish school children participated in the study. In the first step, exploratory factor analyses (EFAs) were performed in…
Performance review of the ROMI-RIP rough mill simulator

Treesearch

Edward Thomas; Urs Buehlmann

2003-01-01

The USDA Forest Service's ROMI-RIP version 2.0 (RR2) rough mill rip-first simulation program was validated in a recent study. The validation study found that when RR2 was set to search for optimum yield without considering actual rough mill strip solutions, it produced yields that were as much as 7 percent higher (71.1% versus 64.0%) than the actual rough mill....
Validity and reliability of the Omron HJ-303 tri-axial accelerometer-based pedometer.

PubMed

Steeves, Jeremy A; Tyo, Brian M; Connolly, Christopher P; Gregory, Douglas A; Stark, Nyle A; Bassett, David R

2011-09-01

This study compared the validity of a new Omron HJ-303 piezoelectric pedometer and 2 other pedometers (Sportline Traq and Yamax SW200). To examine the effect of speed, 60 subjects walked on a treadmill at 2, 3, and 4 mph. Twenty subjects also ran at 6, 7, and 8 mph. To test lifestyle activities, 60 subjects performed front-back-side-side stepping, elliptical machine and stair climbing/descending. Twenty others performed ballroom dancing. Sixty participants completed 5 100-step trials while wearing 5 different sets of the devices tested device reliability. Actual steps were determined using a hand tally counter. Significant differences existed among pedometers (P < .05). For walking, the Omron pedometers were the most valid. The Sportline overestimated and the Yamax underestimated steps (P < .05). Worn on the waist or in the backpack, the Omron device and Sportline were valid for running. The Omron was valid for 3 activities (elliptical machine, ascending and descending stairs). The Sportline overestimated all of these activities, and Yamax was only valid for descending stairs. The Omron andYamax were both valid and reliable in the 100-step trials. The Omron HJ-303, worn on the waist, appeared to be the most valid of the 3 pedometers.
Validating the Use of pPerformance Risk Indices for System-Level Risk and Maturity Assessments

NASA Astrophysics Data System (ADS)

Holloman, Sherrica S.

With pressure on the U.S. Defense Acquisition System (DAS) to reduce cost overruns and schedule delays, system engineers' performance is only as good as their tools. Recent literature details a need for 1) objective, analytical risk quantification methodologies over traditional subjective qualitative methods -- such as, expert judgment, and 2) mathematically rigorous system-level maturity assessments. The Mahafza, Componation, and Tippett (2005) Technology Performance Risk Index (TPRI) ties the assessment of technical performance to the quantification of risk of unmet performance; however, it is structured for component- level data as input. This study's aim is to establish a modified TPRI with systems-level data as model input, and then validate the modified index with actual system-level data from the Department of Defense's (DoD) Major Defense Acquisition Programs (MDAPs). This work's contribution is the establishment and validation of the System-level Performance Risk Index (SPRI). With the introduction of the SPRI, system-level metrics are better aligned, allowing for better assessment, tradeoff and balance of time, performance and cost constraints. This will allow system engineers and program managers to ultimately make better-informed system-level technical decisions throughout the development phase.
Analysis of UAS DAA Alerting in Fast-Time Simulations without DAA Mitigation

NASA Technical Reports Server (NTRS)

Thipphavong, David P.; Santiago, Confesor; Isaacson, Douglas R.; Lee, Seung Man; Park, Chunki; Refai, Mohamad Said; Snow, James

2015-01-01

Realization of the expected proliferation of Unmanned Aircraft System (UAS) operations in the National Airspace System (NAS) depends on the development and validation of performance standards for UAS Detect and Avoid (DAA) Systems. The RTCA Special Committee 228 is charged with leading the development of draft Minimum Operational Performance Standards (MOPS) for UAS DAA Systems. NASA, as a participating member of RTCA SC-228 is committed to supporting the development and validation of draft requirements for DAA alerting system performance. A recent study conducted using NASA's ACES (Airspace Concept Evaluation System) simulation capability begins to address questions surrounding the development of draft MOPS for DAA alerting systems. ACES simulations were conducted to study the performance of alerting systems proposed by the SC-228 DAA Alerting sub-group. Analysis included but was not limited to: 1) correct alert (and timeliness), 2) false alert (and severity and duration), 3) missed alert, and 4) probability of an alert type at the time of loss of well clear. The performance of DAA alerting systems when using intent vs. dead-reckoning for UAS ownship trajectories was also compared. The results will be used by SC-228 to inform decisions about the surveillance standards of UAS DAA systems and future requirements development and validation efforts.
A Malay version of the Child Oral Impacts on Daily Performances (Child-OIDP) index: assessing validity and reliability.

PubMed

Yusof, Zamros Y M; Jaafar, Nasruddin

2012-06-08

The study aimed to develop and test a Malay version of the Child-OIDP index, evaluate its psychometric properties and report on the prevalence of oral impacts on eight daily performances in a sample of 11-12 year old Malaysian schoolchildren. The Child-OIDP index was translated from English into Malay. The Malay version was tested for reliability and validity on a non-random sample of 132, 11-12 year old schoolchildren from two urban schools in Kuala Lumpur. Psychometric analysis of the Malay Child-OIDP involved face, content, criterion and construct validity tests as well as internal and test-retest reliability. Non-parametric statistical methods were used to assess relationships between Child-OIDP scores and other subjective outcome measures. The standardised Cronbach's alpha was 0.80 and the weighted Kappa was 0.84 (intraclass correlation = 0.79). The index showed significant associations with different subjective measures viz. perceived satisfaction with mouth, perceived needs for dental treatment, perceived oral health status and toothache experience in the previous 3 months (p < 0.05). Two-thirds (66.7%) of the sample had oral impacts affecting one or more performances in the past 3 months. The three most frequently affected performances were cleaning teeth (36.4%), eating foods (34.8%) and maintaining emotional stability (26.5%). In terms of severity of impact, the ability to relax was most severely affected by their oral conditions, followed by ability to socialise and doing schoolwork. Almost three-quarters (74.2%) of schoolchildren with oral impacts had up to three performances affected by their oral conditions. This study indicated that the Malay Child-OIDP index is a valid and reliable instrument to measure the oral impacts of daily performances in 11-12 year old urban schoolchildren in Malaysia.
Validation of an obstetric comorbidity index in an external population.

PubMed

Metcalfe, A; Lix, L M; Johnson, J-A; Currie, G; Lyon, A W; Bernier, F; Tough, S C

2015-12-01

An obstetric comorbidity index has been developed recently with superior performance characteristics relative to general comorbidity measures in an obstetric population. This study aimed to externally validate this index and to examine the impact of including hospitalisation/delivery records only when estimating comorbidity prevalence and discriminative performance of the obstetric comorbidity index. Validation study. Alberta, Canada. Pregnant women who delivered a live or stillborn infant in hospital (n = 5995). Administrative databases were linked to create a population-based cohort. Comorbid conditions were identified from diagnoses for the delivery hospitalisation, all hospitalisations and all healthcare contacts (i.e. hospitalisations, emergency room visits and physician visits) that occurred during pregnancy and 3 months pre-conception. Logistic regression was used to test the discriminative performance of the comorbidity index. Maternal end-organ damage and extended length of stay for delivery. Although prevalence estimates for comorbid conditions were consistently lower in delivery records and hospitalisation data than in data for all healthcare contacts, the discriminative performance of the comorbidity index was constant for maternal end-organ damage [all healthcare contacts area under the receiver operating characteristic curve (AUC) = 0.70; hospitalisation data AUC = 0.67; delivery data AUC = 0.65] and extended length of stay for delivery (all healthcare contacts AUC = 0.60; hospitalisation data AUC = 0.58; delivery data AUC = 0.58). The obstetric comorbidity index shows similar performance characteristics in an external population and is a valid measure of comorbidity in an obstetric population. Furthermore, the discriminative performance of the comorbidity index was similar for comorbidities ascertained at the time of delivery, in hospitalisation data or through all healthcare contacts. © 2015 The Authors. BJOG An International Journal of Obstetrics and Gynaecology published by John Wiley & Sons Ltd on behalf of Royal College of Obstetricians and Gynaecologists.
Development and psychometric evaluation of the nursing instructors' clinical teaching performance inventory.

PubMed

A Farahani, Mansoureh; Emamzadeh Ghasemi, Hormat Sadat; Nikpaima, Nasrin; Fereidooni, Zhila; Rasoli, Maryam

2014-10-29

Evaluation of nursing instructors' clinical teaching performance is a prerequisite to the quality assurance of nursing education. One of the most common procedures for this purpose is using student evaluations. This study was to develop and evaluate the psychometric properties of Nursing Instructors' Clinical Teaching Performance Inventory (NICTPI). The primary items of the inventory were generated by reviewing the published literature and the existing questionnaires as well as consulting with the members of the Faculties Evaluation Committee of the study setting. Psychometric properties were assessed by calculating its content validity ratio and index, and test-retest correlation coefficient as well as conducting an exploratory factor analysis and an internal consistency assessment. The content validity ratios and indices of the items were respectively higher than 0.85 and 0.79. The final version of the inventory consisted of 25 items, and in the exploratory factor analysis, items were loaded on three factors which jointly accounting for 72.85% of the total variance. The test-retest correlation coefficient and the Cronbach's alpha of the inventory were 0.93 and 0.973, respectively. The results revealed that the developed inventory is an appropriate, valid, and reliable instrument for evaluating nursing instructors' clinical teaching performance.
A Case Study on a Combination NDVI Forecasting Model Based on the Entropy Weight Method

DOE Office of Scientific and Technical Information (OSTI.GOV)

Huang, Shengzhi; Ming, Bo; Huang, Qiang

It is critically meaningful to accurately predict NDVI (Normalized Difference Vegetation Index), which helps guide regional ecological remediation and environmental managements. In this study, a combination forecasting model (CFM) was proposed to improve the performance of NDVI predictions in the Yellow River Basin (YRB) based on three individual forecasting models, i.e., the Multiple Linear Regression (MLR), Artificial Neural Network (ANN), and Support Vector Machine (SVM) models. The entropy weight method was employed to determine the weight coefficient for each individual model depending on its predictive performance. Results showed that: (1) ANN exhibits the highest fitting capability among the four orecastingmore » models in the calibration period, whilst its generalization ability becomes weak in the validation period; MLR has a poor performance in both calibration and validation periods; the predicted results of CFM in the calibration period have the highest stability; (2) CFM generally outperforms all individual models in the validation period, and can improve the reliability and stability of predicted results through combining the strengths while reducing the weaknesses of individual models; (3) the performances of all forecasting models are better in dense vegetation areas than in sparse vegetation areas.« less
External validation of the Intensive Care National Audit & Research Centre (ICNARC) risk prediction model in critical care units in Scotland.

PubMed

Harrison, David A; Lone, Nazir I; Haddow, Catriona; MacGillivray, Moranne; Khan, Angela; Cook, Brian; Rowan, Kathryn M

2014-01-01

Risk prediction models are used in critical care for risk stratification, summarising and communicating risk, supporting clinical decision-making and benchmarking performance. However, they require validation before they can be used with confidence, ideally using independently collected data from a different source to that used to develop the model. The aim of this study was to validate the Intensive Care National Audit & Research Centre (ICNARC) model using independently collected data from critical care units in Scotland. Data were extracted from the Scottish Intensive Care Society Audit Group (SICSAG) database for the years 2007 to 2009. Recoding and mapping of variables was performed, as required, to apply the ICNARC model (2009 recalibration) to the SICSAG data using standard computer algorithms. The performance of the ICNARC model was assessed for discrimination, calibration and overall fit and compared with that of the Acute Physiology And Chronic Health Evaluation (APACHE) II model. There were 29,626 admissions to 24 adult, general critical care units in Scotland between 1 January 2007 and 31 December 2009. After exclusions, 23,269 admissions were included in the analysis. The ICNARC model outperformed APACHE II on measures of discrimination (c index 0.848 versus 0.806), calibration (Hosmer-Lemeshow chi-squared statistic 18.8 versus 214) and overall fit (Brier's score 0.140 versus 0.157; Shapiro's R 0.652 versus 0.621). Model performance was consistent across the three years studied. The ICNARC model performed well when validated in an external population to that in which it was developed, using independently collected data.
Environmental fate model for ultra-low-volume insecticide applications used for adult mosquito management

USGS Publications Warehouse

Schleier, Jerome J.; Peterson, Robert K.D.; Irvine, Kathryn M.; Marshall, Lucy M.; Weaver, David K.; Preftakes, Collin J.

2012-01-01

One of the more effective ways of managing high densities of adult mosquitoes that vector human and animal pathogens is ultra-low-volume (ULV) aerosol applications of insecticides. The U.S. Environmental Protection Agency uses models that are not validated for ULV insecticide applications and exposure assumptions to perform their human and ecological risk assessments. Currently, there is no validated model that can accurately predict deposition of insecticides applied using ULV technology for adult mosquito management. In addition, little is known about the deposition and drift of small droplets like those used under conditions encountered during ULV applications. The objective of this study was to perform field studies to measure environmental concentrations of insecticides and to develop a validated model to predict the deposition of ULV insecticides. The final regression model was selected by minimizing the Bayesian Information Criterion and its prediction performance was evaluated using k-fold cross validation. Density of the formulation and the density and CMD interaction coefficients were the largest in the model. The results showed that as density of the formulation decreases, deposition increases. The interaction of density and CMD showed that higher density formulations and larger droplets resulted in greater deposition. These results are supported by the aerosol physics literature. A k-fold cross validation demonstrated that the mean square error of the selected regression model is not biased, and the mean square error and mean square prediction error indicated good predictive ability.
Comparison and validation of injury risk classifiers for advanced automated crash notification systems.

PubMed

Kusano, Kristofer; Gabler, Hampton C

2014-01-01

The odds of death for a seriously injured crash victim are drastically reduced if he or she received care at a trauma center. Advanced automated crash notification (AACN) algorithms are postcrash safety systems that use data measured by the vehicles during the crash to predict the likelihood of occupants being seriously injured. The accuracy of these models are crucial to the success of an AACN. The objective of this study was to compare the predictive performance of competing injury risk models and algorithms: logistic regression, random forest, AdaBoost, naïve Bayes, support vector machine, and classification k-nearest neighbors. This study compared machine learning algorithms to the widely adopted logistic regression modeling approach. Machine learning algorithms have not been commonly studied in the motor vehicle injury literature. Machine learning algorithms may have higher predictive power than logistic regression, despite the drawback of lacking the ability to perform statistical inference. To evaluate the performance of these algorithms, data on 16,398 vehicles involved in non-rollover collisions were extracted from the NASS-CDS. Vehicles with any occupants having an Injury Severity Score (ISS) of 15 or greater were defined as those requiring victims to be treated at a trauma center. The performance of each model was evaluated using cross-validation. Cross-validation assesses how a model will perform in the future given new data not used for model training. The crash ΔV (change in velocity during the crash), damage side (struck side of the vehicle), seat belt use, vehicle body type, number of events, occupant age, and occupant sex were used as predictors in each model. Logistic regression slightly outperformed the machine learning algorithms based on sensitivity and specificity of the models. Previous studies on AACN risk curves used the same data to train and test the power of the models and as a result had higher sensitivity compared to the cross-validated results from this study. Future studies should account for future data; for example, by using cross-validation or risk presenting optimistic predictions of field performance. Past algorithms have been criticized for relying on age and sex, being difficult to measure by vehicle sensors, and inaccuracies in classifying damage side. The models with accurate damage side and including age/sex did outperform models with less accurate damage side and without age/sex, but the differences were small, suggesting that the success of AACN is not reliant on these predictors.
Validity threats: overcoming interference with proposed interpretations of assessment data.

PubMed

Downing, Steven M; Haladyna, Thomas M

2004-03-01

Factors that interfere with the ability to interpret assessment scores or ratings in the proposed manner threaten validity. To be interpreted in a meaningful manner, all assessments in medical education require sound, scientific evidence of validity. The purpose of this essay is to discuss 2 major threats to validity: construct under-representation (CU) and construct-irrelevant variance (CIV). Examples of each type of threat for written, performance and clinical performance examinations are provided. The CU threat to validity refers to undersampling the content domain. Using too few items, cases or clinical performance observations to adequately generalise to the domain represents CU. Variables that systematically (rather than randomly) interfere with the ability to meaningfully interpret scores or ratings represent CIV. Issues such as flawed test items written at inappropriate reading levels or statistically biased questions represent CIV in written tests. For performance examinations, such as standardised patient examinations, flawed cases or cases that are too difficult for student ability contribute CIV to the assessment. For clinical performance data, systematic rater error, such as halo or central tendency error, represents CIV. The term face validity is rejected as representative of any type of legitimate validity evidence, although the fact that the appearance of the assessment may be an important characteristic other than validity is acknowledged. There are multiple threats to validity in all types of assessment in medical education. Methods to eliminate or control validity threats are suggested.
Prediction models for intracranial hemorrhage or major bleeding in patients on antiplatelet therapy: a systematic review and external validation study.

PubMed

Hilkens, N A; Algra, A; Greving, J P

2016-01-01

ESSENTIALS: Prediction models may help to identify patients at high risk of bleeding on antiplatelet therapy. We identified existing prediction models for bleeding and validated them in patients with cerebral ischemia. Five prediction models were identified, all of which had some methodological shortcomings. Performance in patients with cerebral ischemia was poor. Background Antiplatelet therapy is widely used in secondary prevention after a transient ischemic attack (TIA) or ischemic stroke. Bleeding is the main adverse effect of antiplatelet therapy and is potentially life threatening. Identification of patients at increased risk of bleeding may help target antiplatelet therapy. This study sought to identify existing prediction models for intracranial hemorrhage or major bleeding in patients on antiplatelet therapy and evaluate their performance in patients with cerebral ischemia. We systematically searched PubMed and Embase for existing prediction models up to December 2014. The methodological quality of the included studies was assessed with the CHARMS checklist. Prediction models were externally validated in the European Stroke Prevention Study 2, comprising 6602 patients with a TIA or ischemic stroke. We assessed discrimination and calibration of included prediction models. Five prediction models were identified, of which two were developed in patients with previous cerebral ischemia. Three studies assessed major bleeding, one studied intracerebral hemorrhage and one gastrointestinal bleeding. None of the studies met all criteria of good quality. External validation showed poor discriminative performance, with c-statistics ranging from 0.53 to 0.64 and poor calibration. A limited number of prediction models is available that predict intracranial hemorrhage or major bleeding in patients on antiplatelet therapy. The methodological quality of the models varied, but was generally low. Predictive performance in patients with cerebral ischemia was poor. In order to reliably predict the risk of bleeding in patients with cerebral ischemia, development of a prediction model according to current methodological standards is needed. © 2015 International Society on Thrombosis and Haemostasis.
Validation of an Alzheimer’s disease assessment battery in Asian participants with mild to moderate Alzheimer’s disease

PubMed Central

Shen, Joan HQ; Shen, Qi; Yu, Holly; Lai, Jin-Shei; Beaumont, Jennifer L; Zhang, Zhenxin; Wang, Huali; Kim, Seong Yoon; Chen, Christopher; Kwok, Timothy; Wang, Shuu-Jiun; Lee, Dong Young; Harrison, John; Cummings, Jeffrey

2014-01-01

There is a lack of validated tools for assessing Alzheimer’s disease (AD) across Asia. This study evaluates the psychometric properties of the Alzheimer’s Disease Assessment Scale-Cognitive Subscale (ADAS-Cog), Disability Assessment for Dementia (DAD), and Neuropsychological Test Battery (NTB) in Asian participants. Participants with mild to moderate AD (n=251) and healthy controls (n=51) from Mainland China, Taiwan, Singapore, Hong Kong, and South Korea completed selected instruments at several time points. Test-retest reliability was better than 0.70 for all tests. AD participants performed significantly more poorly than controls on every score. Within the AD group, greater disease severity corresponded to significantly poorer performance. The AD group test performance worsened over time and there was a trend for worse performance in AD compared to healthy controls over time. The ADAS-Cog, DAD, and NTB are reliable, valid, and responsive measures in this population and could be used for clinical trials across Asian countries/regions. PMID:25628967
Psychometric properties of virtual reality vignette performance measures: a novel approach for assessing adolescents' social competency skills.

PubMed

Paschall, Mallie J; Fishbein, Diana H; Hubal, Robert C; Eldreth, Diana

2005-02-01

This study examined the psychometric properties of performance measures for three novel, interactive virtual reality vignette exercises developed to assess social competency skills of at-risk adolescents. Performance data were collected from 117 African-American male 15-17 year olds. Data for 18 performance measures were obtained, based on adolescents' interaction with a provocative virtual teenage character. Twelve of the 18 performance measures loaded on two factors corresponding to emotional control and interpersonal communication skills, providing support for their factorial validity. The internal reliability coefficients for the two multi-item measures were 0.88 and 0.91, respectively. Additional analyses with established measures of three psychosocial factors (beliefs supporting aggression, aggressive conflict-resolution style and hostility) and behavioral criteria (e.g., self-reported behavioral misconduct and drug use) provided limited support for the construct and criterion-related validity of the performance measures. Study findings suggest that the virtual reality vignette exercises may represent a promising approach for assessing adolescents' social competency skills.
Validation of asthma recording in electronic health records: a systematic review

PubMed Central

Nissen, Francis; Quint, Jennifer K; Wilkinson, Samantha; Mullerova, Hana; Smeeth, Liam; Douglas, Ian J

2017-01-01

Objective To describe the methods used to validate asthma diagnoses in electronic health records and summarize the results of the validation studies. Background Electronic health records are increasingly being used for research on asthma to inform health services and health policy. Validation of the recording of asthma diagnoses in electronic health records is essential to use these databases for credible epidemiological asthma research. Methods We searched EMBASE and MEDLINE databases for studies that validated asthma diagnoses detected in electronic health records up to October 2016. Two reviewers independently assessed the full text against the predetermined inclusion criteria. Key data including author, year, data source, case definitions, reference standard, and validation statistics (including sensitivity, specificity, positive predictive value [PPV], and negative predictive value [NPV]) were summarized in two tables. Results Thirteen studies met the inclusion criteria. Most studies demonstrated a high validity using at least one case definition (PPV >80%). Ten studies used a manual validation as the reference standard; each had at least one case definition with a PPV of at least 63%, up to 100%. We also found two studies using a second independent database to validate asthma diagnoses. The PPVs of the best performing case definitions ranged from 46% to 58%. We found one study which used a questionnaire as the reference standard to validate a database case definition; the PPV of the case definition algorithm in this study was 89%. Conclusion Attaining high PPVs (>80%) is possible using each of the discussed validation methods. Identifying asthma cases in electronic health records is possible with high sensitivity, specificity or PPV, by combining multiple data sources, or by focusing on specific test measures. Studies testing a range of case definitions show wide variation in the validity of each definition, suggesting this may be important for obtaining asthma definitions with optimal validity. PMID:29238227
Validation of Sea levels from coastal altimetry waveform retracking expert system: a case study around the Prince William Sound in Alaska

NASA Astrophysics Data System (ADS)

Idris, N. H.; Deng, X.; Idris, N. H.

2017-05-01

This paper presents the validation of Coastal Altimetry Waveform Retracking Expert System (CAWRES), a novel method to optimize the Jason satellite altimetric sea levels from multiple retracking solutions. The validation is conducted over the region of Prince William Sound in Alaska, USA, where altimetric waveforms are perturbed by emerged land and sea states. Validation is performed in twofold. First, comparison with existing retrackers (i.e. MLE4 and Ice) from the Sensor Geophysical Data Records (SGDR), and second, comparison with in-situ tide gauge data. From the first validation assessment, in general, CAWRES outperforms the MLE4 and Ice retrackers. In 4 out of 6 cases, the value of improvement percentage (standard deviation of difference) is higher (lower) than those of the SGDR retrackers. CAWRES also presents the best performance in producing valid observations, and has the lowest noise when compared to the SGDR retrackers. From the second assessment with tide gauge, CAWRES retracked sea level anomalies (SLAs) are consistent with those of the tide gauge. The accuracy of CAWRES retracked SLAs is slightly better than those of the MLE4. However, the performance of Ice retracker is better than those of CAWRES and MLE4, suggesting the empirical-based retracker is more effective. The results demonstrate that the CAWRES would have potential to be applied to coastal regions elsewhere.

The English and Chinese versions of the five-level EuroQoL Group's five-dimension questionnaire (EQ-5D) were valid and reliable and provided comparable scores in Asian breast cancer patients.

PubMed

Lee, Chun Fan; Ng, Raymond; Luo, Nan; Wong, Nan Soon; Yap, Yoon Sim; Lo, Soo Kien; Chia, Whay Kuang; Yee, Alethea; Krishna, Lalit; Wong, Celest; Goh, Cynthia; Cheung, Yin Bun

2013-01-01

To examine the measurement properties of and comparability between the English and Chinese versions of the five-level EuroQoL Group's five-dimension questionnaire (EQ-5D) in breast cancer patients in Singapore. This is an observational study of 269 patients. Known-group validity and responsiveness of the EQ-5D utility index and visual analog scale (VAS) were assessed in relation to various clinical characteristics and longitudinal change in performance status, respectively. Convergent and divergent validity was examined by correlation coefficients between the EQ-5D and a breast cancer-specific instrument. Test-retest reliability was evaluated. The two language versions were compared by multiple regression analyses. For both English and Chinese versions, the EQ-5D utility index and VAS demonstrated known-group validity and convergent and divergent validity, and presented sufficient test-retest reliability (intraclass correlation = 0.72 to 0.83). The English version was responsive to changes in performance status. The Chinese version was responsive to decline in performance status, but there was no conclusive evidence about its responsiveness to improvement in performance status. In the comparison analyses of the utility index and VAS between the two language versions, borderline results were obtained, and equivalence cannot be definitely confirmed. The five-level EQ-5D is valid, responsive, and reliable in assessing health outcome of breast cancer patients. The English and Chinese versions provide comparable measurement results.
Cross-cultural adaptation and psychometric testing of the Quality of Dying and Death Questionnaire for the Spanish population.

PubMed

Gutiérrez Sánchez, Daniel; Cuesta-Vargas, Antonio I

2018-04-01

Many measurements have been developed to assess the quality of death (QoD). Among these, the Quality of Dying and Death Questionnaire (QODD) is the most widely studied and best validated. Informal carers and health professionals who care for the patient during their last days of life can complete this assessment tool. The aim of the study is to carry out a cross-cultural adaptation and a psychometric analysis of the QODD for the Spanish population. The translation was performed using a double forward and backward method. An expert panel evaluated the content validity. The questionnaire was tested in a sample of 72 Spanish-speaking adult carers of deceased cancer patients. A psychometric analysis was performed to evaluate internal consistency, divergent criterion-related validity with the Mini-Suffering State Examination (MSSE) and concurrent criterion-related validity with the Palliative Outcome Scale (POS). Some items were deleted and modified to create the Spanish version of the QODD (QODD-ESP-26). The instrument was readable and acceptable. The content validity index was 0.96, suggesting that all items are relevant for the measure of the QoD. This questionnaire showed high internal consistency (Cronbach's α coefficient = 0.88). Divergent validity with MSSE (r = -0.64) and convergent validity with POS (r = -0.61) were also demonstrated. The QODD-ESP-26 is a valid and reliable instrument for the assessment of the QoD of deceased cancer patients that can be used in a clinical and research setting. Copyright © 2018 Elsevier Ltd. All rights reserved.
Development of a specific anaerobic field test for aerobic gymnastics.

PubMed

Alves, Christiano Robles Rodrigues; Borelli, Marcello Tadeu Caetano; Paineli, Vitor de Salles; Azevedo, Rafael de Almeida; Borelli, Claudia Cristine Gomes; Lancha Junior, Antônio Herbert; Gualano, Bruno; Artioli, Guilherme Giannini

2015-01-01

The current investigation aimed to develop a valid specific field test to evaluate anaerobic physical performance in Aerobic Gymnastics athletes. We first designed the Specific Aerobic Gymnast Anaerobic Test (SAGAT), which included gymnastics-specific elements performed in maximal repeated sprint fashion, with a total duration of 80-90 s. In order to validate the SAGAT, three independent sub-studies were performed to evaluate the concurrent validity (Study I, n=8), the reliability (Study II, n=10) and the sensitivity (Study III, n=30) of the test in elite female athletes. In Study I, a positive correlation was shown between lower-body Wingate test and SAGAT performance (Mean power: p = 0.03, r = -0.69, CI: -0.94 to 0.03 and Peak power: p = 0.02, r = -0.72, CI: -0.95 to -0.04) and between upper-body Wingate test and SAGAT performance (Mean power: p = 0.03, r = -0.67, CI: -0.94 to 0.02 and Peak power: p = 0.03, r = -0.69, CI: -0.94 to 0.03). Additionally, plasma lactate was similarly increased in response to SAGAT (p = 0.002), lower-body Wingate Test (p = 0.021) and a simulated competition (p = 0.007). In Study II, no differences were found between the time to complete the SAGAT in repeated trials (p = 0.84; Cohen's d effect size = 0.09; ICC = 0.97, CI: 0.89 to 0.99; MDC95 = 0.12 s). Finally, in Study III the time to complete the SAGAT was significantly lower during the competition cycle when compared to the period before the preparatory cycle (p < 0.001), showing an improvement in SAGAT performance after a specific Aerobic Gymnastics training period. Taken together, these data have demonstrated that SAGAT is a specific, reliable and sensitive measurement of specific anaerobic performance in elite female Aerobic Gymnastics, presenting great potential to be largely applied in training settings.
Development of a Specific Anaerobic Field Test for Aerobic Gymnastics

PubMed Central

Paineli, Vitor de Salles; Azevedo, Rafael de Almeida; Borelli, Claudia Cristine Gomes; Lancha Junior, Antônio Herbert; Gualano, Bruno; Artioli, Guilherme Giannini

2015-01-01

The current investigation aimed to develop a valid specific field test to evaluate anaerobic physical performance in Aerobic Gymnastics athletes. We first designed the Specific Aerobic Gymnast Anaerobic Test (SAGAT), which included gymnastics-specific elements performed in maximal repeated sprint fashion, with a total duration of 80-90 s. In order to validate the SAGAT, three independent sub-studies were performed to evaluate the concurrent validity (Study I, n=8), the reliability (Study II, n=10) and the sensitivity (Study III, n=30) of the test in elite female athletes. In Study I, a positive correlation was shown between lower-body Wingate test and SAGAT performance (Mean power: p = 0.03, r = -0.69, CI: -0.94 to 0.03 and Peak power: p = 0.02, r = -0.72, CI: -0.95 to -0.04) and between upper-body Wingate test and SAGAT performance (Mean power: p = 0.03, r = -0.67, CI: -0.94 to 0.02 and Peak power: p = 0.03, r = -0.69, CI: -0.94 to 0.03). Additionally, plasma lactate was similarly increased in response to SAGAT (p = 0.002), lower-body Wingate Test (p = 0.021) and a simulated competition (p = 0.007). In Study II, no differences were found between the time to complete the SAGAT in repeated trials (p = 0.84; Cohen’s d effect size = 0.09; ICC = 0.97, CI: 0.89 to 0.99; MDC95 = 0.12 s). Finally, in Study III the time to complete the SAGAT was significantly lower during the competition cycle when compared to the period before the preparatory cycle (p < 0.001), showing an improvement in SAGAT performance after a specific Aerobic Gymnastics training period. Taken together, these data have demonstrated that SAGAT is a specific, reliable and sensitive measurement of specific anaerobic performance in elite female Aerobic Gymnastics, presenting great potential to be largely applied in training settings. PMID:25876039
Validity of linear encoder measurement of sit-to-stand performance power in older people.

PubMed

Lindemann, U; Farahmand, P; Klenk, J; Blatzonis, K; Becker, C

2015-09-01

To investigate construct validity of linear encoder measurement of sit-to-stand performance power in older people by showing associations with relevant functional performance and physiological parameters. Cross-sectional study. Movement laboratory of a geriatric rehabilitation clinic. Eighty-eight community-dwelling, cognitively unimpaired older women (mean age 78 years). Sit-to-stand performance power and leg power were assessed using a linear encoder and the Nottingham Power Rig, respectively. Gait speed was measured on an instrumented walkway. Maximum quadriceps and hand grip strength were assessed using dynamometers. Mid-thigh muscle cross-sectional area of both legs was measured using magnetic resonance imaging. Associations of sit-to-stand performance power with power assessed by the Nottingham Power Rig, maximum gait speed and muscle cross-sectional area were r=0.646, r=0.536 and r=0.514, respectively. A linear regression model explained 50% of the variance in sit-to-stand performance power including muscle cross-sectional area (p=0.001), maximum gait speed (p=0.002), and power assessed by the Nottingham Power Rig (p=0.006). Construct validity of linear encoder measurement of sit-to-stand power was shown at functional level and morphological level for older women. This measure could be used in routine clinical practice as well as in large-scale studies. DRKS00003622. Copyright © 2015 Chartered Society of Physiotherapy. Published by Elsevier Ltd. All rights reserved.
The Comprehensive Care Project: Measuring Physician Performance in Ambulatory Practice

PubMed Central

Holmboe, Eric S; Weng, Weifeng; Arnold, Gerald K; Kaplan, Sherrie H; Normand, Sharon-Lise; Greenfield, Sheldon; Hood, Sarah; Lipner, Rebecca S

2010-01-01

Objective To investigate the feasibility, reliability, and validity of comprehensively assessing physician-level performance in ambulatory practice. Data Sources/Study Setting Ambulatory-based general internists in 13 states participated in the assessment. Study Design We assessed physician-level performance, adjusted for patient factors, on 46 individual measures, an overall composite measure, and composite measures for chronic, acute, and preventive care. Between- versus within-physician variation was quantified by intraclass correlation coefficients (ICC). External validity was assessed by correlating performance on a certification exam. Data Collection/Extraction Methods Medical records for 236 physicians were audited for seven chronic and four acute care conditions, and six age- and gender-appropriate preventive services. Principal Findings Performance on the individual and composite measures varied substantially within (range 5–86 percent compliance on 46 measures) and between physicians (ICC range 0.12–0.88). Reliabilities for the composite measures were robust: 0.88 for chronic care and 0.87 for preventive services. Higher certification exam scores were associated with better performance on the overall (r = 0.19; p <.01), chronic care (r = 0.14, p = .04), and preventive services composites (r = 0.17, p = .01). Conclusions Our results suggest that reliable and valid comprehensive assessment of the quality of chronic and preventive care can be achieved by creating composite measures and by sampling feasible numbers of patients for each condition. PMID:20819110
Creativity, Emotional Intelligence, and School Performance in Children

ERIC Educational Resources Information Center

Hansenne, Michel; Legrand, Jessica

2012-01-01

Previous studies have shown that both creativity and emotional intelligence (EI) were related to children school performance. In this study, we investigated the incremental validity of EI over creativity in an elementary school setting. Seventy-three children aged from 9 to 12 years old were recruited to participate in the study. Verbal and…
Development and External Validation of a Melanoma Risk Prediction Model Based on Self-assessed Risk Factors.

PubMed

Vuong, Kylie; Armstrong, Bruce K; Weiderpass, Elisabete; Lund, Eiliv; Adami, Hans-Olov; Veierod, Marit B; Barrett, Jennifer H; Davies, John R; Bishop, D Timothy; Whiteman, David C; Olsen, Catherine M; Hopper, John L; Mann, Graham J; Cust, Anne E; McGeechan, Kevin

2016-08-01

Identifying individuals at high risk of melanoma can optimize primary and secondary prevention strategies. To develop and externally validate a risk prediction model for incident first-primary cutaneous melanoma using self-assessed risk factors. We used unconditional logistic regression to develop a multivariable risk prediction model. Relative risk estimates from the model were combined with Australian melanoma incidence and competing mortality rates to obtain absolute risk estimates. A risk prediction model was developed using the Australian Melanoma Family Study (629 cases and 535 controls) and externally validated using 4 independent population-based studies: the Western Australia Melanoma Study (511 case-control pairs), Leeds Melanoma Case-Control Study (960 cases and 513 controls), Epigene-QSkin Study (44 544, of which 766 with melanoma), and Swedish Women's Lifestyle and Health Cohort Study (49 259 women, of which 273 had melanoma). We validated model performance internally and externally by assessing discrimination using the area under the receiver operating curve (AUC). Additionally, using the Swedish Women's Lifestyle and Health Cohort Study, we assessed model calibration and clinical usefulness. The risk prediction model included hair color, nevus density, first-degree family history of melanoma, previous nonmelanoma skin cancer, and lifetime sunbed use. On internal validation, the AUC was 0.70 (95% CI, 0.67-0.73). On external validation, the AUC was 0.66 (95% CI, 0.63-0.69) in the Western Australia Melanoma Study, 0.67 (95% CI, 0.65-0.70) in the Leeds Melanoma Case-Control Study, 0.64 (95% CI, 0.62-0.66) in the Epigene-QSkin Study, and 0.63 (95% CI, 0.60-0.67) in the Swedish Women's Lifestyle and Health Cohort Study. Model calibration showed close agreement between predicted and observed numbers of incident melanomas across all deciles of predicted risk. In the external validation setting, there was higher net benefit when using the risk prediction model to classify individuals as high risk compared with classifying all individuals as high risk. The melanoma risk prediction model performs well and may be useful in prevention interventions reliant on a risk assessment using self-assessed risk factors.
Further evaluation of the EORTC QLQ-C30 psychometric properties in a large Brazilian cancer patient cohort as a function of their educational status.

PubMed

Paiva, Carlos Eduardo; Carneseca, Estela Cristina; Barroso, Eliane Marçon; de Camargos, Mayara Goulart; Alfano, Ana Camila Callado; Rugno, Fernanda Capella; Paiva, Bianca Sakamoto Ribeiro

2014-08-01

The European Organization for Research and Treatment of Cancer Core Quality of Life Questionnaire (EORTC QLQ-C30) is considered a valid instrument for use in Brazil. However, the previous Brazilian validation study included only 30 lung cancer patients and only measured test-retest reliability. The aim of this study was to evaluate the psychometric properties of the EORTC QLQ-C30 in a sample of cancer patients at different educational levels who completed the instrument administered by an interviewer. Data from six prospective studies conducted by the same group of researchers were combined in this study (N = 986). Reliability was assessed using Cronbach's alpha coefficient, all values of which were >0.7, with the exception of cognitive functioning, social functioning, and nausea and vomiting (α = 0.57, α = 0.69, and α = 0.68, respectively). In multi-trait scaling analysis, convergent and divergent validity were considered adequate (validity indices were 91.6 and 97.4%). In general, moderate to strong correlations were found between the subscales of the EORTC QLQ-C30 and its respective dimensions from the WHOQOL-bref, the hospital anxiety and depression scale, and the Edmonton Symptom Assessment System (ESAS) instruments. In addition, the EORTC QLQ-C30 was able to differentiate groups of patients with distinct performance statuses and types of treatment (known-group validation). Statistical analyses were also performed on educational status, yielding similar results. Detailed psychometric property data using the EORTC QLQ-C30 in Brazil are added by this study. In addition, we demonstrated that this instrument is in general reliable and valid regardless of the patient educational level.
Numerical studies and metric development for validation of magnetohydrodynamic models on the HIT-SI experiment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hansen, C., E-mail: hansec@uw.edu; Columbia University, New York, New York 10027; Victor, B.

We present application of three scalar metrics derived from the Biorthogonal Decomposition (BD) technique to evaluate the level of agreement between macroscopic plasma dynamics in different data sets. BD decomposes large data sets, as produced by distributed diagnostic arrays, into principal mode structures without assumptions on spatial or temporal structure. These metrics have been applied to validation of the Hall-MHD model using experimental data from the Helicity Injected Torus with Steady Inductive helicity injection experiment. Each metric provides a measure of correlation between mode structures extracted from experimental data and simulations for an array of 192 surface-mounted magnetic probes. Numericalmore » validation studies have been performed using the NIMROD code, where the injectors are modeled as boundary conditions on the flux conserver, and the PSI-TET code, where the entire plasma volume is treated. Initial results from a comprehensive validation study of high performance operation with different injector frequencies are presented, illustrating application of the BD method. Using a simplified (constant, uniform density and temperature) Hall-MHD model, simulation results agree with experimental observation for two of the three defined metrics when the injectors are driven with a frequency of 14.5 kHz.« less
Timed activity performance in persons with upper limb amputation: A preliminary study.

PubMed

Resnik, Linda; Borgia, Mathew; Acluche, Frantzy

55 subjects with upper limb amputation were administered the T-MAP twice within one week. To develop a timed measure of activity performance for persons with upper limb amputation (T-MAP); examine the measure's internal consistency, test-retest reliability and validity; and compare scores by prosthesis use. Measures of activity performance for persons with upper limb amputation are needed The time required to perform daily activities is a meaningful metric that implication for participation in life roles. Internal consistency and test-retest reliability were evaluated. Construct validity was examined by comparing scores by amputation level. Exploratory analyses compared sub-group scores, and examined correlations with other measures. Scale alpha was 0.77, ICC was 0.93. Timed scores differed by amputation level. Subjects using a prosthesis took longer to perform all tasks. T-MAP was not correlated with other measures of dexterity or activity, but was correlated with pain for non-prosthesis users. The timed scale had adequate internal consistency and excellent test-retest reliability. Analyses support reliability and construct validity of the T-MAP. 2c "outcomes" research. Published by Elsevier Inc.
Electric Ground Support Equipment Advanced Battery Technology Demonstration Project at the Ontario Airport

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tyler Gray; Jeremy Diez; Jeffrey Wishart

2013-07-01

The intent of the electric Ground Support Equipment (eGSE) demonstration is to evaluate the day-to-day vehicle performance of electric baggage tractors using two advanced battery technologies to demonstrate possible replacements for the flooded lead-acid (FLA) batteries utilized throughout the industry. These advanced battery technologies have the potential to resolve barriers to the widespread adoption of eGSE deployment. Validation testing had not previously been performed within fleet operations to determine if the performance of current advanced batteries is sufficient to withstand the duty cycle of electric baggage tractors. This report summarizes the work performed and data accumulated during this demonstration inmore » an effort to validate the capabilities of advanced battery technologies. This report summarizes the work performed and data accumulated during this demonstration in an effort to validate the capabilities of advanced battery technologies. The demonstration project also grew the relationship with Southwest Airlines (SWA), our demonstration partner at Ontario International Airport (ONT), located in Ontario, California. The results of this study have encouraged a proposal for a future demonstration project with SWA.« less
Measuring affective temperaments: a systematic review of validation studies of the Temperament Evaluation in Memphis Pisa and San Diego (TEMPS) instruments.

PubMed

Elias, Liana R; Köhler, Cristiano A; Stubbs, Brendon; Maciel, Beatriz R; Cavalcante, Lígia M; Vale, Antonio M O; Gonda, Xénia; Quevedo, João; Hyphantis, Thomas N; Soares, Jair C; Vieta, Eduard; Carvalho, André F

2017-04-01

The assessment of affective temperaments has provided useful insights for the psychopathological understanding of affective disorders and for the conceptualization of bipolar spectrum disorders. The Temperament in Memphis Pisa and San Diego (TEMPS) instrument has been widely used in research, yet its psychometric properties and optimal factor structure are unclear. The PubMed/MEDLINE, PsycINFO, and EMBASE electronic databases were searched from inception until March 15th, 2016. Validation peer-reviewed studies of different versions of the TEMPS performed in adult samples were considered for inclusion. Twenty-seven studies (N=20,787) met inclusion criteria. Several versions of the TEMPS have been validated in 14 languages across 15 countries. The 110-item self-reported version of the TEMPS has been the most studied version. Most studies (50%) supported a five factor solution although few studies performed confirmatory factor analyses. A five-factor solution has consistently been reported for the 39-item version of the TEMPS-A. Overall, evidence indicates that different versions of the TEMPS have adequate internal consistency reliability, while the TEMPS-A-110 version has acceptable test-retest reliability. The methodological quality of included studies varied. A meta-analysis could not be performed due to the heterogeneity of settings and versions of the TEMPS utilized. Different versions of the TEMPS have been validated across different cultures. The short 39-item version of the TEMPS-A holds promise and merits further investigation. Culture-bound factors may influence the expression and/or assessment of affective temperaments with the TEMPS. Copyright © 2017 Elsevier B.V. All rights reserved.
VDA, a Method of Choosing a Better Algorithm with Fewer Validations

PubMed Central

Kluger, Yuval

2011-01-01

The multitude of bioinformatics algorithms designed for performing a particular computational task presents end-users with the problem of selecting the most appropriate computational tool for analyzing their biological data. The choice of the best available method is often based on expensive experimental validation of the results. We propose an approach to design validation sets for method comparison and performance assessment that are effective in terms of cost and discrimination power. Validation Discriminant Analysis (VDA) is a method for designing a minimal validation dataset to allow reliable comparisons between the performances of different algorithms. Implementation of our VDA approach achieves this reduction by selecting predictions that maximize the minimum Hamming distance between algorithmic predictions in the validation set. We show that VDA can be used to correctly rank algorithms according to their performances. These results are further supported by simulations and by realistic algorithmic comparisons in silico. VDA is a novel, cost-efficient method for minimizing the number of validation experiments necessary for reliable performance estimation and fair comparison between algorithms. Our VDA software is available at http://sourceforge.net/projects/klugerlab/files/VDA/ PMID:22046256
Validation of a wireless modular monitoring system for structures

NASA Astrophysics Data System (ADS)

Lynch, Jerome P.; Law, Kincho H.; Kiremidjian, Anne S.; Carryer, John E.; Kenny, Thomas W.; Partridge, Aaron; Sundararajan, Arvind

2002-06-01

A wireless sensing unit for use in a Wireless Modular Monitoring System (WiMMS) has been designed and constructed. Drawing upon advanced technological developments in the areas of wireless communications, low-power microprocessors and micro-electro mechanical system (MEMS) sensing transducers, the wireless sensing unit represents a high-performance yet low-cost solution to monitoring the short-term and long-term performance of structures. A sophisticated reduced instruction set computer (RISC) microcontroller is placed at the core of the unit to accommodate on-board computations, measurement filtering and data interrogation algorithms. The functionality of the wireless sensing unit is validated through various experiments involving multiple sensing transducers interfaced to the sensing unit. In particular, MEMS-based accelerometers are used as the primary sensing transducer in this study's validation experiments. A five degree of freedom scaled test structure mounted upon a shaking table is employed for system validation.
Does IQ Really Predict Job Performance?

PubMed Central

Richardson, Ken; Norgate, Sarah H.

2015-01-01

IQ has played a prominent part in developmental and adult psychology for decades. In the absence of a clear theoretical model of internal cognitive functions, however, construct validity for IQ tests has always been difficult to establish. Test validity, therefore, has always been indirect, by correlating individual differences in test scores with what are assumed to be other criteria of intelligence. Job performance has, for several reasons, been one such criterion. Correlations of around 0.5 have been regularly cited as evidence of test validity, and as justification for the use of the tests in developmental studies, in educational and occupational selection and in research programs on sources of individual differences. Here, those correlations are examined together with the quality of the original data and the many corrections needed to arrive at them. It is concluded that considerable caution needs to be exercised in citing such correlations for test validation purposes. PMID:26405429
On the Validity of Beer-Lambert Law and its Significance for Sunscreens.

PubMed

Herzog, Bernd; Schultheiss, Amélie; Giesinger, Jochen

2018-03-01

The sun protection factor (SPF) is the most important quantity to characterize the performance of sunscreens. As the standard method for its determination is based on clinical trials involving irradiation of human volunteers, calculations of sunscreen performance have become quite popular to reduce the number of in vivo studies. Such simulations imply the calculation of UV transmittance of the sunscreen film using the amounts and spectroscopic properties of the UV absorbers employed, and presuppose the validity of the Beer-Lambert law. As sunscreen films on human skin can contain considerable concentrations of UV absorbers, it is questioned whether the Beer-Lambert law is still valid for these systems. The results of this work show that the validity of the Beer-Lambert law is still given at the high concentrations at which UV absorbers occur in sunscreen films on human skin. © 2017 The American Society of Photobiology.
External validation and comparison of three prediction tools for risk of osteoporotic fractures using data from population based electronic health records: retrospective cohort study

PubMed Central

Cohen-Stavi, Chandra; Leventer-Roberts, Maya; Balicer, Ran D

2017-01-01

Objective To directly compare the performance and externally validate the three most studied prediction tools for osteoporotic fractures—QFracture, FRAX, and Garvan—using data from electronic health records. Design Retrospective cohort study. Setting Payer provider healthcare organisation in Israel. Participants 1 054 815 members aged 50 to 90 years for comparison between tools and cohorts of different age ranges, corresponding to those in each tools’ development study, for tool specific external validation. Main outcome measure First diagnosis of a major osteoporotic fracture (for QFracture and FRAX tools) and hip fractures (for all three tools) recorded in electronic health records from 2010 to 2014. Observed fracture rates were compared to probabilities predicted retrospectively as of 2010. Results The observed five year hip fracture rate was 2.7% and the rate for major osteoporotic fractures was 7.7%. The areas under the receiver operating curve (AUC) for hip fracture prediction were 82.7% for QFracture, 81.5% for FRAX, and 77.8% for Garvan. For major osteoporotic fractures, AUCs were 71.2% for QFracture and 71.4% for FRAX. All the tools underestimated the fracture risk, but the average observed to predicted ratios and the calibration slopes of FRAX were closest to 1. Tool specific validation analyses yielded hip fracture prediction AUCs of 88.0% for QFracture (among those aged 30-100 years), 81.5% for FRAX (50-90 years), and 71.2% for Garvan (60-95 years). Conclusions Both QFracture and FRAX had high discriminatory power for hip fracture prediction, with QFracture performing slightly better. This performance gap was more pronounced in previous studies, likely because of broader age inclusion criteria for QFracture validations. The simpler FRAX performed almost as well as QFracture for hip fracture prediction, and may have advantages if some of the input data required for QFracture are not available. However, both tools require calibration before implementation. PMID:28104610
Validation of an organizational communication climate assessment toolkit.

PubMed

Wynia, Matthew K; Johnson, Megan; McCoy, Thomas P; Griffin, Leah Passmore; Osborn, Chandra Y

2010-01-01

Effective communication is critical to providing quality health care and can be affected by a number of modifiable organizational factors. The authors performed a prospective multisite validation study of an organizational communication climate assessment tool in 13 geographically and ethnically diverse health care organizations. Communication climate was measured across 9 discrete domains. Patient and staff surveys with matched items in each domain were developed using a national consensus process, which then underwent psychometric field testing and assessment of domain coherence. The authors found meaningful within-site and between-site performance score variability in all domains. In multivariable models, most communication domains were significant predictors of patient-reported quality of care and trust. The authors conclude that these assessment tools provide a valid empirical assessment of organizational communication climate in 9 domains. Assessment results may be useful to track organizational performance, to benchmark, and to inform tailored quality improvement interventions.
Validation of Ion Chromatographic Method for Determination of Standard Inorganic Anions in Treated and Untreated Drinking Water

NASA Astrophysics Data System (ADS)

Ivanova, V.; Surleva, A.; Koleva, B.

2018-06-01

An ion chromatographic method for determination of fluoride, chloride, nitrate and sulphate in untreated and treated drinking waters was described. An automated 850 IC Professional, Metrohm system equipped with conductivity detector and Metrosep A Supp 7-250 (250 x 4 mm) column was used. The validation of the method was performed for simultaneous determination of all studied analytes and the results have showed that the validated method fits the requirements of the current water legislation. The main analytical characteristics were estimated for each of studied analytes: limits of detection, limits of quantification, working and linear ranges, repeatability and intermediate precision, recovery. The trueness of the method was estimated by analysis of certified reference material for soft drinking water. Recovery test was performed on spiked drinking water samples. An uncertainty was estimated. The method was applied for analysis of drinking waters before and after chlorination.

Psychosocial correlates of disordered eating in female collegiate athletes: validation of the ATHLETE questionnaire.

PubMed

Hinton, Pamela S; Kubas, Karen L

2005-01-01

Female athletes may be at greater risk for disordered eating than their nonathletic peers, but the psychological antecedents of this dysfunctional behavior in athletes have yet to be elucidated. The objective of this study was to develop an athletics-oriented measure of psychological predictors of disordered eating and to test its initial reliability and validity. Female athletes from 3 National Collegiate Athletics Association (NCAA) Division I universities completed the ATHLETE, a written questionnaire designed to assess psychosocial factors associated with disordered eating in athletes. Five distinct and internally consistent factors (Drive for Thinness and Performance, Social Pressure on Eating, Performance Perfectionism, Social Pressure on Body Shape, and Team Trust) were positively associated with and predictive of disordered eating behaviors in female athletes. The ATHLETE is a reliable and valid measure of psychological predictors of disordered eating in athletics and will be useful in studying the etiology of disordered eating in female athletes.
Patient perspective on quality of geriatric care and rehabilitation--development and psychometric testing of a questionnaire.

PubMed

Wressle, Ewa; Eriksson, Lennart; Fahlander, Amie; Rasmusson, Ing-Marie; Tedemalm, Ulla; Tängmark, Karin

2006-06-01

The aim was to develop and test a questionnaire for use in telephone interviews concerning patient evaluation of geriatric care and rehabilitation. Instrument development was performed comprising qualitative interviews, construction of items, content validation, pilot study and data collection for evaluation of care and rehabilitation, clinical utility, reliability and construct validity. Qualitative interviews were performed with 12 elderly participants. The qualitative interviews formed the basis for the construction of 45 items. An expert panel performed a content validation of the questionnaire resulting in a revised version. A pilot study comprised 29 participants recently discharged from geriatric wards and the main data collection comprised 221 participants. Inclusion criteria were being able to perform a telephone interview and willingness to participate. Clinical utility was examined through questions to the interviewers, answered in writing. Cronbach's alpha coefficient was 0.79. According to a factor analysis and the evaluation of clinical utility, the underlying dimensions of the final revised questionnaire concern 'Respect and safety', 'Information and participation' and 'Rehabilitation interventions', scored in 18 items. In addition, one global item concerns satisfaction with care, resulting in 19 items in total. The revised questionnaire was named PaPeR, Patient Perspective on care and Rehabilitation. The questionnaire is considered valid, reliable and judged to have good clinical utility. The time consumption for the telephone interview is about 10-20 minutes. The questionnaire is useful in defining areas for potential quality improvement in geriatric wards.
Cross-cultural adaptation and validation of the Italian version of the Kerlan-Jobe Orthopaedic Clinic Shoulder and Elbow score.

PubMed

Merolla, Giovanni; Corona, Katia; Zanoli, Gustavo; Cerciello, Simone; Giannotti, Stefano; Porcellini, Giuseppe

2017-12-01

The Kerlan-Jobe Orthopaedic Clinic (KJOC) Shoulder and Elbow score is a reliable and sensitive tool to measure the performance of overhead athletes. The purpose of this study was to carry out a cross-cultural adaptation and validation of the KJOC questionnaire in Italian and to assess its reliability, validity, and responsiveness. Ninety professional athletes with a painful shoulder were included in this study and were assigned to the "injury group" (n = 32) or the "overuse group" (n = 58); 65 were managed conservatively and 25 were treated by arthroscopic surgery. To assess the reliability of the KJOC score, patients were asked to fill in the questionnaire at baseline and after 2 weeks. To test the construct validity, KJOC scores were compared to those obtained with the Italian version of the Disabilities of the Arm, Shoulder, and Hand (DASH) scale, and with the DASH sports/performing arts module. To test KJOC score responsiveness, the follow-up KJOC scores of the participants treated conservatively were compared to those of the patients treated by arthroscopic surgery. Statistical analysis demonstrated that the KJOC questionnaire is reliable in terms of the single items and the overall score (ICC 0.95-0.99); that it has high construct validity (r s = -0.697; p < 0.01); and that it is responsive to clinical differences in shoulder function (p < 0.0001). The Italian version of the KJOC Shoulder and Elbow score performed in a similar way to the English version and demonstrated good validity, reliability, and responsiveness after conservative and surgical treatment. II.
Portuguese Version of the Pain Beliefs and Perceptions Inventory: A Multicenter Validation Study.

PubMed

Azevedo, Luís Filipe; Sampaio, Rute; Camila Dias, Cláudia; Romão, José; Lemos, Laurinda; Agualusa, Luís; Vaz-Serra, Sílvia; Patto, Teresa; Costa-Pereira, Altamiro; Castro-Lopes, José Manuel

2017-07-01

We aimed to perform the translation, cultural adaptation, and validation of the Pain Beliefs and Perceptions Inventory (PBPI) for the European Portuguese language and chronic pain population. This is a longitudinal multicenter validation study. A Portuguese version of the PBPI (PBPI-P) was created through a process of translation, back translation, and expert panel evaluation. The PBPI-P was administered to a total of 122 patients from 13 chronic pain clinics in Portugal, at baseline and after 7 days. Internal consistency and test-retest reliability were assessed by Cronbach's alpha (α) and intraclass correlation coefficient (ICC). Construct (convergent and discriminant) validity was assessed based on a set of previously developed theoretical hypotheses about interrelations between the PBPI-P and other measures. Exploratory and confirmatory factor analyses were performed to test the theoretical structure of the PBPI-P. The internal consistency and test-retest reliability coefficients for each respective subscale were α = 0.620 and ICC = 0.801 for mystery; α = 0.744 and ICC = 0.841 for permanence; α = 0.778 and ICC = 0.791 for constancy; and α = 0.764 and ICC = 0.881 for self-blame. Exploratory and confirmatory factor analysis revealed a four-factor structure (performance, constancy, self-blame, and mystery) that explained 63% of the variance. The construct validity of the PBPI-P was shown to be adequate, with more than 90% of the previously defined hypotheses regarding interrelations with other measures confirmed. The PBPI-P has been shown to be adequate and to have excellent reliability, internal consistency, and validity. It may contribute to a better pain assessment and is suitable for research and clinical use. © 2016 World Institute of Pain.
Validation of the GreenLight™ Simulator and development of a training curriculum for photoselective vaporisation of the prostate.

PubMed

Aydin, Abdullatif; Muir, Gordon H; Graziano, Manuela E; Khan, Muhammad Shamim; Dasgupta, Prokar; Ahmed, Kamran

2015-06-01

To assess face, content and construct validity, and feasibility and acceptability of the GreenLight™ Simulator as a training tool for photoselective vaporisation of the prostate (PVP), and to establish learning curves and develop an evidence-based training curriculum. This prospective, observational and comparative study, recruited novice (25 participants), intermediate (14) and expert-level urologists (seven) from the UK and Europe at the 28th European Association of Urological Surgeons Annual Meeting 2013. A group of novices (12 participants) performed 10 sessions of subtask training modules followed by a long operative case, whereas a second group (13) performed five sessions of a given case module. Intermediate and expert groups performed all training modules once, followed by one operative case. The outcome measures for learning curves and construct validity were time to task, coagulation time, vaporisation time, average sweep speed, average laser distance, blood loss, operative errors, and instrument cost. Face and content validity, feasibility and acceptability were addressed through a quantitative survey. Construct validity was demonstrated in two of five training modules (P = 0.038; P = 0.018) and in a considerable number of case metrics (P = 0.034). Learning curves were seen in all five training modules (P < 0.001) and significant reduction in case operative time (P < 0.001) and error (P = 0.017) were seen. An evidence-based training curriculum, to help trainees acquire transferable skills, was produced using the results. This study has shown the GreenLight Simulator to be a valid and useful training tool for PVP. It is hoped that by using the training curriculum for the GreenLight Simulator, novice trainees can acquire skills and knowledge to a predetermined level of proficiency. © 2014 The Authors. BJU International © 2014 BJU International.
Analysis of Flowfields over Four-Engine DC-X Rockets

NASA Technical Reports Server (NTRS)

Wang, Ten-See; Cornelison, Joni

1996-01-01

The objective of this study is to validate a computational methodology for the aerodynamic performance of an advanced conical launch vehicle configuration. The computational methodology is based on a three-dimensional, viscous flow, pressure-based computational fluid dynamics formulation. Both wind-tunnel and ascent flight-test data are used for validation. Emphasis is placed on multiple-engine power-on effects. Computational characterization of the base drag in the critical subsonic regime is the focus of the validation effort; until recently, almost no multiple-engine data existed for a conical launch vehicle configuration. Parametric studies using high-order difference schemes are performed for the cold-flow tests, whereas grid studies are conducted for the flight tests. The computed vehicle axial force coefficients, forebody, aftbody, and base surface pressures compare favorably with those of tests. The results demonstrate that with adequate grid density and proper distribution, a high-order difference scheme, finite rate afterburning kinetics to model the plume chemistry, and a suitable turbulence model to describe separated flows, plume/air mixing, and boundary layers, computational fluid dynamics is a tool that can be used to predict the low-speed aerodynamic performance for rocket design and operations.
Development and validation of a pre-hospital "Red Flag" alert for activation of intra-hospital haemorrhage control response in blunt trauma.

PubMed

Hamada, Sophie Rym; Rosa, Anne; Gauss, Tobias; Desclefs, Jean-Philippe; Raux, Mathieu; Harrois, Anatole; Follin, Arnaud; Cook, Fabrice; Boutonnet, Mathieu; Attias, Arie; Ausset, Sylvain; Boutonnet, Mathieu; Dhonneur, Gilles; Duranteau, Jacques; Langeron, Olivier; Paugam-Burtz, Catherine; Pirracchio, Romain; de St Maurice, Guillaume; Vigué, Bernard; Rouquette, Alexandra; Duranteau, Jacques

2018-05-05

Haemorrhagic shock is the leading cause of early preventable death in severe trauma. Delayed treatment is a recognized prognostic factor that can be prevented by efficient organization of care. This study aimed to develop and validate Red Flag, a binary alert identifying blunt trauma patients with high risk of severe haemorrhage (SH), to be used by the pre-hospital trauma team in order to trigger an adequate intra-hospital standardized haemorrhage control response: massive transfusion protocol and/or immediate haemostatic procedures. A multicentre retrospective study of prospectively collected data from a trauma registry (Traumabase®) was performed. SH was defined as: packed red blood cell (RBC) transfusion in the trauma room, or transfusion ≥ 4 RBC in the first 6 h, or lactate ≥ 5 mmol/L, or immediate haemostatic surgery, or interventional radiology and/or death of haemorrhagic shock. Pre-hospital characteristics were selected using a multiple logistic regression model in a derivation cohort to develop a Red Flag binary alert whose performances were confirmed in a validation cohort. Among the 3675 patients of the derivation cohort, 672 (18%) had SH. The final prediction model included five pre-hospital variables: Shock Index ≥ 1, mean arterial blood pressure ≤ 70 mmHg, point of care haemoglobin ≤ 13 g/dl, unstable pelvis and pre-hospital intubation. The Red Flag alert was triggered by the presence of any combination of at least two criteria. Its predictive performances were sensitivity 75% (72-79%), specificity 79% (77-80%) and area under the receiver operating characteristic curve 0.83 (0.81-0.84) in the derivation cohort, and were not significantly different in the independent validation cohort of 2999 patients. The Red Flag alert developed and validated in this study has high performance to accurately predict or exclude SH.
Reliability and validity of the Assessment of Daily Activity Performance (ADAP) in community-dwelling older women.

PubMed

de Vreede, Paul L; Samson, Monique M; van Meeteren, Nico L; Duursma, Sijmen A; Verhaar, Harald J

2006-08-01

The Assessment of Daily Activity Performance (ADAP) test was developed, and modeled after the Continuous-scale Physical Functional Performance (CS-PFP) test, to provide a quantitative assessment of older adults' physical functional performance. The aim of this study was to determine the intra-examiner reliability and construct validity of the ADAP in a community-living older population, and to identify the importance of tester experience. Forty-three community-dwelling, older women (mean age 75 yr +/-4.3) were randomized to the test-retest reliability study (n=19) or validation study (n=24). The intra-examiner reliability of an experienced (tester 1) and an inexperienced tester (tester 2) was assessed by comparing test and retest scores of 19 participants. Construct validity was assessed by comparing the ADAP scores of 24 participants with self-perceived function by the SF-36 Health Survey, muscle function tests, and the Timed Up and Go test (TUG). Tester 1 had good consistency and reliability scores (mean difference between test and retest scores (DIF), -1.05+/-1.99; 95% confidence interval (CI), -2.58 to 0.48; Cronbach's alpha (alpha) range, 0.83 to 0.98; intraclass correlation (ICC) range, 0.75 to 0.96; Limits of Agreement (LoA), -2.58 to 4.95). Tester 2 had lower reliability scores (DIF, -2.45+/-4.36; 95% CI, -5.56 to 0.67; alpha range, 0.53 to 0.94; ICC range, 0.36 to 0.90; LoA, -6.09 to 10.99), with a systematic difference between test and retest scores for the ADAP domain lower-body strength (-3.81; 95% CI, -6.09 to -1.54), ADAP correlated with SF-36 Physical Functioning scale (r=0.67), TUG test (r=-0.91) and with isometric knee extensor strength (r=0.80). The ADAP test is a reliable and valid instrument. Our results suggest that testers should practise using the test, to improve reliability, before applying it to clinical settings.
Liver Full Reference Set Application: David Lubman - Univ of Michigan (2011) — EDRN Public Portal

Cancer.gov

In this work we will perform the next step in the biomarker development and validation. This step will be the Phase 2 validation of glycoproteins that have passed Phase 1 blinded validation using ELISA kits based on target glycoproteins selected based on our previous work. This will be done in a large Phase 2 sample set obtained in a multicenter study funded by the EDRN. The assays will be performed in our research lab located in the Center for Cancer Proteomics in the University of Michigan Medical Center. This study will include patients in whom serum was stored for future validation and includes samples from early HCC (n = 158), advanced cases (n=214) and cirrhotic controls (n = 417). These samples will be supplied by the EDRN (per Dr. Jo Ann Rinaudo) and will be analyzed in a blinded fashion by Dr. Feng from the Fred Hutchinson Cancer Center. This phase 2 study was designed to have above 90% power at one-sided 5% type-I error for comparing the joint sensitivity and specificity for differentiating early stage HCC from cirrhotic patients between AFP and a new marker. Sample sizes of 200 for early stage HCC and 400 for cirrhotics were required to achieve the stated power (14). We will select our candidates for this larger phase validation set based on the results of previous work. These will include HGF and CD14 and the results of these assays will be used to evaluate the performance of each of these markers and combinations of HGF and CD14 and AFP and HGF. It is expected that each assay will be repeated three times for each marker and will also be performed for AFP as the standard for comparison. 250 uL of each sample is requested for analysis.
Evaluation of the Performance of Routine Information System Management (PRISM) framework: evidence from Uganda.

PubMed

Hotchkiss, David R; Aqil, Anwer; Lippeveld, Theo; Mukooyo, Edward

2010-07-03

Sound policy, resource allocation and day-to-day management decisions in the health sector require timely information from routine health information systems (RHIS). In most low- and middle-income countries, the RHIS is viewed as being inadequate in providing quality data and continuous information that can be used to help improve health system performance. In addition, there is limited evidence on the effectiveness of RHIS strengthening interventions in improving data quality and use. The purpose of this study is to evaluate the usefulness of the newly developed Performance of Routine Information System Management (PRISM) framework, which consists of a conceptual framework and associated data collection and analysis tools to assess, design, strengthen and evaluate RHIS. The specific objectives of the study are: a) to assess the reliability and validity of the PRISM instruments and b) to assess the validity of the PRISM conceptual framework. Facility- and worker-level data were collected from 110 health care facilities in twelve districts in Uganda in 2004 and 2007 using records reviews, structured interviews and self-administered questionnaires. The analysis procedures include Cronbach's alpha to assess internal consistency of selected instruments, test-retest analysis to assess the reliability and sensitivity of the instruments, and bivariate and multivariate statistical techniques to assess validity of the PRISM instruments and conceptual framework. Cronbach's alpha analysis suggests high reliability (0.7 or greater) for the indices measuring a promotion of a culture of information, RHIS tasks self-efficacy and motivation. The study results also suggest that a promotion of a culture of information influences RHIS tasks self-efficacy, RHIS tasks competence and motivation, and that self-efficacy and the presence of RHIS staff have a direct influence on the use of RHIS information, a key aspect of RHIS performance. The study results provide some empirical support for the reliability and validity of the PRISM instruments and the validity of the PRISM conceptual framework, suggesting that the PRISM approach can be effectively used by RHIS policy makers and practitioners to assess the RHIS and evaluate RHIS strengthening interventions. However, additional studies with larger sample sizes are needed to further investigate the value of the PRISM instruments in exploring the linkages between RHIS data quality and use, and health systems performance.
Improving the Performance of the Listening Competency Scale: Revision and Validation

ERIC Educational Resources Information Center

Mickelson, William T.; Welch, S. A.

2013-01-01

Measuring latent traits is central to quantitative listening research and has been the focus of many studies. One such prominent measurement instrument, based on the Wolvin and Coakley (1993) listening taxonomy, was developed by Ford, Wolvin, and Chung (2000). Subsequent validation research (Mickelson & Welch, 2012) called for revisiting and…
Validity Evidence for the Measurement of the Strength of Motivation for Medical School

ERIC Educational Resources Information Center

Kusurkar, Rashmi; Croiset, Gerda; Kruitwagen, Cas; ten Cate, Olle

2011-01-01

The Strength of Motivation for Medical School (SMMS) questionnaire is designed to determine the strength of motivation of students particularly for medical study. This research was performed to establish the validity evidence for measuring strength of motivation for medical school. Internal structure and relations to other variables were used as…
A Comparison between SRSS-IE and SSiS-PSG Scores: Examining Convergent Validity

ERIC Educational Resources Information Center

Lane, Kathleen Lynne; Oakes, Wendy Peia; Common, Eric Alan; Zorigian, Kris; Brunsting, Nelson C.; Schatschneider, Christopher

2015-01-01

We report findings of a validation study comparing two screening tools: the Student Risk Screening Scale-Internalizing and Externalizing (SRSS-IE, an adapted version of the Student Risk Screening Scale) and the Social Skills Improvement System-Performance Screening Guide (SSiS-PSG). Participants included 458 kindergarten through fifth-grade…
Additional Evidence of Convergent Validity between SRSS-IE and SSiS-PSG Scores

ERIC Educational Resources Information Center

Lane, Kathleen Lynne; Oakes, Wendy Peia; Ennis, Robin Parks; Royer, David James

2015-01-01

We report findings of a validity study comparing two screening tools: the Student Risk Screening Scale-Internalizing and Externalizing (SRSS-IE) and the Social Skills Improvement System-Performance Screening Guide (SSiS-PSG; Elliott & Gresham, 2007). Participants were 1,680 kindergarten through sixth-grade elementary students from three…
Validity and Generalizability of Measuring Student Engaged Time in Physical Education.

ERIC Educational Resources Information Center

Silverman, Stephen; Zotos, Connee

The validity of interval and time sampling methods of measuring student engaged time was investigated in a study estimating the actual time students spent engaged in relevant motor performance in physical education classes. Two versions of the interval Academic Learning Time in Physical Education (ALT-PE) instrument and an equivalent time sampling…
Validity Evidence for Games as Assessment Environments. CRESST Report 773

ERIC Educational Resources Information Center

Delacruz, Girlie C.; Chung, Gregory K. W. K.; Baker, Eva L.

2010-01-01

This study provides empirical evidence of a highly specific use of games in education--the assessment of the learner. Linear regressions were used to examine the predictive and convergent validity of a math game as assessment of mathematical understanding. Results indicate that prior knowledge significantly predicts game performance. Results also…
Development and Validation of a Mathematics Anxiety Scale for Students

ERIC Educational Resources Information Center

Ko, Ho Kyoung; Yi, Hyun Sook

2011-01-01

This study developed and validated a Mathematics Anxiety Scale for Students (MASS) that can be used to measure the level of mathematics anxiety that students experience in school settings and help them overcome anxiety and perform better in mathematics achievement. We conducted a series of preliminary analyses and panel reviews to evaluate quality…
Validating Cognitive Models of Task Performance in Algebra on the SAT®. Research Report No. 2009-3

ERIC Educational Resources Information Center

Gierl, Mark J.; Leighton, Jacqueline P.; Wang, Changjiang; Zhou, Jiawen; Gokiert, Rebecca; Tan, Adele

2009-01-01

The purpose of the study is to present research focused on validating the four algebra cognitive models in Gierl, Wang, et al., using student response data collected with protocol analysis methods to evaluate the knowledge structures and processing skills used by a sample of SAT test takers.
Participation in Occupational Performance: Reliability and Validity of the Activity Card Sort.

ERIC Educational Resources Information Center

Katz, Noomi; Karpin, Hanah; Lak, Arit; Furman, Tania; Hartman-Maeir, Adina

2003-01-01

A study assessed the reliability and validity of the Activity Card Sort (ACS) within different adult groups (n=263): healthy adults, healthy older adults, Alzheimer's caregivers, multiple sclerosis patients, and stroke survivors. Found that the ACS had high internal consistency for daily living and social-cultural activities and a lower…
Readability Level of Standardized Test Items and Student Performance: The Forgotten Validity Variable

ERIC Educational Resources Information Center

Hewitt, Margaret A.; Homan, Susan P.

2004-01-01

Test validity issues considered by test developers and school districts rarely include individual item readability levels. In this study, items from a major standardized test were examined for individual item readability level and item difficulty. The Homan-Hewitt Readability Formula was applied to items across three grade levels. Results of…

Adolescent Time Attitude Scale: Adaptation into Turkish

ERIC Educational Resources Information Center

Çelik, Eyüp; Sahranç, Ümit; Kaya, Mehmet; Turan, Mehmet Emin

2017-01-01

This research is aimed at examining the validity and reliability of the Turkish version of the Time Attitude Scale. Data was collected from 433 adolescents; 206 males and 227 females participated in the study. Confirmatory factor analysis performed to discover the structural validity of the scale. The internal consistency method was used for…
Validity of the MicroDYN Approach: Complex Problem Solving Predicts School Grades beyond Working Memory Capacity

ERIC Educational Resources Information Center

Schweizer, Fabian; Wustenberg, Sascha; Greiff, Samuel

2013-01-01

This study examines the validity of the complex problem solving (CPS) test MicroDYN by investigating a) the relation between its dimensions--rule identification (exploration strategy), rule knowledge (acquired knowledge), rule application (control performance)--and working memory capacity (WMC), and b) whether CPS predicts school grades in…
Is Teacher Assessment Reliable or Valid for High School Students under a Web-Based Portfolio Environment?

ERIC Educational Resources Information Center

Chang, Chi-Cheng; Wu, Bing-Hong

2012-01-01

This study explored the reliability and validity of teacher assessment under a Web-based portfolio assessment environment (or Web-based teacher portfolio assessment). Participants were 72 eleventh graders taking the "Computer Application" course. The students perform portfolio creation, inspection, self- and peer-assessment using the Web-based…
Screening for Social, Emotional, and Behavioral Problems at Kindergarten Entry: Utility and Incremental Validity of Parent Report

ERIC Educational Resources Information Center

Owens, Julie Sarno; Storer, Jennifer; Holdaway, Alex S.; Serrano, Verenea J.; Watabe, Yuko; Himawan, Lina K.; Krelko, Rebecca E.; Vause, Katherine J.; Girio-Herrera, Erin; Andrews, Nina

2015-01-01

The current study examined the utility and incremental validity of parent ratings on the Strengths and Difficulties Questionnaire and Disruptive Behavior Disorders rating scale completed at kindergarten registration in identifying risk status as defined by important criterion variables (teacher ratings, daily behavioral performance, and quarterly…
Dynamic testing in schizophrenia: does training change the construct validity of a test?

PubMed

Wiedl, Karl H; Schöttke, Henning; Green, Michael F; Nuechterlein, Keith H

2004-01-01

Dynamic testing typically involves specific interventions for a test to assess the extent to which test performance can be modified, beyond level of baseline (static) performance. This study used a dynamic version of the Wisconsin Card Sorting Test (WCST) that is based on cognitive remediation techniques within a test-training-test procedure. From results of previous studies with schizophrenia patients, we concluded that the dynamic and static versions of the WCST should have different construct validity. This hypothesis was tested by examining the patterns of correlations with measures of executive functioning, secondary verbal memory, and verbal intelligence. Results demonstrated a specific construct validity of WCST dynamic (i.e., posttest) scores as an index of problem solving (Tower of Hanoi) and secondary verbal memory and learning (Auditory Verbal Learning Test), whereas the impact of general verbal capacity and selective attention (Verbal IQ, Stroop Test) was reduced. It is concluded that the construct validity of the test changes with dynamic administration and that this difference helps to explain why the dynamic version of the WCST predicts functional outcome better than the static version.
Coverage of the Test of Memory Malingering, Victoria Symptom Validity Test, and Word Memory Test on the Internet: is test security threatened?

PubMed

Bauer, Lyndsey; McCaffrey, Robert J

2006-01-01

In forensic neuropsychological settings, maintaining test security has become critically important, especially in regard to symptom validity tests (SVTs). Coaching, which can entail providing patients or litigants with information about the cognitive sequelae of head injury, or teaching them test-taking strategies to avoid detection of symptom dissimulation has been examined experimentally in many research studies. Emerging evidence supports that coaching strategies affect psychological and neuropsychological test performance to differing degrees depending on the coaching paradigm and the tests administered. The present study sought to examine Internet coverage of SVTs because it is potentially another source of coaching, or information that is readily available. Google searches were performed on the Test of Memory Malingering, the Victoria Symptom Validity Test, and the Word Memory Test. Results indicated that there is a variable amount of information available about each test that could threaten test security and validity should inappropriately interested parties find it. Steps that could be taken to improve this situation and limitations to this exploration are discussed.
Performance Ratings: Designs for Evaluating Their Validity and Accuracy.

DTIC Science & Technology

1986-07-01

ratees with substantial validity and with little bias due to the ethod for rating. Convergent validity and discriminant validity account for approximately...The expanded research design suggests that purpose for the ratings has little influence on the multitrait-multimethod properties of the ratings...Convergent and discriminant validity again account for substantial differences in the ratings of performance. Little method bias is present; both methods of
Evaluating the Effects of Executive Learning and Development on Organisational Performance: Implications for Developing Senior Manager and Executive Capabilities

ERIC Educational Resources Information Center

Akrofi, Solomon

2016-01-01

In spite of decades of research into high-performance work systems, very few studies have examined the relationship between executive learning and development and organisational performance. In an attempt to close this gap, this study explores the effects of a validated four-dimensional executive learning and development measure on a composite…
The Development of Performance Indicators for Prison Libraries.

ERIC Educational Resources Information Center

Lithgow, Susan D.

This report describes a study to investigate the improved efficiency and effectiveness of prison library provision in England and Wales, through the development and validation of relevant performance indicators to be used as part of quality assurance programs. Prison libraries perform important educational, rehabilitative, and recreational…
Blast effect on the lower extremities and its mitigation: a computational study.

PubMed

Dong, Liqiang; Zhu, Feng; Jin, Xin; Suresh, Mahi; Jiang, Binhui; Sevagan, Gopinath; Cai, Yun; Li, Guangyao; Yang, King H

2013-12-01

A series of computational studies were performed to investigate the response of the lower extremities of mounted soldiers under landmine detonation. A numerical human body model newly developed at Wayne State University was used to simulate two types of experimental studies and the model predictions were validated against test data in terms of the tibia axial force as well as bone fracture pattern. Based on the validated model, the minimum axial force causing tibia facture was found. Then a series of parametric studies was conducted to determine the critical velocity (peak velocity of the floor plate) causing tibia fracture at different upper/lower leg angles. In addition, to limit the load transmission through the vehicular floor, two types of energy absorbing materials, namely IMPAXX(®) foam and aluminum alloy honeycomb, were selected for floor matting. Their performances in terms of blast effect mitigation were compared using the validated numerical model, and it has been found that honeycomb is a more efficient material for blast injury prevention under the loading conditions studied. © 2013 Elsevier Ltd. All rights reserved.
Development and Validation of a Safety Climate Scale for Manufacturing Industry

PubMed Central

Ghahramani, Abolfazl; Khalkhali, Hamid R.

2015-01-01

Background This paper describes the development of a scale for measuring safety climate. Methods This study was conducted in six manufacturing companies in Iran. The scale developed through conducting a literature review about the safety climate and constructing a question pool. The number of items was reduced to 71 after performing a screening process. Results The result of content validity analysis showed that 59 items had excellent item content validity index (≥ 0.78) and content validity ratio (> 0.38). The exploratory factor analysis resulted in eight safety climate dimensions. The reliability value for the final 45-item scale was 0.96. The result of confirmatory factor analysis showed that the safety climate model is satisfactory. Conclusion This study produced a valid and reliable scale for measuring safety climate in manufacturing companies. PMID:26106508
Reliability and validity of the McDonald Play Inventory.

PubMed

McDonald, Ann E; Vigen, Cheryl

2012-01-01

This study examined the ability of a two-part self-report instrument, the McDonald Play Inventory, to reliably and validly measure the play activities and play styles of 7- to 11-yr-old children and to discriminate between the play of neurotypical children and children with known learning and developmental disabilities. A total of 124 children ages 7-11 recruited from a sample of convenience and a subsample of 17 parents participated in this study. Reliability estimates yielded moderate correlations for internal consistency, total test intercorrelations, and test-retest reliability. Validity estimates were established for content and construct validity. The results suggest that a self-report instrument yields reliable and valid measures of a child's perceived play performance and discriminates between the play of children with and without disabilities. Copyright © 2012 by the American Occupational Therapy Association, Inc.
Imputation of missing data in time series for air pollutants

NASA Astrophysics Data System (ADS)

Junger, W. L.; Ponce de Leon, A.

2015-02-01

Missing data are major concerns in epidemiological studies of the health effects of environmental air pollutants. This article presents an imputation-based method that is suitable for multivariate time series data, which uses the EM algorithm under the assumption of normal distribution. Different approaches are considered for filtering the temporal component. A simulation study was performed to assess validity and performance of proposed method in comparison with some frequently used methods. Simulations showed that when the amount of missing data was as low as 5%, the complete data analysis yielded satisfactory results regardless of the generating mechanism of the missing data, whereas the validity began to degenerate when the proportion of missing values exceeded 10%. The proposed imputation method exhibited good accuracy and precision in different settings with respect to the patterns of missing observations. Most of the imputations obtained valid results, even under missing not at random. The methods proposed in this study are implemented as a package called mtsdi for the statistical software system R.
Validity of GRE General Test scores and TOEFL scores for graduate admission to a technical university in Western Europe

NASA Astrophysics Data System (ADS)

Zimmermann, Judith; von Davier, Alina A.; Buhmann, Joachim M.; Heinimann, Hans R.

2018-01-01

Graduate admission has become a critical process in tertiary education, whereby selecting valid admissions instruments is key. This study assessed the validity of Graduate Record Examination (GRE) General Test scores for admission to Master's programmes at a technical university in Europe. We investigated the indicative value of GRE scores for the Master's programme grade point average (GGPA) with and without the addition of the undergraduate GPA (UGPA) and the TOEFL score, and of GRE scores for study completion and Master's thesis performance. GRE scores explained 20% of the variation in the GGPA, while additional 7% were explained by the TOEFL score and 3% by the UGPA. Contrary to common belief, the GRE quantitative reasoning score showed only little explanatory power. GRE scores were also weakly related to study progress but not to thesis performance. Nevertheless, GRE and TOEFL scores were found to be sensible admissions instruments. Rigorous methodology was used to obtain highly reliable results.
Performance Tested Method multiple laboratory validation study of ELISA-based assays for the detection of peanuts in food.

PubMed

Park, Douglas L; Coates, Scott; Brewer, Vickery A; Garber, Eric A E; Abouzied, Mohamed; Johnson, Kurt; Ritter, Bruce; McKenzie, Deborah

2005-01-01

Performance Tested Method multiple laboratory validations for the detection of peanut protein in 4 different food matrixes were conducted under the auspices of the AOAC Research Institute. In this blind study, 3 commercially available ELISA test kits were validated: Neogen Veratox for Peanut, R-Biopharm RIDASCREEN FAST Peanut, and Tepnel BioKits for Peanut Assay. The food matrixes used were breakfast cereal, cookies, ice cream, and milk chocolate spiked at 0 and 5 ppm peanut. Analyses of the samples were conducted by laboratories representing industry and international and U.S governmental agencies. All 3 commercial test kits successfully identified spiked and peanut-free samples. The validation study required 60 analyses on test samples at the target level 5 microg peanut/g food and 60 analyses at a peanut-free level, which was designed to ensure that the lower 95% confidence limit for the sensitivity and specificity would not be <90%. The probability that a test sample contains an allergen given a prevalence rate of 5% and a positive test result using a single test kit analysis with 95% sensitivity and 95% specificity, which was demonstrated for these test kits, would be 50%. When 2 test kits are run simultaneously on all samples, the probability becomes 95%. It is therefore recommended that all field samples be analyzed with at least 2 of the validated kits.
External validation of a Cox prognostic model: principles and methods

PubMed Central

2013-01-01

Background A prognostic model should not enter clinical practice unless it has been demonstrated that it performs a useful role. External validation denotes evaluation of model performance in a sample independent of that used to develop the model. Unlike for logistic regression models, external validation of Cox models is sparsely treated in the literature. Successful validation of a model means achieving satisfactory discrimination and calibration (prediction accuracy) in the validation sample. Validating Cox models is not straightforward because event probabilities are estimated relative to an unspecified baseline function. Methods We describe statistical approaches to external validation of a published Cox model according to the level of published information, specifically (1) the prognostic index only, (2) the prognostic index together with Kaplan-Meier curves for risk groups, and (3) the first two plus the baseline survival curve (the estimated survival function at the mean prognostic index across the sample). The most challenging task, requiring level 3 information, is assessing calibration, for which we suggest a method of approximating the baseline survival function. Results We apply the methods to two comparable datasets in primary breast cancer, treating one as derivation and the other as validation sample. Results are presented for discrimination and calibration. We demonstrate plots of survival probabilities that can assist model evaluation. Conclusions Our validation methods are applicable to a wide range of prognostic studies and provide researchers with a toolkit for external validation of a published Cox model. PMID:23496923
The methodological quality of three foundational law enforcement drug influence evaluation validation studies

PubMed Central

2013-01-01

Background A Drug Influence Evaluation (DIE) is a formal assessment of an impaired driving suspect, performed by a trained law enforcement officer who uses circumstantial facts, questioning, searching, and a physical exam to form an unstandardized opinion as to whether a suspect’s driving was impaired by drugs. This paper first identifies the scientific studies commonly cited in American criminal trials as evidence of DIE accuracy, and second, uses the QUADAS tool to investigate whether the methodologies used by these studies allow them to correctly quantify the diagnostic accuracy of the DIEs currently administered by US law enforcement. Results Three studies were selected for analysis. For each study, the QUADAS tool identified biases that distorted reported accuracies. The studies were subject to spectrum bias, selection bias, misclassification bias, verification bias, differential verification bias, incorporation bias, and review bias. The studies quantified DIE performance with prevalence-dependent accuracy statistics that are internally but not externally valid. Conclusion The accuracies reported by these studies do not quantify the accuracy of the DIE process now used by US law enforcement. These studies do not validate current DIE practice. PMID:24188398
Validation of a physically based catchment model for application in post-closure radiological safety assessments of deep geological repositories for solid radioactive wastes.

PubMed

Thorne, M C; Degnan, P; Ewen, J; Parkin, G

2000-12-01

The physically based river catchment modelling system SHETRAN incorporates components representing water flow, sediment transport and radionuclide transport both in solution and bound to sediments. The system has been applied to simulate hypothetical future catchments in the context of post-closure radiological safety assessments of a potential site for a deep geological disposal facility for intermediate and certain low-level radioactive wastes at Sellafield, west Cumbria. In order to have confidence in the application of SHETRAN for this purpose, various blind validation studies have been undertaken. In earlier studies, the validation was undertaken against uncertainty bounds in model output predictions set by the modelling team on the basis of how well they expected the model to perform. However, validation can also be carried out with bounds set on the basis of how well the model is required to perform in order to constitute a useful assessment tool. Herein, such an assessment-based validation exercise is reported. This exercise related to a field plot experiment conducted at Calder Hollow, west Cumbria, in which the migration of strontium and lanthanum in subsurface Quaternary deposits was studied on a length scale of a few metres. Blind predictions of tracer migration were compared with experimental results using bounds set by a small group of assessment experts independent of the modelling team. Overall, the SHETRAN system performed well, failing only two out of seven of the imposed tests. Furthermore, of the five tests that were not failed, three were positively passed even when a pessimistic view was taken as to how measurement errors should be taken into account. It is concluded that the SHETRAN system, which is still being developed further, is a powerful tool for application in post-closure radiological safety assessments.
Validity and reliability of global operative assessment of laparoscopic skills (GOALS) in novice trainees performing a laparoscopic cholecystectomy.

PubMed

Kramp, Kelvin H; van Det, Marc J; Hoff, Christiaan; Lamme, Bas; Veeger, Nic J G M; Pierie, Jean-Pierre E N

2015-01-01

Global Operative Assessment of Laparoscopic Skills (GOALS) assessment has been designed to evaluate skills in laparoscopic surgery. A longitudinal blinded study of randomized video fragments was conducted to estimate the validity and reliability of GOALS in novice trainees. In total, 10 trainees each performed 6 consecutive laparoscopic cholecystectomies. Sixty procedures were recorded on video. Video fragments of (1) opening of the peritoneum; (2) dissection of Calot's triangle and achievement of critical view of safety; and (3) dissection of the gallbladder from the liver bed were blinded, randomized, and rated by 2 consultant surgeons using GOALS. Also, a grade was given for overall competence. The correlation of GOALS with live observation Objective Structured Assessment of Technical Skills (OSATS) scores was calculated. Construct validity was estimated using the Friedman 2-way analysis of variance by ranks and the Wilcoxon signed-rank test. The interrater reliability was calculated using the absolute and consistency agreement 2-way random-effects model intraclass correlation coefficient. A high correlation was found between mean GOALS score (r = 0.879, p = 0.021) and mean OSATS score. The GOALS score increased significantly across the 6 procedures (p = 0.002). The trainees performed significantly better on their sixth when compared with their first cholecystectomy (p = 0.004). The consistency agreement interrater reliability was 0.37 for the mean GOALS score (p = 0.002) and 0.55 for overall competence (p < 0.001) of the 3 video fragments. The validity observed in this randomized blinded longitudinal study supports the existing evidence that GOALS is a valid tool for assessment of novice trainees. A relatively low reliability was found in this study. Copyright © 2014 Association of Program Directors in Surgery. Published by Elsevier Inc. All rights reserved.
Validation of a Multimarker Model for Assessing Risk of Type 2 Diabetes from a Five-Year Prospective Study of 6784 Danish People (Inter99)

PubMed Central

Urdea, Mickey; Kolberg, Janice; Wilber, Judith; Gerwien, Robert; Moler, Edward; Rowe, Michael; Jorgensen, Paul; Hansen, Torben; Pedersen, Oluf; Jørgensen, Torben; Borch-Johnsen, Knut

2009-01-01

Background Improved identification of subjects at high risk for development of type 2 diabetes would allow preventive interventions to be targeted toward individuals most likely to benefit. In previous research, predictive biomarkers were identified and used to develop multivariate models to assess an individual's risk of developing diabetes. Here we describe the training and validation of the PreDx™ Diabetes Risk Score (DRS) model in a clinical laboratory setting using baseline serum samples from subjects in the Inter99 cohort, a population-based primary prevention study of cardiovascular disease. Methods Among 6784 subjects free of diabetes at baseline, 215 subjects progressed to diabetes (converters) during five years of follow-up. A nested case-control study was performed using serum samples from 202 converters and 597 randomly selected nonconverters. Samples were randomly assigned to equally sized training and validation sets. Seven biomarkers were measured using assays developed for use in a clinical reference laboratory. Results The PreDx DRS model performed better on the training set (area under the curve [AUC] = 0.837) than fasting plasma glucose alone (AUC = 0.779). When applied to the sequestered validation set, the PreDx DRS showed the same performance (AUC = 0.838), thus validating the model. This model had a better AUC than any other single measure from a fasting sample. Moreover, the model provided further risk stratification among high-risk subpopulations with impaired fasting glucose or metabolic syndrome. Conclusions The PreDx DRS provides the absolute risk of diabetes conversion in five years for subjects identified to be “at risk” using the clinical factors. PMID:20144324

Validation of a multimarker model for assessing risk of type 2 diabetes from a five-year prospective study of 6784 Danish people (Inter99).

PubMed

Urdea, Mickey; Kolberg, Janice; Wilber, Judith; Gerwien, Robert; Moler, Edward; Rowe, Michael; Jorgensen, Paul; Hansen, Torben; Pedersen, Oluf; Jørgensen, Torben; Borch-Johnsen, Knut

2009-07-01

Improved identification of subjects at high risk for development of type 2 diabetes would allow preventive interventions to be targeted toward individuals most likely to benefit. In previous research, predictive biomarkers were identified and used to develop multivariate models to assess an individual's risk of developing diabetes. Here we describe the training and validation of the PreDx Diabetes Risk Score (DRS) model in a clinical laboratory setting using baseline serum samples from subjects in the Inter99 cohort, a population-based primary prevention study of cardiovascular disease. Among 6784 subjects free of diabetes at baseline, 215 subjects progressed to diabetes (converters) during five years of follow-up. A nested case-control study was performed using serum samples from 202 converters and 597 randomly selected nonconverters. Samples were randomly assigned to equally sized training and validation sets. Seven biomarkers were measured using assays developed for use in a clinical reference laboratory. The PreDx DRS model performed better on the training set (area under the curve [AUC] = 0.837) than fasting plasma glucose alone (AUC = 0.779). When applied to the sequestered validation set, the PreDx DRS showed the same performance (AUC = 0.838), thus validating the model. This model had a better AUC than any other single measure from a fasting sample. Moreover, the model provided further risk stratification among high-risk subpopulations with impaired fasting glucose or metabolic syndrome. The PreDx DRS provides the absolute risk of diabetes conversion in five years for subjects identified to be "at risk" using the clinical factors. Copyright 2009 Diabetes Technology Society.
Risk assessment model for development of advanced age-related macular degeneration.

PubMed

Klein, Michael L; Francis, Peter J; Ferris, Frederick L; Hamon, Sara C; Clemons, Traci E

2011-12-01

To design a risk assessment model for development of advanced age-related macular degeneration (AMD) incorporating phenotypic, demographic, environmental, and genetic risk factors. We evaluated longitudinal data from 2846 participants in the Age-Related Eye Disease Study. At baseline, these individuals had all levels of AMD, ranging from none to unilateral advanced AMD (neovascular or geographic atrophy). Follow-up averaged 9.3 years. We performed a Cox proportional hazards analysis with demographic, environmental, phenotypic, and genetic covariates and constructed a risk assessment model for development of advanced AMD. Performance of the model was evaluated using the C statistic and the Brier score and externally validated in participants in the Complications of Age-Related Macular Degeneration Prevention Trial. The final model included the following independent variables: age, smoking history, family history of AMD (first-degree member), phenotype based on a modified Age-Related Eye Disease Study simple scale score, and genetic variants CFH Y402H and ARMS2 A69S. The model did well on performance measures, with very good discrimination (C statistic = 0.872) and excellent calibration and overall performance (Brier score at 5 years = 0.08). Successful external validation was performed, and a risk assessment tool was designed for use with or without the genetic component. We constructed a risk assessment model for development of advanced AMD. The model performed well on measures of discrimination, calibration, and overall performance and was successfully externally validated. This risk assessment tool is available for online use.
A meta-analytic review of self-reported, clinician-rated, and performance-based motivation measures in schizophrenia: Are we measuring the same "stuff"?

PubMed

Luther, Lauren; Firmin, Ruth L; Lysaker, Paul H; Minor, Kyle S; Salyers, Michelle P

2018-04-07

An array of self-reported, clinician-rated, and performance-based measures has been used to assess motivation in schizophrenia; however, the convergent validity evidence for these motivation assessment methods is mixed. The current study is a series of meta-analyses that summarize the relationships between methods of motivation measurement in 45 studies of people with schizophrenia. The overall mean effect size between self-reported and clinician-rated motivation measures (r = 0.27, k = 33) was significant, positive, and approaching medium in magnitude, and the overall effect size between performance-based and clinician-rated motivation measures (r = 0.21, k = 11) was positive, significant, and small in magnitude. The overall mean effect size between self-reported and performance-based motivation measures was negligible and non-significant (r = -0.001, k = 2), but this meta-analysis was underpowered. Findings suggest modest convergent validity between clinician-rated and both self-reported and performance-based motivation measures, but additional work is needed to clarify the convergent validity between self-reported and performance-based measures. Further, there is likely more variability than similarity in the underlying construct that is being assessed across the three methods, particularly between the performance-based and other motivation measurement types. These motivation assessment methods should not be used interchangeably, and measures should be more precisely described as the specific motivational construct or domain they are capturing. Copyright © 2018 Elsevier Ltd. All rights reserved.
Comparison of performance-based assessment and real world skill in people with serious mental illness: Ecological validity of the Test of Grocery Shopping Skills.

PubMed

Faith, Laura A; Rempfer, Melisa V

2018-05-07

Valid functional measures are essential for clinical and research efforts that address recovery and community functioning in people with serious mental illness. Although there is a great deal of interest in functional assessment, there is limited research supporting how well current evaluation methods provide a true assessment of real world functioning or naturalistic behavior. To address this gap in the literature, the present study examined the performance of individuals with serious mental illness (i.e., diagnosis of schizophrenia-spectrum, bipolar disorder, or other depression/anxiety diagnoses and accompanying functional disability) on the Test of Grocery Shopping Skills (TOGSS), a performance-based naturalistic task. We compared TOGSS performance to two dimensions of real world functioning: directly observed real world grocery shopping and ratings of community functioning. Results indicated that the TOGSS was significantly associated with real life grocery shopping, in terms of both shopping accuracy (r = 0.424) and time (r = 0.491). Further, self-report and observer-rated methods of assessing real world shopping behaviors were significantly correlated (r = 0.455). To our knowledge, this is one of the first studies to directly compare a performance-based naturalistic skill assessment with carefully observed real world performance of that skill in people with serious mental illness. These findings support the feasibility and ecological validity of performance-based naturalistic assessment with the TOGSS. Copyright © 2018 Elsevier B.V. All rights reserved.
Active imaging system performance model for target acquisition

NASA Astrophysics Data System (ADS)

Espinola, Richard L.; Teaney, Brian; Nguyen, Quang; Jacobs, Eddie L.; Halford, Carl E.; Tofsted, David H.

2007-04-01

The U.S. Army RDECOM CERDEC Night Vision & Electronic Sensors Directorate has developed a laser-range-gated imaging system performance model for the detection, recognition, and identification of vehicle targets. The model is based on the established US Army RDECOM CERDEC NVESD sensor performance models of the human system response through an imaging system. The Java-based model, called NVLRG, accounts for the effect of active illumination, atmospheric attenuation, and turbulence effects relevant to LRG imagers, such as speckle and scintillation, and for the critical sensor and display components. This model can be used to assess the performance of recently proposed active SWIR systems through various trade studies. This paper will describe the NVLRG model in detail, discuss the validation of recent model components, present initial trade study results, and outline plans to validate and calibrate the end-to-end model with field data through human perception testing.
A Cross-Validation of easyCBM Mathematics Cut Scores in Washington State: 2009-2010 Test. Technical Report #1105

ERIC Educational Resources Information Center

Anderson, Daniel; Alonzo, Julie; Tindal, Gerald

2011-01-01

In this technical report, we document the results of a cross-validation study designed to identify optimal cut-scores for the use of the easyCBM[R] mathematics test in the state of Washington. A large sample, randomly split into two groups of roughly equal size, was used for this study. Students' performance classification on the Washington state…
A Cross-Validation of easyCBM[R] Mathematics Cut Scores in Oregon: 2009-2010. Technical Report #1104

ERIC Educational Resources Information Center

Anderson, Daniel; Alonzo, Julie; Tindal, Gerald

2011-01-01

In this technical report, we document the results of a cross-validation study designed to identify optimal cut-scores for the use of the easyCBM[R] mathematics test in Oregon. A large sample, randomly split into two groups of roughly equal size, was used for this study. Students' performance classification on the Oregon state test was used as the…
Validating a work group climate assessment tool for improving the performance of public health organizations

PubMed Central

Perry, Cary; LeMay, Nancy; Rodway, Greg; Tracy, Allison; Galer, Joan

2005-01-01

Background This article describes the validation of an instrument to measure work group climate in public health organizations in developing countries. The instrument, the Work Group Climate Assessment Tool (WCA), was applied in Brazil, Mozambique, and Guinea to assess the intermediate outcomes of a program to develop leadership for performance improvement. Data were collected from 305 individuals in 42 work groups, who completed a self-administered questionnaire. Methods The WCA was initially validated using Cronbach's alpha reliability coefficient and exploratory factor analysis. This article presents the results of a second validation study to refine the initial analyses to account for nested data, to provide item-level psychometrics, and to establish construct validity. Analyses included eigenvalue decomposition analysis, confirmatory factor analysis, and validity and reliability analyses. Results This study confirmed the validity and reliability of the WCA across work groups with different demographic characteristics (gender, education, management level, and geographical location). The study showed that there is agreement between the theoretical construct of work climate and the items in the WCA tool across different populations. The WCA captures a single perception of climate rather than individual sub-scales of clarity, support, and challenge. Conclusion The WCA is useful for comparing the climates of different work groups, tracking the changes in climate in a single work group over time, or examining differences among individuals' perceptions of their work group climate. Application of the WCA before and after a leadership development process can help work groups hold a discussion about current climate and select a target for improvement. The WCA provides work groups with a tool to take ownership of their own group climate through a process that is simple and objective and that protects individual confidentiality. PMID:16223447
Physical examination tests of the shoulder: a systematic review and meta-analysis of diagnostic test performance.

PubMed

Gismervik, Sigmund Ø; Drogset, Jon O; Granviken, Fredrik; Rø, Magne; Leivseth, Gunnar

2017-01-25

Physical examination tests of the shoulder (PETS) are clinical examination maneuvers designed to aid the assessment of shoulder complaints. Despite more than 180 PETS described in the literature, evidence of their validity and usefulness in diagnosing the shoulder is questioned. This meta-analysis aims to use diagnostic odds ratio (DOR) to evaluate how much PETS shift overall probability and to rank the test performance of single PETS in order to aid the clinician's choice of which tests to use. This study adheres to the principles outlined in the Cochrane guidelines and the PRISMA statement. A fixed effect model was used to assess the overall diagnostic validity of PETS by pooling DOR for different PETS with similar biomechanical rationale when possible. Single PETS were assessed and ranked by DOR. Clinical performance was assessed by sensitivity, specificity, accuracy and likelihood ratio. Six thousand nine-hundred abstracts and 202 full-text articles were assessed for eligibility; 20 articles were eligible and data from 11 articles could be included in the meta-analysis. All PETS for SLAP (superior labral anterior posterior) lesions pooled gave a DOR of 1.38 [1.13, 1.69]. The Supraspinatus test for any full thickness rotator cuff tear obtained the highest DOR of 9.24 (sensitivity was 0.74, specificity 0.77). Compression-Rotation test obtained the highest DOR (6.36) among single PETS for SLAP lesions (sensitivity 0.43, specificity 0.89) and Hawkins test obtained the highest DOR (2.86) for impingement syndrome (sensitivity 0.58, specificity 0.67). No single PETS showed superior clinical test performance. The clinical performance of single PETS is limited. However, when the different PETS for SLAP lesions were pooled, we found a statistical significant change in post-test probability indicating an overall statistical validity. We suggest that clinicians choose their PETS among those with the highest pooled DOR and to assess validity to their own specific clinical settings, review the inclusion criteria of the included primary studies. We further propose that future studies on the validity of PETS use randomized research designs rather than the accuracy design relying less on well-established gold standard reference tests and efficient treatment options.
Measuring striving for understanding and learning value of geometry: a validity study

NASA Astrophysics Data System (ADS)

Ubuz, Behiye; Aydınyer, Yurdagül

2017-11-01

The current study aimed to construct a questionnaire that measures students' personality traits related to striving for understanding and learning value of geometry and then examine its psychometric properties. Through the use of multiple methods on two independent samples of 402 and 521 middle school students, two studies were performed to address this issue to provide support for its validity. In Study 1, exploratory factor analysis indicated the two-factor model. In Study 2, confirmatory factor analysis indicated the better fit of two-factor model compared to one or three-factor model. Convergent and discriminant validity evidence provided insight into the distinctiveness of the two factors. Subgroup validity evidence revealed gender differences for striving for understanding geometry trait favouring girls and grade level differences for learning value of geometry trait favouring the sixth- and seventh-grade students. Predictive validity evidence demonstrated that the striving for understanding geometry trait but not learning value of geometry trait was significantly correlated with prior mathematics achievement. In both studies, each factor and the entire questionnaire showed satisfactory reliability. In conclusion, the questionnaire was psychometrically sound.
Modeling the Relationship between Safety Climate and Safety Performance in a Developing Construction Industry: A Cross-Cultural Validation Study

PubMed Central

Zahoor, Hafiz; Chan, Albert P. C.; Utama, Wahyudi P.; Gao, Ran; Zafar, Irfan

2017-01-01

This study attempts to validate a safety performance (SP) measurement model in the cross-cultural setting of a developing country. In addition, it highlights the variations in investigating the relationship between safety climate (SC) factors and SP indicators. The data were collected from forty under-construction multi-storey building projects in Pakistan. Based on the results of exploratory factor analysis, a SP measurement model was hypothesized. It was tested and validated by conducting confirmatory factor analysis on calibration and validation sub-samples respectively. The study confirmed the significant positive impact of SC on safety compliance and safety participation, and negative impact on number of self-reported accidents/injuries. However, number of near-misses could not be retained in the final SP model because it attained a lower standardized path coefficient value. Moreover, instead of safety participation, safety compliance established a stronger impact on SP. The study uncovered safety enforcement and promotion as a novel SC factor, whereas safety rules and work practices was identified as the most neglected factor. The study contributed to the body of knowledge by unveiling the deviations in existing dimensions of SC and SP. The refined model is expected to concisely measure the SP in the Pakistani construction industry, however, caution must be exercised while generalizing the study results to other developing countries. PMID:28350366
Modeling the Relationship between Safety Climate and Safety Performance in a Developing Construction Industry: A Cross-Cultural Validation Study.

PubMed

Zahoor, Hafiz; Chan, Albert P C; Utama, Wahyudi P; Gao, Ran; Zafar, Irfan

2017-03-28

This study attempts to validate a safety performance (SP) measurement model in the cross-cultural setting of a developing country. In addition, it highlights the variations in investigating the relationship between safety climate (SC) factors and SP indicators. The data were collected from forty under-construction multi-storey building projects in Pakistan. Based on the results of exploratory factor analysis, a SP measurement model was hypothesized. It was tested and validated by conducting confirmatory factor analysis on calibration and validation sub-samples respectively. The study confirmed the significant positive impact of SC on safety compliance and safety participation , and negative impact on number of self-reported accidents/injuries . However, number of near-misses could not be retained in the final SP model because it attained a lower standardized path coefficient value. Moreover, instead of safety participation , safety compliance established a stronger impact on SP. The study uncovered safety enforcement and promotion as a novel SC factor, whereas safety rules and work practices was identified as the most neglected factor. The study contributed to the body of knowledge by unveiling the deviations in existing dimensions of SC and SP. The refined model is expected to concisely measure the SP in the Pakistani construction industry, however, caution must be exercised while generalizing the study results to other developing countries.
Traditional Chinese version of the Mayer Salovey Caruso Emotional Intelligence Test (MSCEIT-TC): Its validation and application to schizophrenic individuals.

PubMed

Mao, Wei-Chung; Chen, Li-Fen; Chi, Chia-Hsing; Lin, Ching-Hung; Kao, Yu-Chen; Hsu, Wen-Yau; Lane, Hsien-Yuan; Hsieh, Jen-Chuen

2016-09-30

Schizophrenia is an illness that impairs a person's social cognition. The Mayer Salovey Caruso Emotional Intelligence Test (MSCEIT) is the most well-known test used to measure emotional intelligence (EI), which is a major component of social cognition. Given the absence of EI ability-based scales adapted to Chinese speakers, we translated the MSCEIT into a Traditional Chinese version (MSCEIT-TC) and validated this scale for use in schizophrenia studies. The specific aims were to validate the MSCEIT-TC, to develop a norm for the MSCEIT-TC, and use this norm to explore the EI performance of schizophrenic individuals. We included in our study seven hundred twenty-eight healthy controls and seventy-six individuals with schizophrenia. The results suggest that the MSCEIT-TC is reliable and valid when assessing EI. The results showed good discrimination and validity when comparing the two study groups. Impairment was the greatest for two branches Understanding and Managing Emotions, which implies that the deficits of schizophrenia individuals involve ToM (theory of mind) tasks. Deficits involving the negative scale of schizophrenia was related to impaired performance when the MSCEIT-TC was used (in branch 2, 3, 4, and the area Strategic). Our findings suggest that the MSCEIT-TC can be used for emotional studies in healthy Chinese and in clinical setting for investigating schizophrenic individuals. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Validation of a novel virtual reality simulator for robotic surgery.

PubMed

Schreuder, Henk W R; Persson, Jan E U; Wolswijk, Richard G H; Ihse, Ingmar; Schijven, Marlies P; Verheijen, René H M

2014-01-01

With the increase in robotic-assisted laparoscopic surgery there is a concomitant rising demand for training methods. The objective was to establish face and construct validity of a novel virtual reality simulator (dV-Trainer, Mimic Technologies, Seattle, WA) for the use in training of robot-assisted surgery. A comparative cohort study was performed. Participants (n = 42) were divided into three groups according to their robotic experience. To determine construct validity, participants performed three different exercises twice. Performance parameters were measured. To determine face validity, participants filled in a questionnaire after completion of the exercises. Experts outperformed novices in most of the measured parameters. The most discriminative parameters were "time to complete" and "economy of motion" (P < 0.001). The training capacity of the simulator was rated 4.6 ± 0.5 SD on a 5-point Likert scale. The realism of the simulator in general, visual graphics, movements of instruments, interaction with objects, and the depth perception were all rated as being realistic. The simulator is considered to be a very useful training tool for residents and medical specialist starting with robotic surgery. Face and construct validity for the dV-Trainer could be established. The virtual reality simulator is a useful tool for training robotic surgery.
Validation of a Novel Virtual Reality Simulator for Robotic Surgery

PubMed Central

Schreuder, Henk W. R.; Persson, Jan E. U.; Wolswijk, Richard G. H.; Ihse, Ingmar; Schijven, Marlies P.; Verheijen, René H. M.

2014-01-01

Objective. With the increase in robotic-assisted laparoscopic surgery there is a concomitant rising demand for training methods. The objective was to establish face and construct validity of a novel virtual reality simulator (dV-Trainer, Mimic Technologies, Seattle, WA) for the use in training of robot-assisted surgery. Methods. A comparative cohort study was performed. Participants (n = 42) were divided into three groups according to their robotic experience. To determine construct validity, participants performed three different exercises twice. Performance parameters were measured. To determine face validity, participants filled in a questionnaire after completion of the exercises. Results. Experts outperformed novices in most of the measured parameters. The most discriminative parameters were “time to complete” and “economy of motion” (P < 0.001). The training capacity of the simulator was rated 4.6 ± 0.5 SD on a 5-point Likert scale. The realism of the simulator in general, visual graphics, movements of instruments, interaction with objects, and the depth perception were all rated as being realistic. The simulator is considered to be a very useful training tool for residents and medical specialist starting with robotic surgery. Conclusions. Face and construct validity for the dV-Trainer could be established. The virtual reality simulator is a useful tool for training robotic surgery. PMID:24600328
Validation of Computerized Automatic Calculation of the Sequential Organ Failure Assessment Score

PubMed Central

Harrison, Andrew M.; Pickering, Brian W.; Herasevich, Vitaly

2013-01-01

Purpose. To validate the use of a computer program for the automatic calculation of the sequential organ failure assessment (SOFA) score, as compared to the gold standard of manual chart review. Materials and Methods. Adult admissions (age > 18 years) to the medical ICU with a length of stay greater than 24 hours were studied in the setting of an academic tertiary referral center. A retrospective cross-sectional analysis was performed using a derivation cohort to compare automatic calculation of the SOFA score to the gold standard of manual chart review. After critical appraisal of sources of disagreement, another analysis was performed using an independent validation cohort. Then, a prospective observational analysis was performed using an implementation of this computer program in AWARE Dashboard, which is an existing real-time patient EMR system for use in the ICU. Results. Good agreement between the manual and automatic SOFA calculations was observed for both the derivation (N=94) and validation (N=268) cohorts: 0.02 ± 2.33 and 0.29 ± 1.75 points, respectively. These results were validated in AWARE (N=60). Conclusion. This EMR-based automatic tool accurately calculates SOFA scores and can facilitate ICU decisions without the need for manual data collection. This tool can also be employed in a real-time electronic environment. PMID:23936639
Evaluating the statistical performance of less applied algorithms in classification of worldview-3 imagery data in an urbanized landscape

NASA Astrophysics Data System (ADS)

Ranaie, Mehrdad; Soffianian, Alireza; Pourmanafi, Saeid; Mirghaffari, Noorollah; Tarkesh, Mostafa

2018-03-01

In recent decade, analyzing the remotely sensed imagery is considered as one of the most common and widely used procedures in the environmental studies. In this case, supervised image classification techniques play a central role. Hence, taking a high resolution Worldview-3 over a mixed urbanized landscape in Iran, three less applied image classification methods including Bagged CART, Stochastic gradient boosting model and Neural network with feature extraction were tested and compared with two prevalent methods: random forest and support vector machine with linear kernel. To do so, each method was run ten time and three validation techniques was used to estimate the accuracy statistics consist of cross validation, independent validation and validation with total of train data. Moreover, using ANOVA and Tukey test, statistical difference significance between the classification methods was significantly surveyed. In general, the results showed that random forest with marginal difference compared to Bagged CART and stochastic gradient boosting model is the best performing method whilst based on independent validation there was no significant difference between the performances of classification methods. It should be finally noted that neural network with feature extraction and linear support vector machine had better processing speed than other.
The psychometric properties of the WHOQOL-BREF in Japanese couples

PubMed Central

Sun, Yi; Sugawara, Masumi; Matsumoto, Satoko; Sakai, Atsushi; Takaoka, Junko; Goto, Noriko

2015-01-01

This study investigated the psychometric properties of the Japanese version of the WHOQOL-BREF among 10,693 community-based married Japanese men and women (4376 couples) who were either expecting or raising a child. Analyses of item-response distributions, internal consistency, criterion validity, and discriminant validity indicated that the scale had acceptable reliability and performed well in preliminary tests of validity. Furthermore, dyadic confirmatory factor analysis revealed that the theoretical factor structure was valid and similar across partners, suggesting that men and women define and value quality of life in a similar way. PMID:28070365
Design, implementation, and psychometric analysis of a scoring instrument for simulated pediatric resuscitation: a report from the EXPRESS pediatric investigators.

PubMed

Donoghue, Aaron; Ventre, Kathleen; Boulet, John; Brett-Fleegler, Marisa; Nishisaki, Akira; Overly, Frank; Cheng, Adam

2011-04-01

Robustly tested instruments for quantifying clinical performance during pediatric resuscitation are lacking. Examining Pediatric Resuscitation Education through Simulation and Scripting Collaborative was established to conduct multicenter trials of simulation education in pediatric resuscitation, evaluating performance with multiple instruments, one of which is the Clinical Performance Tool (CPT). We hypothesize that the CPT will measure clinical performance during simulated pediatric resuscitation in a reliable and valid manner. Using a pediatric resuscitation scenario as a basis, a scoring system was designed based on Pediatric Advanced Life Support algorithms comprising 21 tasks. Each task was scored as follows: task not performed (0 points); task performed partially, incorrectly, or late (1 point); and task performed completely, correctly, and within the recommended time frame (2 points). Study teams at 14 children's hospitals went through the scenario twice (PRE and POST) with an interposed 20-minute debriefing. Both scenarios for each of eight study teams were scored by multiple raters. A generalizability study, based on the PRE scores, was conducted to investigate the sources of measurement error in the CPT total scores. Inter-rater reliability was estimated based on the variance components. Validity was assessed by repeated measures analysis of variance comparing PRE and POST scores. Sixteen resuscitation scenarios were reviewed and scored by seven raters. Inter-rater reliability for the overall CPT score was 0.63. POST scores were found to be significantly improved compared with PRE scores when controlled for within-subject covariance (F1,15 = 4.64, P < 0.05). The variance component ascribable to rater was 2.4%. Reliable and valid measures of performance in simulated pediatric resuscitation can be obtained from the CPT. Future studies should examine the applicability of trichotomous scoring instruments to other clinical scenarios, as well as performance during actual resuscitations.
Prevalence of Invalid Performance on Baseline Testing for Sport-Related Concussion by Age and Validity Indicator.

PubMed

Abeare, Christopher A; Messa, Isabelle; Zuccato, Brandon G; Merker, Bradley; Erdodi, Laszlo

2018-03-12

Estimated base rates of invalid performance on baseline testing (base rates of failure) for the management of sport-related concussion range from 6.1% to 40.0%, depending on the validity indicator used. The instability of this key measure represents a challenge in the clinical interpretation of test results that could undermine the utility of baseline testing. To determine the prevalence of invalid performance on baseline testing and to assess whether the prevalence varies as a function of age and validity indicator. This retrospective, cross-sectional study included data collected between January 1, 2012, and December 31, 2016, from a clinical referral center in the Midwestern United States. Participants included 7897 consecutively tested, equivalently proportioned male and female athletes aged 10 to 21 years, who completed baseline neurocognitive testing for the purpose of concussion management. Baseline assessment was conducted with the Immediate Postconcussion Assessment and Cognitive Testing (ImPACT), a computerized neurocognitive test designed for assessment of concussion. Base rates of failure on published ImPACT validity indicators were compared within and across age groups. Hypotheses were developed after data collection but prior to analyses. Of the 7897 study participants, 4086 (51.7%) were male, mean (SD) age was 14.71 (1.78) years, 7820 (99.0%) were primarily English speaking, and the mean (SD) educational level was 8.79 (1.68) years. The base rate of failure ranged from 6.4% to 47.6% across individual indicators. Most of the sample (55.7%) failed at least 1 of 4 validity indicators. The base rate of failure varied considerably across age groups (117 of 140 [83.6%] for those aged 10 years to 14 of 48 [29.2%] for those aged 21 years), representing a risk ratio of 2.86 (95% CI, 2.60-3.16; P < .001). The results for base rate of failure were surprisingly high overall and varied widely depending on the specific validity indicator and the age of the examinee. The strong age association, with 3 of 4 participants aged 10 to 12 years failing validity indicators, suggests that the clinical interpretation and utility of baseline testing in this age group is questionable. These findings underscore the need for close scrutiny of performance validity indicators on baseline testing across age groups.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.