Discriminant Validity Assessment: Use of Fornell & Larcker criterion versus HTMT Criterion
NASA Astrophysics Data System (ADS)
Hamid, M. R. Ab; Sami, W.; Mohmad Sidek, M. H.
2017-09-01
Assessment of discriminant validity is a must in any research that involves latent variables for the prevention of multicollinearity issues. Fornell and Larcker criterion is the most widely used method for this purpose. However, a new method has emerged for establishing the discriminant validity assessment through heterotrait-monotrait (HTMT) ratio of correlations method. Therefore, this article presents the results of discriminant validity assessment using these methods. Data from previous study was used that involved 429 respondents for empirical validation of value-based excellence model in higher education institutions (HEI) in Malaysia. From the analysis, the convergent, divergent and discriminant validity were established and admissible using Fornell and Larcker criterion. However, the discriminant validity is an issue when employing the HTMT criterion. This shows that the latent variables under study faced the issue of multicollinearity and should be looked into for further details. This also implied that the HTMT criterion is a stringent measure that could detect the possible indiscriminant among the latent variables. In conclusion, the instrument which consisted of six latent variables was still lacking in terms of discriminant validity and should be explored further.
Schiffman, Eric L.; Truelove, Edmond L.; Ohrbach, Richard; Anderson, Gary C.; John, Mike T.; List, Thomas; Look, John O.
2011-01-01
AIMS The purpose of the Research Diagnostic Criteria for Temporomandibular Disorders (RDC/TMD) Validation Project was to assess the diagnostic validity of this examination protocol. An overview is presented, including Axis I and II methodology and descriptive statistics for the study participant sample. This paper details the development of reliable methods to establish the reference standards for assessing criterion validity of the Axis I RDC/TMD diagnoses. Validity testing for the Axis II biobehavioral instruments was based on previously validated reference standards. METHODS The Axis I reference standards were based on the consensus of 2 criterion examiners independently performing a comprehensive history, clinical examination, and evaluation of imaging. Intersite reliability was assessed annually for criterion examiners and radiologists. Criterion exam reliability was also assessed within study sites. RESULTS Study participant demographics were comparable to those of participants in previous studies using the RDC/TMD. Diagnostic agreement of the criterion examiners with each other and with the consensus-based reference standards was excellent with all kappas ≥ 0.81, except for osteoarthrosis (moderate agreement, k = 0.53). Intrasite criterion exam agreement with reference standards was excellent (k ≥ 0.95). Intersite reliability of the radiologists for detecting computed tomography-disclosed osteoarthrosis and magnetic resonance imaging-disclosed disc displacement was good to excellent (k = 0.71 and 0.84, respectively). CONCLUSION The Validation Project study population was appropriate for assessing the reliability and validity of the RDC/TMD Axis I and II. The reference standards used to assess the validity of Axis I TMD were based on reliable and clinically credible methods. PMID:20213028
Schiffman, Eric L; Truelove, Edmond L; Ohrbach, Richard; Anderson, Gary C; John, Mike T; List, Thomas; Look, John O
2010-01-01
The purpose of the Research Diagnostic Criteria for Temporomandibular Disorders (RDC/TMD) Validation Project was to assess the diagnostic validity of this examination protocol. The aim of this article is to provide an overview of the project's methodology, descriptive statistics, and data for the study participant sample. This article also details the development of reliable methods to establish the reference standards for assessing criterion validity of the Axis I RDC/TMD diagnoses. The Axis I reference standards were based on the consensus of two criterion examiners independently performing a comprehensive history, clinical examination, and evaluation of imaging. Intersite reliability was assessed annually for criterion examiners and radiologists. Criterion examination reliability was also assessed within study sites. Study participant demographics were comparable to those of participants in previous studies using the RDC/TMD. Diagnostic agreement of the criterion examiners with each other and with the consensus-based reference standards was excellent with all kappas > or = 0.81, except for osteoarthrosis (moderate agreement, k = 0.53). Intrasite criterion examiner agreement with reference standards was excellent (k > or = 0.95). Intersite reliability of the radiologists for detecting computed tomography-disclosed osteoarthrosis and magnetic resonance imaging-disclosed disc displacement was good to excellent (k = 0.71 and 0.84, respectively). The Validation Project study population was appropriate for assessing the reliability and validity of the RDC/TMD Axis I and II. The reference standards used to assess the validity of Axis I TMD were based on reliable and clinically credible methods.
The Validation of a Case-Based, Cumulative Assessment and Progressions Examination
Coker, Adeola O.; Copeland, Jeffrey T.; Gottlieb, Helmut B.; Horlen, Cheryl; Smith, Helen E.; Urteaga, Elizabeth M.; Ramsinghani, Sushma; Zertuche, Alejandra; Maize, David
2016-01-01
Objective. To assess content and criterion validity, as well as reliability of an internally developed, case-based, cumulative, high-stakes third-year Annual Student Assessment and Progression Examination (P3 ASAP Exam). Methods. Content validity was assessed through the writing-reviewing process. Criterion validity was assessed by comparing student scores on the P3 ASAP Exam with the nationally validated Pharmacy Curriculum Outcomes Assessment (PCOA). Reliability was assessed with psychometric analysis comparing student performance over four years. Results. The P3 ASAP Exam showed content validity through representation of didactic courses and professional outcomes. Similar scores on the P3 ASAP Exam and PCOA with Pearson correlation coefficient established criterion validity. Consistent student performance using Kuder-Richardson coefficient (KR-20) since 2012 reflected reliability of the examination. Conclusion. Pharmacy schools can implement internally developed, high-stakes, cumulative progression examinations that are valid and reliable using a robust writing-reviewing process and psychometric analyses. PMID:26941435
ERIC Educational Resources Information Center
Fidler, James R.
1993-01-01
Criterion-related validities of 2 laboratory practitioner certification examinations for medical technologists (MTs) and medical laboratory technicians (MLTs) were assessed for 81 MT and 70 MLT examinees. Validity coefficients are presented for both measures. Overall, summative ratings yielded stronger validity coefficients than ratings based on…
Criterion-Related Validity: Assessing the Value of Subscores
ERIC Educational Resources Information Center
Davison, Mark L.; Davenport, Ernest C., Jr.; Chang, Yu-Feng; Vue, Kory; Su, Shiyang
2015-01-01
Criterion-related profile analysis (CPA) can be used to assess whether subscores of a test or test battery account for more criterion variance than does a single total score. Application of CPA to subscore evaluation is described, compared to alternative procedures, and illustrated using SAT data. Considerations other than validity and reliability…
Palm, Peter; Josephson, Malin; Mathiassen, Svend Erik; Kjellberg, Katarina
2016-06-01
We evaluated the intra- and inter-observer reliability and criterion validity of an observation protocol, developed in an iterative process involving practicing ergonomists, for assessment of working technique during cash register work for the purpose of preventing upper extremity symptoms. Two ergonomists independently assessed 17 15-min videos of cash register work on two occasions each, as a basis for examining reliability. Criterion validity was assessed by comparing these assessments with meticulous video-based analyses by researchers. Intra-observer reliability was acceptable (i.e. proportional agreement >0.7 and kappa >0.4) for 10/10 questions. Inter-observer reliability was acceptable for only 3/10 questions. An acceptable inter-observer reliability combined with an acceptable criterion validity was obtained only for one working technique aspect, 'Quality of movements'. Thus, major elements of the cashiers' working technique could not be assessed with an acceptable accuracy from short periods of observations by one observer, such as often desired by practitioners. Practitioner Summary: We examined an observation protocol for assessing working technique in cash register work. It was feasible in use, but inter-observer reliability and criterion validity were generally not acceptable when working technique aspects were assessed from short periods of work. We recommend the protocol to be used for educational purposes only.
Evidence for the Criterion Validity and Clinical Utility of the Pathological Narcissism Inventory
ERIC Educational Resources Information Center
Thomas, Katherine M.; Wright, Aidan G. C.; Lukowitsky, Mark R.; Donnellan, M. Brent; Hopwood, Christopher J.
2012-01-01
In this study, the authors evaluated aspects of criterion validity and clinical utility of the grandiosity and vulnerability components of the Pathological Narcissism Inventory (PNI) using two undergraduate samples (N = 299 and 500). Criterion validity was assessed by evaluating the correlations of narcissistic grandiosity and narcissistic…
MacKillop, James; Acker, John D; Bollinger, Jared; Clifton, Allan; Miller, Joshua D; Campbell, W Keith; Goodie, Adam S
2013-09-01
Alcohol misuse is substantially influenced by social factors, but systematic assessments of social network drinking are typically lengthy. The goal of the present study was to provide further validation of a brief measure of social network alcohol use, the Brief Alcohol Social Density Assessment (BASDA), in a sample of emerging adults. Specifically, the study sought to examine the BASDA's convergent, criterion, and incremental validity in relation to well-established measures of drinking motives and problematic drinking. Participants were 354 undergraduates who were assessed using the BASDA, the Alcohol Use Disorders Identification Test (AUDIT), and the Drinking Motives Questionnaire. Significant associations were observed between the BASDA index of alcohol-related social density and alcohol misuse, social motives, and conformity motives, supporting convergent validity. Criterion-related validity was supported by evidence that significantly greater alcohol involvement was present in the social networks of individuals scoring at or above an AUDIT score of 8, a validated criterion for hazardous drinking. Finally, the BASDA index was significantly associated with alcohol misuse above and beyond drinking motives in relation to AUDIT scores, supporting incremental validity. Taken together, these findings provide further support for the BASDA as an efficient measure of drinking in an individual's social network. Methodological considerations as well as recommendations for future investigations in this area are discussed.
Validity of the Eating Attitudes Test and the Eating Disorders Inventory in Bulimia Nervosa.
ERIC Educational Resources Information Center
Gross, Janet; And Others
1986-01-01
Assessed criterion and concurrent validity of the Eating Attitudes Test and the Eating Disorder Inventory in 82 women with bulimia nervosa. Both tests demonstrated criterion validity by discriminating bulimia nervosa subjects from normals. Only weak support was found for concurrent validity within bulimia subjects. Recommends combination of…
Dahlke, Jeffrey A; Kostal, Jack W; Sackett, Paul R; Kuncel, Nathan R
2018-05-03
We explore potential explanations for validity degradation using a unique predictive validation data set containing up to four consecutive years of high school students' cognitive test scores and four complete years of those students' college grades. This data set permits analyses that disentangle the effects of predictor-score age and timing of criterion measurements on validity degradation. We investigate the extent to which validity degradation is explained by criterion dynamism versus the limited shelf-life of ability scores. We also explore whether validity degradation is attributable to fluctuations in criterion variability over time and/or GPA contamination from individual differences in course-taking patterns. Analyses of multiyear predictor data suggest that changes to the determinants of performance over time have much stronger effects on validity degradation than does the shelf-life of cognitive test scores. The age of predictor scores had only a modest relationship with criterion-related validity when the criterion measurement occasion was held constant. Practical implications and recommendations for future research are discussed. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
ERIC Educational Resources Information Center
Willoughby, Michael T.; Blair, Clancy B.; Wirth, R. J.; Greenberg, Mark
2010-01-01
In this study, the authors examined the psychometric properties and criterion validity of a newly developed battery of tasks that were designed to assess executive function (EF) abilities in early childhood. The battery was included in the 36-month assessment of the Family Life Project (FLP), a prospective longitudinal study of 1,292 children…
Criterion-Referenced Testing for College-Level General Education: Some Problems and Recommendations.
ERIC Educational Resources Information Center
Benoist, Howard
1979-01-01
The adoption of a criterion-referenced assessment system and the resulting disadvantages of this form of evaluation for the college general education program are discussed, including problems in identifying assessment validation procedures. (RAO)
Convergent, discriminant, and criterion validity of DSM-5 traits.
Yalch, Matthew M; Hopwood, Christopher J
2016-10-01
Section III of the Diagnostic and Statistical Manual of Mental Disorders (5th edi.; DSM-5; American Psychiatric Association, 2013) contains a system for diagnosing personality disorder based in part on assessing 25 maladaptive traits. Initial research suggests that this aspect of the system improves the validity and clinical utility of the Section II Model. The Computer Adaptive Test of Personality Disorder (CAT-PD; Simms et al., 2011) contains many similar traits as the DSM-5, as well as several additional traits seemingly not covered in the DSM-5. In this study we evaluate the convergent and discriminant validity between the DSM-5 traits, as assessed by the Personality Inventory for DSM-5 (PID-5; Krueger et al., 2012), and CAT-PD in an undergraduate sample, and test whether traits included in the CAT-PD but not the DSM-5 provide incremental validity in association with clinically relevant criterion variables. Results supported the convergent and discriminant validity of the PID-5 and CAT-PD scales in their assessment of 23 out of 25 DSM-5 traits. DSM-5 traits were consistently associated with 11 criterion variables, despite our having intentionally selected clinically relevant criterion constructs not directly assessed by DSM-5 traits. However, the additional CAT-PD traits provided incremental information above and beyond the DSM-5 traits for all criterion variables examined. These findings support the validity of pathological trait models in general and the DSM-5 and CAT-PD models in particular, while also suggesting that the CAT-PD may include additional traits for consideration in future iterations of the DSM-5 system. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Ando, Yukako; Kataoka, Tsuyoshi; Okamura, Hitoshi; Tanaka, Katsutoshi; Kobayashi, Toshio
2013-12-01
The purpose of this research is to verify the reliability and validity of a job stressor scale for nurses caring for patients with intractable neurological diseases. A mail survey was conducted using a self-report questionnaire. The subjects were 263 nurses and assistant nurses working in wards specializing in intractable neurological diseases. The response rate was 71.9% (valid response rate, 66.2%). With regard to reliability, internal consistency and stability were assessed. Internal consistency was examined via Cronbach's alpha. For stability, the test-retest method was performed and stability was examined via intraclass correlation coefficients. With regard to validity, factor validity, criterion-related validity, and content validity were assessed. Exploratory factor analysis was used for factor validity. For criterion-related validity, an existing scale was used as an external criterion; concurrent validity was examined via Spearman's rank correlation coefficients. As a result of analysis, there were 26 items in the scale created with an eight factor structure. Cronbach's a for the 26 items was 0.90; with the exception of two factors, alpha for all of the individual sub-factors was high at 0.7 or higher. The intraclass correlation coefficient for the 26 items was 0.89 (p < 0.001). With regard to criterion-related validity, concurrent validity was confirmed and the correlation coefficient with an external criterion was 0.73 (p < 0.001). For content validity, subjects who responded that "The questionnaire represents a stressor well or to a degree" accounted for 81% of the total responses. Reliability and validity were confirmed, so the scale created in the current research is a usable scale.
Vanwolleghem, Griet; Van Dyck, Delfien; Ducheyne, Fabian; De Bourdeaudhuij, Ilse; Cardon, Greet
2014-06-10
Google Street View provides a valuable and efficient alternative to observe the physical environment compared to on-site fieldwork. However, studies on the use, reliability and validity of Google Street View in a cycling-to-school context are lacking. We aimed to study the intra-, inter-rater reliability and criterion validity of EGA-Cycling (Environmental Google Street View Based Audit - Cycling to school), a newly developed audit using Google Street View to assess the physical environment along cycling routes to school. Parents (n = 52) of 11-to-12-year old Flemish children, who mostly cycled to school, completed a questionnaire and identified their child's cycling route to school on a street map. Fifty cycling routes of 11-to-12-year olds were identified and physical environmental characteristics along the identified routes were rated with EGA-Cycling (5 subscales; 37 items), based on Google Street View. To assess reliability, two researchers performed the audit. Criterion validity of the audit was examined by comparing the ratings based on Google Street View with ratings through on-site assessments. Intra-rater reliability was high (kappa range 0.47-1.00). Large variations in the inter-rater reliability (kappa range -0.03-1.00) and criterion validity scores (kappa range -0.06-1.00) were reported, with acceptable inter-rater reliability values for 43% of all items and acceptable criterion validity for 54% of all items. EGA-Cycling can be used to assess physical environmental characteristics along cycling routes to school. However, to assess the micro-environment specifically related to cycling, on-site assessments have to be added.
Yee, Chee-Seng; Farewell, Vernon; Isenberg, David A; Rahman, Anisur; Teh, Lee-Suan; Griffiths, Bridget; Bruce, Ian N; Ahmad, Yasmeen; Prabu, Athiveeraramapandian; Akil, Mohammed; McHugh, Neil; D'Cruz, David; Khamashta, Munther A; Maddison, Peter; Gordon, Caroline
2007-01-01
Objective To determine the construct and criterion validity of the British Isles Lupus Assessment Group 2004 (BILAG-2004) index for assessing disease activity in systemic lupus erythematosus (SLE). Methods Patients with SLE were recruited into a multicenter cross-sectional study. Data on SLE disease activity (scores on the BILAG-2004 index, Classic BILAG index, and Systemic Lupus Erythematosus Disease Activity Index 2000 [SLEDAI-2K]), investigations, and therapy were collected. Overall BILAG-2004 and overall Classic BILAG scores were determined by the highest score achieved in any of the individual systems in the respective index. Erythrocyte sedimentation rates (ESRs), C3 levels, C4 levels, anti–double-stranded DNA (anti-dsDNA) levels, and SLEDAI-2K scores were used in the analysis of construct validity, and increase in therapy was used as the criterion for active disease in the analysis of criterion validity. Statistical analyses were performed using ordinal logistic regression for construct validity and logistic regression for criterion validity. Sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were calculated. Results Of the 369 patients with SLE, 92.7% were women, 59.9% were white, 18.4% were Afro-Caribbean and 18.4% were South Asian. Their mean ± SD age was 41.6 ± 13.2 years and mean disease duration was 8.8 ± 7.7 years. More than 1 assessment was obtained on 88.6% of the patients, and a total of 1,510 assessments were obtained. Increasing overall scores on the BILAG-2004 index were associated with increasing ESRs, decreasing C3 levels, decreasing C4 levels, elevated anti-dsDNA levels, and increasing SLEDAI-2K scores (all P < 0.01). Increase in therapy was observed more frequently in patients with overall BILAG-2004 scores reflecting higher disease activity. Scores indicating active disease (overall BILAG-2004 scores of A and B) were significantly associated with increase in therapy (odds ratio [OR] 19.3, P < 0.01). The BILAG-2004 and Classic BILAG indices had comparable sensitivity, specificity, PPV, and NPV. Conclusion These findings show that the BILAG-2004 index has construct and criterion validity. PMID:18050213
ERIC Educational Resources Information Center
Harris, Larry P.; Wolf, Steven R.
1979-01-01
The article focuses on the controversy over norm-referenced v criterion-referenced measures (CRM) in assessment of learning disorders. The authors contend that while the reliability of CRMs is generally indisputable, the validity of measures designed from local curricula is still dependent on the intuitive judgments of teachers. (Author/SBH)
Toro, Brigitte; Nester, Christopher J; Farren, Pauline C
2007-03-01
To develop the construct, content, and criterion validity of the Salford Gait Tool (SF-GT) and to evaluate agreement between gait observations using the SF-GT and kinematic gait data. Tool development and comparative evaluation. University in the United Kingdom. For designing construct and content validity, convenience samples of 10 children with hemiplegic, diplegic, and quadriplegic cerebral palsy (CP) and 152 physical therapy students and 4 physical therapists were recruited. For developing criterion validity, kinematic gait data of 13 gait clusters containing 56 children with hemiplegic, diplegic, and quadriplegic CP and 11 neurologically intact children was used. For clinical evaluation, a convenience sample of 23 pediatric physical therapists participated. We developed a sagittal plane observational gait assessment tool through a series of design, test, and redesign iterations. The tool's grading system was calibrated using kinematic gait data of 13 gait clusters and was evaluated by comparing the agreement of gait observations using the SF-GT with kinematic gait data. Criterion standard kinematic gait data. There was 58% mean agreement based on grading categories and 80% mean agreement based on degree estimations evaluated with the least significant difference method. The new SF-GT has good concurrent criterion validity.
Shmulewitz, D.; Wall, M.M.; Aharonovich, E.; Spivak, B.; Weizman, A.; Frisch, A.; Grant, B. F.; Hasin, D.
2013-01-01
Background The fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) proposes aligning nicotine use disorder (NUD) criteria with those for other substances, by including the current DSM fourth edition (DSM-IV) nicotine dependence (ND) criteria, three abuse criteria (neglect roles, hazardous use, interpersonal problems) and craving. Although NUD criteria indicate one latent trait, evidence is lacking on: (1) validity of each criterion; (2) validity of the criteria as a set; (3) comparative validity between DSM-5 NUD and DSM-IV ND criterion sets; and (4) NUD prevalence. Method Nicotine criteria (DSM-IV ND, abuse and craving) and external validators (e.g. smoking soon after awakening, number of cigarettes per day) were assessed with a structured interview in 734 lifetime smokers from an Israeli household sample. Regression analysis evaluated the association between validators and each criterion. Receiver operating characteristic analysis assessed the association of the validators with the DSM-5 NUD set (number of criteria endorsed) and tested whether DSM-5 or DSM-IV provided the most discriminating criterion set. Changes in prevalence were examined. Results Each DSM-5 NUD criterion was significantly associated with the validators, with strength of associations similar across the criteria. As a set, DSM-5 criteria were significantly associated with the validators, were significantly more discriminating than DSM-IV ND criteria, and led to increased prevalence of binary NUD (two or more criteria) over ND. Conclusions All findings address previous concerns about the DSM-IV nicotine diagnosis and its criteria and support the proposed changes for DSM-5 NUD, which should result in improved diagnosis of nicotine disorders. PMID:23312475
An evidence-based decision assistance model for predicting training outcome in juvenile guide dogs.
Harvey, Naomi D; Craigon, Peter J; Blythe, Simon A; England, Gary C W; Asher, Lucy
2017-01-01
Working dog organisations, such as Guide Dogs, need to regularly assess the behaviour of the dogs they train. In this study we developed a questionnaire-style behaviour assessment completed by training supervisors of juvenile guide dogs aged 5, 8 and 12 months old (n = 1,401), and evaluated aspects of its reliability and validity. Specifically, internal reliability, temporal consistency, construct validity, predictive criterion validity (comparing against later training outcome) and concurrent criterion validity (comparing against a standardised behaviour test) were evaluated. Thirty-nine questions were sourced either from previously published literature or created to meet requirements identified via Guide Dogs staff surveys and staff feedback. Internal reliability analyses revealed seven reliable and interpretable trait scales named according to the questions within them as: Adaptability; Body Sensitivity; Distractibility; Excitability; General Anxiety; Trainability and Stair Anxiety. Intra-individual temporal consistency of the scale scores between 5-8, 8-12 and 5-12 months was high. All scales excepting Body Sensitivity showed some degree of concurrent criterion validity. Predictive criterion validity was supported for all seven scales, since associations were found with training outcome, at at-least one age. Thresholds of z-scores on the scales were identified that were able to distinguish later training outcome by identifying 8.4% of all dogs withdrawn for behaviour and 8.5% of all qualified dogs, with 84% and 85% specificity. The questionnaire assessment was reliable and could detect traits that are consistent within individuals over time, despite juvenile dogs undergoing development during the study period. By applying thresholds to scores produced from the questionnaire this assessment could prove to be a highly valuable decision-making tool for Guide Dogs. This is the first questionnaire-style assessment of juvenile dogs that has shown value in predicting the training outcome of individual working dogs.
Milian, Monika; Kreitschmann-Andermahr, Ilonka; Siegel, Sonja; Kleist, Bernadette; Führer-Sakel, Dagmar; Honegger, Juergen; Buchfelder, Michael; Psaras, Tsambika
2015-01-01
To evaluate the construct and criterion validity of the Tuebingen Cushing's disease quality of life inventory (Tuebingen CD-25) for application in patients treated for Cushing's disease (CD). A total of 176 patients with adrenocorticotropin hormone-dependent CD (144 of them female, overall mean age 46.1 ± 13.7 years) treated at 3 large tertiary referral centers in Germany were studied. Construct validity was assessed by hypothesis testing (self-perceived symptom reduction assessment) and contrasted groups (patients with vs. without hypercorticolism). For this purpose, already existing data from 55 CD patients was used, representing the hypercortisolemic group. Criterion validity (concurrent validity) was assessed in relation to the Cushing's quality of life questionnaire (CushingQoL), the Short Form 36 health survey (SF-36), and the body mass index (BMI). Patients with self-perceived remarkable symptom reduction had significant lower Tuebingen CD-25 scores (i.e. better health-related quality of life) than patients with self-perceived insufficient symptom reduction (p < 0.05). Similarly, the mean scores of the Tuebingen CD-25 scales were lower in patients without hypercortisolism (total score 27.0 ± 17.2) compared to those with hypercortisolism (total score 45.3 ± 22.1; each p < 0.05), providing evidence for construct validity. Criterion validity was confirmed by the correlations between the Tuebingen CD-25 total score and the CushingQoL (Spearman's coefficient -0.733), as well as all scales of the SF-36 (Spearman's coefficient between -0.447 and -0.700). The analyses presented in this large-sample study provide robust evidence for the construct and criterion validity of the Tuebingen CD-25. © 2015 S. Karger AG, Basel.
Pagliarin, Karina Carlesso; Ortiz, Karin Zazo; Barreto, Simone dos Santos; Pimenta Parente, Maria Alice de Mattos; Nespoulous, Jean-Luc; Joanette, Yves; Fonseca, Rochele Paz
2015-10-15
The Montreal-Toulouse Language Assessment Battery - Brazilian version (MTL-BR) provides a general description of language processing and related components in adults with brain injury. The present study aimed at verifying the criterion-related validity of the Montreal-Toulouse Language Assessment Battery - Brazilian version (MTL-BR) by assessing its ability to discriminate between individuals with unilateral brain damage with and without aphasia. The investigation was carried out in a Brazilian community-based sample of 104 adults, divided into four groups: 26 participants with left hemisphere damage (LHD) with aphasia, 25 participants with right hemisphere damage (RHD), 28 with LHD non-aphasic, and 25 healthy adults. There were significant differences between patients with aphasia and the other groups on most total and subtotal scores on MTL-BR tasks. The results showed strong criterion-related validity evidence for the MTL-BR Battery, and provided important information regarding hemispheric specialization and interhemispheric cooperation. Future research is required to search for additional evidence of sensitivity, specificity and validity of the MTL-BR in samples with different types of aphasia and degrees of language impairment. Copyright © 2015 Elsevier B.V. All rights reserved.
Jung, Sung-Hoon; Kwon, Oh-Yun; Jeon, In-Cheol; Hwang, Ui-Jae; Weon, Jong-Hyuck
2018-01-01
The purposes of this study were to determine the intra-rater test-retest reliability of a smart phone-based measurement tool (SBMT) and a three-dimensional (3D) motion analysis system for measuring the transverse rotation angle of the pelvis during single-leg lifting (SLL) and the criterion validity of the transverse rotation angle of the pelvis measurement using SBMT compared with a 3D motion analysis system (3DMAS). Seventeen healthy volunteers performed SLL with their dominant leg without bending the knee until they reached a target placed 20 cm above the table. This study used a 3DMAS, considered the gold standard, to measure the transverse rotation angle of the pelvis to assess the criterion validity of the SBMT measurement. Intra-rater test-retest reliability was determined using the SBMT and 3DMAS using intra-class correlation coefficient (ICC) [3,1] values. The criterion validity of the SBMT was assessed with ICC [3,1] values. Both the 3DMAS (ICC = 0.77) and SBMT (ICC = 0.83) showed excellent intra-rater test-retest reliability in the measurement of the transverse rotation angle of the pelvis during SLL in a supine position. Moreover, the SBMT showed an excellent correlation with the 3DMAS (ICC = 0.99). Measurement of the transverse rotation angle of the pelvis using the SBMT showed excellent reliability and criterion validity compared with the 3DMAS.
ERIC Educational Resources Information Center
Bödeker, Malte; Bucksch, Jens; Wallmann-Sperlich, Birgit
2018-01-01
The Neighborhood Physical Activity Questionnaire allows to assess physical activity within and outside the neighborhood. Study objectives were to examine the criterion-related validity and health/functioning associations of Neighborhood Physical Activity Questionnaire-derived physical activity in German older adults. A total of 107 adults aged…
Mungovan, Sean F; Peralta, Paula J; Gass, Gregory C; Scanlan, Aaron T
2018-04-12
To examine the test-retest reliability and criterion validity of a high-intensity, netball-specific fitness test. Repeated measures, within-subject design. Eighteen female netball players competing in an international competition completed a trial of the Net-Test, which consists of 14 timed netball-specific movements. Players also completed a series of netball-relevant criterion fitness tests. Ten players completed an additional Net-Test trial one week later to assess test-retest reliability using intraclass correlation coefficient (ICC), typical error of measurement (TEM), and coefficient of variation (CV). The typical error of estimate expressed as CV and Pearson correlations were calculated between each criterion test and Net-Test performance to assess criterion validity. Five movements during the Net-Test displayed moderate ICC (0.84-0.90) and two movements displayed high ICC (0.91-0.93). Seven movements and heart rate taken during the Net-Test held low CV (<5%) with values ranging from 1.7 to 9.5% across measures. Total time (41.63±2.05s) during the Net-Test possessed low CV and significant (p<0.05) correlations with 10m sprint time (1.98±0.12s; CV=4.4%, r=0.72), 20m sprint time (3.38±0.19s; CV=3.9%, r=0.79), 505 Change-of-Direction time (2.47±0.08s; CV=2.0%, r=0.80); and maximum oxygen uptake (46.59±2.58 mLkg -1 min -1 ; CV=4.5%, r=-0.66). The Net-Test possesses acceptable reliability for the assessment of netball fitness. Further, the high criterion validity for the Net-Test suggests a range of important netball-specific fitness elements are assessed in combination. Copyright © 2018 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.
Empirical agreement in model validation.
Jebeile, Julie; Barberousse, Anouk
2016-04-01
Empirical agreement is often used as an important criterion when assessing the validity of scientific models. However, it is by no means a sufficient criterion as a model can be so adjusted as to fit available data even though it is based on hypotheses whose plausibility is known to be questionable. Our aim in this paper is to investigate into the uses of empirical agreement within the process of model validation. Copyright © 2015 Elsevier Ltd. All rights reserved.
Turkish Version of Kolcaba's Immobilization Comfort Questionnaire: A Validity and Reliability Study.
Tosun, Betül; Aslan, Özlem; Tunay, Servet; Akyüz, Aygül; Özkan, Hüseyin; Bek, Doğan; Açıksöz, Semra
2015-12-01
The purpose of this study was to determine the validity and reliability of the Turkish version of the Immobilization Comfort Questionnaire (ICQ). The sample used in this methodological study consisted of 121 patients undergoing lower extremity arthroscopy in a training and research hospital. The validity study of the questionnaire assessed language validity, structural validity and criterion validity. Structural validity was evaluated via exploratory factor analysis. Criterion validity was evaluated by assessing the correlation between the visual analog scale (VAS) scores (i.e., the comfort and pain VAS scores) and the ICQ scores using Spearman's correlation test. The Kaiser-Meyer-Olkin coefficient and Bartlett's test of sphericity were used to determine the suitability of the data for factor analysis. Internal consistency was evaluated to determine reliability. The data were analyzed with SPSS version 15.00 for Windows. Descriptive statistics were presented as frequencies, percentages, means and standard deviations. A p value ≤ .05 was considered statistically significant. A moderate positive correlation was found between the ICQ scores and the VAS comfort scores; a moderate negative correlation was found between the ICQ and the VAS pain measures in the criterion validity analysis. Cronbach α values of .75 and .82 were found for the first and second measurements, respectively. The findings of this study reveal that the ICQ is a valid and reliable tool for assessing the comfort of patients in Turkey who are immobilized because of lower extremity orthopedic problems. Copyright © 2015. Published by Elsevier B.V.
Rönspies, Jelena; Schmidt, Alexander F; Melnikova, Anna; Krumova, Rosina; Zolfagari, Asadeh; Banse, Rainer
2015-07-01
The present study was conducted to validate an adaptation of the Implicit Relational Assessment Procedure (IRAP) as an indirect latency-based measure of sexual orientation. Furthermore, reliability and criterion validity of the IRAP were compared to two established indirect measures of sexual orientation: a Choice Reaction Time task (CRT) and a Viewing Time (VT) task. A sample of 87 heterosexual and 35 gay men completed all three indirect measures in an online study. The IRAP and the VT predicted sexual orientation nearly perfectly. Both measures also showed a considerable amount of convergent validity. Reliabilities (internal consistencies) reached satisfactory levels. In contrast, the CRT did not tap into sexual orientation in the present study. In sum, the VT measure performed best, with the IRAP showing only slightly lower reliability and criterion validity, whereas the CRT did not yield any evidence of reliability or criterion validity in the present research. The results were discussed in the light of specific task properties of the indirect latency-based measures (task-relevance vs. task-irrelevance).
ERIC Educational Resources Information Center
Brown, James M.; Chang, Gerald
1982-01-01
The predictive validity of the Minnesota Reading Assessment (MRA) when used to project potential performance of postsecondary vocational-technical education students was examined. Findings confirmed the MRA to be a valid predictor, although the error in prediction varied between the criterion variables. (Author/GK)
An evidence-based decision assistance model for predicting training outcome in juvenile guide dogs
Craigon, Peter J.; Blythe, Simon A.; England, Gary C. W.; Asher, Lucy
2017-01-01
Working dog organisations, such as Guide Dogs, need to regularly assess the behaviour of the dogs they train. In this study we developed a questionnaire-style behaviour assessment completed by training supervisors of juvenile guide dogs aged 5, 8 and 12 months old (n = 1,401), and evaluated aspects of its reliability and validity. Specifically, internal reliability, temporal consistency, construct validity, predictive criterion validity (comparing against later training outcome) and concurrent criterion validity (comparing against a standardised behaviour test) were evaluated. Thirty-nine questions were sourced either from previously published literature or created to meet requirements identified via Guide Dogs staff surveys and staff feedback. Internal reliability analyses revealed seven reliable and interpretable trait scales named according to the questions within them as: Adaptability; Body Sensitivity; Distractibility; Excitability; General Anxiety; Trainability and Stair Anxiety. Intra-individual temporal consistency of the scale scores between 5–8, 8–12 and 5–12 months was high. All scales excepting Body Sensitivity showed some degree of concurrent criterion validity. Predictive criterion validity was supported for all seven scales, since associations were found with training outcome, at at-least one age. Thresholds of z-scores on the scales were identified that were able to distinguish later training outcome by identifying 8.4% of all dogs withdrawn for behaviour and 8.5% of all qualified dogs, with 84% and 85% specificity. The questionnaire assessment was reliable and could detect traits that are consistent within individuals over time, despite juvenile dogs undergoing development during the study period. By applying thresholds to scores produced from the questionnaire this assessment could prove to be a highly valuable decision-making tool for Guide Dogs. This is the first questionnaire-style assessment of juvenile dogs that has shown value in predicting the training outcome of individual working dogs. PMID:28614347
The Arthroscopic Surgical Skill Evaluation Tool (ASSET).
Koehler, Ryan J; Amsdell, Simon; Arendt, Elizabeth A; Bisson, Leslie J; Braman, Jonathan P; Bramen, Jonathan P; Butler, Aaron; Cosgarea, Andrew J; Harner, Christopher D; Garrett, William E; Olson, Tyson; Warme, Winston J; Nicandri, Gregg T
2013-06-01
Surgeries employing arthroscopic techniques are among the most commonly performed in orthopaedic clinical practice; however, valid and reliable methods of assessing the arthroscopic skill of orthopaedic surgeons are lacking. The Arthroscopic Surgery Skill Evaluation Tool (ASSET) will demonstrate content validity, concurrent criterion-oriented validity, and reliability when used to assess the technical ability of surgeons performing diagnostic knee arthroscopic surgery on cadaveric specimens. Cross-sectional study; Level of evidence, 3. Content validity was determined by a group of 7 experts using the Delphi method. Intra-articular performance of a right and left diagnostic knee arthroscopic procedure was recorded for 28 residents and 2 sports medicine fellowship-trained attending surgeons. Surgeon performance was assessed by 2 blinded raters using the ASSET. Concurrent criterion-oriented validity, interrater reliability, and test-retest reliability were evaluated. Content validity: The content development group identified 8 arthroscopic skill domains to evaluate using the ASSET. Concurrent criterion-oriented validity: Significant differences in the total ASSET score (P < .05) between novice, intermediate, and advanced experience groups were identified. Interrater reliability: The ASSET scores assigned by each rater were strongly correlated (r = 0.91, P < .01), and the intraclass correlation coefficient between raters for the total ASSET score was 0.90. Test-retest reliability: There was a significant correlation between ASSET scores for both procedures attempted by each surgeon (r = 0.79, P < .01). The ASSET appears to be a useful, valid, and reliable method for assessing surgeon performance of diagnostic knee arthroscopic surgery in cadaveric specimens. Studies are ongoing to determine its generalizability to other procedures as well as to the live operating room and other simulated environments.
ERIC Educational Resources Information Center
Tibbetts, Katherine A.; And Others
This paper describes the development of a criterion-referenced, performance-based measure of third grade reading comprehension. The primary purpose of the assessment is to contribute unique and valid information for use in the formative evaluation of a whole literacy program. A secondary purpose is to supplement other program efforts to…
ERIC Educational Resources Information Center
Deng, Weiling; Monfils, Lora
2017-01-01
Using simulated data, this study examined the impact of different levels of stringency of the valid case inclusion criterion on item response theory (IRT)-based true score equating over 5 years in the context of K-12 assessment when growth in student achievement is expected. Findings indicate that the use of the most stringent inclusion criterion…
ERIC Educational Resources Information Center
Wray, Kraig; Lai, Cheng-Fei; Sáez, Leilani; Alonzo, Julie; Tindal, Gerald
2013-01-01
We report the results of an alternate form reliability and criterion validity study of kindergarten and grade 1 (N = 84-199) reading measures from the easyCBM© assessment system and Stanford Early School Achievement Test/Stanford Achievement Test, 10th edition (SESAT/SAT-10) across 5 time points. The alternate form reliabilities ranged from…
Assessment scale of risk for surgical positioning injuries 1
Lopes, Camila Mendonça de Moraes; Haas, Vanderlei José; Dantas, Rosana Aparecida Spadoti; de Oliveira, Cheila Gonçalves; Galvão, Cristina Maria
2016-01-01
ABSTRACT Objective: to build and validate a scale to assess the risk of surgical positioning injuries in adult patients. Method: methodological research, conducted in two phases: construction and face and content validation of the scale and field research, involving 115 patients. Results: the Risk Assessment Scale for the Development of Injuries due to Surgical Positioning contains seven items, each of which presents five subitems. The scale score ranges between seven and 35 points in which, the higher the score, the higher the patient's risk. The Content Validity Index of the scale corresponded to 0.88. The application of Student's t-test for equality of means revealed the concurrent criterion validity between the scores on the Braden scale and the constructed scale. To assess the predictive criterion validity, the association was tested between the presence of pain deriving from surgical positioning and the development of pressure ulcer, using the score on the Risk Assessment Scale for the Development of Injuries due to Surgical Positioning (p<0.001). The interrater reliability was verified using the intraclass correlation coefficient, equal to 0.99 (p<0.001). Conclusion: the scale is a valid and reliable tool, but further research is needed to assess its use in clinical practice. PMID:27579925
ERIC Educational Resources Information Center
Bornstein, Robert F.
2011-01-01
Although definitions of validity have evolved considerably since L. J. Cronbach and P. E. Meehl's classic (1955) review, contemporary validity research continues to emphasize correlational analyses assessing predictor-criterion relationships, with most outcome criteria being self-reports. The present article describes an alternative way of…
ERIC Educational Resources Information Center
Fairclough, Stuart J.; Hilland, Toni A.; Vinson, Don; Stratton, Gareth
2012-01-01
The study purpose was to assess preliminary validity and reliability of the Physical Education and School Sport Environment Inventory (PESSEI), which was designed to audit physical education (PE) and school sport spaces and resources. PE teachers from eight English secondary schools completed the PESSEI. Criterion validity was assessed by…
The Measurement of Negative Creativity: Metrics and Relationships
ERIC Educational Resources Information Center
Kapoor, Hansika; Khan, Azizuddin
2016-01-01
Although the dark side of creativity and negative creativity are shaping into legitimate subconstructs, measures to assess the same remain to be validated. To meet this goal, two studies assessed the convergent, predictive, and criterion-related validities of two valence-inclusive creativity measures. One measure assessed the self-report…
Dueñas, María; Mendonça, Liliane; Sampaio, Rute; Gouvinhas, Cláudia; Oliveira, Daniela; Castro-Lopes, José Manuel; Azevedo, Luís Filipe
2017-03-01
The Bowel Function Index (BFI) is a simple and sound bowel function and opioid-induced constipation (OIC) screening tool. We aimed to develop the translation and cultural adaptation of this measure (BFI-P) and to assess its reliability and validity for the Portuguese language and a chronic pain population. The BFI-P was created after a process including translation, back translation and cultural adaptation. Participants (n = 226) were recruited in a chronic pain clinic and were assessed at baseline and after one week. Internal consistency, test-retest reliability, responsiveness, construct (convergent and known groups) and factorial validity were assessed. Test-retest reliability had an intra-class correlation of 0.605 for BFI mean score. Internal consistency of BFI had Cronbach's alpha of 0.865. The construct validity of BFI-P was shown to be excellent and the exploratory factor analysis confirmed its unidimensional structure. The responsiveness of BFI-P was excellent, with a suggested 17-19 point and 8-12 point change in score constituting a clinically relevant change in constipation for patients with and without previous constipation, respectively. This study had some limitations, namely, the criterion validity of BFI-P was not directly assessed; and the absence of a direct criterion for OIC precluded the assessment of the criterion based responsiveness of BFI-P. Nevertheless, BFI may importantly contribute to better OIC screening and its Portuguese version (BFI-P) has been shown to have excellent reliability, internal consistency, validity and responsiveness. Further suggestions regarding statistically and clinically important change cut-offs for this instrument are presented.
Eckner, James T.; Richardson, James K.; Kim, Hogene; Joshi, Monica S.; Oh, Youkeun K.; Ashton-Miller, James A.
2015-01-01
Summary Slowed reaction time (RT) represents both a risk factor for and a consequence of sport concussion. The purpose of this study was to determine the reliability and criterion validity of a novel clinical test of simple and complex RT, called RTclin, in contact sport athletes. Both tasks were adapted from the well-known ruler drop test of RT and involve manually grasping a falling vertical shaft upon its release, with the complex task employing a go/no-go paradigm based on a slight cue. In 46 healthy contact sport athletes (24 males; M = 16.3 yr., SD = 5.0; 22 women: M age= 15.0 yr., SD = 4.0) whose sports included soccer, ice hockey, American football, martial arts, wrestling, and lacrosse, the latency and accuracy of simple and complex RTclin had acceptable test-retest and inter-rater reliabilities and correlated with a computerized criterion standard, the Axon Computerized Cognitive Assessment Tool. Medium to large effect sizes were found. The novel RTclin tests have acceptable reliability and criterion validity for clinical use and hold promise as concussion assessment tools. PMID:26106803
Steele, Catriona M.; Namasivayam-MacDonald, Ashwini M.; Guida, Brittany T.; Cichero, Julie A.; Duivestein, Janice; MRSc; Hanson, Ben; Lam, Peter; Riquelme, Luis F.
2018-01-01
Objective To assess consensual validity, interrater reliability, and criterion validity of the International Dysphagia Diet Standardisation Initiative Functional Diet Scale, a new functional outcome scale intended to capture the severity of oropharyngeal dysphagia, as represented by the degree of diet texture restriction recommended for the patient. Design Participants assigned International Dysphagia Diet Standardisation Initiative Functional Diet Scale scores to 16 clinical cases. Consensual validity was measured against reference scores determined by an author reference panel. Interrater reliability was measured overall and across quartile subsets of the dataset. Criterion validity was evaluated versus Functional Oral Intake Scale (FOIS) scores assigned by survey respondents to the same case scenarios. Feedback was requested regarding ease and likelihood of use. Setting Web-based survey. Participants Respondents (NZ170) from 29 countries. Interventions Not applicable. Main Outcome Measures Consensual validity (percent agreement and Kendall t), criterion validity (Spearman rank correlation), and interrater reliability (Kendall concordance and intraclass coefficients). Results The International Dysphagia Diet Standardisation Initiative Functional Diet Scale showed strong consensual validity, criterion validity, and interrater reliability. Scenarios involving liquid-only diets, transition from nonoral feeding, or trial diet advances in therapy showed the poorest consensus, indicating a need for clear instructions on how to score these situations. The International Dysphagia Diet Standardisation Initiative Functional Diet Scale showed greater sensitivity than the FOIS to specific changes in diet. Most (>70%) respondents indicated enthusiasm for implementing the International Dysphagia Diet Standardisation Initiative Functional Diet Scale. Conclusions This initial validation study suggests that the International Dysphagia Diet Standardisation Initiative Functional Diet Scale has strong consensual and criterion validity and can be used reliably by clinicians to capture diet texture restriction and progression in people with dysphagia. PMID:29428348
Steele, Catriona M; Namasivayam-MacDonald, Ashwini M; Guida, Brittany T; Cichero, Julie A; Duivestein, Janice; Hanson, Ben; Lam, Peter; Riquelme, Luis F
2018-05-01
To assess consensual validity, interrater reliability, and criterion validity of the International Dysphagia Diet Standardisation Initiative Functional Diet Scale, a new functional outcome scale intended to capture the severity of oropharyngeal dysphagia, as represented by the degree of diet texture restriction recommended for the patient. Participants assigned International Dysphagia Diet Standardisation Initiative Functional Diet Scale scores to 16 clinical cases. Consensual validity was measured against reference scores determined by an author reference panel. Interrater reliability was measured overall and across quartile subsets of the dataset. Criterion validity was evaluated versus Functional Oral Intake Scale (FOIS) scores assigned by survey respondents to the same case scenarios. Feedback was requested regarding ease and likelihood of use. Web-based survey. Respondents (N=170) from 29 countries. Not applicable. Consensual validity (percent agreement and Kendall τ), criterion validity (Spearman rank correlation), and interrater reliability (Kendall concordance and intraclass coefficients). The International Dysphagia Diet Standardisation Initiative Functional Diet Scale showed strong consensual validity, criterion validity, and interrater reliability. Scenarios involving liquid-only diets, transition from nonoral feeding, or trial diet advances in therapy showed the poorest consensus, indicating a need for clear instructions on how to score these situations. The International Dysphagia Diet Standardisation Initiative Functional Diet Scale showed greater sensitivity than the FOIS to specific changes in diet. Most (>70%) respondents indicated enthusiasm for implementing the International Dysphagia Diet Standardisation Initiative Functional Diet Scale. This initial validation study suggests that the International Dysphagia Diet Standardisation Initiative Functional Diet Scale has strong consensual and criterion validity and can be used reliably by clinicians to capture diet texture restriction and progression in people with dysphagia. Copyright © 2018 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
The Arthroscopic Surgical Skill Evaluation Tool (ASSET)
Koehler, Ryan J.; Amsdell, Simon; Arendt, Elizabeth A; Bisson, Leslie J; Braman, Jonathan P; Butler, Aaron; Cosgarea, Andrew J; Harner, Christopher D; Garrett, William E; Olson, Tyson; Warme, Winston J.; Nicandri, Gregg T.
2014-01-01
Background Surgeries employing arthroscopic techniques are among the most commonly performed in orthopaedic clinical practice however, valid and reliable methods of assessing the arthroscopic skill of orthopaedic surgeons are lacking. Hypothesis The Arthroscopic Surgery Skill Evaluation Tool (ASSET) will demonstrate content validity, concurrent criterion-oriented validity, and reliability, when used to assess the technical ability of surgeons performing diagnostic knee arthroscopy on cadaveric specimens. Study Design Cross-sectional study; Level of evidence, 3 Methods Content validity was determined by a group of seven experts using a Delphi process. Intra-articular performance of a right and left diagnostic knee arthroscopy was recorded for twenty-eight residents and two sports medicine fellowship trained attending surgeons. Subject performance was assessed by two blinded raters using the ASSET. Concurrent criterion-oriented validity, inter-rater reliability, and test-retest reliability were evaluated. Results Content validity: The content development group identified 8 arthroscopic skill domains to evaluate using the ASSET. Concurrent criterion-oriented validity: Significant differences in total ASSET score (p<0.05) between novice, intermediate, and advanced experience groups were identified. Inter-rater reliability: The ASSET scores assigned by each rater were strongly correlated (r=0.91, p <0.01) and the intra-class correlation coefficient between raters for the total ASSET score was 0.90. Test-retest reliability: there was a significant correlation between ASSET scores for both procedures attempted by each individual (r = 0.79, p<0.01). Conclusion The ASSET appears to be a useful, valid, and reliable method for assessing surgeon performance of diagnostic knee arthroscopy in cadaveric specimens. Studies are ongoing to determine its generalizability to other procedures as well as to the live OR and other simulated environments. PMID:23548808
Beehler, Sarah; Ahern, Jennifer; Balmer, Brandi; Kuhlman, Jennifer
2017-01-01
This pilot study evaluated the validity and reliability of an Experience of Neighborhood (EON) measure developed to assess neighborhood characteristics that shape reintegration opportunities for returning service members and their families. A total of 91 post-9/11 veterans and spouses completed a survey administered at the Minnesota State Fair. Participants self-reported on their reintegration status (veterans), social functioning (spouses), social support, and mental health. EON factor structure, internal consistency reliability, and validity (discriminant, content, criterion) were analyzed. The EON measure showed adequate reliability, discriminant validity, and content validity. More work is needed to assess criterion validity because EON scores were not correlated with scores on a Census-based index used to measure quality of military neighborhoods. The EON may be useful in assessing broad local factors influencing health among returning veterans and spouses. More research is needed to understand geographic variation in neighborhood conditions and how those affect reintegration and mental health for military families.
Beehler, Sarah; Ahern, Jennifer; Balmer, Brandi; Kuhlman, Jennifer
2017-01-01
This pilot study evaluated the validity and reliability of an Experience of Neighborhood (EON) measure developed to assess neighborhood characteristics that shape reintegration opportunities for returning service members and their families. A total of 91 post-9/11 veterans and spouses completed a survey administered at the Minnesota State Fair. Participants self-reported on their reintegration status (veterans), social functioning (spouses), social support, and mental health. EON factor structure, internal consistency reliability, and validity (discriminant, content, criterion) were analyzed. The EON measure showed adequate reliability, discriminant validity, and content validity. More work is needed to assess criterion validity because EON scores were not correlated with scores on a Census-based index used to measure quality of military neighborhoods. The EON may be useful in assessing broad local factors influencing health among returning veterans and spouses. More research is needed to understand geographic variation in neighborhood conditions and how those affect reintegration and mental health for military families. PMID:28936370
Five-level emergency triage systems: variation in assessment of validity.
Kuriyama, Akira; Urushidani, Seigo; Nakayama, Takeo
2017-11-01
Triage systems are scales developed to rate the degree of urgency among patients who arrive at EDs. A number of different scales are in use; however, the way in which they have been validated is inconsistent. Also, it is difficult to define a surrogate that accurately predicts urgency. This systematic review described reference standards and measures used in previous validation studies of five-level triage systems. We searched PubMed, EMBASE and CINAHL to identify studies that had assessed the validity of five-level triage systems and described the reference standards and measures applied in these studies. Studies were divided into those using criterion validity (reference standards developed by expert panels or triage systems already in use) and those using construct validity (prognosis, costs and resource use). A total of 57 studies examined criterion and construct validity of 14 five-level triage systems. Criterion validity was examined by evaluating (1) agreement between the assigned degree of urgency with objective standard criteria (12 studies), (2) overtriage and undertriage (9 studies) and (3) sensitivity and specificity of triage systems (7 studies). Construct validity was examined by looking at (4) the associations between the assigned degree of urgency and measures gauged in EDs (48 studies) and (5) the associations between the assigned degree of urgency and measures gauged after hospitalisation (13 studies). Particularly, among 46 validation studies of the most commonly used triages (Canadian Triage and Acuity Scale, Emergency Severity Index and Manchester Triage System), 13 and 39 studies examined criterion and construct validity, respectively. Previous studies applied various reference standards and measures to validate five-level triage systems. They either created their own reference standard or used a combination of severity/resource measures. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Validity Arguments for Diagnostic Assessment Using Automated Writing Evaluation
ERIC Educational Resources Information Center
Chapelle, Carol A.; Cotos, Elena; Lee, Jooyoung
2015-01-01
Two examples demonstrate an argument-based approach to validation of diagnostic assessment using automated writing evaluation (AWE). "Criterion"®, was developed by Educational Testing Service to analyze students' papers grammatically, providing sentence-level error feedback. An interpretive argument was developed for its use as part of…
Guise, Brian J; Thompson, Matthew D; Greve, Kevin W; Bianchini, Kevin J; West, Laura
2014-03-01
The current study assessed performance validity on the Stroop Color and Word Test (Stroop) in mild traumatic brain injury (TBI) using criterion-groups validation. The sample consisted of 77 patients with a reported history of mild TBI. Data from 42 moderate-severe TBI and 75 non-head-injured patients with other clinical diagnoses were also examined. TBI patients were categorized on the basis of Slick, Sherman, and Iverson (1999) criteria for malingered neurocognitive dysfunction (MND). Classification accuracy is reported for three indicators (Word, Color, and Color-Word residual raw scores) from the Stroop across a range of injury severities. With false-positive rates set at approximately 5%, sensitivity was as high as 29%. The clinical implications of these findings are discussed. © 2012 The British Psychological Society.
Nascimento-Ferreira, Marcus V; Collese, Tatiana S; de Moraes, Augusto César F; Rendo-Urteaga, Tara; Moreno, Luis A; Carvalho, Heráclito B
2016-12-01
Sleep duration has been associated with several health outcomes in children and adolescents. As an extensive number of questionnaires are currently used to investigate sleep schedule or sleep time, we performed a systematic review of criterion validation of sleep time questionnaires for children and adolescents, considering accelerometers as the reference method. We found a strong correlation between questionnaires and accelerometers for weeknights and a moderate correlation for weekend nights. When considering only studies performing a reliability assessment of the used questionnaires, a significant increase in the correlations for both weeknights and weekend nights was observed. In conclusion, moderate to strong criterion validity of sleep time questionnaires was observed; however, the reliability assessment of the questionnaires showed strong validation performance. Copyright © 2015 Elsevier Ltd. All rights reserved.
Validity of Various Methods for Determining Velocity, Force, and Power in the Back Squat.
Banyard, Harry G; Nosaka, Ken; Sato, Kimitake; Haff, G Gregory
2017-10-01
To examine the validity of 2 kinematic systems for assessing mean velocity (MV), peak velocity (PV), mean force (MF), peak force (PF), mean power (MP), and peak power (PP) during the full-depth free-weight back squat performed with maximal concentric effort. Ten strength-trained men (26.1 ± 3.0 y, 1.81 ± 0.07 m, 82.0 ± 10.6 kg) performed three 1-repetition-maximum (1RM) trials on 3 separate days, encompassing lifts performed at 6 relative intensities including 20%, 40%, 60%, 80%, 90%, and 100% of 1RM. Each repetition was simultaneously recorded by a PUSH band and commercial linear position transducer (LPT) (GymAware [GYM]) and compared with measurements collected by a laboratory-based testing device consisting of 4 LPTs and a force plate. Trials 2 and 3 were used for validity analyses. Combining all 120 repetitions indicated that the GYM was highly valid for assessing all criterion variables while the PUSH was only highly valid for estimations of PF (r = .94, CV = 5.4%, ES = 0.28, SEE = 135.5 N). At each relative intensity, the GYM was highly valid for assessing all criterion variables except for PP at 20% (ES = 0.81) and 40% (ES = 0.67) of 1RM. Moreover, the PUSH was only able to accurately estimate PF across all relative intensities (r = .92-.98, CV = 4.0-8.3%, ES = 0.04-0.26, SEE = 79.8-213.1 N). PUSH accuracy for determining MV, PV, MF, MP, and PP across all 6 relative intensities was questionable for the back squat, yet the GYM was highly valid at assessing all criterion variables, with some caution given to estimations of MP and PP performed at lighter loads.
Mayorga-Vega, Daniel; Bocanegra-Parrilla, Raúl; Ornelas, Martha; Viciana, Jesús
2016-01-01
The main purpose of the present meta-analysis was to examine the criterion-related validity of the distance- and time-based walk/run tests for estimating cardiorespiratory fitness among apparently healthy children and adults. Relevant studies were searched from seven electronic bibliographic databases up to August 2015 and through other sources. The Hunter-Schmidt's psychometric meta-analysis approach was conducted to estimate the population criterion-related validity of the following walk/run tests: 5,000 m, 3 miles, 2 miles, 3,000 m, 1.5 miles, 1 mile, 1,000 m, ½ mile, 600 m, 600 yd, ¼ mile, 15 min, 12 min, 9 min, and 6 min. From the 123 included studies, a total of 200 correlation values were analyzed. The overall results showed that the criterion-related validity of the walk/run tests for estimating maximum oxygen uptake ranged from low to moderate (rp = 0.42-0.79), with the 1.5 mile (rp = 0.79, 0.73-0.85) and 12 min walk/run tests (rp = 0.78, 0.72-0.83) having the higher criterion-related validity for distance- and time-based field tests, respectively. The present meta-analysis also showed that sex, age and maximum oxygen uptake level do not seem to affect the criterion-related validity of the walk/run tests. When the evaluation of an individual's maximum oxygen uptake attained during a laboratory test is not feasible, the 1.5 mile and 12 min walk/run tests represent useful alternatives for estimating cardiorespiratory fitness. As in the assessment with any physical fitness field test, evaluators must be aware that the performance score of the walk/run field tests is simply an estimation and not a direct measure of cardiorespiratory fitness.
Measuring Sexual Motives: A Test of the Psychometric Properties of the Sexual Motivations Scale.
Jardin, Charles; Garey, Lorra; Zvolensky, Michael J
2017-01-01
Sexual motives refer to functions served by sexual behavior. The Sex Motivations Scale (SMS) has frequently been used to assess sexual motives. At its development, the SMS demonstrated good internal consistency; convergent, divergent, and criterion validity; and configural invariance across sex, age, and Caucasians and African Americans. Yet the metric and scalar invariance of the SMS has not been examined, nor has the measurement invariance of the SMS across Hispanic and Asian Americans, sexual minority status, and relationship status been tested. The criterion validity of the SMS also has yet to be examined for nonintercourse sexual behaviors, such as sexting. The present study aimed to address these gaps in a diverse sample of 2,201 college students (77.60% female; M age = 22.06; 27.84% Caucasian). Results further affirmed the configural, metric, and scalar invariance of the SMS. The convergent and divergent validity of the SMS was supported in relation to positive and negative affect and attachment patterns; and specific SMS subscales demonstrated associations with sexual intercourse behaviors and sexting, supporting the criterion validity of the SMS. These findings suggest the relevance of the SMS in assessing sexual motives across diverse populations and behaviors.
Goodman, L A; Corcoran, C; Turner, K; Yuan, N; Green, B L
1998-07-01
This article reviews the psychometric properties of the Stressful Life Events Screening Questionnaire (SLESQ), a recently developed trauma history screening measure, and discusses the complexities involved in assessing trauma exposure. There are relatively few general measures of exposure to a variety of types of traumatic events, and most of those that exist have not been subjected to rigorous psychometric evaluation. The SLESQ showed good test-retest reliability, with a median kappa of .73, adequate convergent validity (with a lengthier interview) with a median kappa of .64, and good discrimination between Criterion A and non-Criterion A events. The discussion addresses some of the challenges of assessing traumatic event exposure along the dimensions of defining traumatic events, assessment methodologies, reporting consistency, and incident validation.
Design and validation of a comprehensive fecal incontinence questionnaire.
Macmillan, Alexandra K; Merrie, Arend E H; Marshall, Roger J; Parry, Bryan R
2008-10-01
Fecal incontinence can have a profound effect on quality of life. Its prevalence remains uncertain because of stigma, lack of consistent definition, and dearth of validated measures. This study was designed to develop a valid clinical and epidemiologic questionnaire, building on current literature and expertise. Patients and experts undertook face validity testing. Construct validity, criterion validity, and test-retest reliability was undertaken. Construct validity comprised factor analysis and internal consistency of the quality of life scale. The validity of known groups was tested against 77 control subjects by using regression models. Questionnaire results were compared with a stool diary for criterion validity. Test-retest reliability was calculated from repeated questionnaire completion. The questionnaire achieved good face validity. It was completed by 104 patients. The quality of life scale had four underlying traits (factor analysis) and high internal consistency (overall Cronbach alpha = 0.97). Patients and control subjects answered the questionnaire significantly differently (P < 0.01) in known-groups validity testing. Criterion validity assessment found mean differences close to zero. Median reliability for the whole questionnaire was 0.79 (range, 0.35-1). This questionnaire compares favorably with other available instruments, although the interpretation of stool consistency requires further research. Its sensitivity to treatment still needs to be investigated.
Huang, X N; Zhang, Y; Feng, W W; Wang, H S; Cao, B; Zhang, B; Yang, Y F; Wang, H M; Zheng, Y; Jin, X M; Jia, M X; Zou, X B; Zhao, C X; Robert, J; Jing, Jin
2017-06-02
Objective: To evaluate the reliability and validity of warning signs checklist developed by the National Health and Family Planning Commission of the People's Republic of China (NHFPC), so as to determine the screening effectiveness of warning signs on developmental problems of early childhood. Method: Stratified random sampling method was used to assess the reliability and validity of checklist of warning sign and 2 110 children 0 to 6 years of age(1 513 low-risk subjects and 597 high-risk subjects) were recruited from 11 provinces of China. The reliability evaluation for the warning signs included the test-retest reliability and interrater reliability. With the use of Age and Stage Questionnaire (ASQ) and Gesell Development Diagnosis Scale (GESELL) as the criterion scales, criterion validity was assessed by determining the correlation and consistency between the screening results of warning signs and the criterion scales. Result: In terms of the warning signs, the screening positive rates at different ages ranged from 10.8%(21/141) to 26.2%(51/137). The median (interquartile) testing time for each subject was 1(0.6) minute. Both the test-retest reliability and interrater reliability of warning signs reached 0.7 or above, indicating that the stability was good. In terms of validity assessment, there was remarkable consistency between ASQ and warning signs, with the Kappa value of 0.63. With the use of GESELL as criterion, it was determined that the sensitivity of warning signs in children with suspected developmental delay was 82.2%, and the specificity was 77.7%. The overall Youden index was 0.6. Conclusion: The reliability and validity of warning signs checklist for screening early childhood developmental problems have met the basic requirements of psychological screening scales, with the characteristics of short testing time and easy operation. Thus, this warning signs checklist can be used for screening psychological and behavioral problems of early childhood, especially in community settings.
Development and psychometric testing of the Cancer Knowledge Scale for Elders.
Su, Ching-Ching; Chen, Yuh-Min; Kuo, Bo-Jein
2009-03-01
To develop the Cancer Knowledge Scale for Elders and test its validity and reliability. The number of elders suffering from cancer is increasing. To facilitate cancer prevention behaviours among elders, they shall be educated about cancer-related knowledge. Prior to designing a programme that would respond to the special needs of elders, understanding the cancer-related knowledge within this population was necessary. However, extensive review of the literature revealed a lack of appropriate instruments for measuring cancer-related knowledge. A valid and reliable cancer knowledge scale for elders is necessary. A non-experimental methodological design was used to test the psychometric properties of the Cancer Knowledge Scale for Elders. Item analysis was first performed to screen out items that had low corrected item-total correlation coefficients. Construct validity was examined with a principle component method of exploratory factor analysis. Cancer-related health behaviour was used as the criterion variable to evaluate criterion-related validity. Internal consistency reliability was assessed by the KR-20. Stability was determined by two-week test-retest reliability. The factor analysis yielded a four-factor solution accounting for 49.5% of the variance. For criterion-related validity, cancer knowledge was positively correlated with cancer-related health behaviour (r = 0.78, p < 0.001). The KR-20 coefficients of each factor were 0.85, 0.76, 0.79 and 0.67 and 0.87 for the total scale. Test-retest reliability over a two-week period was 0.83 (p < 0.001). This study provides evidence for content validity, construct validity, criterion-related validity, internal consistency and stability of the Cancer Knowledge Scale for Elders. The results show that this scale is an easy-to-use instrument for elders and has adequate validity and reliability. The scale can be used as an assessment instrument when implementing cancer education programmes for elders. It can also be used to evaluate the effects of education programmes.
Multi-Informant Assessment of Temperament in Children with Externalizing Behavior Problems
ERIC Educational Resources Information Center
Copeland, William; Landry, Kerry; Stanger, Catherine; Hudziak, James J.
2004-01-01
We examined the criterion validity of parent and self-report versions of the Junior Temperament and Character Inventory (JTCI) in children with high levels of externalizing problems. The sample included 412 children (206 participants and 206 siblings) participating in a family study of attention and aggressive behavior problems. Criterion validity…
The brief multidimensional students' life satisfaction scale-college version.
Zullig, Keith J; Huebner, E Scott; Patton, Jon M; Murray, Karen A
2009-01-01
To investigate the psychometric properties of the BMSLSS-College among 723 college students. Internal consistency estimates explored scale reliability, factor analysis explored construct validity, and known-groups validity was assessed using the National College Youth Risk Behavior Survey and Harvard School of Public Health College Alcohol Study. Criterion-related validity was explored through analyses with the CDC's health-related quality of life scale and a social isolation scale. Acceptable internal consistency reliability, construct, known-groups, and criterion-related validity were established. Findings offer preliminary support for the BMSLSS-C; it could be useful in large-scale research studies, applied screening contexts, and for program evaluation purposes toward achieving Healthy People 2010 objectives.
Validation of the organizational culture assessment instrument.
Heritage, Brody; Pollock, Clare; Roberts, Lynne
2014-01-01
Organizational culture is a commonly studied area in industrial/organizational psychology due to its important role in workplace behaviour, cognitions, and outcomes. Jung et al.'s [1] review of the psychometric properties of organizational culture measurement instruments noted many instruments have limited validation data despite frequent use in both theoretical and applied situations. The Organizational Culture Assessment Instrument (OCAI) has had conflicting data regarding its psychometric properties, particularly regarding its factor structure. Our study examined the factor structure and criterion validity of the OCAI using robust analysis methods on data gathered from 328 (females = 226, males = 102) Australian employees. Confirmatory factor analysis supported a four factor structure of the OCAI for both ideal and current organizational culture perspectives. Current organizational culture data demonstrated expected reciprocally-opposed relationships between three of the four OCAI factors and the outcome variable of job satisfaction but ideal culture data did not, thus indicating possible weak criterion validity when the OCAI is used to assess ideal culture. Based on the mixed evidence regarding the measure's properties, further examination of the factor structure and broad validity of the measure is encouraged.
Validation of the Organizational Culture Assessment Instrument
Heritage, Brody; Pollock, Clare; Roberts, Lynne
2014-01-01
Organizational culture is a commonly studied area in industrial/organizational psychology due to its important role in workplace behaviour, cognitions, and outcomes. Jung et al.'s [1] review of the psychometric properties of organizational culture measurement instruments noted many instruments have limited validation data despite frequent use in both theoretical and applied situations. The Organizational Culture Assessment Instrument (OCAI) has had conflicting data regarding its psychometric properties, particularly regarding its factor structure. Our study examined the factor structure and criterion validity of the OCAI using robust analysis methods on data gathered from 328 (females = 226, males = 102) Australian employees. Confirmatory factor analysis supported a four factor structure of the OCAI for both ideal and current organizational culture perspectives. Current organizational culture data demonstrated expected reciprocally-opposed relationships between three of the four OCAI factors and the outcome variable of job satisfaction but ideal culture data did not, thus indicating possible weak criterion validity when the OCAI is used to assess ideal culture. Based on the mixed evidence regarding the measure's properties, further examination of the factor structure and broad validity of the measure is encouraged. PMID:24667839
Development and Validation of the Spanish-English Language Proficiency Scale (SELPS)
ERIC Educational Resources Information Center
Smyk, Ekaterina; Restrepo, M. Adelaida; Gorin, Joanna S.; Gray, Shelley
2013-01-01
Purpose: This study examined the development and validation of a criterion-referenced Spanish-English Language Proficiency Scale (SELPS) that was designed to assess the oral language skills of sequential bilingual children ages 4-8. This article reports results for the English proficiency portion of the scale. Method: The SELPS assesses syntactic…
Lifesource XL-18 pedometer for measuring steps under controlled and free-living conditions.
Liu, Sam; Brooks, Dina; Thomas, Scott; Eysenbach, Gunther; Nolan, Robert Peter
2015-01-01
The primary aim was to examine the criterion and construct validity and test-retest reliability of the Lifesource XL-18 pedometer (A&D Medical, Toronto, ON, Canada) for measuring steps under controlled and free-living activities. The influence of body mass index, waist size and walking speed on the criterion validity of XL-18 was also explored. Forty adults (35-74 years) performed a 6-min walk test in the controlled condition, and the criterion validity of XL-18 was assessed by comparing it to steps counted manually. Thirty-five adults participated in the free-living condition and the construct validity of XL-18 was assessed by comparing it to Yamax SW-200 (YAMAX Health & Sports, Inc., San Antonio, TX, USA). During the controlled condition, XL-18 did not significantly differ from criterion (P > 0.05) and no systematic error was found using Bland-Altman analysis. The accuracy of XL-18 decreased with slower walking speed (P = 0.001). During the free-living condition, Bland-Altman analysis revealed that XL-18 overestimated daily steps by 327 ± 118 than Yamax (P = 0.004). However, the absolute percent error (APE) (6.5 ± 0.58%) was still within an acceptable range. XL-18 did not differ statistically between pant pockets. XL-18 is suitable for measuring steps in controlled and free-living conditions. However, caution may be required when interpreting the steps recorded under slower speeds and free-living conditions.
Tousignant, Michel; Smeesters, Cécil; Breton, Anne-Marie; Breton, Emilie; Corriveau, Hélène
2006-04-01
This study compared range of motion (ROM) measurements using a cervical range of motion device (CROM) and an optoelectronic system (OPTOTRAK). To examine the criterion validity of the CROM for the measurement of cervical ROM on healthy adults. Whereas measurements of cervical ROM are recognized as part of the assessment of patients with neck pain, few devices are available in clinical settings. Two papers published previously showed excellent criterion validity for measurements of cervical flexion/extension and lateral flexion using the CROM. Subjects performed neck rotation, flexion/extension, and lateral flexion while sitting on a wooden chair. The ROM values were measured by the CROM as well as the OPTOTRAK. The cervical rotational ROM values using the CROM demonstrated a good to excellent linear relationship with those using the OPTOTRAK: right rotation, r = 0.89 (95% confidence interval, 0.81-0.94), and left rotation, r = 0.94 (95% confidence interval, 0.90-0.97). Similar results were also obtained for flexion/extension and lateral flexion ROM values. The CROM showed excellent criterion validity for measurements of cervical rotation. We propose using ROM values measured by the CROM as outcome measures for patients with neck pain.
Development of Internet-Based Tasks for the Executive Function Performance Test.
Rand, Debbie; Lee Ben-Haim, Keren; Malka, Rachel; Portnoy, Sigal
The Executive Function Performance Test (EFPT) is a reliable and valid performance-based tool to assess executive functions (EFs). This study's objective was to develop and verify two Internet-based tasks for the EFPT. A cross-sectional study assessed the alternate-form reliability of the Internet-based bill-paying and telephone-use tasks in healthy adults and people with subacute stroke (Study 1). It also sought to establish the tasks' criterion reliability for assessing EF deficits by correlating performance with that on the Trail Making Test in five groups: healthy young adults, healthy older adults, people with subacute stroke, people with chronic stroke, and young adults with attention deficit hyperactivity disorder (Study 2). The alternative-form reliability and initial construct validity for the Internet-based bill-paying task were verified. Criterion validity was established for both tasks. The Internet-based tasks are comparable to the original EFPT tasks and can be used for assessment of EF deficits. Copyright © 2018 by the American Occupational Therapy Association, Inc.
Lemon, Stephenie C; Rosal, Milagros C; Welch, Garry
2011-11-01
This study assessed the psychometric properties of the Audit of Diabetes-Dependent Quality of Life (ADDQoL) modified for low-income, low-education, Spanish-speaking Puerto Ricans with type 2 diabetes residing in the northeastern United States. Cross-sectional data from 226 patients were analyzed. Scale modifications included simplification of instructions, question wording and response format, and oral administration. Reliability was assessed with Cronbach's alpha coefficient and internal structure by exploratory factor analysis. Criterion validity was assessed using correlation analysis and linear and logistic regression models assessing the association of the ADDQoL with standardized physical health status, mental health status, depression, and comorbidity indices. Two ADDQoL items were dropped. The modified scale had excellent internal consistency and supported the original scale factor structure. Criterion validity results supported the validity of this measure. The modified ADDQoL showed psychometric properties that support its use in low-income, Spanish-speaking Puerto Ricans with type 2 diabetes who reside in mainland U.S.
Standards Performance Continuum: Development and Validation of a Measure of Effective Pedagogy.
ERIC Educational Resources Information Center
Doherty, R. William; Hilberg, R. Soleste; Epaloose, Georgia; Tharp, Roland G.
2002-01-01
Describes the development and validation of the Standards Performance Continuum (SPC) for assessing teacher performance of the Standards for Effective Pedagogy. Three studies involving Florida, California, and New Mexico public school teachers provided evidence of inter-rater reliability, concurrent validity, and criterion-related validity…
Concurrent Validity of the TONI-3
ERIC Educational Resources Information Center
Banks, Sandra H.; Franzen, Michael D.
2010-01-01
The literature pertaining to intelligence assessment reveals an ongoing discussion about the areas of intelligence captured by nonverbal tests. To date, few studies have investigated the criterion validity of the Test of Nonverbal Intelligence, Third Edition (TONI-3). The present study investigates the concurrent validity of the TONI-3 in a sample…
Yılmaz, Emel; Eser, Erhan; Şekuri, Cevad; Kültürsay, Hakan
2011-08-01
The purpose of this study was to describe the psychometric properties of the Myocardial Infarction Dimensional Assessment Scale (MIDAS). This is a methodological cultural adaptation study. The MIDAS consists of 35-items covering seven domains: physical activity, insecurity, emotional reaction, dependency, diet, concerns over medication, and side effects which are rated on a five-point Likert scale from 1: never to 5:always. The highest score of MIDAS is 100.Quality of life (QOL) decreases as the score of scale increases. Overall 185 myocardial infarction (MI) patients were enrolled in this study. Cronbach alpha was used for the reliability analysis. The criterion validity, structural validity, and sensitivity analysis approach was used for validity analysis. New York Heart Association (NYHA) and the Canadian Cardiovascular Society Functional Classifications (CCSFC) for testing the criterion validity; SF-36 for construct validity testing of the Turkish version of the MIDAS were used. The range of Cronbach alpha values is 0.79-0.90 for seven domains of the scale. No problematic items were observed for the entire scale. Medication related domains of the MIDAS showed considerable floor effects (35.7%-22.7%). Confirmatory Factor analysis indicators [Comparative Fit Index (CFI) =0.95 and Root Mean Square Error of Approximation (RMSEA) =0.075] supported the construct validity of MIDAS. Convergent validity of the MIDAS was confirmed with correlation of SF-36 scale where appropriate. Criterion validity results was also satisfactory by comparing different stages of the NYHA and the CCSFC (p<0.05). Overall results revealed that Turkish version of the MIDAS is a reliable and valid instrument.
Matsuzaki, Mika; Sullivan, Ruth; Ekelund, Ulf; Krishna, K V Radha; Kulkarni, Bharati; Collier, Tim; Ben-Shlomo, Yoav; Kinra, Sanjay; Kuper, Hannah
2016-01-19
There is limited availability of context-specific physical activity questionnaires in low and middle income countries. The aim of this study was to develop and examine the validity of a new Indian physical activity questionnaire, the Andhra Pradesh Children and Parent Study Physical Activity Questionnaire (APCAPS-PAQ). The current study was conducted with the cohort from the Hyderabad DXA Study (n = 2321), recruited in 2009-2010. Criterion validity (n = 245) was examined by comparing the APCAPS-PAQ to a combined heart rate and motion sensor worn for 8 days. Construct validity (n = 2321) was assessed with linear regression, comparing APCAPS-PAQ against BMI, percent body fat, and pulse rate. The APCAPS-PAQ criterion validity was variable depending on the PA intensity groups (ρ = 0.26, 0.07, 0.39; к = 0.14, 0.04, 0.16 for sedentary, light, moderate/vigorous physical activity (MVPA) respectively). Sedentary and light intensity activities from the questionnaire were underestimated when compared to the criterion data while MVPA in APCAPS-PAQ was overestimated. Higher time spent in sedentary activity in APCAPS-PAQ was associated with higher BMI and percent body fat, suggesting construct validity. The APCAPS-PAQ validity is comparable to other physical activity questionnaires. This tool is able to assess sedentary behavior, moderate/vigorous activity and physical activity energy expenditure on a group level with reasonable validity. This new questionnaire may be used for ranking individuals according to their sedentary time and physical activity in southern India.
Comparison of two methods of measuring physical activity in South African older adults.
Kolbe-Alexander, Tracy L; Lambert, Estelle V; Harkins, Judith Biletnikoff; Ekelund, Ulf
2006-01-01
The aim of this study was to assess the validity and reliability of the Yale Physical Activity Survey (YPAS) and the short version of the International Physical Activity Questionnaire (IPAQ) in older South African adults. The YPAS includes measures of weekly energy expenditure (EE) for housework, yard work, caregiving, exercise, and recreation. The IPAQ measures total time and EE during vigorous and moderate activity, walking, and sitting. The instruments were administered twice for test-retest reliability (men, n = 52, 68 +/- 5.4 years, and women, n = 70, 66 +/- 5.8 years). Data for criterion validity were obtained from accelerometers. YPAS reliability ranged from r = .44 to.80 for men and r = .59 to .99 for women (p < .0001). IPAQ reliability was lower for men (r = .29 to .76) than for women (r = .46 to .77). Criterion validity of the YPAS was .31 to .54 for men and .26 to .29 for women. The YPAS and short IPAQ had comparable results for reliability and criterion validity.
Ghisi, Gabriela Lima de Melo; Dos Santos, Rafaella Zulianello; Bonin, Christiani Batista Decker; Roussenq, Suellen; Grace, Sherry L; Oh, Paul; Benetti, Magnus
2014-01-01
To translate, culturally adapt and psychometrically validate the Information Needs in Cardiac Rehabilitation (INCR) tool to Portuguese. The identification of information needs is considered the first step to improve knowledge that ultimately could improve health outcomes. The Portuguese version generated was tested in 300 cardiac rehabilitation patients (CR) (34% women; mean age = 61.3 ± 2.1 years old). Test-retest reliability was assessed using intraclass correlation coefficient (ICC), the internal consistency using Cronbach's alpha, and the criterion validity was assessed with regard to patients' education and duration in CR. All 9 subscales were considered internally consistent (á > 0.7). Significant differences between mean total needs and educational level (p < 0.05) and duration in CR (p = 0.03) supported criterion validity. The overall mean (4.6 ± 0.4), as well as the means of the 9 subscales were high (emergency/safety was the greatest need). The Portuguese INCR was demonstrated to have sufficient reliability, consistency and validity. Copyright © 2014 Elsevier Inc. All rights reserved.
Development and Validation of Triarchic Construct Scales from the Psychopathic Personality Inventory
Hall, Jason R.; Drislane, Laura E.; Patrick, Christopher J.; Morano, Mario; Lilienfeld, Scott O.; Poythress, Norman G.
2014-01-01
The Triarchic model of psychopathy describes this complex condition in terms of distinct phenotypic components of boldness, meanness, and disinhibition. Brief self-report scales designed specifically to index these psychopathy facets have thus far demonstrated promising construct validity. The present study sought to develop and validate scales for assessing facets of the Triarchic model using items from a well-validated existing measure of psychopathy—the Psychopathic Personality Inventory (PPI). A consensus rating approach was used to identify PPI items relevant to each Triarchic facet, and the convergent and discriminant validity of the resulting PPI-based Triarchic scales were evaluated in relation to multiple criterion variables (i.e., other psychopathy inventories, antisocial personality disorder features, personality traits, psychosocial functioning) in offender and non-offender samples. The PPI-based Triarchic scales showed good internal consistency and related to criterion variables in ways consistent with predictions based on the Triarchic model. Findings are discussed in terms of implications for conceptualization and assessment of psychopathy. PMID:24447280
Hall, Jason R; Drislane, Laura E; Patrick, Christopher J; Morano, Mario; Lilienfeld, Scott O; Poythress, Norman G
2014-06-01
The Triarchic model of psychopathy describes this complex condition in terms of distinct phenotypic components of boldness, meanness, and disinhibition. Brief self-report scales designed specifically to index these psychopathy facets have thus far demonstrated promising construct validity. The present study sought to develop and validate scales for assessing facets of the Triarchic model using items from a well-validated existing measure of psychopathy-the Psychopathic Personality Inventory (PPI). A consensus-rating approach was used to identify PPI items relevant to each Triarchic facet, and the convergent and discriminant validity of the resulting PPI-based Triarchic scales were evaluated in relation to multiple criterion variables (i.e., other psychopathy inventories, antisocial personality disorder features, personality traits, psychosocial functioning) in offender and nonoffender samples. The PPI-based Triarchic scales showed good internal consistency and related to criterion variables in ways consistent with predictions based on the Triarchic model. Findings are discussed in terms of implications for conceptualization and assessment of psychopathy.
Machado-Vieira, Rodrigo; Luckenbaugh, David A; Ballard, Elizabeth D; Henter, Ioline D; Tohen, Mauricio; Suppes, Trisha; Zarate, Carlos A
2017-01-01
DSM-5 describes "a distinct period of abnormally and persistently elevated, expansive, or irritable mood and abnormally and persistently increased activity or energy" as a primary criterion for mania. Thus, increased energy or activity is now considered a core symptom of manic and hypomanic episodes. Using data from the Systematic Treatment Enhancement Program for Bipolar Disorder study, the authors analyzed point prevalence data obtained at the initial visit to assess the diagnostic validity of this new DSM-5 criterion. The study hypothesis was that the DSM-5 criterion would alter the prevalence of mania and/or hypomania. The authors compared prevalence, clinical characteristics, validators, and outcome in patients meeting the DSM-5 criteria (i.e., DSM-IV criteria plus the DSM-5 criterion of increased activity or energy) and those who did not meet the new DSM-5 criterion (i.e., who only met DSM-IV criteria). All 4,360 participants met DSM-IV criteria for bipolar disorder, and 310 met DSM-IV criteria for a manic or hypomanic episode. When the new DSM-5 criterion of increased activity or energy was added as a coprimary symptom, the prevalence of mania and hypomania was reduced. Although minor differences were noted in clinical and concurrent validators, no changes were observed in longitudinal outcomes. The findings confirm that including increased activity or energy as part of DSM-5 criterion A decreases the prevalence of manic and hypomanic episodes but does not affect longitudinal clinical outcomes.
Correlates of the MMPI-2-RF in a college setting.
Forbey, Johnathan D; Lee, Tayla T C; Handel, Richard W
2010-12-01
The current study examined empirical correlates of scores on Minnesota Multiphasic Personality Inventory-2-Restructured Form (MMPI-2-RF; A. Tellegen & Y. S. Ben-Porath, 2008; Y. S. Ben-Porath & A. Tellegen, 2008) scales in a college setting. The MMPI-2-RF and six criterion measures (assessing anger, assertiveness, sex roles, cognitive failures, social avoidance, and social fear) were administered to 846 college students (nmen = 264, nwomen = 582) to examine the convergent and discriminant validity of scores on the MMPI-2-RF Specific Problems and Interest scales. Results demonstrated evidence of generally good convergent score validity for the selected MMPI-2-RF scales, reflected in large effect size correlations with criterion measure scores. Further, MMPI-2-RF scale scores demonstrated adequate discriminant validity, reflected in relatively low comparative median correlations between scores on MMPI-2-RF substantive scale sets and criterion measures. Limitations and future directions are discussed.
Klußmann, André; Gebhardt, Hansjürgen; Rieger, Monika; Liebers, Falk; Steinberg, Ulf
2012-01-01
Upper extremity musculoskeletal symptoms and disorders are common in the working population. The economic and social impact of such disorders is considerable. Long-time, dynamic repetitive exposure of the hand-arm system during manual handling operations (MHO) alone or in combination with static and postural effort are recognised as causes of musculoskeletal symptoms and disorders. The assessment of these manual work tasks is crucial to estimate health risks of exposed employees. For these work tasks, a new method for the assessment of the working conditions was developed and a validation study was performed. The results suggest satisfying criterion validity and moderate objectivity of the KIM-MHO draft 2007. The method was modified and evaluated again. It is planned to release a new version of KIM-MHO in spring 2012.
Renteria, Laura; Li, Susan Tinsley; Pliskin, Neil H
2008-05-01
The utility of the Spanish WAIS-III was investigated by examining its reliability and validity among 100 Spanish-speaking participants. Results indicated that the internal consistency of the subtests was satisfactory, but inadequate for Letter Number Sequencing. Criterion validity was adequate. Convergent and discriminant validity results were generally similar to the North American normative sample. Paired sample t-tests suggested that the WAIS-III may underestimate ability when compared to the criterion measures that were utilized to assess validity. This study provides support for the use of the Spanish WAIS-III in urban Hispanic populations, but also suggests that caution be used when administering specific subtests, due to the nature of the Latin America alphabet and potential test bias.
Evaluation of Criterion Validity for Scales with Congeneric Measures
ERIC Educational Resources Information Center
Raykov, Tenko
2007-01-01
A method for estimating criterion validity of scales with homogeneous components is outlined. It accomplishes point and interval estimation of interrelationship indices between composite scores and criterion variables and is useful for testing hypotheses about criterion validity of measurement instruments. The method can also be used with missing…
A new self-report inventory of dyslexia for students: criterion and construct validity.
Tamboer, Peter; Vorst, Harrie C M
2015-02-01
The validity of a Dutch self-report inventory of dyslexia was ascertained in two samples of students. Six biographical questions, 20 general language statements and 56 specific language statements were based on dyslexia as a multi-dimensional deficit. Dyslexia and non-dyslexia were assessed with two criteria: identification with test results (Sample 1) and classification using biographical information (both samples). Using discriminant analyses, these criteria were predicted with various groups of statements. All together, 11 discriminant functions were used to estimate classification accuracy of the inventory. In Sample 1, 15 statements predicted the test criterion with classification accuracy of 98%, and 18 statements predicted the biographical criterion with classification accuracy of 97%. In Sample 2, 16 statements predicted the biographical criterion with classification accuracy of 94%. Estimations of positive and negative predictive value were 89% and 99%. Items of various discriminant functions were factor analysed to find characteristic difficulties of students with dyslexia, resulting in a five-factor structure in Sample 1 and a four-factor structure in Sample 2. Answer bias was investigated with measures of internal consistency reliability. Less than 20 self-report items are sufficient to accurately classify students with and without dyslexia. This supports the usefulness of self-assessment of dyslexia as a valid alternative to diagnostic test batteries. Copyright © 2015 John Wiley & Sons, Ltd.
Ghisi, Gabriela Lima de Melo; Grace, Sherry L; Thomas, Scott; Evans, Michael F; Oh, Paul
2013-06-01
To develop and psychometrically validate a tool to assess information needs in cardiac rehabilitation (CR) patients. After a literature search, 60 information items divided into 11 areas of needs were identified. To establish content validity, they were reviewed by an expert panel (N=10). Refined items were pilot-tested in 34 patients on a 5-point Likert-scale from 1 "really not helpful" to 5 "very important". A final version was generated and psychometrically tested in 203 CR patients. Test-retest reliability was assessed via the intraclass correlation coefficient (ICC), the internal consistency using Cronbach's alpha, and criterion validity was assessed with regard to patient's education and duration in CR. Five items were excluded after ICC analysis as well as one area of needs. All 10 areas were considered internally consistent (Cronbach's alpha>0.7). Criterion validity was supported by significant differences in mean scores by educational level (p<0.05) and duration in CR (p<0.001). The mean total score was 4.08 ± 0.53. Patients rated safety as their greatest information need. The INCR Tool was demonstrated to have good reliability and validity. This is an appropriate tool for application in clinical and research settings, assessing patients' needs during CR and as part of education programming. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Scholes, Shaun; Coombs, Ngaire; Pedisic, Zeljko; Mindell, Jennifer S; Bauman, Adrian; Rowlands, Alex V; Stamatakis, Emmanuel
2014-06-15
The criterion validity of the 2008 Physical Activity and Sedentary Behavior Assessment Questionnaire (PASBAQ) was examined in a nationally representative sample of 2,175 persons aged ≥16 years in England using accelerometry. Using accelerometer minutes/day greater than or equal to 200 counts as a criterion, Spearman's correlation coefficient (ρ) for PASBAQ-assessed total activity was 0.30 (95% confidence interval (CI): 0.25, 0.35) in women and 0.20 (95% CI: 0.15, 0.26) in men. Correlations between accelerometer counts/minute of wear time and questionnaire-assessed relative energy expenditure (metabolic equivalent-minutes/day) were higher in women (ρ = 0.41, 95% CI: 0.36, 0.46) than in men (ρ = 0.32, 95% CI: 0.26, 0.38). Similar correlations were observed for minutes/day spent in vigorous activity (women: ρ = 0.39, 95% CI: 0.33, 0.46; men: ρ = 0.31, 95% CI: 0.26, 0.36) and moderate-to-vigorous activity (women: ρ = 0.42, 95% CI: 0.36, 0.48; men: ρ = 0.38, 95% CI: 0.32, 0.45). Correlations for time spent being sedentary (<100 counts/minute) were 0.30 (95% CI: 0.24, 0.35) and 0.25 (95% CI: 0.19, 0.30) in women and men, respectively. Sedentary behavior correlations showed no sex difference. The validity of sedentary behavior and total physical activity was higher in older age groups, but validity was higher in younger persons for vigorous-intensity activity. The PASBAQ is a useful and valid instrument for ranking individuals according to levels of physical activity and sedentary behavior. © The Author 2014. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health.
Scholes, Shaun; Coombs, Ngaire; Pedisic, Zeljko; Mindell, Jennifer S.; Bauman, Adrian; Rowlands, Alex V.; Stamatakis, Emmanuel
2014-01-01
The criterion validity of the 2008 Physical Activity and Sedentary Behavior Assessment Questionnaire (PASBAQ) was examined in a nationally representative sample of 2,175 persons aged ≥16 years in England using accelerometry. Using accelerometer minutes/day greater than or equal to 200 counts as a criterion, Spearman's correlation coefficient (ρ) for PASBAQ-assessed total activity was 0.30 (95% confidence interval (CI): 0.25, 0.35) in women and 0.20 (95% CI: 0.15, 0.26) in men. Correlations between accelerometer counts/minute of wear time and questionnaire-assessed relative energy expenditure (metabolic equivalent-minutes/day) were higher in women (ρ = 0.41, 95% CI: 0.36, 0.46) than in men (ρ = 0.32, 95% CI: 0.26, 0.38). Similar correlations were observed for minutes/day spent in vigorous activity (women: ρ = 0.39, 95% CI: 0.33, 0.46; men: ρ = 0.31, 95% CI: 0.26, 0.36) and moderate-to-vigorous activity (women: ρ = 0.42, 95% CI: 0.36, 0.48; men: ρ = 0.38, 95% CI: 0.32, 0.45). Correlations for time spent being sedentary (<100 counts/minute) were 0.30 (95% CI: 0.24, 0.35) and 0.25 (95% CI: 0.19, 0.30) in women and men, respectively. Sedentary behavior correlations showed no sex difference. The validity of sedentary behavior and total physical activity was higher in older age groups, but validity was higher in younger persons for vigorous-intensity activity. The PASBAQ is a useful and valid instrument for ranking individuals according to levels of physical activity and sedentary behavior. PMID:24863551
Validation of the Intrinsic Spirituality Scale (ISS) with Muslims.
Hodge, David R; Zidan, Tarek; Husain, Altaf
2015-12-01
This study validates an existing spirituality measure--the intrinsic spirituality scale (ISS)--for use with Muslims in the United States. A confirmatory factor analysis was conducted with a diverse sample of self-identified Muslims (N = 281). Validity and reliability were assessed along with criterion and concurrent validity. The measurement model fit the data well, normed χ2 = 2.50, CFI = 0.99, RMSEA = 0.07, and SRMR = 0.02. All 6 items that comprise the ISS demonstrated satisfactory levels of validity (λ > .70) and reliability (R2 > .50). The Cronbach's alpha obtained with the present sample was .93. Appropriate correlations with theoretically linked constructs demonstrated criterion and concurrent validity. The results suggest the ISS is a valid measure of spirituality in clinical settings with the rapidly growing Muslim population. The ISS may, for instance, provide an efficient screening tool to identify Muslims that are particularly likely to benefit from spiritually accommodative treatments. (c) 2015 APA, all rights reserved).
Kim, Dong Hee; Im, Yeo Jin
2013-02-01
To develop and test the validity and reliability of the Korean version of the Family Management Measure (Korean FaMM) to assess applicability for families with children having chronic illnesses. The Korean FaMM was articulated through forward-backward translation methods. Internal consistency reliability, construct and criterion validity were calculated using PASW WIN (19.0) and AMOS (20.0). Survey data were collected from 341 mothers of children suffering from chronic disease enrolled in a university hospital in Seoul, South Korea. The Korean version of FaMM showed reliable internal consistency with Cronbach's alpha for the total scale of .69-.91. Factor loadings of the 53 items on the six sub-scales ranged from 0.28-0.84. The model of six subscales for the Korean FaMM was validated by expiratory and confirmatory factor analysis (χ²<.001, RMR<.05, GFI, AGFI, NFI, NNFI>.08). Criterion validity compared to the Parental Stress Index (PSI) showed significant correlation. The findings of this study demonstrate that the Korean FaMM showed satisfactory construct and criterion validity and reliability. It is useful to measure Korean family's management style with their children who have a chronic illness.
Amaya-Arias, Ana Carolina; Alzate, Juan Pablo; Eslava-Schmalbach, Javier H
2017-01-01
This study aimed at determining the validity of the Pediatric Quality of Life Inventory 4.0 (PedsQL™ 4.0) for the measurement of health-related quality of life (HRQOL) in Colombian children. Validation study of measurement instruments. The PedsQL™ 4.0 was applied by convenience sampling to 375 pairs of children and adolescents between the ages of 5 and 17 and to their parents-caregivers, as well as to 125 parents-caregivers of children between the ages of 2 and 4 in five cities of Colombia (Bogota, Medellin, Cali, Barranquilla and Bucaramanga). Construct validity was assessed through the use of exploratory and confirmatory factor analysis, and criterion validity was assessed by correlations between the PedsQL™ 4.0 and the KIDSCREEN-27. The instrument was applied to 375 children (ages 5-18) and 125 parents of children between the ages of 2 and 4. Factor analysis revealed four factors considered suitable for the sample in both the child and parent reports, whereas Bartlett's test of sphericity showed inter-correlation between variables. Scale and subscales showed proper indicators of internal consistency. It is recommended not to include or review some of the items in the Colombian version of the scale. The Spanish version for Colombia of the PedsQL™ 4.0 displays suitable indicators of criterion and construct validity, therefore becoming a valuable tool for measuring HRQOL in children in our country. Some modifications are recommended for the Colombian version of the scale.
Mayorga-Vega, Daniel; Bocanegra-Parrilla, Raúl; Ornelas, Martha; Viciana, Jesús
2016-01-01
Objectives The main purpose of the present meta-analysis was to examine the criterion-related validity of the distance- and time-based walk/run tests for estimating cardiorespiratory fitness among apparently healthy children and adults. Materials and Methods Relevant studies were searched from seven electronic bibliographic databases up to August 2015 and through other sources. The Hunter-Schmidt’s psychometric meta-analysis approach was conducted to estimate the population criterion-related validity of the following walk/run tests: 5,000 m, 3 miles, 2 miles, 3,000 m, 1.5 miles, 1 mile, 1,000 m, ½ mile, 600 m, 600 yd, ¼ mile, 15 min, 12 min, 9 min, and 6 min. Results From the 123 included studies, a total of 200 correlation values were analyzed. The overall results showed that the criterion-related validity of the walk/run tests for estimating maximum oxygen uptake ranged from low to moderate (rp = 0.42–0.79), with the 1.5 mile (rp = 0.79, 0.73–0.85) and 12 min walk/run tests (rp = 0.78, 0.72–0.83) having the higher criterion-related validity for distance- and time-based field tests, respectively. The present meta-analysis also showed that sex, age and maximum oxygen uptake level do not seem to affect the criterion-related validity of the walk/run tests. Conclusions When the evaluation of an individual’s maximum oxygen uptake attained during a laboratory test is not feasible, the 1.5 mile and 12 min walk/run tests represent useful alternatives for estimating cardiorespiratory fitness. As in the assessment with any physical fitness field test, evaluators must be aware that the performance score of the walk/run field tests is simply an estimation and not a direct measure of cardiorespiratory fitness. PMID:26987118
Psychometric Validation of the Academic Motivation Scale in a Dental Student Sample.
Orsini, Cesar; Binnie, Vivian; Evans, Phillip; Ledezma, Priscilla; Fuentes, Fernando; Villegas, Maria J
2015-08-01
The Academic Motivation Scale is one of the most frequently used instruments to assess academic motivation. It relies on the self-determination theory of human motivation. However, motivation has been understudied in dental education. Therefore, to address the lack of valid instruments to assess academic motivation in dental education and contribute to future research in the field, the aim of this study was to analyze the psychometric properties of this instrument in a sample of dental students. Participants were 989 Chilean undergraduate dental students (86% response rate) who completed a survey containing a Chilean face-valid version of the Spanish Academic Motivation Scale and three other motivation-related instruments to assess the survey's construct and criterion validity. Later, 76 of the students (out of 100 invited) took the survey again to assess its test-retest stability. The instrument's construct validity was supported by the superior goodness of fit of the seven-subscale Academic Motivation Scale over competing models through confirmatory factor analysis and by the expected correlations among its subscales. The concurrent criterion validity was supported by the confirmation of correlations between its subscales and external criteria. Adequate internal consistency and test-retest correlations were also found. The evidence from this study suggests that the Academic Motivation Scale is a preliminarily valid and reliable instrument to assess motivation in the predoctoral dental context. Future research in this area is needed to confirm or refute these results.
Dobbin, Nick; Hunwicks, Richard; Jones, Ben; Till, Kevin; Highton, Jamie; Twist, Craig
2018-02-01
To examine the criterion and construct validity of an isometric midthigh-pull dynamometer to assess whole-body strength in professional rugby league players. Fifty-six male rugby league players (33 senior and 23 youth players) performed 4 isometric midthigh-pull efforts (ie, 2 on the dynamometer and 2 on the force platform) in a randomized and counterbalanced order. Isometric peak force was underestimated (P < .05) using the dynamometer compared with the force platform (95% LoA: -213.5 ± 342.6 N). Linear regression showed that peak force derived from the dynamometer explained 85% (adjusted R 2 = .85, SEE = 173 N) of the variance in the dependent variable, with the following prediction equation derived: predicted peak force = [1.046 × dynamometer peak force] + 117.594. Cross-validation revealed a nonsignificant bias (P > .05) between the predicted and peak force from the force platform and an adjusted R 2 (79.6%) that represented shrinkage of 0.4% relative to the cross-validation model (80%). Peak force was greater for the senior than the youth professionals using the dynamometer (2261.2 ± 222 cf 1725.1 ± 298.0 N, respectively; P < .05). The isometric midthigh pull assessed using a dynamometer underestimates criterion peak force but is capable of distinguishing muscle-function characteristics between professional rugby league players of different standards.
ERIC Educational Resources Information Center
Baker, Doris Luft; Biancarosa, Gina; Park, Bitnara Jasmine; Bousselot, Tracy; Smith, Jean-Louise; Baker, Scott K.; Kame'enui, Edward J.; Alonzo, Julie; Tindal, Gerald
2015-01-01
We examined the criterion validity and diagnostic efficiency of oral reading fluency (ORF), word reading accuracy, and reading comprehension (RC) for students in Grades 7 and 8 taking into account form effects of ORF, time of assessment, and individual differences, including student designations of limited English proficiency and special education…
ERIC Educational Resources Information Center
Seo, Hyojeong; Wehmeyer, Michael L.; Shogren, Karrie A.; Hughes, Carolyn; Thompson, James R.; Little, Todd D.; Palmer, Susan B.
2017-01-01
Given the growing importance of support needs assessment in the field of intellectual disability, it is imperative to develop assessments of support needs whose scores and inferences demonstrate reliability and validity. The purpose of this study was to examine the criterion validity of scores on the "Supports Intensity Scale-Children's…
Breakdown parameter for kinetic modeling of multiscale gas flows.
Meng, Jianping; Dongari, Nishanth; Reese, Jason M; Zhang, Yonghao
2014-06-01
Multiscale methods built purely on the kinetic theory of gases provide information about the molecular velocity distribution function. It is therefore both important and feasible to establish new breakdown parameters for assessing the appropriateness of a fluid description at the continuum level by utilizing kinetic information rather than macroscopic flow quantities alone. We propose a new kinetic criterion to indirectly assess the errors introduced by a continuum-level description of the gas flow. The analysis, which includes numerical demonstrations, focuses on the validity of the Navier-Stokes-Fourier equations and corresponding kinetic models and reveals that the new criterion can consistently indicate the validity of continuum-level modeling in both low-speed and high-speed flows at different Knudsen numbers.
[Development and validity of workplace bullying in nursing-type inventory (WPBN-TI)].
Lee, Younju; Lee, Mihyoung
2014-04-01
The purpose of this study was to develop an instrument to assess bullying of nurses, and test the validity and reliability of the instrument. The initial thirty items of WPBN-TI were identified through a review of the literature on types bullying related to nursing and in-depth interviews with 14 nurses who experienced bullying at work. Sixteen items were developed through 2 content validity tests by 9 experts and 10 nurses. The final WPBN-TI instrument was evaluated by 458 nurses from five general hospitals in the Incheon metropolitan area. SPSS 18.0 program was used to assess the instrument based on internal consistency reliability, construct validity, and criterion validity. WPBN-TI consisted of 16 items with three distinct factors (verbal and nonverbal bullying, work-related bullying, and external threats), which explained 60.3% of the total variance. The convergent validity and determinant validity for WPBN-TI were 100.0%, 89.7%, respectively. Known-groups validity of WPBN-TI was proven through the mean difference between subjective perception of bullying. The satisfied criterion validity for WPBN-TI was more than .70. The reliability of WPBN-TI was Cronbach's α of .91. WPBN-TI with high validity and reliability is suitable to determine types of bullying in nursing workplace.
Vanderploeg, Rodney D; Cooper, Douglas B; Belanger, Heather G; Donnell, Alison J; Kennedy, Jan E; Hopewell, Clifford A; Scott, Steven G
2014-01-01
To develop and cross-validate internal validity scales for the Neurobehavioral Symptom Inventory (NSI). Four existing data sets were used: (1) outpatient clinical traumatic brain injury (TBI)/neurorehabilitation database from a military site (n = 403), (2) National Department of Veterans Affairs TBI evaluation database (n = 48 175), (3) Florida National Guard nonclinical TBI survey database (n = 3098), and (4) a cross-validation outpatient clinical TBI/neurorehabilitation database combined across 2 military medical centers (n = 206). Secondary analysis of existing cohort data to develop (study 1) and cross-validate (study 2) internal validity scales for the NSI. The NSI, Mild Brain Injury Atypical Symptoms, and Personality Assessment Inventory scores. Study 1: Three NSI validity scales were developed, composed of 5 unusual items (Negative Impression Management [NIM5]), 6 low-frequency items (LOW6), and the combination of 10 nonoverlapping items (Validity-10). Cut scores maximizing sensitivity and specificity on these measures were determined, using a Mild Brain Injury Atypical Symptoms score of 8 or more as the criterion for invalidity. Study 2: The same validity scale cut scores again resulted in the highest classification accuracy and optimal balance between sensitivity and specificity in the cross-validation sample, using a Personality Assessment Inventory Negative Impression Management scale with a T score of 75 or higher as the criterion for invalidity. The NSI is widely used in the Department of Defense and Veterans Affairs as a symptom-severity assessment following TBI, but is subject to symptom overreporting or exaggeration. This study developed embedded NSI validity scales to facilitate the detection of invalid response styles. The NSI Validity-10 scale appears to hold considerable promise for validity assessment when the NSI is used as a population-screening tool.
An evaluation of the Psychache Scale on an offender population.
Mills, Jeremy F; Green, Kate; Reddon, John R
2005-10-01
This study examined the generalizability of a self-report measure of psychache to an offender population. The factor structure, construct validity, and criterion validity of the Psychache Scale was assessed on 136 male prison inmates. The results showed the Psychache Scale has a single underlying factor structure and to be strongly associated with measures of depression and hopelessness and moderately associated with psychiatric symptoms and the criterion variable of a history of prior suicide attempts. The variables of depression, hopelessness, and psychiatric symptoms all contributed unique variance to psychache. Discussion centers on psychache's theoretical application to the prediction of suicide.
Assessing Sleep Disturbance in Low Back Pain: The Validity of Portable Instruments
Alsaadi, Saad M.; McAuley, James H.; Hush, Julia M.; Bartlett, Delwyn J.; McKeough, Zoe M.; Grunstein, Ronald R.; Dungan, George C.; Maher, Chris G.
2014-01-01
Although portable instruments have been used in the assessment of sleep disturbance for patients with low back pain (LBP), the accuracy of the instruments in detecting sleep/wake episodes for this population is unknown. This study investigated the criterion validity of two portable instruments (Armband and Actiwatch) for assessing sleep disturbance in patients with LBP. 50 patients with LBP performed simultaneous overnight sleep recordings in a university sleep laboratory. All 50 participants were assessed by Polysomnography (PSG) and the Armband and a subgroup of 33 participants wore an Actiwatch. Criterion validity was determined by calculating epoch-by-epoch agreement, sensitivity, specificity and prevalence and bias- adjusted kappa (PABAK) for sleep versus wake between each instrument and PSG. The relationship between PSG and the two instruments was assessed using intraclass correlation coefficients (ICC 2, 1). The study participants showed symptoms of sub-threshold insomnia (mean ISI = 13.2, 95% CI = 6.36) and poor sleep quality (mean PSQI = 9.20, 95% CI = 4.27). Observed agreement with PSG was 85% and 88% for the Armband and Actiwatch. Sensitivity was 0.90 for both instruments and specificity was 0.54 and 0.67 and PABAK of 0.69 and 0.77 for the Armband and Actiwatch respectively. The ICC (95%CI) was 0.76 (0.61 to 0.86) and 0.80 (0.46 to 0.92) for total sleep time, 0.52 (0.29 to 0.70) and 0.55 (0.14 to 0.77) for sleep efficiency, 0.64 (0.45 to 0.78) and 0.52 (0.23 to 0.73) for wake after sleep onset and 0.13 (−0.15 to 0.39) and 0.33 (−0.05 to 0.63) for sleep onset latency, for the Armband and Actiwatch, respectively. The findings showed that both instruments have varied criterion validity across the sleep parameters from excellent validity for measures of total sleep time, good validity for measures of sleep efficiency and wake after onset to poor validity for sleep onset latency. PMID:24763506
Assessing sleep disturbance in low back pain: the validity of portable instruments.
Alsaadi, Saad M; McAuley, James H; Hush, Julia M; Bartlett, Delwyn J; McKeough, Zoe M; Grunstein, Ronald R; Dungan, George C; Maher, Chris G
2014-01-01
Although portable instruments have been used in the assessment of sleep disturbance for patients with low back pain (LBP), the accuracy of the instruments in detecting sleep/wake episodes for this population is unknown. This study investigated the criterion validity of two portable instruments (Armband and Actiwatch) for assessing sleep disturbance in patients with LBP. 50 patients with LBP performed simultaneous overnight sleep recordings in a university sleep laboratory. All 50 participants were assessed by Polysomnography (PSG) and the Armband and a subgroup of 33 participants wore an Actiwatch. Criterion validity was determined by calculating epoch-by-epoch agreement, sensitivity, specificity and prevalence and bias- adjusted kappa (PABAK) for sleep versus wake between each instrument and PSG. The relationship between PSG and the two instruments was assessed using intraclass correlation coefficients (ICC 2, 1). The study participants showed symptoms of sub-threshold insomnia (mean ISI = 13.2, 95% CI = 6.36) and poor sleep quality (mean PSQI = 9.20, 95% CI = 4.27). Observed agreement with PSG was 85% and 88% for the Armband and Actiwatch. Sensitivity was 0.90 for both instruments and specificity was 0.54 and 0.67 and PABAK of 0.69 and 0.77 for the Armband and Actiwatch respectively. The ICC (95%CI) was 0.76 (0.61 to 0.86) and 0.80 (0.46 to 0.92) for total sleep time, 0.52 (0.29 to 0.70) and 0.55 (0.14 to 0.77) for sleep efficiency, 0.64 (0.45 to 0.78) and 0.52 (0.23 to 0.73) for wake after sleep onset and 0.13 (-0.15 to 0.39) and 0.33 (-0.05 to 0.63) for sleep onset latency, for the Armband and Actiwatch, respectively. The findings showed that both instruments have varied criterion validity across the sleep parameters from excellent validity for measures of total sleep time, good validity for measures of sleep efficiency and wake after onset to poor validity for sleep onset latency.
A Model for Estimating the Reliability and Validity of Criterion-Referenced Measures.
ERIC Educational Resources Information Center
Edmonston, Leon P.; Randall, Robert S.
A decision model designed to determine the reliability and validity of criterion referenced measures (CRMs) is presented. General procedures which pertain to the model are discussed as to: Measures of relationship, Reliability, Validity (content, criterion-oriented, and construct validation), and Item Analysis. The decision model is presented in…
Buchowski, Maciej S; Matthews, Charles E; Cohen, Sarah S; Signorello, Lisa B; Fowke, Jay H; Hargreaves, Margaret K; Schlundt, David G; Blot, William J
2012-08-01
Low physical activity (PA) is linked to cancer and other diseases prevalent in racial/ethnic minorities and low-income populations. This study evaluated the PA questionnaire (PAQ) used in the Southern Cohort Community Study, a prospective investigation of health disparities between African-American and white adults. The PAQ was administered upon entry into the cohort (PAQ1) and after 12-15 months (PAQ2) in 118 participants (40-60 year-old, 48% male, 74% African-American). Test-retest reliability (PAQ1 versus PAQ2) was assessed using Spearman correlations and the Wilcoxon signed rank test. Criterion validity of the PAQ was assessed via comparison with a PA monitor and a last-month PA survey (LMPAS), administered up to 4 times in the study period. The PAQ test-retest reliability ranged from 0.25-0.54 for sedentary behaviors and 0.22-0.47 for active behaviors. The criterion validity for the PAQ compared with PA monitor ranged from 0.21-0.24 for sedentary behaviors and from 0.17-0.31 for active behaviors. There was general consistency in the magnitude of correlations between the PAQ and PA-monitor between African-Americans and whites. The SCCS-PAQ has fair to moderate test-retest reliability and demonstrated some evidence of criterion validity for ranking participants by their level of sedentary and active behaviors.
2012-12-01
Development and validation. ABA, BQ , and criterion data were extracted from AT- SAT concurrent, criterion- related validation database. Overall, 1,232...dependent on responses to the other instrument. 3 A subset of 260 controllers in the AT- SAT dataset had full and complete ABA, BQ , and criterion data (i.e... SAT cases with ABA, BQ , and criterion data (n=260) was very small, making fairness analyses with the validation sample impractical. However, the
Reliability and criterion-related validity of a new repeated agility test
Makni, E; Jemni, M; Elloumi, M; Chamari, K; Nabli, MA; Padulo, J; Moalla, W
2016-01-01
The study aimed to assess the reliability and the criterion-related validity of a new repeated sprint T-test (RSTT) that includes intense multidirectional intermittent efforts. The RSTT consisted of 7 maximal repeated executions of the agility T-test with 25 s of passive recovery rest in between. Forty-five team sports players performed two RSTTs separated by 3 days to assess the reliability of best time (BT) and total time (TT) of the RSTT. The intra-class correlation coefficient analysis revealed a high relative reliability between test and retest for BT and TT (>0.90). The standard error of measurement (<0.50) showed that the RSTT has a good absolute reliability. The minimal detectable change values for BT and TT related to the RSTT were 0.09 s and 0.58 s, respectively. To check the criterion-related validity of the RSTT, players performed a repeated linear sprint (RLS) and a repeated sprint with changes of direction (RSCD). Significant correlations between the BT and TT of the RLS, RSCD and RSTT were observed (p<0.001). The RSTT is, therefore, a reliable and valid measure of the intermittent repeated sprint agility performance. As this ability is required in all team sports, it is suggested that team sports coaches, fitness coaches and sports scientists consider this test in their training follow-up. PMID:27274109
Amaya-Arias, Ana Carolina; Alzate, Juan Pablo; Eslava-Schmalbach, Javier H
2017-01-01
Background: This study aimed at determining the validity of the Pediatric Quality of Life Inventory 4.0 (PedsQL™ 4.0) for the measurement of health-related quality of life (HRQOL) in Colombian children. Methods: Validation study of measurement instruments. The PedsQL™ 4.0 was applied by convenience sampling to 375 pairs of children and adolescents between the ages of 5 and 17 and to their parents-caregivers, as well as to 125 parents-caregivers of children between the ages of 2 and 4 in five cities of Colombia (Bogota, Medellin, Cali, Barranquilla and Bucaramanga). Construct validity was assessed through the use of exploratory and confirmatory factor analysis, and criterion validity was assessed by correlations between the PedsQL™ 4.0 and the KIDSCREEN-27. Results: The instrument was applied to 375 children (ages 5–18) and 125 parents of children between the ages of 2 and 4. Factor analysis revealed four factors considered suitable for the sample in both the child and parent reports, whereas Bartlett's test of sphericity showed inter-correlation between variables. Scale and subscales showed proper indicators of internal consistency. It is recommended not to include or review some of the items in the Colombian version of the scale. Conclusions: The Spanish version for Colombia of the PedsQL™ 4.0 displays suitable indicators of criterion and construct validity, therefore becoming a valuable tool for measuring HRQOL in children in our country. Some modifications are recommended for the Colombian version of the scale. PMID:28900536
Community validation of the IDEA study cognitive screen in rural Tanzania.
Gray, William K; Paddick, Stella Maria; Collingwood, Cecilia; Kisoli, Aloyce; Mbowe, Godfrey; Mkenda, Sarah; Lissu, Carolyn; Rogathi, Jane; Kissima, John; Walker, Richard W; Mushi, Declare; Chaote, Paul; Ogunniyi, Adesola; Dotchin, Catherine L
2016-11-01
The dementia diagnosis gap in sub-Saharan Africa (SSA) is large, partly because of difficulties in screening for cognitive impairment in the community. As part of the Identification and Intervention for Dementia in Elderly Africans (IDEA) study, we aimed to validate the IDEA cognitive screen in a community-based sample in rural Tanzania METHODS: Study participants were recruited from people who attended screening days held in villages within the rural Hai district of Tanzania. Criterion validity was assessed against the gold standard clinical dementia diagnosis using DSM-IV criteria. Construct validity was assessed against, age, education, sex and grip strength and instrumental activities of daily living (IADLs). Internal consistency and floor and ceiling effects were also examined. During community screening, the IDEA cognitive screen had high criterion validity, with an area under the receiver operating characteristic curve of 0.855 (95% CI 0.794 to 0.915). Higher scores on the screen were significantly correlated with lower age, male sex, having attended school, better grip strength and improved performance in activities of daily living. Factor analysis revealed a single factor with an eigenvalue greater than one, although internal consistency was only moderate (Cronbach's alpha = 0.534). The IDEA cognitive screen had high criterion and construct validity and is suitable for use as a cognitive screening instrument in a community setting in SSA. Only moderate internal consistency may partly reflect the multi-domain nature of dementia as diagnosed clinically. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
The Marital Disaffection Scale: An Inventory for Assessing Emotional Estrangement in Marriage.
ERIC Educational Resources Information Center
Kayser, Karen
1996-01-01
Describes a self-report scale measuring levels of disaffection toward one's spouse. A questionnaire containing the Marital Disaffection Scale (MDS) and other disaffection measures of marital happiness was administered to 76 spouses. Results indicated good criterion-related validity, discriminant validity, and interitem reliability. Findings…
Validation of the Proficiency Examination for Diagnostic Radiologic Technology. Final Report.
ERIC Educational Resources Information Center
Educational Testing Service, Princeton, NJ.
The validity of the Proficiency Examination for Diagnostic Radiologic Technology was investigated, using 140 radiologic technologists who took both the written Proficiency Examination and a performance test. As an additional criterion measure of job proficiency, supervisors' assessments were obtained for 128 of the technologists. The resulting…
Hernández-Padilla, José M; Granero-Molina, José; Márquez-Hernández, Verónica V; Suthers, Fiona; López-Entrambasaguas, Olga M; Fernández-Sola, Cayetano
2017-06-01
Rapid and accurate interpretation of cardiac arrhythmias by nurses has been linked with safe practice and positive patient outcomes. Although training in electrocardiogram rhythm recognition is part of most undergraduate nursing programmes, research continues to suggest that nurses and nursing students lack competence in recognising cardiac rhythms. In order to promote patient safety, nursing educators must develop valid and reliable assessment tools that allow the rigorous assessment of this competence before nursing students are allowed to practise without supervision. The aim of this study was to develop and psychometrically evaluate a toolkit to holistically assess competence in electrocardiogram rhythm recognition. Following a convenience sampling technique, 293 nursing students from a nursing faculty in a Spanish university were recruited for the study. The following three instruments were developed and psychometrically tested: an electrocardiogram knowledge assessment tool (ECG-KAT), an electrocardiogram skills assessment tool (ECG-SAT) and an electrocardiogram self-efficacy assessment tool (ECG-SES). Reliability and validity (content, criterion and construct) of these tools were meticulously examined. A high Cronbach's alpha coefficient demonstrated the excellent reliability of the instruments (ECG-KAT=0.89; ECG-SAT=0.93; ECG-SES=0.98). An excellent context validity index (scales' average content validity index>0.94) and very good criterion validity were evidenced for all the tools. Regarding construct validity, principal component analysis revealed that all items comprising the instruments contributed to measure knowledge, skills or self-efficacy in electrocardiogram rhythm recognition. Moreover, known-groups analysis showed the tools' ability to detect expected differences in competence between groups with different training experiences. The three-instrument toolkit developed showed excellent psychometric properties for measuring competence in electrocardiogram rhythm recognition.
Saffari, Mohsen; Naderi, Maryam K; Piper, Crystal N; Koenig, Harold G
There is no valid and well-established tool to measure fatigue in people with chronic hepatitis B. The aim of this study was to translate the Multidimensional Fatigue Inventory (MFI) into Persian and examine its reliability and validity in Iranian people with chronic hepatitis B. The demographic questionnaire and MFI, as well as Chronic Liver Disease Questionnaire and EuroQol-5D (to assess criterion validity), were administered in face-to-face interviews with 297 participants. A forward-backward translation method was used to develop a culturally adapted Persian version of the questionnaire. Cronbach's α was used to assess the internal reliability of the scale. Pearson correlation was used to assess criterion validity, and known-group method was used along with factor analysis to establish construct validity. Cronbach's α for the total scale was 0.89. Convergent and discriminant validities were also established. Correlations between the MFI and the health-related quality of life scales were significant (p < .01). The scale differentiated between subgroups of persons with the hepatitis B infection in terms of age, gender, employment, education, disease duration, and stage of disease. Factor analysis indicated a four-factor solution for the scale that explained 60% of the variance. The MFI is a valid and reliable instrument to identify fatigue in Iranians with hepatitis B.
de Geus, Eveline; Aalfs, Cora M; Menko, Fred H; Sijmons, Rolf H; Verdam, Mathilde G E; de Haes, Hanneke C J M; Smets, Ellen M A
2015-08-01
Despite the use of genetic services, counselees do not always share hereditary cancer information with at-risk relatives. Reasons for not informing relatives may be categorized as a lack of: knowledge, motivation, and/or self-efficacy. This study aims to develop and test the psychometric properties of the Informing Relatives Inventory, a battery of instruments that intend to measure counselees' knowledge, motivation, and self-efficacy regarding the disclosure of hereditary cancer risk information to at-risk relatives. Guided by the proposed conceptual framework, existing instruments were selected and new instruments were developed. We tested the instruments' acceptability, dimensionality, reliability, and criterion-related validity in consecutive index patients visiting the Clinical Genetics department with questions regarding hereditary breast and/or ovarian cancer or colon cancer. Data of 211 index patients were included (response rate = 62%). The Informing Relatives Inventory (IRI) assesses three barriers in disclosure representing seven domains. Instruments assessing index patients' (positive) motivation and self-efficacy were acceptable and reliable and suggested good criterion-related validity. Psychometric properties of instruments assessing index patients knowledge were disputable. These items were moderately accepted by index patients and the criterion-related validity was weaker. This study presents a first conceptual framework and associated inventory (IRI) that improves insight into index patients' barriers regarding the disclosure of genetic cancer information to at-risk relatives. Instruments assessing (positive) motivation and self-efficacy proved to be reliable measurements. Measuring index patients knowledge appeared to be more challenging. Further research is necessary to ensure IRI's dimensionality and sensitivity to change.
Translation and validation of the Canadian diabetes risk assessment questionnaire in China.
Guo, Jia; Shi, Zhengkun; Chen, Jyu-Lin; Dixon, Jane K; Wiley, James; Parry, Monica
2018-01-01
To adapt the Canadian Diabetes Risk Assessment Questionnaire for the Chinese population and to evaluate its psychometric properties. A cross-sectional study was conducted with a convenience sample of 194 individuals aged 35-74 years from October 2014 to April 2015. The Canadian Diabetes Risk Assessment Questionnaire was adapted and translated for the Chinese population. Test-retest reliability was conducted to measure stability. Criterion and convergent validity of the adapted questionnaire were assessed using 2-hr 75 g oral glucose tolerance tests and the Finnish Diabetes Risk Scores, respectively. Sensitivity and specificity were evaluated to establish its predictive validity. The test-retest reliability was 0.988. Adequate validity of the adapted questionnaire was demonstrated by positive correlations found between the scores and 2-hr 75 g oral glucose tolerance tests (r = .343, p < .001) and with the Finnish Diabetes Risk Scores (r = .738, p < .001). The area under receiver operating characteristic curve was 0.705 (95% CI .632, .778), demonstrating moderate diagnostic value at a cutoff score of 30. The sensitivity was 73%, with a positive predictive value of 57% and negative predictive value of 78%. Our results provided evidence supporting the translation consistency, content validity, convergent validity, criterion validity, sensitivity, and specificity of the translated Canadian Diabetes Risk Assessment Questionnaire with minor modifications. This paper provides clinical, practical, and methodological information on how to adapt a diabetes risk calculator between cultures for public health nurses. © 2017 Wiley Periodicals, Inc.
2013-01-01
Summary of background data Recent smartphones, such as the iPhone, are often equipped with an accelerometer and magnetometer, which, through software applications, can perform various inclinometric functions. Although these applications are intended for recreational use, they have the potential to measure and quantify range of motion. The purpose of this study was to estimate the intra and inter-rater reliability as well as the criterion validity of the clinometer and compass applications of the iPhone in the assessment cervical range of motion in healthy participants. Methods The sample consisted of 28 healthy participants. Two examiners measured cervical range of motion of each participant twice using the iPhone (for the estimation of intra and inter-reliability) and once with the CROM (for the estimation of criterion validity). Estimates of reliability and validity were then established using the intraclass correlation coefficient (ICC). Results We observed a moderate intra-rater reliability for each movement (ICC = 0.65-0.85) but a poor inter-rater reliability (ICC < 0.60). For the criterion validity, the ICCs are moderate (>0.50) to good (>0.65) for movements of flexion, extension, lateral flexions and right rotation, but poor (<0.50) for the movement left rotation. Conclusion We found good intra-rater reliability and lower inter-rater reliability. When compared to the gold standard, these applications showed moderate to good validity. However, before using the iPhone as an outcome measure in clinical settings, studies should be done on patients presenting with cervical problems. PMID:23829201
Development of a new instrument for determining the level of chewing function in children.
Serel Arslan, S; Demir, N; Barak Dolgun, A; Karaduman, A A
2016-07-01
This study aimed to develop a chewing performance scale that classifies chewing from normal to severely impaired and to investigate its validity and reliability. The study included the developmental phase and reported the content, structural, criterion validity, interobserver and intra-observer reliability of the chewing performance scale, which was called the Karaduman Chewing Performance Scale (KCPS). A dysphagia literature review, other questionnaires and clinical experiences were used in the developmental phase. Seven experts assessed the steps for content validity over two Delphi rounds. To test structural, criterion validity, interobserver and intra-observer reliability, two swallowing therapists evaluated chewing videos of 144 children (Group I: 61 healthy children without chewing disorders, mean age of 42·38 ± 9·36 months; Group II: 83 children with cerebral palsy who have chewing disorders, mean age of 39·09 ± 22·95 months) using KCPS. The Behavioral Pediatrics Feeding Assessment Scale (BPFAS) was used for criterion validity. The KCPS steps arranged between 0-4 were found to be necessary. The content validity index was 0·885. The KCPS levels were found to be different between groups I and II (χ(2) = 123·286, P < 0·001). A moderately strong positive correlation was found between the KCPS and the subscales of the BPFAS (r = 0·444-0·773, P < 0·001). An excellent positive correlation was detected between two swallowing therapists and between two examinations of one swallowing therapist (r = 0·962, P < 0·001; r = 0·990, P < 0·001, respectively). The KCPS is a valid, reliable, quick and clinically easy-to-use functional instrument for determining the level of chewing function in children. © 2016 John Wiley & Sons Ltd.
Rossi, Gina; Debast, Inge; van Alphen, S P J
2017-07-01
The dimensional personality disorders model in the Diagnostic and Statistical Manual (DSM)-5 section III conceptually differentiates impaired personality functioning (criterion A) from the presence of pathological traits (criterion B). This study is the first to specifically address the measurement of criterion A in older adults. Moreover, the convergent/divergent validity of criterion A and criterion B will be compared in younger and older age groups. The Severity Indices of Personality Functioning - Short Form (SIPP-SF) was administered in older (N = 171) and younger adults (N = 210). The factorial structure was analyzed with exploratory structural equation modeling. Differences in convergent/divergent validity between personality functioning (SIPP-SF) and pathological traits (Personality Inventory for DSM-5; Dimensional Assessment of Personality Pathology-Basic Questionnaire) were examined across age groups. Identity Integration, Relational Capacities, Responsibility, Self-Control, and Social Concordance were corroborated as higher order domains. Although the SIPP-SF domains measured unique variation, some high correlations with pathological traits referred to overlapping constructs. Moreover, in older adults, personality functioning was more strongly related to Psychoticism, Disinhibition, Antagonism and Dissocial Behavior compared to younger adults. The SIPP-SF construct validity was demonstrated in terms of a structure of five higher order domains of personality functioning. The instrument is promising as a possible measure of impaired personality functioning in older adults. As such, it is a useful clinical tool to follow up effects of therapy on levels of personality functioning. Moreover, traits were associated with different degrees of personality functioning across age groups.
Assessor Training: Its Effects on Criterion-Based Assessment in a Medical Context
ERIC Educational Resources Information Center
Pell, Godfrey; Homer, Matthew S.; Roberts, Trudie E.
2008-01-01
Increasingly, academic institutions are being required to improve the validity of the assessment process; unfortunately, often this is at the expense of reliability. In medical schools (such as Leeds), standardized tests of clinical skills, such as "Objective Structured Clinical Examinations" (OSCEs) are widely used to assess clinical…
Chen, Poyu; Lin, Keh-Chung; Liing, Rong-Jiuan; Wu, Ching-Yi; Chen, Chia-Ling; Chang, Ku-Chou
2016-06-01
To examine the criterion validity, responsiveness, and minimal clinically important difference (MCID) of the EuroQoL 5-Dimensions Questionnaire (EQ-5D-5L) and visual analog scale (EQ-VAS) in people receiving rehabilitation after stroke. The EQ-5D-5L, along with four criterion measures-the Medical Research Council scales for muscle strength, the Fugl-Meyer assessment, the functional independence measure, and the Stroke Impact Scale-was administered to 65 patients with stroke before and after 3- to 4-week therapy. Criterion validity was estimated using the Spearman correlation coefficient. Responsiveness was analyzed by the effect size, standardized response mean (SRM), and criterion responsiveness. The MCID was determined by anchor-based and distribution-based approaches. The percentage of patients exceeding the MCID was also reported. Concurrent validity of the EQ-Index was better compared with the EQ-VAS. The EQ-Index has better power for predicting the rehabilitation outcome in the activities of daily living than other motor-related outcome measures. The EQ-Index was moderately responsive to change (SRM = 0.63), whereas the EQ-VAS was only mildly responsive to change. The MCID estimation of the EQ-Index (the percentage of patients exceeding the MCID) was 0.10 (33.8 %) and 0.10 (33.8 %) based on the anchor-based and distribution-based approaches, respectively, and the estimation of EQ-VAS was 8.61 (41.5 %) and 10.82 (32.3 %). The EQ-Index has shown reasonable concurrent validity, limited predictive validity, and acceptable responsiveness for detecting the health-related quality of life in stroke patients undergoing rehabilitation, but not for EQ-VAS. Future research considering different recovery stages after stroke is warranted to validate these estimations.
Validation of a home food inventory among low-income Spanish- and Somali-speaking families.
Hearst, Mary O; Fulkerson, Jayne A; Parke, Michelle; Martin, Lauren
2013-07-01
To refine and validate an existing home food inventory (HFI) for low-income Somali- and Spanish-speaking families. Formative assessment was conducted using two focus groups, followed by revisions of the HFI, translation of written materials and instrument validation in participants’ homes. Twin Cities Metropolitan Area, Minnesota, USA. Thirty low-income families with children of pre-school age (fifteen Spanish-speaking; fifteen Somali-speaking) completed the HFI simultaneously with, but independently of, a trained staff member. Analysis consisted of calculation of both item-specific and average food group kappa coefficients, specificity, sensitivity and Spearman’s correlation between participants’ and staff scores as a means of assessing criterion validity of individual items, food categories and the obesogenic score. The formative assessment revealed the need for few changes/additions for food items typically found in Spanish-speaking households. Somali-speaking participants requested few additions, but many deletions, including frozen processed food items, non-perishable produce and many sweets as they were not typical food items kept in the home. Generally, all validity indices were within an acceptable range, with the exception of values associated with items such as ‘whole wheat bread’ (k = 0.16). The obesogenic score (presence of high-fat, high-energy foods) had high criterion validity with k = 0.57, sensitivity = 91.8%, specificity = 70.6% and Spearman correlation = 0.78. The revised HFI is a valid assessment tool for use among Spanish and Somali households. This instrument refinement and validation process can be replicated with other population groups.
Development and initial validation of the appropriate antibiotic use self-efficacy scale.
Hill, Erin M; Watkins, Kaitlin
2018-06-04
While there are various medication self-efficacy scales that exist, none assess self-efficacy for appropriate antibiotic use. The Appropriate Antibiotic Use Self-Efficacy Scale (AAUSES) was developed, pilot tested, and its psychometric properties were examined. Following pilot testing of the scale, a 28-item questionnaire was examined using a sample (n = 289) recruited through the Amazon Mechanical Turk platform. Participants also completed other scales and items, which were used in assessing discriminant, convergent, and criterion-related validity. Test-retest reliability was also examined. After examining the scale and removing items that did not assess appropriate antibiotic use, an exploratory factor analysis was conducted on 13 items from the original scale. Three factors were retained that explained 65.51% of the variance. The scale and its subscales had adequate internal consistency. The scale had excellent test-retest reliability, as well as demonstrated convergent, discriminant, and criterion-related validity. The AAUSES is a valid and reliable scale that assesses three domains of appropriate antibiotic use self-efficacy. The AAUSES may have utility in clinical and research settings in understanding individuals' beliefs about appropriate antibiotic use and related behavioral correlates. Future research is needed to examine the scale's utility in these settings. Copyright © 2018 Elsevier B.V. All rights reserved.
Validation of the Chinese Version of the Quality of Nursing Work Life Scale
Fu, Xia; Xu, Jiajia; Song, Li; Li, Hua; Wang, Jing; Wu, Xiaohua; Hu, Yani; Wei, Lijun; Gao, Lingling; Wang, Qiyi; Lin, Zhanyi; Huang, Huigen
2015-01-01
Quality of Nursing Work Life (QNWL) serves as a predictor of a nurse’s intent to leave and hospital nurse turnover. However, QNWL measurement tools that have been validated for use in China are lacking. The present study evaluated the construct validity of the QNWL scale in China. A cross-sectional study was conducted conveniently from June 2012 to January 2013 at five hospitals in Guangzhou, which employ 1938 nurses. The participants were asked to complete the QNWL scale and the World Health Organization Quality of Life abbreviated version (WHOQOL-BREF). A total of 1922 nurses provided the final data used for analyses. Sixty-five nurses from the first investigated division were re-measured two weeks later to assess the test-retest reliability of the scale. The internal consistency reliability of the QNWL scale was assessed using Cronbach’s α. Test-retest reliability was assessed using the intra-class correlation coefficient (ICC). Criterion-relation validity was assessed using the correlation of the total scores of the QNWL and the WHOQOL-BREF. Construct validity was assessed with the following indices: χ2 statistics and degrees of freedom; relative mean square error of approximation (RMSEA); the Akaike information criterion (AIC); the consistent Akaike information criterion (CAIC); the goodness-of-fit index (GFI); the adjusted goodness of fit index; and the comparative fit index (CFI). The findings demonstrated high internal consistency (Cronbach’s α = 0.912) and test-retest reliability (interclass correlation coefficient = 0.74) for the QNWL scale. The chi-square test (χ2 = 13879.60, df [degree of freedom] = 813 P = 0.0001) was significant. The RMSEA value was 0.091, and AIC = 1806.00, CAIC = 7730.69, CFI = 0.93, and GFI = 0.74. The correlation coefficient between the QNWL total scores and the WHOQOL-BREF total scores was 0.605 (p<0.01). The QNWL scale was reliable and valid in Chinese-speaking nurses and could be used as a clinical and research instrument for measuring work-related factors among nurses in China. PMID:25950838
Moschella, Melissa
2016-01-01
This article explains the problems with Alan Shewmon’s critique of brain death as a valid sign of human death, beginning with a critical examination of his analogy between brain death and severe spinal cord injury. The article then goes on to assess his broader argument against the necessity of the brain for adult human organismal integration, arguing that he fails to translate correctly from biological to metaphysical claims. Finally, on the basis of a deeper metaphysical analysis, I offer a revised rationale for the validity of the neurological criterion of human death. PMID:27095749
Wagenlehner, Florian Martin Erich; Fröhlich, Oliver; Bschleipfer, Thomas; Weidner, Wolfgang; Perletti, Gianpaolo
2014-06-01
Anatomical damage to pelvic floor structures may cause multiple symptoms. The Integral Theory System Questionnaire (ITSQ) is a holistic questionnaire that uses symptoms to help locate damage in specific connective tissue structures as a guide to reconstructive surgery. It is based on the integral theory, which states that pelvic floor symptoms and prolapse are both caused by lax suspensory ligaments. The aim of the present study was to psychometrically validate the ITSQ. Established psychometric properties including validity, reliability, and responsiveness were considered for evaluation. Criterion validity was assessed in a cohort of 110 women with pelvic floor dysfunctions by analyzing the correlation of questionnaire responses with objective clinical data. Test-retest was performed with questionnaires from 47 patients. Cronbach's alpha and "split-half" reliability coefficients were calculated for inner consistency analysis. Psychometric properties of ITSQ were comparable to the ones of previously validated Pelvic Floor Questionnaires. Face validity and content validity were approved by an expert group of the International Collaboration of Pelvic Floor surgeons. Convergent validity assessed using Bayesian method was at least as accurate as the expert assessment of anatomical defects. Objective data measurement in patients demonstrated significant correlations with ITSQ domains fulfilling criterion validity. Internal consistency values ranked from 0.85 to 0.89 in different scenarios. The ITSQ proofed accurate and is able to serve as a holistic Pelvic Floor Questionnaire directing symptoms to site-specific pelvic floor reconstructive surgery.
Wang, Yi-Wen; Tsai, Yun-Fang; Lee, Shwu-Hua; Chen, Ying-Jen; Chen, Hsiu-Fang
2016-07-01
To develop and psychometrically test the Protective Reasons against Suicide Inventory among older Chinese-speaking outpatients. Tools currently exist to test reasons for living among individuals of all ages in western countries, but few are available to assess older adults' protective reasons against suicide in Asia. A cross-sectional survey to investigate protective reasons against suicide among older Chinese-speaking outpatients. The Protective Reasons against Suicide Inventory was developed based on individual interviews with 83 older outpatients in Taiwan, the literature and the authors' clinical experiences. The resulting Inventory was examined in 2013 for content validity, face validity, construct validity, criterion-related validity, internal consistency reliability and test-retest reliability. The Inventory had excellent content validity and face validity. Factor analysis yielded a seven-factor solution, accounting for 87·7% of the variance. Scores on the global Inventory and its subscales tended to be higher in outpatients diagnosed without suicidal ideation than in outpatients diagnosed with suicidal ideation, indicating good criterion validity. Inventory reliability and the intraclass correlation coefficient were satisfactory. The Protective Reasons against Suicide Inventory can be completed in 5 minutes and is perceived as easy to complete. Moreover, the Inventory yielded highly acceptable parameters for validity and reliability. The Protective Reasons against Suicide Inventory can be used to assess older Chinese-speaking outpatients for factors that protect them from attempting suicide. © 2016 John Wiley & Sons Ltd.
ERIC Educational Resources Information Center
Pike, Gary R.
1989-01-01
A study investigated the appropriateness of the American College Testing Program's College Outcome Measures Program, conducted at the University of Tennessee, Knoxville, by applying the criterion of construct validity. Results indicated that while the test primarily measures individual differences, it is also sensitive to the effects of higher…
An Evaluation of the Psychache Scale on an Offender Population
ERIC Educational Resources Information Center
Mills, Jeremy F.; Green, Kate; Reddon, John R.
2005-01-01
This study examined the generalizability of a self-report measure of psychache to an offender population. The factor structure, construct validity, and criterion validity of the Psychache Scale was assessed on 136 male prison inmates. The results showed the Psychache Scale has a single underlying factor structure and to be strongly associated with…
Investigation of the Lollipop Test as a Pre-Kindergarten Screening Instrument.
ERIC Educational Resources Information Center
Chew, Alex L.; Morris, John D.
1987-01-01
The validity of the Lollipop Test: A Diagnostic Screening Test of School Readiness was examined for 129 pre-kindergarten subjects using the Developmental Indicator for the Assessment of Learning as the criterion. Concurrent validity was demonstrated across the test batteries. The Lollipop Test appears to be an attractive alternative…
Shin, Marlena H; Sullivan, Jennifer L; Rosen, Amy K; Solomon, Jeffrey L; Dunn, Edward J; Shimada, Stephanie L; Hayes, Jennifer; Rivard, Peter E
2014-12-01
Increasing use of Agency for Healthcare Research and Quality's Patient Safety Indicators (PSIs) for hospital performance measurement intensifies the need to critically assess their validity. Our study examined the extent to which variation in PSI composite score is related to differences in hospital organizational structures or processes (i.e., criterion validity). In site visits to three Veterans Health Administration hospitals with high and three with low PSI composite scores ("low performers" and "high performers," respectively), we interviewed a cross-section of hospital staff. We then coded interview transcripts for evidence in 13 safety-related domains and assessed variation across high and low performers. Evidence of leadership and coordination of work/communication (organizational process domains) was predominantly favorable for high performers only. Evidence in the other domains was either mixed, or there were insufficient data to rate the domains. While we found some evidence of criterion validity, the extent to which variation in PSI rates is related to differences in hospitals' organizational structures/processes needs further study. © The Author(s) 2014.
ERIC Educational Resources Information Center
LaBelle, Sara; Johnson, Zac D.
2018-01-01
Three studies were conducted to generate a valid and reliable instrument to measure student-to-student confirmation. Study One (N = 396) sought to establish a factor structure based on previous research. Study Two (N = 396) sought to confirm this factor structure and assess criterion-related validity. Study Three (N = 283) sought to assess…
ERIC Educational Resources Information Center
Thomas, Michael L.; Lanyon, Richard I.; Millsap, Roger E.
2009-01-01
The use of criterion group validation is hindered by the difficulty of classifying individuals on latent constructs. Latent class analysis (LCA) is a method that can be used for determining the validity of scales meant to assess latent constructs without such a priori classifications. The authors used this method to examine the ability of the L…
ERIC Educational Resources Information Center
Armstrong, William B.
As part of an effort to statistically validate the placement tests used in California's San Diego Community College District (SDCCD) a study was undertaken to review the criteria- and content-related validity of the Assessment and Placement Services (APS) reading and writing tests. Evidence of criteria and content validity was gathered from…
Romero-García, Marta; de la Cueva-Ariza, Laura; Benito-Aracil, Llucia; Lluch-Canut, Teresa; Trujols-Albet, Joan; Martínez-Momblan, Maria Antonia; Juvé-Udina, Maria-Eulàlia; Delgado-Hito, Pilar
2018-06-01
The aim of this study was to develop and validate the Nursing Intensive-Care Satisfaction Scale to measures satisfaction with nursing care from the critical care patient's perspective. Instruments that measure satisfaction with nursing cares have been designed and validated without taking the patient's perspective into consideration. Despite the benefits and advances in measuring satisfaction with nursing care, none instrument is specifically designed to assess satisfaction in intensive care units. Instrument development. The population were all discharged patients (January 2013 - January 2015) from three Intensive Care Units of a third level hospital (N = 200). All assessment instruments were given to discharged patients and 48 hours later, to analyse the temporal stability, only the questionnaire was given again. The validation process of the scale included the analysis of internal consistency, temporal stability; validity of construct through a confirmatory factor analysis; and criterion validity. Reliability was 0.95. The intraclass correlation coefficient for the total scale was 0.83 indicating a good temporal stability. Construct validity showed an acceptable fit and factorial structure with four factors, in accordance with the theoretical model, being Consequences factor the best correlated with other factors. Criterion validity, presented a correlation between low and high (range: 0.42-0.68). The scale has been designed and validated incorporating the perspective of critical care patients. Thanks to its reliability and validity, this questionnaire can be used both in research and in clinical practice. The scale offers a possibility to assess and develop interventions to improve patient satisfaction with nursing care. © 2018 John Wiley & Sons Ltd.
Classen, Sherrilene; Wang, Yanning; Winter, Sandra M; Velozo, Craig A; Lanford, Desiree N; Bédard, Michel
2013-01-01
We determined the concurrent criterion validity of the Safe Driving Behavior Measure (SDBM) for on-road outcomes (passing or failing the on-road test as determined by a certified driving rehabilitation specialist) among older drivers and their family members-caregivers. On the basis of ratings from 168 older drivers and 168 family members-caregivers, we calculated receiver operating characteristic curves. The drivers' area under the curve (AUC) was .620 (95% confidence interval [CI] = .514-.725, p = .043). The family members-caregivers' AUC was .726 (95% CI = .622-.829, p ≤ .01). Older drivers' ratings showed statistically significant yet poor concurrent criterion validity, but family members-caregivers' ratings showed good concurrent criterion validity for the criterion on-road driving test. Continuing research with a more representative sample is being pursued to confirm the SDBM's concurrent criterion validity. This screening tool may be useful for generalist practitioners to use in making decisions regarding driving. Copyright © 2013 by the American Occupational Therapy Association, Inc.
Wang, Yanning; Winter, Sandra M.; Velozo, Craig A.; Lanford, Desiree N.; Bédard, Michel
2013-01-01
We determined the concurrent criterion validity of the Safe Driving Behavior Measure (SDBM) for on-road outcomes (passing or failing the on-road test as determined by a certified driving rehabilitation specialist) among older drivers and their family members–caregivers. On the basis of ratings from 168 older drivers and 168 family members–caregivers, we calculated receiver operating characteristic curves. The drivers’ area under the curve (AUC) was .620 (95% confidence interval [CI] = .514–.725, p = .043). The family members–caregivers’ AUC was .726 (95% CI = .622–.829, p ≤ .01). Older drivers’ ratings showed statistically significant yet poor concurrent criterion validity, but family members–caregivers’ ratings showed good concurrent criterion validity for the criterion on-road driving test. Continuing research with a more representative sample is being pursued to confirm the SDBM’s concurrent criterion validity. This screening tool may be useful for generalist practitioners to use in making decisions regarding driving. PMID:23245789
The Work-Health-Check (WHC): a brief new tool for assessing psychosocial stress in the workplace.
Gadinger, M C; Schilling, O; Litaker, D; Fischer, J E
2012-01-01
Brief, psychometrically robust questionnaires assessing work-related psychosocial stressors are lacking. The purpose of the study is to evaluate the psychometric properties of a brief new questionnaire for assessing sources of work-related psychosocial stress. Managers, blue- and white-collar workers (n= 628 at measurement point one, n=459 at measurement point two), sampled from an online panel of a German marketing research institute. We either developed or identified appropriate items from existing questionnaires for ten scales, which are conceptually based in work stress models and reflected either work-related demands or resources. Factorial structure was evaluated by confirmatory factor analyses (CFA). Scale reliability was assessed by Cronbach's Alpha, and test-retest; correlations with work-related efforts demonstrated convergent and discriminant validity for the demand and resource scales, respectively. Scale correlations with health indicators tested criterion validity. All scales had satisfactory reliability (Cronbach's Alpha: 0.74-0.93, retest reliabilities: 0.66-0.81). CFA supported the anticipated factorial structure. Significant correlations between job-related efforts and demand scales (mean r=0.44) and non-significant correlations with the resource scales (mean r=0.07) suggested good convergent and discriminant validity, respectively. Scale correlations with health indicators demonstrated good criterion validity. The WHC appears to be a brief, psychometrically robust instrument for assessing work-related psychosocial stressors.
Validity of field expedient devices to assess core temperature during exercise in the cold.
Bagley, James R; Judelson, Daniel A; Spiering, Barry A; Beam, William C; Bartolini, J Albert; Washburn, Brian V; Carney, Keven R; Muñoz, Colleen X; Yeargin, Susan W; Casa, Douglas J
2011-12-01
Exposure to cold environments affects human performance and physiological function. Major medical organizations recommend rectal temperature (TREC) to evaluate core body temperature (TcORE) during exercise in the cold; however, other field expedient devices claim to measure TCORE. The purpose of this study was to determine if field expedient devices provide valid measures of TcRE during rest and exercise in the cold. Participants included 13 men and 12 women (age = 24 +/- 3 yr, height = 170.7 +/- 10.6 cm, mass = 73.4 +/- 16.7 kg, body fat = 18 +/- 7%) who reported being healthy and at least recreationally active. During 150 min of cold exposure, subjects sequentially rested for 30 min, cycled for 90 min (heart rate = 120-140 bpm), and rested for an additional 30 min. Investigators compared aural (T(AUR)), expensive axillary (T(AXLe)), inexpensive axillary (T(AXLi)), forehead (T(FOR)), gastrointestinal (T(GI)), expensive oral (T(ORLe)), inexpensive oral (T(ORLi)), and temporal (T(TEM)) temperatures to T(REc) every 15 min. Researchers used mean difference between each device and T(REC) (i.e., mean bias) as the primary criterion for validity. T(AUR), T(AXLe), T(AXLi), T(FOR), TORLe, T(ORLi), and TTEM provided significantly lower measures compared to T(REC) and fell below our validity criterion. T(GI) significantly exceeded T(REC) at three of eleven time points, but no significant difference existed between mean T(REC) and T(GI) across time. Only T(GI) achieved our validity criterion and compared favorably to T(REC). T(GI) offers a valid measurement with which to assess T(CORE) during rest and exercise in the cold; athletic trainers, mountain rescuers, and military medical personnel should avoid other field expedient devices in similar conditions.
Nebraska Wisconsin Cognitive Assessment Battery (NEWCAB).
ERIC Educational Resources Information Center
Kalyan-Masih, V.; Marshall, W.
This report discusses the construct and criterion-related validity of the Nebraska Wisconsin Cognitive Assessment Battery (NEWCAB), on the basis of pooled regional data collected in Iowa, Kansas, Nebraska, and Wisconsin on a 3-year longitudinal sample of 107 6-year-old, 141 7-year-old, and 160 8-year-old children. Designed to assess the cognitive…
Buchowski, Maciej S.; Matthews, Charles E.; Cohen, Sarah S.; Signorello, Lisa B.; Fowke, Jay H.; Hargreaves, Margaret K.; Schlundt, David G.; Blot, William J.
2012-01-01
Background Low physical activity (PA) is linked to cancer and other diseases prevalent in racial/ethnic minorities and low-income populations. This study evaluated the PA questionnaire (PAQ) used in the Southern Cohort Community Study, a prospective investigation of health disparities between African-American and white adults. Methods The PAQ was administered upon entry into the cohort (PAQ1) and after 12–15 months (PAQ2) in 118 participants (40–60 year-old, 48% male, 74% African-American). Test-retest reliability (PAQ1 versus PAQ2) was assessed using Spearman correlations and the Wilcoxon signed rank test. Criterion validity of the PAQ was assessed via comparison with a PA monitor and a last-month PA survey (LMPAS), administered up to 4 times in the study period. Results The PAQ test-retest reliability ranged from 0.25–0.54 for sedentary behaviors and 0.22–0.47 for active behaviors. The criterion validity for the PAQ compared with PA monitor ranged from 0.21–0.24 for sedentary behaviors and from 0.17–0.31 for active behaviors. There was general consistency in the magnitude of correlations between the PAQ and PA-monitor between African-Americans and whites. Conclusions The SCCS-PAQ has fair to moderate test-retest reliability and demonstrated some evidence of criterion validity for ranking participants by their level of sedentary and active behaviors. PMID:21952413
Baek, Sora; Park, Hee-Won; Lee, Yookyung; Grace, Sherry L; Kim, Won-Seok
2017-10-01
To perform a translation and cross-cultural adaptation of the Cardiac Rehabilitation Barriers Scale (CRBS) for use in Korea, followed by psychometric validation. The CRBS was developed to assess patients' perception of the degree to which patient, provider and health system-level barriers affect their cardiac rehabilitation (CR) participation. The CRBS consists of 21 items (barriers to adherence) rated on a 5-point Likert scale. The first phase was to translate and cross-culturally adapt the CRBS to the Korean language. After back-translation, both versions were reviewed by a committee. The face validity was assessed in a sample of Korean patients (n=53) with history of acute myocardial infarction that did not participate in CR through semi-structured interviews. The second phase was to assess the construct and criterion validity of the Korean translation as well as internal reliability, through administration of the translated version in 104 patients, principle component analysis with varimax rotation and cross-referencing against CR use, respectively. The length, readability, and clarity of the questionnaire were rated well, demonstrating face validity. Analysis revealed a six-factor solution, demonstrating construct validity. Cronbach's alpha was greater than 0.65. Barriers rated highest included not knowing about CR and not being contacted by a program. The mean CRBS score was significantly higher among non-attendees (2.71±0.26) than CR attendees (2.51±0.18) (p<0.01). The Korean version of CRBS has demonstrated face, content and criterion validity, suggesting it may be useful for assessing barriers to CR utilization in Korea.
Serel Arslan, S; Demir, N; Karaduman, A A
2017-02-01
This study aimed to develop a scale called Tongue Thrust Rating Scale (TTRS), which categorised tongue thrust in children in terms of its severity during swallowing, and to investigate its validity and reliability. The study describes the developmental phase of the TTRS and presented its content and criterion-based validity and interobserver and intra-observer reliability. For content validation, seven experts assessed the steps in the scale over two Delphi rounds. Two physical therapists evaluated videos of 50 children with cerebral palsy (mean age, 57·9 ± 16·8 months), using the TTRS to test criterion-based validity, interobserver and intra-observer reliability. The Karaduman Chewing Performance Scale (KCPS) and Drooling Severity and Frequency Scale (DSFS) were used for criterion-based validity. All the TTRS steps were deemed necessary. The content validity index was 0·857. A very strong positive correlation was found between two examinations by one physical therapist, which indicated intra-observer reliability (r = 0·938, P < 0·001). A very strong positive correlation was also found between the TTRS scores of two physical therapists, indicating interobserver reliability (r = 0·892, P < 0·001). There was also a strong positive correlation between the TTRS and KCPS (r = 0·724, P < 0·001) and a very strong positive correlation between the TTRS scores and DSFS (r = 0·822 and r = 0·755; P < 0·001). These results demonstrated the criterion-based validity of the TTRS. The TTRS is a valid, reliable and clinically easy-to-use functional instrument to document the severity of tongue thrust in children. © 2016 John Wiley & Sons Ltd.
Rodríguez, Iván; Zambrano, Lysien; Manterola, Carlos
2016-04-01
Physiological parameters used to measure exercise intensity are oxygen uptake and heart rate. However, perceived exertion (PE) is a scale that has also been frequently applied. The objective of this study is to establish the criterion-related validity of PE scales in children during an incremental exercise test. Seven electronic databases were used. Studies aimed at assessing criterion-related validity of PE scales in healthy children during an incremental exercise test were included. Correlation coefficients were transformed into z-values and assessed in a meta-analysis by means of a fixed effects model if I2 was below 50% or a random effects model, if it was above 50%. wenty-five articles that studied 1418 children (boys: 49.2%) met the inclusion criteria. Children's average age was 10.5 years old. Exercise modalities included bike, running and stepping exercises. The weighted correlation coefficient was 0.835 (95% confidence interval: 0.762-0.887) and 0.874 (95% confidence interval: 0.794-0.924) for heart rate and oxygen uptake as reference criteria. The production paradigm and scales that had not been adapted to children showed the lowest measurement performance (p < 0.05). Measuring PE could be valid in healthy children during an incremental exercise test. Child-specific rating scales showed a better performance than those that had not been adapted to this population. Further studies with better methodological quality should be conducted in order to confirm these results. Sociedad Argentina de Pediatría.
Miller, Joshua D; Lynam, Donald R
2012-07-01
Since its publication, the Psychopathic Personality Inventory and its revision (Lilienfeld & Andrews, 1996; Lilienfeld & Widows, 2005) have become increasingly popular such that it is now among the most frequently used self-report inventories for the assessment of psychopathy. The current meta-analysis examined the relations between the two PPI factors (factor 1: Fearless Dominance; factor 2: Self-Centered Impulsivity), as well as their relations with other validated measures of psychopathy, internalizing and externalizing forms of psychopathology, general personality traits, and antisocial personality disorder symptoms. Across 61 samples reported in 49 publications, we found support for the convergent and criterion validity of both PPI factor 2 and the PPI total score. Much weaker validation was found for PPI factor 1, which manifested limited convergent validity and a pattern of correlations with central criterion variables that was inconsistent with many conceptualizations of psychopathy. PsycINFO Database Record (c) 2012 APA, all rights reserved.
Zubeidat, Ihab; Salinas, José María; Sierra, Juan Carlos; Fernández-Parra, Antonio
2007-01-01
In this study, we analyzed the reliability and validity of the Social Interaction Anxiety Scale (SIAS) and propose a separation criterion between youths with specific and generalized social anxiety and youths without social anxiety. A sample of 1012 Spanish youths attending school completed the SIAS, the Liebowitz Social Anxiety Scale, the Social Avoidance and Distress Scale, the Fear of Negative Evaluation Scale, the Youth Self-Report for Ages 11-18 and the Minnesota Multiphasic Personality Inventory-Adolescent. The factor analysis suggests the existence of three factors in the SIAS, the first two of which explain most of the variance of the construct assessed. Internal consistency is adequate in the first two factors. The SIAS features an adequate theoretical validity with the scores of different variables related to social interaction. Analysis of the criterion scores yields three groups pertaining to three clearly differentiated clusters. In the third cluster, two of social anxiety groups - specific and generalized - have been identified by means of a quantitative separation criterion.
Statistical methodology: II. Reliability and validity assessment in study design, Part B.
Karras, D J
1997-02-01
Validity measures the correspondence between a test and other purported measures of the same or similar qualities. When a reference standard exists, a criterion-based validity coefficient can be calculated. If no such standard is available, the concepts of content and construct validity may be used, but quantitative analysis may not be possible. The Pearson and Spearman tests of correlation are often used to assess the correspondence between tests, but do not account for measurement biases and may yield misleading results. Techniques that measure interest differences may be more meaningful in validity assessment, and the kappa statistic is useful for analyzing categorical variables. Questionnaires often can be designed to allow quantitative assessment of reliability and validity, although this may be difficult. Inclusion of homogeneous questions is necessary to assess reliability. Analysis is enhanced by using Likert scales or similar techniques that yield ordinal data. Validity assessment of questionnaires requires careful definition of the scope of the test and comparison with previously validated tools.
Psychometric evaluation of the Swedish version of Rosenberg's self-esteem scale.
Eklund, Mona; Bäckström, Martin; Hansson, Lars
2018-04-01
The widely used Rosenberg's self-esteem scale (RSES) has not been evaluated for psychometric properties in Sweden. This study aimed at analyzing its factor structure, internal consistency, criterion, convergent and discriminant validity, sensitivity to change, and whether a four-graded Likert-type response scale increased its reliability and validity compared to a yes/no response scale. People with mental illness participating in intervention studies to (1) promote everyday life balance (N = 223) or (2) remedy self-stigma (N = 103) were included. Both samples completed the RSES and questionnaires addressing quality of life and sociodemographic data. Sample 1 also completed instruments chosen to assess convergent and discriminant validity: self-mastery (convergent validity), level of functioning and occupational engagement (discriminant validity). Confirmatory factor analysis (CFA), structural equation modeling, and conventional inferential statistics were used. Based on both samples, the Swedish RSES formed one factor and exhibited high internal consistency (>0.90). The two response scales were equivalent. Criterion validity in relation to quality of life was demonstrated. RSES could distinguish between women and men (women scoring lower) and between diagnostic groups (people with depression scoring lower). Correlations >0.5 with variables chosen to reflect convergent validity and around 0.2 with variables used to address discriminant validity further highlighted the construct validity of RSES. The instrument also showed sensitivity to change. The Swedish RSES exhibited a one-component factor structure and showed good psychometric properties in terms of good internal consistency, criterion, convergent and discriminant validity, and sensitivity to change. The yes/no and the four-graded Likert-type response scales worked equivalently.
Cross-cultural validity of a dietary questionnaire for studies of dental caries risk in Japanese.
Shinga-Ishihara, Chikako; Nakai, Yukie; Milgrom, Peter; Murakami, Kaori; Matsumoto-Nakano, Michiyo
2014-01-02
Diet is a major modifiable contributing factor in the etiology of dental caries. The purpose of this paper is to examine the reliability and cross-cultural validity of the Japanese version of the Food Frequency Questionnaire to assess dietary intake in relation to dental caries risk in Japanese. The 38-item Food Frequency Questionnaire, in which Japanese food items were added to increase content validity, was translated into Japanese, and administered to two samples. The first sample comprised 355 pregnant women with mean age of 29.2 ± 4.2 years for the internal consistency and criterion validity analyses. Factor analysis (principal components with Varimax rotation) was used to determine dimensionality. The dietary cariogenicity score was calculated from the Food Frequency Questionnaire and used for the analyses. Salivary mutans streptococci level was used as a semi-quantitative assessment of dental caries risk and measured by Dentocult SM. Dentocult SM scores were compared with the dietary cariogenicity score computed from the Food Frequency Questionnaire to examine criterion validity, and assessed by Spearman's correlation coefficient (rs) and Kruskal-Wallis test. Test-retest reliability of the Food Frequency Questionnaire was assessed with a second sample of 25 adults with mean age of 34.0 ± 3.0 years by using the intraclass correlation coefficient analysis. The Japanese language version of the Food Frequency Questionnaire showed high test-retest reliability (ICC = 0.70) and good criterion validity assessed by relationship with salivary mutans streptococci levels (rs = 0.22; p < 0.001). Factor analysis revealed four subscales that construct the questionnaire (solid sugars, solid and starchy sugars, liquid and semisolid sugars, sticky and slowly dissolving sugars). Internal consistency were low to acceptable (Cronbach's alpha = 0.67 for the total scale, 0.46-0.61 for each subscale). Mean dietary cariogenicity scores were 50.8 ± 19.5 in the first sample, 47.4 ± 14.1, and 40.6 ± 11.3 for the first and second administrations in the second sample. The distribution of Dentocult SM score was 6.8% (score = 0), 34.4% (score = 1), 39.4% (score = 2), and 19.4% (score = 3). Participants with higher scores were more likely to have higher dietary cariogenicity scores (p < 0.001; Kruskal-Wallis test). These results provide the preliminary evidence for the reliability and validity of the Japanese language Food Frequency Questionnaire.
ERIC Educational Resources Information Center
Woodburn, Jim; Sutcliffe, Nick
1996-01-01
The Objective Structured Clinical Examination (OSCE), initially developed for undergraduate medical education, has been adapted for assessment of clinical skills in podiatry students. A 12-month pilot study found the test had relatively low levels of reliability, high construct and criterion validity, and good stability of performance over time.…
Validity and Bias of Academic Achievement Measures in the First Year of Elementary School
ERIC Educational Resources Information Center
Hammes, Patricia Simone; Bigras, Marc; Crepaldi, Maria Aparecida
2016-01-01
We tested the criterion-related validity and potential bias of two measures of pupils' academic achievement: the Teacher Rating Scale (TRS) and the Mathematics and Literacy Achievement Tests (MLTs). These measures are representative of assessment methods largely used in the elementary school. The aims were: (1) to verify the extent to which TRS…
Lee, Rebekka M; Emmons, Karen M; Okechukwu, Cassandra A; Barrett, Jessica L; Kenney, Erica L; Cradock, Angie L; Giles, Catherine M; deBlois, Madeleine E; Gortmaker, Steven L
2014-11-28
Nutrition and physical activity interventions have been effective in creating environmental changes in afterschool programs. However, accurate assessment can be time-consuming and expensive as initiatives are scaled up for optimal population impact. This study aims to determine the criterion validity of a simple, low-cost, practitioner-administered observational measure of afterschool physical activity, nutrition, and screen time practices and child behaviors. Directors from 35 programs in three cities completed the Out-of-School Nutrition and Physical Activity Observational Practice Assessment Tool (OSNAP-OPAT) on five days. Trained observers recorded snacks served and obtained accelerometer data each day during the same week. Observations of physical activity participation and snack consumption were conducted on two days. Correlations were calculated to validate weekly average estimates from OSNAP-OPAT compared to criterion measures. Weekly criterion averages are based on 175 meals served, snack consumption of 528 children, and physical activity levels of 356 children. OSNAP-OPAT validly assessed serving water (r = 0.73), fruits and vegetables (r = 0.84), juice >4oz (r = 0.56), and grains (r = 0.60) at snack; sugary drinks (r = 0.70) and foods (r = 0.68) from outside the program; and children's water consumption (r = 0.56) (all p <0.05). Reports of physical activity time offered were correlated with accelerometer estimates (minutes of moderate and vigorous physical activity r = 0.59, p = 0.02; vigorous physical activity r = 0.63, p = 0.01). The reported proportion of children participating in moderate and vigorous physical activity was correlated with observations (r = 0.48, p = 0.03), as were reports of computer (r = 0.85) and TV/movie (r = 0.68) time compared to direct observations (both p < 0.01). OSNAP-OPAT can assist researchers and practitioners in validly assessing nutrition and physical activity environments and behaviors in afterschool settings. Phase 1 of this measure validation was conducted during a study registered at clinicaltrials.gov NCT01396473.
Fernández-Domínguez, Juan Carlos; de Pedro-Gómez, Joan Ernest; Morales-Asencio, José Miguel; Sastre-Fullana, Pedro; Sesé-Abad, Albert
2017-01-01
Introduction Most of the EBP measuring instruments available to date present limitations both in the operationalisation of the construct and also in the rigour of their psychometric development, as revealed in the literature review performed. The aim of this paper is to provide rigorous and adequate reliability and validity evidence of the scores of a new transdisciplinary psychometric tool, the Health Sciences Evidence-Based Practice (HS-EBP), for measuring the construct EBP in Health Sciences professionals. Methods A pilot study and a subsequent two-stage validation test sample were conducted to progressively refine the instrument until a reduced 60-item version with a five-factor latent structure. Reliability was analysed through both Cronbach’s alpha coefficient and intraclass correlations (ICC). Latent structure was contrasted using confirmatory factor analysis (CFA) following a model comparison aproach. Evidence of criterion validity of the scores obtained was achieved by considering attitudinal resistance to change, burnout, and quality of professional life as criterion variables; while convergent validity was assessed using the Spanish version of the Evidence-Based Practice Questionnaire (EBPQ-19). Results Adequate evidence of both reliability and ICC was obtained for the five dimensions of the questionnaire. According to the CFA model comparison, the best fit corresponded to the five-factor model (RMSEA = 0.049; CI 90% RMSEA = [0.047; 0.050]; CFI = 0.99). Adequate criterion and convergent validity evidence was also provided. Finally, the HS-EBP showed the capability to find differences between EBP training levels as an important evidence of decision validity. Conclusions Reliability and validity evidence obtained regarding the HS-EBP confirm the adequate operationalisation of the EBP construct as a process put into practice to respond to every clinical situation arising in the daily practice of professionals in health sciences (transprofessional). The tool could be useful for EBP individual assessment and for evaluating the impact of specific interventions to improve EBP. PMID:28486533
Fernández-Domínguez, Juan Carlos; de Pedro-Gómez, Joan Ernest; Morales-Asencio, José Miguel; Bennasar-Veny, Miquel; Sastre-Fullana, Pedro; Sesé-Abad, Albert
2017-01-01
Most of the EBP measuring instruments available to date present limitations both in the operationalisation of the construct and also in the rigour of their psychometric development, as revealed in the literature review performed. The aim of this paper is to provide rigorous and adequate reliability and validity evidence of the scores of a new transdisciplinary psychometric tool, the Health Sciences Evidence-Based Practice (HS-EBP), for measuring the construct EBP in Health Sciences professionals. A pilot study and a subsequent two-stage validation test sample were conducted to progressively refine the instrument until a reduced 60-item version with a five-factor latent structure. Reliability was analysed through both Cronbach's alpha coefficient and intraclass correlations (ICC). Latent structure was contrasted using confirmatory factor analysis (CFA) following a model comparison aproach. Evidence of criterion validity of the scores obtained was achieved by considering attitudinal resistance to change, burnout, and quality of professional life as criterion variables; while convergent validity was assessed using the Spanish version of the Evidence-Based Practice Questionnaire (EBPQ-19). Adequate evidence of both reliability and ICC was obtained for the five dimensions of the questionnaire. According to the CFA model comparison, the best fit corresponded to the five-factor model (RMSEA = 0.049; CI 90% RMSEA = [0.047; 0.050]; CFI = 0.99). Adequate criterion and convergent validity evidence was also provided. Finally, the HS-EBP showed the capability to find differences between EBP training levels as an important evidence of decision validity. Reliability and validity evidence obtained regarding the HS-EBP confirm the adequate operationalisation of the EBP construct as a process put into practice to respond to every clinical situation arising in the daily practice of professionals in health sciences (transprofessional). The tool could be useful for EBP individual assessment and for evaluating the impact of specific interventions to improve EBP.
Gutiérrez Sánchez, Daniel; Cuesta-Vargas, Antonio I
2018-04-01
Many measurements have been developed to assess the quality of death (QoD). Among these, the Quality of Dying and Death Questionnaire (QODD) is the most widely studied and best validated. Informal carers and health professionals who care for the patient during their last days of life can complete this assessment tool. The aim of the study is to carry out a cross-cultural adaptation and a psychometric analysis of the QODD for the Spanish population. The translation was performed using a double forward and backward method. An expert panel evaluated the content validity. The questionnaire was tested in a sample of 72 Spanish-speaking adult carers of deceased cancer patients. A psychometric analysis was performed to evaluate internal consistency, divergent criterion-related validity with the Mini-Suffering State Examination (MSSE) and concurrent criterion-related validity with the Palliative Outcome Scale (POS). Some items were deleted and modified to create the Spanish version of the QODD (QODD-ESP-26). The instrument was readable and acceptable. The content validity index was 0.96, suggesting that all items are relevant for the measure of the QoD. This questionnaire showed high internal consistency (Cronbach's α coefficient = 0.88). Divergent validity with MSSE (r = -0.64) and convergent validity with POS (r = -0.61) were also demonstrated. The QODD-ESP-26 is a valid and reliable instrument for the assessment of the QoD of deceased cancer patients that can be used in a clinical and research setting. Copyright © 2018 Elsevier Ltd. All rights reserved.
2013-01-01
Background Transplant recipients are expected to adhere to a lifelong immunosuppressant therapeutic regimen. However, nonadherence to treatment is an underestimated problem for which no properly validated measurement tool is available for Portuguese-speaking patients. We aimed to initially validate the Basel Assessment of Adherence to Immunosuppressive Medications Scale (BAASIS®) to accurately estimate immunosuppressant nonadherence in Brazilian transplant patients. Methods The BAASIS® (English version) was transculturally adapted and its psychometric properties were assessed. The transcultural adaptation was performed using the Guillemin protocol. Psychometric testing included reliability (intraobserver and interobserver reproducibility, agreement, Kappa coefficient, and the Cronbach’s alpha) and validity (content, criterion, and construct validities). Results The final version of the transculturally adapted BAASIS® was pretested, and no difficulties in understanding its content were found. The intraobserver and interobserver reproducibility variances (0.007 and 0.003, respectively), the Cronbach’s alpha (0.7), Kappa coefficient (0.88) and the agreement (95.2%) suggest accuracy, preciseness and reliability. For construct validity, exploratory factorial analysis demonstrated unidimensionality of the first three questions (r = 0.76, r = 0.80, and r = 0.68). For criterion validity, the adapted BAASIS® was correlated with another self-report instrument, the Measure of Adherence to Treatment, and showed good congruence (r = 0.65). Conclusions The BAASIS® has adequate psychometric properties and may be employed in advance to measure adherence to posttransplant immunosuppressant treatments. This instrument will be the first one validated to use in this specific transplant population and in the Portuguese language. PMID:23692889
Buekenhout, Imke; Leitão, José; Gomes, Ana A
2018-05-24
Month ordering tasks have been used in experimental settings to obtain measures of working memory (WM) capacity in older/clinical groups based solely on their face validity. We sought to assess the appropriateness of using a month ordering task in other contexts, including clinical settings, as a psychometrically sound WM assessment. To this end, we constructed a month ordering task (ucMOT), studied its reliability (internal consistency and temporal stability), and gathered construct-related and criterion-related validity evidence for its use as a WM assessment. The ucMOT proved to be internally consistent and temporally stable, and analyses of the criterion-related validity evidence revealed that its scores predicted the efficiency of language comprehension processes known to depend crucially on WM resources, namely, processes involved in pronoun interpretation. Furthermore, all ucMOT items discriminated between younger and older age groups; the global scores were significantly correlated with scores on well-established WM tasks and presented lower correlations with instruments that evaluate different (although related) processes, namely, inhibition and processing speed. We conclude that the ucMOT possesses solid psychometric properties. Accordingly, we acquired normative data for the Portuguese population, which we present as a regression-based algorithm that yields z scores adjusted for age, gender, and years of formal education. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Sajjad, Madiha; Khan, Rehan Ahmed; Yasmeen, Rahila
2018-01-01
To develop a tool to evaluate faculty perceptions of assessment quality in an undergraduate medical program. The Assessment Implementation Measure (AIM) tool was developed by a mixed method approach. A preliminary questionnaire developed through literature review was submitted to a panel of 10 medical education experts for a three-round 'Modified Delphi technique'. Panel agreement of > 75% was considered the criterion for inclusion of items in the questionnaire. Cognitive pre-testing of five faculty members was conducted. Pilot study was done with 30 randomly selected faculty members. Content validity index (CVI) was calculated for individual items (I-CVI) and composite scale (S-CVI). Cronbach's alpha was calculated to determine the internal consistency reliability of the tool. The final AIM tool had 30 items after the Delphi process. S-CVI was 0.98 with the S-CVI/Avg method and 0.86 by S-CVI/UA method, suggesting good content validity. Cut-off value of < 0.9 I-CVI was taken as criterion for item deletion. Cognitive pre-testing revealed good item interpretation. Cronbach's alpha calculated for the AIM was 0.9, whereas Cronbach's alpha for the four domains ranged from 0.67 to 0.80. 'AIM' is a relevant and useful instrument with good content validity and reliability of results, and may be used to evaluate the teachers´ perceptions about assessment quality.
Debast, Inge; Rossi, Gina; van Alphen, S P J
2018-04-01
The alternative model for personality disorders in the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders ( DSM-5) is considered an important step toward a possibly better conceptualization of personality pathology in older adulthood, by the introduction of levels of personality functioning (Criterion A) and trait dimensions (Criterion B). Our main aim was to examine age-neutrality of the Short Form of the Severity Indices of Personality Problems (SIPP-SF; Criterion A) and Personality Inventory for DSM-5-Brief Form (PID-5-BF; Criterion B). Differential item functioning (DIF) analyses and more specifically the impact on scale level through differential test functioning (DTF) analyses made clear that the SIPP-SF was more age-neutral (6% DIF, only one of four domains showed DTF) than the PID-5-BF (25% DIF, all four tested domains had DTF) in a community sample of older and younger adults. Age differences in convergent validity also point in the direction of differences in underlying constructs. Concurrent and criterion validity in geriatric psychiatry inpatients suggest that both the SIPP-SF scales measuring levels of personality functioning (especially self-functioning) and the PID-5-BF might be useful screening measures in older adults despite age-neutrality not being confirmed.
Downer, Jason T.; Booren, Leslie M.; Lima, Olivia K.; Luckner, Amy E.; Pianta, Robert C.
2012-01-01
This paper introduces the Individualized Classroom Assessment Scoring System (inCLASS), an observation tool that targets children’s interactions in preschool classrooms with teachers, peers, and tasks. In particular, initial evidence is reported of the extent to which the inCLASS meets the following psychometric criteria: inter-rater reliability, normal distributions and adequate range, construct validity, and criterion-related validity. These initial findings suggest that the inCLASS has the potential to provide an authentic, contextualized assessment of young children’s classroom behaviors. Future directions for research with the inCLASS are discussed. PMID:23175598
ERIC Educational Resources Information Center
Lin, Keh-chung; Chen, Hui-fang; Chen, Chia-ling; Wang, Tien-ni; Wu, Ching-yi; Hsieh, Yu-wei; Wu, Li-ling
2012-01-01
This study examined criterion-related validity and clinimetric properties of the Pediatric Motor Activity Log (PMAL) in children with cerebral palsy. Study participants were 41 children (age range: 28-113 months) and their parents. Criterion-related validity was evaluated by the associations between the PMAL and criterion measures at baseline and…
ERIC Educational Resources Information Center
Swanson, Jennifer R.; Bradley-Johnson, Sharon; Johnson, C. Merle; O'Dell, Anna Rubenaker
2009-01-01
Three studies examine the validity of the Preschool Form of the Cognitive Abilities Scale--Second Edition (CAS-2). Significant high concurrent criterion-related validity correlations, corrected for restricted range, are found between the CAS-2 and the Detroit Test of Learning Ability--Primary: Third Edition for 26 three-year-olds (r[subscript c] =…
Simulated Driving Assessment (SDA) for Teen Drivers: Results from a Validation Study
McDonald, Catherine C.; Kandadai, Venk; Loeb, Helen; Seacrist, Thomas S.; Lee, Yi-Ching; Winston, Zachary; Winston, Flaura K.
2015-01-01
Background Driver error and inadequate skill are common critical reasons for novice teen driver crashes, yet few validated, standardized assessments of teen driving skills exist. The purpose of this study was to evaluate the construct and criterion validity of a newly developed Simulated Driving Assessment (SDA) for novice teen drivers. Methods The SDA's 35-minute simulated drive incorporates 22 variations of the most common teen driver crash configurations. Driving performance was compared for 21 inexperienced teens (age 16–17 years, provisional license ≤90 days) and 17 experienced adults (age 25–50 years, license ≥5 years, drove ≥100 miles per week, no collisions or moving violations ≤3 years). SDA driving performance (Error Score) was based on driving safety measures derived from simulator and eye-tracking data. Negative driving outcomes included simulated collisions or run-off-the-road incidents. A professional driving evaluator/instructor reviewed videos of SDA performance (DEI Score). Results The SDA demonstrated construct validity: 1.) Teens had a higher Error Score than adults (30 vs. 13, p=0.02); 2.) For each additional error committed, the relative risk of a participant's propensity for a simulated negative driving outcome increased by 8% (95% CI: 1.05–1.10, p<0.01). The SDA demonstrated criterion validity: Error Score was correlated with DEI Score (r=−0.66, p<0.001). Conclusions This study supports the concept of validated simulated driving tests like the SDA to assess novice driver skill in complex and hazardous driving scenarios. The SDA, as a standard protocol to evaluate teen driver performance, has the potential to facilitate screening and assessment of teen driving readiness and could be used to guide targeted skill training. PMID:25740939
van Bokhorst-de van der Schueren, Marian A E; Guaitoli, Patrícia Realino; Jansma, Elise P; de Vet, Henrica C W
2014-02-01
Numerous nutrition screening tools for the hospital setting have been developed. The aim of this systematic review is to study construct or criterion validity and predictive validity of nutrition screening tools for the general hospital setting. A systematic review of English, French, German, Spanish, Portuguese and Dutch articles identified via MEDLINE, Cinahl and EMBASE (from inception to the 2nd of February 2012). Additional studies were identified by checking reference lists of identified manuscripts. Search terms included key words for malnutrition, screening or assessment instruments, and terms for hospital setting and adults. Data were extracted independently by 2 authors. Only studies expressing the (construct, criterion or predictive) validity of a tool were included. 83 studies (32 screening tools) were identified: 42 studies on construct or criterion validity versus a reference method and 51 studies on predictive validity on outcome (i.e. length of stay, mortality or complications). None of the tools performed consistently well to establish the patients' nutritional status. For the elderly, MNA performed fair to good, for the adults MUST performed fair to good. SGA, NRS-2002 and MUST performed well in predicting outcome in approximately half of the studies reviewed in adults, but not in older patients. Not one single screening or assessment tool is capable of adequate nutrition screening as well as predicting poor nutrition related outcome. Development of new tools seems redundant and will most probably not lead to new insights. New studies comparing different tools within one patient population are required. Copyright © 2013 Elsevier Ltd and European Society for Clinical Nutrition and Metabolism. All rights reserved.
Gromisch, Elizabeth S; Zemon, Vance; Holtzer, Roee; Chiaravalloti, Nancy D; DeLuca, John; Beier, Meghan; Farrell, Eileen; Snyder, Stacey; Schairer, Laura C; Glukhovsky, Lisa; Botvinick, Jason; Sloan, Jessica; Picone, Mary Ann; Kim, Sonya; Foley, Frederick W
2016-10-01
Cognitive dysfunction is prevalent in multiple sclerosis. As self-reported cognitive functioning is unreliable, brief objective screening measures are needed. Utilizing widely used full-length neuropsychological tests, this study aimed to establish the criterion validity of highly abbreviated versions of the Brief Visuospatial Memory Test - Revised (BVMT-R), Symbol Digit Modalities Test (SDMT), Delis-Kaplan Executive Function System (D-KEFS) Sorting Test, and Controlled Oral Word Association Test (COWAT) in order to begin developing an MS-specific screening battery. Participants from Holy Name Medical Center and the Kessler Foundation were administered one or more of these four measures. Using test-specific criterion to identify impairment at both -1.5 and -2.0 SD, receiver-operating-characteristic (ROC) analyses of BVMT-R Trial 1, Trial 2, and Trial 1 + 2 raw data (N = 286) were run to calculate the classification accuracy of the abbreviated version, as well as the sensitivity and specificity. The same methods were used for SDMT 30-s and 60-s (N = 321), D-KEFS Sorting Free Card Sort 1 (N = 120), and COWAT letters F and A (N = 298). Using these definitions of impairment, each analysis yielded high classification accuracy (89.3 to 94.3%). BVMT-R Trial 1, SDMT 30-s, D-KEFS Free Card Sort 1, and COWAT F possess good criterion validity in detecting impairment on their respective overall measure, capturing much of the same information as the full version. Along with the first two trials of the California Verbal Learning Test - Second Edition (CVLT-II), these five highly abbreviated measures may be used to develop a brief screening battery.
[Evaluation of Suicide Risk Levels in Hospitals: Validity and Reliability Tests].
Macagnino, Sandro; Steinert, Tilman; Uhlmann, Carmen
2018-05-01
Examination of in-hospital suicide risk levels concerning their validity and their reliability. The internal suicide risk levels were evaluated in a cross sectional study of in 163 inpatients. A reliability check was performed via determining interrater-reliability of senior physician, therapist and the responsible nurse. Within the scope of the validity check, we conducted analyses of criterion validity and construct validity. For the total sample an "acceptable" to "good" interrater-reliability (Kendalls W = .77) of suicide risk levels were obtained. Schizophrenic disorders showed the lowest values, for personality disorders we found the highest level of interrater-reliability. When examining the criterion validity, Item-9 of the BDI-II is substantial correlated to our suicide risk levels (ρ m = .54, p < .01). Within the scope of construct validity check, affective disorders showed the highest correlation (ρ = .77), compatible also with "convergent validity". They differed with schizophrenic disorders which showed the least concordance (ρ = .43). In-hospital suicide risk levels may represent an important contribution to the assessment of suicidal behavior of inpatients experiencing psychiatric treatment due to their overall good validity and reliability. © Georg Thieme Verlag KG Stuttgart · New York.
Clinical validity of prototype personality disorder ratings in adolescents.
Defife, Jared A; Haggerty, Greg; Smith, Scott W; Betancourt, Luis; Ahmed, Zain; Ditkowsky, Keith
2015-01-01
A growing body of research shows that personality pathology in adolescents is clinically distinctive and frequently stable into adulthood. A reliable and useful method for rating personality pathology in adolescent patients has the potential to enhance conceptualization, dissemination, and treatment effectiveness. The aim of this study is to examine the clinical validity of a prototype matching approach (derived from the Shedler Westen Assessment Procedure-Adolescent Version) for quantifying personality pathology in an adolescent inpatient sample. Sixty-six adolescent inpatients and their parents or legal guardians completed forms of the Child Behavior Checklist (CBCL) assessing emotional and behavioral problems. Clinical criterion variables including suicide history, substance use, and fights with peers were also assessed. Patients' individual and group therapists on the inpatient unit completed personality prototype ratings. Prototype diagnoses demonstrated substantial reliability (median intraclass correlation coefficient =.75) across independent ratings from individual and group therapists. Personality prototype ratings correlated with the CBCL scales and clinical criterion variables in anticipated and meaningful ways. As seen in prior research with adult samples, prototype personality ratings show clinical validity across independent clinician raters previously unfamiliar with the approach, and they are meaningfully related to clinical symptoms, behavioral problems, and adaptive functioning.
Clinical Validity of Prototype Personality Disorder Ratings in Adolescents
DeFife, Jared A.; Haggerty, Greg; Smith, Scott W.; Betancourt, Luis; Ahmed, Zain; Ditkowsky, Keith
2015-01-01
A growing body of research shows that personality pathology in adolescents is clinically distinctive and frequently stable into adulthood. A reliable and useful method for rating personality pathology in adolescent patients has the potential to enhance conceptualization, dissemination, and treatment effectiveness. The aim of this study is to examine the clinical validity of a prototype matching approach (derived from the Shedler Westen Assessment Procedure – Adolescent Version) for quantifying personality pathology in an adolescent inpatient sample. Sixty-six adolescent inpatients and their parents or legal guardians completed forms of the Child Behavior Checklist (CBCL) assessing emotional and behavioral problems. Clinical criterion variables including suicide history, substance use, and fights with peers were also assessed. Patients’ individual and group therapists on the inpatient unit completed personality prototype ratings. Prototype diagnoses demonstrated substantial reliability (median ICC = .75) across independent ratings from individual and group therapists. Personality prototype ratings correlated with the CBCL scales and clinical criterion variables in anticipated and meaningful ways. As seen in prior research with adult samples, prototype personality ratings show clinical validity across independent clinician raters previously unfamiliar with the approach, and they are meaningfully related to clinical symptoms, behavioral problems, and adaptive functioning. PMID:25457971
2017-01-01
Objective To perform a translation and cross-cultural adaptation of the Cardiac Rehabilitation Barriers Scale (CRBS) for use in Korea, followed by psychometric validation. The CRBS was developed to assess patients' perception of the degree to which patient, provider and health system-level barriers affect their cardiac rehabilitation (CR) participation. Methods The CRBS consists of 21 items (barriers to adherence) rated on a 5-point Likert scale. The first phase was to translate and cross-culturally adapt the CRBS to the Korean language. After back-translation, both versions were reviewed by a committee. The face validity was assessed in a sample of Korean patients (n=53) with history of acute myocardial infarction that did not participate in CR through semi-structured interviews. The second phase was to assess the construct and criterion validity of the Korean translation as well as internal reliability, through administration of the translated version in 104 patients, principle component analysis with varimax rotation and cross-referencing against CR use, respectively. Results The length, readability, and clarity of the questionnaire were rated well, demonstrating face validity. Analysis revealed a six-factor solution, demonstrating construct validity. Cronbach's alpha was greater than 0.65. Barriers rated highest included not knowing about CR and not being contacted by a program. The mean CRBS score was significantly higher among non-attendees (2.71±0.26) than CR attendees (2.51±0.18) (p<0.01). Conclusion The Korean version of CRBS has demonstrated face, content and criterion validity, suggesting it may be useful for assessing barriers to CR utilization in Korea. PMID:29201826
Examining the validity of self-reports on scales measuring students' strategic processing.
Samuelstuen, Marit S; Bråten, Ivar
2007-06-01
Self-report inventories trying to measure strategic processing at a global level have been much used in both basic and applied research. However, the validity of global strategy scores is open to question because such inventories assess strategy perceptions outside the context of specific task performance. The primary aim was to examine the criterion-related and construct validity of the global strategy data obtained with the Cross-Curricular Competencies (CCC) scale. Additionally, we wanted to compare the validity of these data with the validity of data obtained with a task-specific self-report inventory focusing on the same types of strategies. The sample included 269 10th-grade students from 12 different junior high schools. Global strategy use as assessed with the CCC was compared with task-specific strategy use reported in three different reading situations. Moreover, relationships between scores on the CCC and scores on measures of text comprehension were examined and compared with relationships between scores on the task-specific strategy measure and the same comprehension measures. The comparison between the CCC strategy scores and the task-specific strategy scores suggested only modest criterion-related validity for the data obtained with the global strategy inventory. The CCC strategy scores were also not related to the text comprehension measures, indicating poor construct validity. In contrast, the task-specific strategy scores were positively related to the comprehension measures, indicating good construct validity. Attempts to measure strategic processing at a global level seem to have limited validity and utility.
Helmerhorst, Hendrik J F; Brage, Søren; Warren, Janet; Besson, Herve; Ekelund, Ulf
2012-08-31
Physical inactivity is one of the four leading risk factors for global mortality. Accurate measurement of physical activity (PA) and in particular by physical activity questionnaires (PAQs) remains a challenge. The aim of this paper is to provide an updated systematic review of the reliability and validity characteristics of existing and more recently developed PAQs and to quantitatively compare the performance between existing and newly developed PAQs.A literature search of electronic databases was performed for studies assessing reliability and validity data of PAQs using an objective criterion measurement of PA between January 1997 and December 2011. Articles meeting the inclusion criteria were screened and data were extracted to provide a systematic overview of measurement properties. Due to differences in reported outcomes and criterion methods a quantitative meta-analysis was not possible.In total, 31 studies testing 34 newly developed PAQs, and 65 studies examining 96 existing PAQs were included. Very few PAQs showed good results on both reliability and validity. Median reliability correlation coefficients were 0.62-0.71 for existing, and 0.74-0.76 for new PAQs. Median validity coefficients ranged from 0.30-0.39 for existing, and from 0.25-0.41 for new PAQs.Although the majority of PAQs appear to have acceptable reliability, the validity is moderate at best. Newly developed PAQs do not appear to perform substantially better than existing PAQs in terms of reliability and validity. Future PAQ studies should include measures of absolute validity and the error structure of the instrument.
Validity and Reliability of the Upper Extremity Work Demands Scale.
Jacobs, Nora W; Berduszek, Redmar J; Dijkstra, Pieter U; van der Sluis, Corry K
2017-12-01
Purpose To evaluate validity and reliability of the upper extremity work demands (UEWD) scale. Methods Participants from different levels of physical work demands, based on the Dictionary of Occupational Titles categories, were included. A historical database of 74 workers was added for factor analysis. Criterion validity was evaluated by comparing observed and self-reported UEWD scores. To assess structural validity, a factor analysis was executed. For reliability, the difference between two self-reported UEWD scores, the smallest detectable change (SDC), test-retest reliability and internal consistency were determined. Results Fifty-four participants were observed at work and 51 of them filled in the UEWD twice with a mean interval of 16.6 days (SD 3.3, range = 10-25 days). Criterion validity of the UEWD scale was moderate (r = .44, p = .001). Factor analysis revealed that 'force and posture' and 'repetition' subscales could be distinguished with Cronbach's alpha of .79 and .84, respectively. Reliability was good; there was no significant difference between repeated measurements. An SDC of 5.0 was found. Test-retest reliability was good (intraclass correlation coefficient for agreement = .84) and all item-total correlations were >.30. There were two pairs of highly related items. Conclusion Reliability of the UEWD scale was good, but criterion validity was moderate. Based on current results, a modified UEWD scale (2 items removed, 1 item reworded, divided into 2 subscales) was proposed. Since observation appeared to be an inappropriate gold standard, we advise to investigate other types of validity, such as construct validity, in further research.
2012-01-01
Physical inactivity is one of the four leading risk factors for global mortality. Accurate measurement of physical activity (PA) and in particular by physical activity questionnaires (PAQs) remains a challenge. The aim of this paper is to provide an updated systematic review of the reliability and validity characteristics of existing and more recently developed PAQs and to quantitatively compare the performance between existing and newly developed PAQs. A literature search of electronic databases was performed for studies assessing reliability and validity data of PAQs using an objective criterion measurement of PA between January 1997 and December 2011. Articles meeting the inclusion criteria were screened and data were extracted to provide a systematic overview of measurement properties. Due to differences in reported outcomes and criterion methods a quantitative meta-analysis was not possible. In total, 31 studies testing 34 newly developed PAQs, and 65 studies examining 96 existing PAQs were included. Very few PAQs showed good results on both reliability and validity. Median reliability correlation coefficients were 0.62–0.71 for existing, and 0.74–0.76 for new PAQs. Median validity coefficients ranged from 0.30–0.39 for existing, and from 0.25–0.41 for new PAQs. Although the majority of PAQs appear to have acceptable reliability, the validity is moderate at best. Newly developed PAQs do not appear to perform substantially better than existing PAQs in terms of reliability and validity. Future PAQ studies should include measures of absolute validity and the error structure of the instrument. PMID:22938557
Persoskie, Alexander; Nguyen, Anh B.; Kaufman, Annette R.; Tworek, Cindy
2017-01-01
Beliefs about the relative harmfulness of one product compared to another (perceived relative harm) are central to research and regulation concerning tobacco and nicotine-containing products, but techniques for measuring such beliefs vary widely. We compared the validity of direct and indirect measures of perceived harm of e-cigarettes and smokeless tobacco (SLT) compared to cigarettes. On direct measures, participants explicitly compare the harmfulness of each product. On indirect measures, participants rate the harmfulness of each product separately, and ratings are compared. The U.S. Health Information National Trends Survey (HINTS-FDA-2015; N=3738) included direct measures of perceived harm of e-cigarettes and SLT compared to cigarettes. Indirect measures were created by comparing ratings of harm from e-cigarettes, SLT, and cigarettes on 3-point scales. Logistic regressions tested validity by assessing whether direct and indirect measures were associated with criterion variables including: ever-trying e-cigarettes, ever-trying snus, and SLT use status. Compared to the indirect measures, the direct measures of harm were more consistently associated with criterion variables. On direct measures, 26% of adults rated e-cigarettes as less harmful than cigarettes, and 11% rated SLT as less harmful than cigarettes. Direct measures appear to provide valid information about individuals’ harm beliefs, which may be used to inform research and tobacco control policy. Further validation research is encouraged. PMID:28073035
Stinchfield, Randy; McCready, John; Turner, Nigel E; Jimenez-Murcia, Susana; Petry, Nancy M; Grant, Jon; Welte, John; Chapman, Heather; Winters, Ken C
2016-09-01
The DSM-5 was published in 2013 and it included two substantive revisions for gambling disorder (GD). These changes are the reduction in the threshold from five to four criteria and elimination of the illegal activities criterion. The purpose of this study was to twofold. First, to assess the reliability, validity and classification accuracy of the DSM-5 diagnostic criteria for GD. Second, to compare the DSM-5-DSM-IV on reliability, validity, and classification accuracy, including an examination of the effect of the elimination of the illegal acts criterion on diagnostic accuracy. To compare DSM-5 and DSM-IV, eight datasets from three different countries (Canada, USA, and Spain; total N = 3247) were used. All datasets were based on similar research methods. Participants were recruited from outpatient gambling treatment services to represent the group with a GD and from the community to represent the group without a GD. All participants were administered a standardized measure of diagnostic criteria. The DSM-5 yielded satisfactory reliability, validity and classification accuracy. In comparing the DSM-5 to the DSM-IV, most comparisons of reliability, validity and classification accuracy showed more similarities than differences. There was evidence of modest improvements in classification accuracy for DSM-5 over DSM-IV, particularly in reduction of false negative errors. This reduction in false negative errors was largely a function of lowering the cut score from five to four and this revision is an improvement over DSM-IV. From a statistical standpoint, eliminating the illegal acts criterion did not make a significant impact on diagnostic accuracy. From a clinical standpoint, illegal acts can still be addressed in the context of the DSM-5 criterion of lying to others.
Davies, Kylie; Bulsara, Max K; Ramelet, Anne-Sylvie; Monterosso, Leanne
2018-05-01
To establish criterion-related construct validity and test-retest reliability for the Endotracheal Suction Assessment Tool© (ESAT©). Endotracheal tube suction performed in children can significantly affect clinical stability. Previously identified clinical indicators for endotracheal tube suction were used as criteria when designing the ESAT©. Content validity was reported previously. The final stages of psychometric testing are presented. Observational testing was used to measure construct validity and determine whether the ESAT© could guide "inexperienced" paediatric intensive care nurses' decision-making regarding endotracheal tube suction. Test-retest reliability of the ESAT© was performed at two time points. The researchers and paediatric intensive care nurse "experts" developed 10 hypothetical clinical scenarios with predetermined endotracheal tube suction outcomes. "Experienced" (n = 12) and "inexperienced" (n = 14) paediatric intensive care nurses were presented with the scenarios and the ESAT© guiding decision-making about whether to perform endotracheal tube suction for each scenario. Outcomes were compared with those predetermined by the "experts" (n = 9). Test-retest reliability of the ESAT© was measured at two consecutive time points (4 weeks apart) with "experienced" and "inexperienced" paediatric intensive care nurses using the same scenarios and tool to guide decision-making. No differences were observed between endotracheal tube suction decisions made by "experts" (n = 9), "inexperienced" (n = 14) and "experienced" (n = 12) nurses confirming the tool's construct validity. No differences were observed between groups for endotracheal tube suction decisions at T1 and T2. Criterion-related construct validity and test-retest reliability of the ESAT© were demonstrated. Further testing is recommended to confirm reliability in the clinical setting with the "inexperienced" nurse to guide decision-making related to endotracheal tube suction. The ESAT© is the first validated tool to systematically guide endotracheal nursing practice for the "inexperienced" nurse. © 2018 John Wiley & Sons Ltd.
Validity, sensitivity and specificity of the mentation, behavior and mood subscale of the UPDRS.
Holroyd, Suzanne; Currie, Lillian J; Wooten, G Frederick
2008-06-01
The unified Parkinson's disease rating scale (UPDRS) is the most widely used tool to rate the severity and the stage of Parkinson's disease (PD). However, the mentation, behavior and mood (MBM) subscale of the UPDRS has received little investigation regarding its validity and sensitivity. Three items of this subscale were compared to criterion tests to examine validity, sensitivity and specificity. Ninety-seven patients with idiopathic PD were assessed on the UPDRS. Scores on three items of the MBM subscale, intellectual impairment, thought disorder and depression, were compared to criterion tests, the telephone interview for cognition status (TICS), psychiatric assessment for psychosis and the geriatric depression scale (GDS). Non-parametric tests of association were performed to examine concurrent validity of the MBM items. The sensitivities, specificities and optimal cutoff scores for each MBM item were estimated by receiver operating characteristic (ROC) curve analysis. The MBM items demonstrated low to moderate correlation with the criterion tests, and the sensitivity and specificity were not strong. Even using a score of 7.0 on the items of the MBM demonstrated a sensitivity/specificity of only 0.19/0.48 for intellectual impairment, 0.60/0.72 for thought disorder and 0.61/0.87 for depression. Using a more appropriate cutoff of 2.0 revealed sensitivities of 0.01, 0.38 and 0.13 respectively. The MBM subscale items of intellectual impairment, thought disorder and depression are not appropriate for screening or diagnostic purposes. Tools such as the TICS and the GDS should be considered instead.
Lundin, Andreas; Hallgren, Mats; Balliu, Natalja; Forsell, Yvonne
2015-01-01
The alcohol use disorders identification test (AUDIT) and AUDIT-Consumption (AUDIT-C) are commonly used in population surveys but there are few validations studies in the general population. Validity should be estimated in samples close to the targeted population and setting. This study aims to validate AUDIT and AUDIT-C in a general population sample (PART) in Stockholm, Sweden. We used a general population subsample age 20 to 64 that answered a postal questionnaire including AUDIT who later participated in a psychiatric interview (n = 1,093). Interviews using Schedules for Clinical Assessment in Neuropsychiatry was used as criterion standard. Diagnoses were set according to the fourth version of the Diagnostic and Statistical Manual of Mental Disorders (DSM-IV). Agreement between the diagnostic test and criterion standard was measured with area under the receiver operator characteristics curve (AUC). A total of 1,086 (450 men and 636 women) of the interview participants completed AUDIT. There were 96 individuals with DSM-IV-alcohol dependence, 36 DSM-IV-Alcohol Abuse, and 153 Risk drinkers. AUCs were for DSM-IV-alcohol use disorder 0.90 (AUDIT-C 0.85); DSM-IV-dependence 0.94 (AUDIT-C 0.89); risk drinking 0.80 (AUDIT-C 0.80); and any criterion 0.87 (AUDIT-C 0.84). In this general population sample, AUDIT and AUDIT-C performed outstanding or excellent in identifying dependency, risk drinking, alcohol use disorder, any disorder, or risk drinking. Copyright © 2015 by the Research Society on Alcoholism.
Kong, Feng; You, Xuqun; Zhao, Jingjing
2017-01-01
The Gratitude Questionnaire (GQ; McCullough et al., 2002) is one of the most widely used instruments to assess dispositional gratitude. The purpose of this study was to validate a Chinese version of the GQ by examining internal consistency, factor structure, convergent validity, and measurement invariance across sex. A total of 1151 Chinese adults were recruited to complete the GQ, Positive Affect and Negative Affect Scales, and Satisfaction with Life Scale. Confirmatory factor analysis indicated that the original unidimensional model fitted well, which is in accordance with the findings in Western populations. Furthermore, the GQ had satisfactory composite reliability and criterion-related validity with measures of life satisfaction and affective well-being. Evidence of configural, metric and scalar invariance across sex was obtained. Tests of the latent mean differences found females had higher latent mean scores than males. These findings suggest that the Chinese version of GQ is a reliable and valid tool for measuring dispositional gratitude and can generally be utilized across sex in the Chinese context. PMID:28919873
Kong, Feng; You, Xuqun; Zhao, Jingjing
2017-01-01
The Gratitude Questionnaire (GQ; McCullough et al., 2002) is one of the most widely used instruments to assess dispositional gratitude. The purpose of this study was to validate a Chinese version of the GQ by examining internal consistency, factor structure, convergent validity, and measurement invariance across sex. A total of 1151 Chinese adults were recruited to complete the GQ, Positive Affect and Negative Affect Scales, and Satisfaction with Life Scale. Confirmatory factor analysis indicated that the original unidimensional model fitted well, which is in accordance with the findings in Western populations. Furthermore, the GQ had satisfactory composite reliability and criterion-related validity with measures of life satisfaction and affective well-being. Evidence of configural, metric and scalar invariance across sex was obtained. Tests of the latent mean differences found females had higher latent mean scores than males. These findings suggest that the Chinese version of GQ is a reliable and valid tool for measuring dispositional gratitude and can generally be utilized across sex in the Chinese context.
Evidence for Response Bias as a Source of Error Variance in Applied Assessment
ERIC Educational Resources Information Center
McGrath, Robert E.; Mitchell, Matthew; Kim, Brian H.; Hough, Leaetta
2010-01-01
After 100 years of discussion, response bias remains a controversial topic in psychological measurement. The use of bias indicators in applied assessment is predicated on the assumptions that (a) response bias suppresses or moderates the criterion-related validity of substantive psychological indicators and (b) bias indicators are capable of…
Criterion-Related Validity of Measuring Sight-Word Acquisition with Curriculum-Based Assessment
ERIC Educational Resources Information Center
Burns, Matthew K.; Mosack, Jill L.
2005-01-01
Curriculum-Based Assessment for Instructional Design (CBA-ID) provides data used to ensure an appropriately challenging learning task. One aspect of appropriate challenge measured by CBA-ID, called the acquisition rate (AR), involves the amount of new information a student could acquire and retain during initial learning. Previous research…
Oral Reading Fluency Assessment: Issues of Construct, Criterion, and Consequential Validity
ERIC Educational Resources Information Center
Valencia, Sheila W.; Smith, Antony T.; Reece, Anne M.; Li, Min; Wixson, Karen K.; Newman, Heather
2010-01-01
This study investigated multiple models for assessing oral reading fluency, including 1-minute oral reading measures that produce scores reported as words correct per minute (wcpm). We compared a measure of wcpm with measures of the individual and combined indicators of oral reading fluency (rate, accuracy, prosody, and comprehension) to examine…
Onwujekwe, Obinna
2004-02-01
Contingent valuation question formats that will be used to elicit willingness to pay for goods and services need to be relevant to the area they will be used in order for responses to be valid. A novel contingent valuation question format called the "structured haggling technique" (SH) that resembles the bargaining system in Nigerian markets was designed and its criterion and content validity compared with those of the bidding game (BG) and binary-with-follow-up (BWFU) technique. This was achieved by determining the willingness to pay (WTP) for insecticide-treated nets (ITNs) in Southeast Nigeria. Content validity was determined through observation of actual trading of untreated nets together with interviews with sellers and consumers. Criterion validity was determined by comparing stated and actual WTP. Stated WTP was determined using a questionnaire administered to 810 household heads and actual WTP was determined by offering the nets for sale to all respondents one month later. The phi (correlation) coefficient was used to compare criterion validity across question formats. The phi coefficients were SH (0.60: 95% C.I. 0.50-0.71), BG (0.42: 95% C.I. 0.29-0.54) and the BWFU (0.32: 95% C.I. 0.20-0.44), implying that the BG and SH had similar levels of criterion-validity while the BWFU was the least criterion-valid. However, the SH was the most content-valid. It is necessary to validate the findings in other areas where haggling is common. Future studies should establish the content validity of question formats in the contexts in which they will be used before administering questionnaires.
ERIC Educational Resources Information Center
Rausch, Erica; Racz, Sarah J.; Augenstein, Tara M.; Keeley, Lauren; Lipton, Melanie F.; Szollos, Sebastian; Riffle, James; Moriarity, Daniel; Kromash, Rachelle; De Los Reyes, Andres
2017-01-01
Background: Among adolescents, depressive symptoms commonly co-occur with social anxiety, with social anxiety often developmentally preceding depressive symptoms. Thus, evidence-based assessments of adolescent social anxiety should be augmented with assessments of depressive symptoms using measures that can be administered across developmental…
2011-01-01
Background Since stress is hypothesized to play a role in the etiology of obesity during adolescence, research on associations between adolescent stress and obesity-related parameters and behaviours is essential. Due to lack of a well-established recent stress checklist for use in European adolescents, the study investigated the reliability and validity of the Adolescent Stress Questionnaire (ASQ) for assessing perceived stress in European adolescents. Methods The ASQ was translated into the languages of the participating cities (Ghent, Stockholm, Vienna, Zaragoza, Pecs and Athens) and was implemented within the HELENA cross-sectional study. A total of 1140 European adolescents provided a valid ASQ, comprising 10 component scales, used for internal reliability (Cronbach α) and construct validity (confirmatory factor analysis or CFA). Contributions of socio-demographic (gender, age, pubertal stage, socio-economic status) characteristics to the ASQ score variances were investigated. Two-hundred adolescents also provided valid saliva samples for cortisol analysis to compare with the ASQ scores (criterion validity). Test-retest reliability was investigated using two ASQ assessments from 37 adolescents. Results Cronbach α-values of the ASQ scales (0.57 to 0.88) demonstrated a moderate internal reliability of the ASQ, and intraclass correlation coefficients (0.45 to 0.84) established an insufficient test-retest reliability of the ASQ. The adolescents' gender (girls had higher stress scores than boys) and pubertal stage (those in a post-pubertal development had higher stress scores than others) significantly contributed to the variance in ASQ scores, while their age and socio-economic status did not. CFA results showed that the original scale construct fitted moderately with the data in our European adolescent population. Only in boys, four out of 10 ASQ scale scores were a significant positive predictor for baseline wake-up salivary cortisol, suggesting a rather poor criterion validity of the ASQ, especially in girls. Conclusions In our European adolescent sample, the ASQ had an acceptable internal reliability and construct validity and the adolescents' gender and pubertal stage systematically contributed to the ASQ variance, but its test-retest reliability and criterion validity were rather poor. Overall, the utility of the ASQ for assessing perceived stress in adolescents across Europe is uncertain and some aspects require further examination. PMID:21943341
Benjamin, Sara E; Neelon, Brian; Ball, Sarah C; Bangdiwala, Shrikant I; Ammerman, Alice S; Ward, Dianne S
2007-01-01
Background Few assessment instruments have examined the nutrition and physical activity environments in child care, and none are self-administered. Given the emerging focus on child care settings as a target for intervention, a valid and reliable measure of the nutrition and physical activity environment is needed. Methods To measure inter-rater reliability, 59 child care center directors and 109 staff completed the self-assessment concurrently, but independently. Three weeks later, a repeat self-assessment was completed by a sub-sample of 38 directors to assess test-retest reliability. To assess criterion validity, a researcher-administered environmental assessment was conducted at 69 centers and was compared to a self-assessment completed by the director. A weighted kappa test statistic and percent agreement were calculated to assess agreement for each question on the self-assessment. Results For inter-rater reliability, kappa statistics ranged from 0.20 to 1.00 across all questions. Test-retest reliability of the self-assessment yielded kappa statistics that ranged from 0.07 to 1.00. The inter-quartile kappa statistic ranges for inter-rater and test-retest reliability were 0.45 to 0.63 and 0.27 to 0.45, respectively. When percent agreement was calculated, questions ranged from 52.6% to 100% for inter-rater reliability and 34.3% to 100% for test-retest reliability. Kappa statistics for validity ranged from -0.01 to 0.79, with an inter-quartile range of 0.08 to 0.34. Percent agreement for validity ranged from 12.9% to 93.7%. Conclusion This study provides estimates of criterion validity, inter-rater reliability and test-retest reliability for an environmental nutrition and physical activity self-assessment instrument for child care. Results indicate that the self-assessment is a stable and reasonably accurate instrument for use with child care interventions. We therefore recommend the Nutrition and Physical Activity Self-Assessment for Child Care (NAP SACC) instrument to researchers and practitioners interested in conducting healthy weight intervention in child care. However, a more robust, less subjective measure would be more appropriate for researchers seeking an outcome measure to assess intervention impact. PMID:17615078
The psychometric properties of the Portuguese version of the Personality Inventory for DSM-5.
Pires, Rute; Sousa Ferreira, Ana; Guedes, David
2017-10-01
The DSM-5 Section III proposes a hybrid dimensional-categorical model of conceptualizing personality and its disorders that includes assessment of impairments in personality functioning (criterion A) and maladaptive personality traits (criterion B). The Personality Inventory for the DSM-5 is a new dimensional tool, composed of 220 items organized into 25 facets that delineate five higher order domains of clinically relevant personality differences, and was developed to operationalize the DSM-5 model of pathological personality traits. The current studies address the internal consistency (study 1), the test-retest reliability (study 2) and the criterion validity (studies 3 and 4) of the Portuguese version of the PID-5 in samples of native speaking psychology students. Results indicated good internal consistency reliabilities and good temporal stability reliabilities for the majority of the PID-5 traits. The correlational pattern of the PID-5 traits with two measures of personality was in accordance with theoretical expectations and showed its concurrent validity. © 2017 Scandinavian Psychological Associations and John Wiley & Sons Ltd.
The French-Canadian validation of a disease-specific, patient-reported outcome measure for lupus.
Bourré-Tessier, J; Clarke, A E; Kosinski, M; Mikolaitis-Preuss, R A; Bernatsky, S; Block, J A; Jolly, M
2014-12-01
The objective of this paper is to perform the cross-cultural validation of the French version of the LupusPRO, a disease-targeted patient-reported outcome measure, among systemic lupus erythematosus (SLE) patients in Canada. The French version of the LupusPRO and the MOS SF-36 were administered; demographic, clinical and serological characteristics were obtained. Disease activity (SELENA-SLEDAI and the Lupus Foundation of America definition of flare) and damage (SLICC/ACR SDI) were assessed. Physician disease activity and damage assessments were ascertained using visual analog scales. Internal consistency reliability (ICR), test-retest reliability (TRT), convergent and discriminant validity (against corresponding domains of the SF-36), criterion validity (against disease activity, damage or health status) and known group validity were tested. A total of 99 French-Canadian SLE patients participated (97% women, mean (SD) age 45.2 (14.5) years). The median (IQR) SELENA-SLEDAI and SDI were 3.5 (6.0) and 1.0 (2.0), respectively. The ICR of the LupusPRO domains ranged from 0.81 to 0.93 (except for lupus symptoms, procreation and coping), while TRT ranged from 0.72 to 0.95. Convergent and discriminant validity, criterion validity and known group validity against disease activity, damage and health status measures were observed. Confirmatory factor analysis showed a good fit. The LupusPRO has fair psychometric properties among French-Canadian patients with SLE. © The Author(s) 2014 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav.
[Criterion Validity of the German Version of the CES-D in the General Population].
Jahn, Rebecca; Baumgartner, Josef S; van den Nest, Miriam; Friedrich, Fabian; Alexandrowicz, Rainer W; Wancata, Johannes
2018-04-17
The "Center of Epidemiologic Studies - Depression scale" (CES-D) is a well-known screening tool for depression. Until now the criterion validity of the German version of the CES-D was not investigated in a sample of the adult general population. 508 study participants of the Austrian general population completed the CES-D. ICD-10 diagnoses were established by using the Schedules for Clinical Assessment in Neuropsychiatry (SCAN). Receiver Operating Characteristics (ROC) analysis was conducted. Possible gender differences were explored. Overall discriminating performance of the CES-D was sufficient (ROC-AUC 0,836). Using the traditional cut-off values of 15/16 and 21/22 respectively the sensitivity was 43.2 % and 32.4 %, respectively. The cut-off value developed on the basis of our sample was 9/10 with a sensitivity of 81.1 % und a specificity of 74.3 %. There were no significant gender differences. This is the first study investigating the criterion validity of the German version of the CES-D in the general population. The optimal cut-off values yielded sufficient sensitivity and specificity, comparable to the values of other screening tools. © Georg Thieme Verlag KG Stuttgart · New York.
Ghisi, Gabriela Lima de Melo; Sandison, Nicole; Oh, Paul
2016-03-01
To develop, pilot test and psychometrically validate a shorter version of the coronary artery disease education questionnaire (CADE-Q), called CADE-Q SV. Based on previous versions of the CADE-Q, cardiac rehabilitation (CR) experts developed 20 items divided into 5 knowledge domains to comprise the first version of the CADE-Q SV. To establish content validity, they were reviewed by an expert panel (N=12). Refined items were pilot-tested in 20 patients, in which clarity was provided. A final version was generated and psychometrically-tested in 132CR patients. Test-retest reliability was assessed via the intraclass correlation coefficient (ICC), the internal consistency using Cronbach's alpha, and criterion validity with regard to patients' education and duration in CR. All ICC coefficients meet the minimum recommended standard. All domains were considered internally consistent (α>0.7). Criterion validity was supported by significant differences in mean scores by educational level (p<0.01) and duration in CR (p<0.05). Knowledge about exercise and nutrition was higher than knowledge about medical condition. The CADE-Q SV was demonstrated to have good reliability and validity. This is a short, quick and appropriate tool for application in clinical and research settings, assessing patients' knowledge during CR and as part of education programming. Copyright © 2015. Published by Elsevier Ireland Ltd.
Estimating activity energy expenditure: how valid are physical activity questionnaires?
Neilson, Heather K; Robson, Paula J; Friedenreich, Christine M; Csizmadi, Ilona
2008-02-01
Activity energy expenditure (AEE) is the modifiable component of total energy expenditure (TEE) derived from all activities, both volitional and nonvolitional. Because AEE may affect health, there is interest in its estimation in free-living people. Physical activity questionnaires (PAQs) could be a feasible approach to AEE estimation in large populations, but it is unclear whether or not any PAQ is valid for this purpose. Our aim was to explore the validity of existing PAQs for estimating usual AEE in adults, using doubly labeled water (DLW) as a criterion measure. We reviewed 20 publications that described PAQ-to-DLW comparisons, summarized study design factors, and appraised criterion validity using mean differences (AEE(PAQ) - AEE(DLW), or TEE(PAQ) - TEE(DLW)), 95% limits of agreement, and correlation coefficients (AEE(PAQ) versus AEE(DLW) or TEE(PAQ) versus TEE(DLW)). Only 2 of 23 PAQs assessed most types of activity over the past year and indicated acceptable criterion validity, with mean differences (TEE(PAQ) - TEE(DLW)) of 10% and 2% and correlation coefficients of 0.62 and 0.63, respectively. At the group level, neither overreporting nor underreporting was more prevalent across studies. We speculate that, aside from reporting error, discrepancies between PAQ and DLW estimates may be partly attributable to 1) PAQs not including key activities related to AEE, 2) PAQs and DLW ascertaining different time periods, or 3) inaccurate assignment of metabolic equivalents to self-reported activities. Small sample sizes, use of correlation coefficients, and limited information on individual validity were problematic. Future research should address these issues to clarify the true validity of PAQs for estimating AEE.
Dong, Lijuan; Liu, Na; Tian, Xiaoyu; Qiao, Xiaoxia; Gobbens, Robbert J J; Kane, Robert L; Wang, Cuili
2017-11-01
To translate the Tilburg Frailty Indicator (TFI) into Chinese and assess its reliability and validity. A sample of 917 community-dwelling older people, aged ≥60 years, in a Chinese city was included between August 2015 and March 2016. Construct validity was assessed using alternative measures corresponding to the TFI items, including self-rated health status (SRH), unintentional weight loss, walking speed, timed-up-and-go tests (TUGT), making telephone calls, grip strength, exhaustion, Short Portable Mental Status Questionnaire (SPMSQ), Geriatric Depression scale (GDS-15), emotional role, Adaptability Partnership Growth Affection and Resolve scale (APGAR) and Social Support Rating Scale (SSRS). Fried's phenotype and frailty index were measured to evaluate criterion validity. Adverse health outcomes (ADL and IADL disability, healthcare utilization, GDS-15, SSRS) were used to assess predictive (concurrent) validity. The internal consistency reliability was good (Cronbach's α=0.71). The test-retest reliability was strong (r=0.88). Kappa coefficients showed agreements between the TFI items and corresponding alternative measures. Alternative measures correlated as expected with the three domains of TFI, with an exclusion that alternative psychological measures had similar correlations with psychological and physical domains of the TFI. The Chinese TFI had excellent criterion validity with the AUCs regarding physical phenotype and frailty index of 0.87 and 0.86, respectively. The predictive (concurrent) validities of the adverse health outcomes and healthcare utilization were acceptable (AUCs: 0.65-0.83). The Chinese TFI has good validity and reliability as an integral instrument to measure frailty of older people living in the community in China. Copyright © 2017 Elsevier B.V. All rights reserved.
Pitchford, Nicola J; Outhwaite, Laura A
2016-01-01
Assessment of cognitive and motor functions is fundamental for developmental and neuropsychological profiling. Assessments are usually conducted on an individual basis, with a trained examiner, using standardized paper and pencil tests, and can take up to an hour or more to complete, depending on the nature of the test. This makes traditional standardized assessments of child development largely unsuitable for use in low-income countries. Touch screen tablets afford the opportunity to assess cognitive functions in groups of participants, with untrained administrators, with precision recording of responses, thus automating the assessment process. In turn, this enables cognitive profiling to be conducted in contexts where access to qualified examiners and standardized assessments are rarely available. As such, touch screen assessments could provide a means of assessing child development in both low- and high-income countries, which would afford cross-cultural comparisons to be made with the same assessment tool. However, before touch screen tablet assessments can be used for cognitive profiling in low-to-high-income countries they need to be shown to provide reliable and valid measures of performance. We report the development of a new touch screen tablet assessment of basic cognitive and motor functions for use with early years primary school children in low- and high-income countries. Measures of spatial intelligence, visual attention, short-term memory, working memory, manual processing speed, and manual coordination are included as well as mathematical knowledge. To investigate if this new touch screen assessment tool can be used for cross-cultural comparisons we administered it to a sample of children ( N = 283) spanning standards 1-3 in a low-income country, Malawi, and a smaller sample of children ( N = 70) from first year of formal schooling from a high-income country, the UK. Split-half reliability, test-retest reliability, face validity, convergent construct validity, predictive criterion validity, and concurrent criterion validity were investigated. Results demonstrate "proof of concept" that touch screen tablet technology can provide reliable and valid psychometric measures of performance in the early years, highlighting its potential to be used in cross-cultural comparisons and research.
Reliability and Validity of a New Physical Activity Self-Report Measure for Younger Children
ERIC Educational Resources Information Center
Belton, Sarahjane; Mac Donncha, Ciaran
2010-01-01
The purpose of this study was to assess the test-retest reliability and validity of a new Youth Physical Activity Self-Report measure. Heart rate and direct observation were employed as criterion measures with a sample of 79 children (aged 7-9 years). Spearman's rho correlation between self reported activity intensity and heart rate was 0.87 for…
A Criterion-Related Validation Study of the Army Core Leader Competency Model
2007-04-01
2004). Transformational and transactional leadership: A meta-analytic test of their relative validity. Journal of Applied Psychology , 89, 755- 768...performance criteria in an attempt to adjust ratings for this influence. Leader survey materials were developed and pilot tested at Ft. Drum and Ft... psychological constructs in the behavioral science realm. Numerous theories, popular literature, websites, assessments, and competency models are
Development of an opioid-related Overdose Risk Behavior Scale (ORBS).
Pouget, Enrique R; Bennett, Alex S; Elliott, Luther; Wolfson-Stofko, Brett; Almeñana, Ramona; Britton, Peter C; Rosenblum, Andrew
2017-01-01
Drug overdose has emerged as the leading cause of injury-related death in the United States, driven by prescription opioid (PO) misuse, polysubstance use, and use of heroin. To better understand opioid-related overdose risks that may change over time and across populations, there is a need for a more comprehensive assessment of related risk behaviors. Drawing on existing research, formative interviews, and discussions with community and scientific advisors an opioid-related Overdose Risk Behavior Scale (ORBS) was developed. Military veterans reporting any use of heroin or POs in the past month were enrolled using venue-based and chain referral recruitment. The final scale consisted of 25 items grouped into 5 subscales eliciting the number of days in the past 30 during which the participant engaged in each behavior. Internal reliability, test-retest reliability and criterion validity were assessed using Cronbach's alpha, intraclass correlations (ICC) and Pearson's correlations with indicators of having overdosed during the past 30 days, respectivelyInternal reliability, test-retest reliability and criterion validity were assessed using Cronbach's alpha, intraclass correlations (ICC) and Pearson's correlations with indicators of having overdosed during the past 30 days, respectively. Data for 220 veterans were analyzed. The 5 subscales-(A) Adherence to Opioid Dosage and Therapeutic Purposes; (B) Alternative Methods of Opioid Administration; (C) Solitary Opioid Use; (D) Use of Nonprescribed Overdose-associated Drugs; and (E) Concurrent Use of POs, Other Psychoactive Drugs and Alcohol-generally showed good internal reliability (alpha range = 0.61 to 0.88), test-retest reliability (ICC range = 0.81 to 0.90), and criterion validity (r range = 0.22 to 0.66). The subscales were internally consistent with each other (alpha = 0.84). The scale mean had an ICC value of 0.99, and correlations with validators ranged from 0.44 to 0.56. These results constitute preliminary evidence for the reliability and validity of the new scale. If further validated, it could help improve overdose prevention and response research and could help improve the precision of overdose education and prevention efforts.
Wielenga, J M; De Vos, R; de Leeuw, R; De Haan, R J
2004-01-01
Assessment of clinimetric properties and diagnostic quality of a stress measurement scale (COMFORT scale). Sample of an open population. Neonatology department (Neonatal Intensive Care Unit), Academic Medical Centre/Emma Children's Hospital, Amsterdam, The Netherlands. One clinical expert and 9 observers observed ventilated premature born babies simultaneously. Criterion validity was assessed by correlating the COMFORT scale with the clinical judgment regarding the amount of stress. Interobserver reliability was assessed on the clinical judgment as well as on the COMFORT scale. Diagnostic qualities were evaluated with a ROC curve. On 19 ventilated prematurely born babies (mean gestational age 30 weeks, mean birth weight 1385 gm), one clinical expert and 9 observers made 30 paired observations. The criterion validity of the COMFORT scale was good (Pearson's r of 0.84). The interobserver reliability of the clinical judgment was very good (weighted Kappa 0.84). The interobserver reliability of each item varied from good to almost perfect (weighted Kappa of 0.64 for muscle tone to 1.00 on heart rate). The reliability of the total COMFORT scale score was satisfying (intra-class correlation coefficient of 0.94). The diagnostic quality of the COMFORT scale was excellent, at a cut-off point of 20 the sensitivity was 100 percent, the specificity was 77 percent, and the area under the curve (AUC) of 0.95. In this first evaluation, the COMFORT scale appears to be a valid and reliable measurement tool to assess the stress of ventilated prematurely born babies.
2014-01-01
Background Health impairments can result in disability and changed work productivity imposing considerable costs for the employee, employer and society as a whole. A large number of instruments exist to measure health-related productivity changes; however their methodological quality remains unclear. This systematic review critically appraised the measurement properties in generic self-reported instruments that measure health-related productivity changes to recommend appropriate instruments for use in occupational and economic health practice. Methods PubMed, PsycINFO, Econlit and Embase were systematically searched for studies whereof: (i) instruments measured health-related productivity changes; (ii) the aim was to evaluate instrument measurement properties; (iii) instruments were generic; (iv) ratings were self-reported; (v) full-texts were available. Next, methodological quality appraisal was based on COSMIN elements: (i) internal consistency; (ii) reliability; (iii) measurement error; (iv) content validity; (v) structural validity; (vi) hypotheses testing; (vii) cross-cultural validity; (viii) criterion validity; and (ix) responsiveness. Recommendations are based on evidence syntheses. Results This review included 25 articles assessing the reliability, validity and responsiveness of 15 different generic self-reported instruments measuring health-related productivity changes. Most studies evaluated criterion validity, none evaluated cross-cultural validity and information on measurement error is lacking. The Work Limitation Questionnaire (WLQ) was most frequently evaluated with moderate respectively strong positive evidence for content and structural validity and negative evidence for reliability, hypothesis testing and responsiveness. Less frequently evaluated, the Stanford Presenteeism Scale (SPS) showed strong positive evidence for internal consistency and structural validity, and moderate positive evidence for hypotheses testing and criterion validity. The Productivity and Disease Questionnaire (PRODISQ) yielded strong positive evidence for content validity, evidence for other properties is lacking. The other instruments resulted in mostly fair-to-poor quality ratings with limited evidence. Conclusions Decisions based on the content of the instrument, usage purpose, target country and population, and available evidence are recommended. Until high-quality studies are in place to accurately assess the measurement properties of the currently available instruments, the WLQ and, in a Dutch context, the PRODISQ are cautiously preferred based on its strong positive evidence for content validity. Based on its strong positive evidence for internal consistency and structural validity, the SPS is cautiously recommended. PMID:24495301
Akram, A J; Ireland, A J; Postlethwaite, K C; Sandy, J R; Jerreat, A S
2013-11-01
This article describes the process of validity and reliability testing of a condition-specific quality-of-life measure for patients with hypodontia presenting for orthodontic treatment. The development of the instrument is described in a previous article. Royal Devon and Exeter NHS Foundation Trust & Musgrove Park Hospital, Taunton. The child perception questionnaire was used as a standard against which to test criterion validity. The Bland and Altman method was used to check agreement between the two questionnaires. Construct validity was tested using principal component analysis on the four sections of the questionnaire. Test-retest reliability was tested using intraclass correlation coefficient and Bland and Altman method. Cronbach's alpha was used to test internal consistency reliability. Overall the questionnaire showed good reliability, criterion and construct validity. This together with previous evidence of good face and content validity suggests that the instrument may prove useful in clinical practice and further research. This study has demonstrated that the newly developed condition-specific quality-of-life questionnaire is both valid and reliable for use in young patients with hypodontia. © 2013 John Wiley & Sons A/S. Published by Blackwell Publishing Ltd.
The development and psychometric evaluation of the Internet Disorder Scale (IDS-15).
Pontes, Halley M; Griffiths, Mark D
2017-01-01
Previously published research suggests that improvement in the assessment of Internet addiction (IA) is paramount in advancing the field. However, little has been done to address inconsistencies in the assessment of IA using a more updated framework. The aim of the present study was to develop a new instrument to assess IA based on a modification of the nine Internet Gaming Disorder (IGD) criteria as suggested by the American Psychiatric Association in the latest (fifth) edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5), and to provide a taxonomy of the potential risk of IA risk among participants. A heterogeneous sample of Internet users (n=1105) was recruited online (61.3% males, mean age 33years). Construct validity of the new instrument - Internet Disorder Scale (IDS-15) - was assessed by means of factorial, convergent, and discriminant validity. Criterion-related validity and reliability were also investigated. Additionally, latent profile analysis (LPA) was carried out to differentiate and characterize Internet users based on their potential IA risk. The construct and criterion-related validity of the IDS-15 were both warranted. The IDS-15 proved to be a valid and reliable tool. Using the LPA, participants were classed as "low addiction risk" (n=183, 18.2%), "medium addiction risk" (n=456, 41.1%), and "high addiction risk" (n=455, 40.77%). Furthermore, key differences emerged among these classes in terms of age, relationship status, cigarette consumption, weekly Internet usage, age of Internet use initiation, and IDS-15 total scores. The present findings support the viability of using adapted IGD criteria as a framework to assess IA. Copyright © 2015 Elsevier Ltd. All rights reserved.
Cross-cultural validity of a dietary questionnaire for studies of dental caries risk in Japanese
2014-01-01
Background Diet is a major modifiable contributing factor in the etiology of dental caries. The purpose of this paper is to examine the reliability and cross-cultural validity of the Japanese version of the Food Frequency Questionnaire to assess dietary intake in relation to dental caries risk in Japanese. Methods The 38-item Food Frequency Questionnaire, in which Japanese food items were added to increase content validity, was translated into Japanese, and administered to two samples. The first sample comprised 355 pregnant women with mean age of 29.2 ± 4.2 years for the internal consistency and criterion validity analyses. Factor analysis (principal components with Varimax rotation) was used to determine dimensionality. The dietary cariogenicity score was calculated from the Food Frequency Questionnaire and used for the analyses. Salivary mutans streptococci level was used as a semi-quantitative assessment of dental caries risk and measured by Dentocult SM. Dentocult SM scores were compared with the dietary cariogenicity score computed from the Food Frequency Questionnaire to examine criterion validity, and assessed by Spearman’s correlation coefficient (rs) and Kruskal-Wallis test. Test-retest reliability of the Food Frequency Questionnaire was assessed with a second sample of 25 adults with mean age of 34.0 ± 3.0 years by using the intraclass correlation coefficient analysis. Results The Japanese language version of the Food Frequency Questionnaire showed high test-retest reliability (ICC = 0.70) and good criterion validity assessed by relationship with salivary mutans streptococci levels (rs = 0.22; p < 0.001). Factor analysis revealed four subscales that construct the questionnaire (solid sugars, solid and starchy sugars, liquid and semisolid sugars, sticky and slowly dissolving sugars). Internal consistency were low to acceptable (Cronbach’s alpha = 0.67 for the total scale, 0.46-0.61 for each subscale). Mean dietary cariogenicity scores were 50.8 ± 19.5 in the first sample, 47.4 ± 14.1, and 40.6 ± 11.3 for the first and second administrations in the second sample. The distribution of Dentocult SM score was 6.8% (score = 0), 34.4% (score = 1), 39.4% (score = 2), and 19.4% (score = 3). Participants with higher scores were more likely to have higher dietary cariogenicity scores (p < 0.001; Kruskal-Wallis test). Conclusions These results provide the preliminary evidence for the reliability and validity of the Japanese language Food Frequency Questionnaire. PMID:24383547
Validity of the Miller forensic assessment of symptoms test in psychiatric inpatients.
Veazey, Connie H; Wagner, Alisha L; Hays, J Ray; Miller, Holly A
2005-06-01
This study investigated the validity of the Miller Forensic Assessment of Symptoms Test (M-FAST), a brief measure of malingering, in an inpatient psychiatric sample of 70. Among those patients who also completed the Personality Assessment Inventory (N=44), Total M-FAST score was related in the expected directions to the Personality Assessment Inventory validity scales and indexes, providing evidence for concurrent validity of the M-FAST. With the PAI malingering index used as a criterion, we examined the diagnostic efficiency of the M-FAST and found a cut score of 8 represented the best balance of sensitivity, specificity, positive predictive power, and negative predictive power. Based on this cut-score of 8, 16% of the population was classified as malingering. The M-FAST appears to be an excellent rapid screen for symptom exaggeration in this population and setting.
ERIC Educational Resources Information Center
Lodewyk, Ken R.; Mandigo, James L.
2017-01-01
Physical and Health Education Canada has developed and implemented a formative, criterion-referenced, and practitioner-based national (Canadian) online educational assessment and support resource called Passport for Life (PFL). It was developed to support the awareness and advancement of physical literacy among PE students and teachers. PFL…
Cultural and Ethnic Bias in Teacher Ratings of Behavior: A Criterion-Focused Review
ERIC Educational Resources Information Center
Mason, Benjamin A.; Gunersel, Adalet Baris; Ney, Emilie A.
2014-01-01
Behavior rating scales are indirect measures of emotional and social functioning used for assessment purposes. Rater bias is systematic error that may compromise the validity of behavior rating scale scores. Teacher bias in ratings of behavior has been investigated in multiple studies, but not yet assessed in a research synthesis that focuses on…
Developing a tool to measure satisfaction among health professionals in sub-Saharan Africa
2013-01-01
Background In sub-Saharan Africa, lack of motivation and job dissatisfaction have been cited as causes of poor healthcare quality and outcomes. Measurement of health workers’ satisfaction adapted to sub-Saharan African working conditions and cultures is a challenge. The objective of this study was to develop a valid and reliable instrument to measure satisfaction among health professionals in the sub-Saharan African context. Methods A survey was conducted in Senegal and Mali in 2011 among 962 care providers (doctors, midwives, nurses and technicians) practicing in 46 hospitals (capital, regional and district). The participation rate was very high: 97% (937/962). After exploratory factor analysis (EFA), construct validity was assessed through confirmatory factor analysis (CFA). The discriminant validity of our subscales was evaluated by comparing the average variance extracted (AVE) for each of the constructs with the squared interconstruct correlation (SIC), and finally for criterion validity, each subscale was tested with two hypotheses. Two dimensions of reliability were assessed: internal consistency with Cronbach’s alpha subscales and stability over time using a test-retest process. Results Eight dimensions of satisfaction encompassing 24 items were identified and validated using a process that combined psychometric analyses and expert opinions: continuing education, salary and benefits, management style, tasks, work environment, workload, moral satisfaction and job stability. All eight dimensions demonstrated significant discriminant validity. The final model showed good performance, with a root mean square error of approximation (RMSEA) of 0.0508 (90% CI: 0.0448 to 0.0569) and a comparative fit index (CFI) of 0.9415. The concurrent criterion validity of the eight dimensions was good. Reliability was assessed based on internal consistency, which was good for all dimensions but one (moral satisfaction < 0.70). Test-retest showed satisfactory temporal stability (intra class coefficient range: 0.60 to 0.91). Conclusions Job satisfaction is a complex construct; this study provides a multidimensional instrument whose content, construct and criterion validities were verified to ensure its suitability for the sub-Saharan African context. When using these subscales in further studies, the variability of the reliability of the subscales should be taken in to account for calculating the sample sizes. The instrument will be useful in evaluative studies which will help guide interventions aimed at improving both the quality of care and its effectiveness. PMID:23826720
Bajada, Stefan; Mohanty, Khitish
2016-06-01
The Majeed scoring system is a disease-specific outcome measure that was originally designed to assess pelvic injuries. The aim of this study was to determine the psychometric properties of the Majeed scoring system for chronic sacroiliac joint pain. Internal consistency, content validity, criterion validity, construct validity and responsiveness to change was assessed prospectively for the Majeed scoring system in a cohort of 60 patients diagnosed with sacroiliac joint pain. This diagnosis was confirmed with CT-guided sacroiliac joint anaesthetic block. The overall Majeed score showed acceptable internal consistency (Cronbach alpha = 0.63). Similarly, it showed acceptable floor (0 %) and ceiling (0 %) effects. On the other hand, the domains of pain, work, sitting and sexual intercourse had high (>30 %) floor effects. Significant correlation with the physical component of the Short Form-36 (p = 0.005) and Oswestry disability index (p ≤ 0.001) was found indicating acceptable criterion validity. The overall Majeed score showed acceptable construct validity with all five developed hypotheses showing significance (p ≤ 0.05). The overall Majeed score showed acceptable responsiveness to change with a large (≥0.80) effect size and standardized response mean. Overall the Majeed scoring system demonstrated acceptable psychometric properties for outcome assessment in chronic sacroiliac joint pain. Thus, its use in this condition is adequate. However, some domains demonstrated suboptimal performance indicating that improvement might be achieved with the development of an outcome measure specific for sacroiliac joint dysfunction and degeneration.
ERIC Educational Resources Information Center
Hutchinson, Nick; Oakes, Peter
2011-01-01
Background: People with Down Syndrome are at significant risk of developing Alzheimer's disease as they get older and early assessment, diagnosis and intervention is essential. Neuro-psychological measures of cognitive functioning play an important part in the assessment process. The aim of the present study was to examine the concurrent criterion…
Performance indicators for public mental healthcare: a systematic international inventory
2012-01-01
Background The development and use of performance indicators (PI) in the field of public mental health care (PMHC) has increased rapidly in the last decade. To gain insight in the current state of PI for PMHC in nations and regions around the world, we conducted a structured review of publications in scientific peer-reviewed journals supplemented by a systematic inventory of PI published in policy documents by (non-) governmental organizations. Methods Publications on PI for PMHC were identified through database- and internet searches. Final selection was based on review of the full content of the publications. Publications were ordered by nation or region and chronologically. Individual PI were classified by development method, assessment level, care domain, performance dimension, diagnostic focus, and data source. Finally, the evidence on feasibility, data reliability, and content-, criterion-, and construct validity of the PI was evaluated. Results A total of 106 publications were included in the sample. The majority of the publications (n = 65) were peer-reviewed journal articles and 66 publications specifically dealt with performance of PMHC in the United States. The objectives of performance measurement vary widely from internal quality improvement to increasing transparency and accountability. The characteristics of 1480 unique PI were assessed. The majority of PI is based on stakeholder opinion, assesses care processes, is not specific to any diagnostic group, and utilizes administrative data sources. The targeted quality dimensions varied widely across and within nations depending on local professional or political definitions and interests. For all PI some evidence for the content validity and feasibility has been established. Data reliability, criterion- and construct validity have rarely been assessed. Only 18 publications on criterion validity were included. These show significant associations in the expected direction on the majority of PI, but mixed results on a noteworthy number of others. Conclusions PI have been developed for a broad range of care levels, domains, and quality dimensions of PMHC. To ensure their usefulness for the measurement of PMHC performance and advancement of transparency, accountability and quality improvement in PMHC, future research should focus on assessment of the psychometric properties of PI. PMID:22433251
Spyridou, Andria; Schauer, Maggie; Ruf-Leuschner, Martina
2015-02-21
Prenatal assessment for psychosocial risk factors and prevention and intervention is scarce and, in most cases, nonexistent in obstetrical care. In this study we aimed to evaluate if the KINDEX, a short instrument developed in Germany, is a useful tool in the hands of non-trained medical staff, in order to identify and refer women in psychosocial risk to the adequate mental health and social services. We also examined the criterion-related concurrent validity of the tool through a validation interview carried out by an expert clinical psychologist. Our final objective was to achieve the cultural adaptation of the KINDEX Greek Version and to offer a valid tool for the psychosocial risk assessment to the obstetric care providers. Two obstetricians and five midwives carried out 93 KINDEX interviews (duration 20 minutes) with pregnant women to assess psychosocial risk factors present during pregnancy. Afterwards they referred women who they identified having two or more psychosocial risk factors to the mental health attention unit of the hospital. During the validation procedure an expert clinical psychologist carried out diagnostic interviews with a randomized subsample of 50 pregnant women based on established diagnostic instruments for stress and psychopathology, like the PSS-14, ESI, PDS, HSCL-25. Significant correlations between the results obtained through the assessment using the KINDEX and the risk areas of stress, psychopathology and trauma load assessed in the validation interview demonstrate the criterion-related concurrent validity of the KINDEX. The referral accuracy of the medical staff is confirmed through comparisons between pregnant women who have and have not been referred to the mental health attention unit. Prenatal screenings for psychosocial risks like the KINDEX are feasible in public health settings in Greece. In addition, validity was confirmed in high correlations between the KINDEX results and the results of the validation interviews. The KINDEX Greek version can be considered a valid tool, which can be used by non-trained medical staff providing obstetrical care to identify high-risk women and refer them to adequate mental health and social services. These kind of assessments are indispensable for the promotion of a healthy family environment and child development.
Translating and validating a Training Needs Assessment tool into Greek
Markaki, Adelais; Antonakis, Nikos; Hicks, Carolyn M; Lionis, Christos
2007-01-01
Background The translation and cultural adaptation of widely accepted, psychometrically tested tools is regarded as an essential component of effective human resource management in the primary care arena. The Training Needs Assessment (TNA) is a widely used, valid instrument, designed to measure professional development needs of health care professionals, especially in primary health care. This study aims to describe the translation, adaptation and validation of the TNA questionnaire into Greek language and discuss possibilities of its use in primary care settings. Methods A modified version of the English self-administered questionnaire consisting of 30 items was used. Internationally recommended methodology, mandating forward translation, backward translation, reconciliation and pretesting steps, was followed. Tool validation included assessing item internal consistency, using the alpha coefficient of Cronbach. Reproducibility (test – retest reliability) was measured by the kappa correlation coefficient. Criterion validity was calculated for selected parts of the questionnaire by correlating respondents' research experience with relevant research item scores. An exploratory factor analysis highlighted how the items group together, using a Varimax (oblique) rotation and subsequent Cronbach's alpha assessment. Results The psychometric properties of the Greek version of the TNA questionnaire for nursing staff employed in primary care were good. Internal consistency of the instrument was very good, Cronbach's alpha was found to be 0.985 (p < 0.001) and Kappa coefficient for reproducibility was found to be 0.928 (p < 0.0001). Significant positive correlations were found between respondents' current performance levels on each of the research items and amount of research involvement, indicating good criterion validity in the areas tested. Factor analysis revealed seven factors with eigenvalues of > 1.0, KMO (Kaiser-Meyer-Olkin) measure of sampling adequacy = 0.680 and Bartlett's test of sphericity, p < 0.001. Conclusion The translated and adapted Greek version is comparable with the original English instrument in terms of validity and reliability and it is suitable to assess professional development needs of nursing staff in Greek primary care settings. PMID:17474989
ERIC Educational Resources Information Center
Oakland, Thomas
New strategies for evaluation criterion referenced measures (CRM) are discussed. These strategies examine the following issues: (1) the use of normed referenced measures (NRM) as CRM and then estimating the reliability and validity of such measures in terms of variance from an arbitrarily specified criterion score, (2) estimation of the…
Physical employment standards for U.K. fire and rescue service personnel.
Blacker, S D; Rayson, M P; Wilkinson, D M; Carter, J M; Nevill, A M; Richmond, V L
2016-01-01
Evidence-based physical employment standards are vital for recruiting, training and maintaining the operational effectiveness of personnel in physically demanding occupations. (i) Develop criterion tests for in-service physical assessment, which simulate the role-related physical demands of UK fire and rescue service (UK FRS) personnel. (ii) Develop practical physical selection tests for FRS applicants. (iii) Evaluate the validity of the selection tests to predict criterion test performance. Stage 1: we conducted a physical demands analysis involving seven workshops and an expert panel to document the key physical tasks required of UK FRS personnel and to develop 'criterion' and 'selection' tests. Stage 2: we measured the performance of 137 trainee and 50 trained UK FRS personnel on selection, criterion and 'field' measures of aerobic power, strength and body size. Statistical models were developed to predict criterion test performance. Stage 3: matter experts derived minimum performance standards. We developed single person simulations of the key physical tasks required of UK FRS personnel as criterion and selection tests (rural fire, domestic fire, ladder lift, ladder extension, ladder climb, pump assembly, enclosed space search). Selection tests were marginally stronger predictors of criterion test performance (r = 0.88-0.94, 95% Limits of Agreement [LoA] 7.6-14.0%) than field test scores (r = 0.84-0.94, 95% LoA 8.0-19.8%) and offered greater face and content validity and more practical implementation. This study outlines the development of role-related, gender-free physical employment tests for the UK FRS, which conform to equal opportunities law. © The Author 2015. Published by Oxford University Press on behalf of the Society of Occupational Medicine. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
2013-01-01
Background Two of the current methodological barriers to implementation science efforts are the lack of agreement regarding constructs hypothesized to affect implementation success and identifiable measures of these constructs. In order to address these gaps, the main goals of this paper were to identify a multi-level framework that captures the predominant factors that impact implementation outcomes, conduct a systematic review of available measures assessing constructs subsumed within these primary factors, and determine the criterion validity of these measures in the search articles. Method We conducted a systematic literature review to identify articles reporting the use or development of measures designed to assess constructs that predict the implementation of evidence-based health innovations. Articles published through 12 August 2012 were identified through MEDLINE, CINAHL, PsycINFO and the journal Implementation Science. We then utilized a modified five-factor framework in order to code whether each measure contained items that assess constructs representing structural, organizational, provider, patient, and innovation level factors. Further, we coded the criterion validity of each measure within the search articles obtained. Results Our review identified 62 measures. Results indicate that organization, provider, and innovation-level constructs have the greatest number of measures available for use, whereas structural and patient-level constructs have the least. Additionally, relatively few measures demonstrated criterion validity, or reliable association with an implementation outcome (e.g., fidelity). Discussion In light of these findings, our discussion centers on strategies that researchers can utilize in order to identify, adapt, and improve extant measures for use in their own implementation research. In total, our literature review and resulting measures compendium increases the capacity of researchers to conceptualize and measure implementation-related constructs in their ongoing and future research. PMID:23414420
Montero-Marín, Jesús; García-Campayo, Javier
2010-06-02
Burnout syndrome has been clinically characterised by a series of three subtypes: frenetic, underchallenged, and worn-out, with reference to coping strategies for stress and frustration at work with different degrees of dedication. The aims of the study are to present an operating definition of these subtypes in order to assess their reliability and convergent validity with respect to a standard burnout criterion and to examine differences with regard to sex and the temporary nature of work contracts. An exploratory factor analysis was performed by the main component method on a range of items devised by experts. The sample was composed of 409 employees of the University of Zaragoza, Spain. The reliability of the scales was assessed with Cronbach's alpha, convergent validity in relation to the Maslach Burnout Inventory with Pearson's r, and differences with Student's t-test and the Mann-Whitney U test. The factorial validity and reliability of the scales were good. The subtypes presented relations of differing degrees with the criterion dimensions, which were greater when dedication to work was lower. The frenetic profile presented fewer relations with the criterion dimensions while the worn-out profile presented relations of the greatest magnitude. Sex was not influential in establishing differences. However, the temporary nature of work contracts was found to have an effect: temporary employees exhibited higher scores in the frenetic profile (p < 0.001), while permanent employees did so in the underchallenged (p = 0.018) and worn-out (p < 0.001) profiles. The classical Maslach description of burnout does not include the frenetic profile; therefore, these patients are not recognised. The developed questionnaire may be a useful tool for the design and appraisal of specific preventive and treatment approaches based on the type of burnout experienced.
Brown, Heidi Wendell; Wise, Meg E.; Westenberg, Danielle; Schmuhl, Nicholas B.; Brezoczky, Kelly Lewis; Rogers, Rebecca G.; Constantine, Melissa L.
2017-01-01
Introduction and hypothesis Fewer than 30% of women with accidental bowel leakage (ABL) seek care, despite the existence of effective, minimally invasive therapies. We developed and validated a condition-specific instrument to assess barriers to care-seeking for ABL in women. Methods Adult women with ABL completed an electronic survey about condition severity, patient activation, previous care-seeking, and demographics. The Barriers to Care-seeking for Accidental Bowel Leakage (BCABL) instrument contained 42 potential items completed at baseline and again 2 weeks later. Paired t tests evaluated test–retest reliability. Factor analysis evaluated factor structure and guided item retention. Cronbach’s alpha evaluated internal consistency. Within and across factor item means generated a summary BCABL score used to evaluate scale validity with six external criterion measures. Results Among 1,677 click-throughs, 736 (44%) entered the survey; 95% of eligible female respondents (427 out of 458) provided complete data. Fifty-three percent of respondents had previously sought care for their ABL; median age was 62 years (range 27–89); mean Vaizey score was 12.8 (SD = 5.0), indicating moderate to severe ABL. Test–retest reliability was excellent for all items. Factor extraction via oblique rotation resulted in the final structure of 16 items in six domains, within which internal consistency was high. All six external criterion measures correlated significantly with BCABL score. Conclusions The BCABL questionnaire, with 16 items mapping to six domains, has excellent criterion validity and test–retest reliability when administered electronically in women with ABL. The BCABL can be used to identify care-seeking barriers for ABL in different populations, inform targeted interventions, and measure their effectiveness. PMID:28236039
Gagné, Myriam; Boulet, Louis-Philippe; Pérez, Norma; Moisan, Jocelyne
2018-04-30
To systematically identify the measurement properties of patient-reported outcome instruments (PROs) that evaluate adherence to inhaled maintenance medication in adults with asthma. We conducted a systematic review of six databases. Two reviewers independently included studies on the measurement properties of PROs that evaluated adherence in asthmatic participants aged ≥18 years. Based on the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN), the reviewers (1) extracted data on internal consistency, reliability, measurement error, content validity, structural validity, hypotheses testing, cross-cultural validity, criterion validity, and responsiveness; (2) assessed the methodological quality of the included studies; (3) assessed the quality of the measurement properties (positive or negative); and (4) summarised the level of evidence (limited, moderate, or strong). We screened 6,068 records and included 15 studies (14 PROs). No studies evaluated measurement error or responsiveness. Based on methodological and measurement property quality assessments, we found limited positive evidence of: (a) internal consistency of the Adherence Questionnaire, Refined Medication Adherence Reason Scale (MAR-Scale), Medication Adherence Report Scale for Asthma (MARS-A), and Test of the Adherence to Inhalers (TAI); (b) reliability of the TAI; and (c) structural validity of the Adherence Questionnaire, MAR-Scale, MARS-A, and TAI. We also found limited negative evidence of: (d) hypotheses testing of Adherence Questionnaire; (e) reliability of the MARS-A; and (f) criterion validity of the MARS-A and TAI. Our results highlighted the need to conduct further high-quality studies that will positively evaluate the reliability, validity, and responsiveness of the available PROs. This article is protected by copyright. All rights reserved.
Meunier, Jean-Christophe; Roskam, Isabelle
2009-01-01
This study presents a validation of a scale that assesses parents' childrearing behavior toward young children. The scale was validated on 565 parents of 2- to 7-year-old children. The current results replicated the factor solution of the original scale designed for parents of school-aged children. The scale demonstrated good psychometric properties: moderate to high internal consistency, the expected relations with criterion variables (parental self-efficacy beliefs, child's behavior and personality), and discriminative properties according to the parents' gender and educational level, the child's age and gender, and the difference between referred and nonreferred children.
Assessing Upper Extremity Motor Function in Practice of Virtual Activities of Daily Living
Adams, Richard J.; Lichter, Matthew D.; Krepkovich, Eileen T.; Ellington, Allison; White, Marga; Diamond, Paul T.
2015-01-01
A study was conducted to investigate the criterion validity of measures of upper extremity (UE) motor function derived during practice of virtual activities of daily living (ADLs). Fourteen hemiparetic stroke patients employed a Virtual Occupational Therapy Assistant (VOTA), consisting of a high-fidelity virtual world and a Kinect™ sensor, in four sessions of approximately one hour in duration. An Unscented Kalman Filter-based human motion tracking algorithm estimated UE joint kinematics in real-time during performance of virtual ADL activities, enabling both animation of the user’s avatar and automated generation of metrics related to speed and smoothness of motion. These metrics, aggregated over discrete sub-task elements during performance of virtual ADLs, were compared to scores from an established assessment of UE motor performance, the Wolf Motor Function Test (WMFT). Spearman’s rank correlation analysis indicates a moderate correlation between VOTA-derived metrics and the time-based WMFT assessments, supporting the criterion validity of VOTA measures as a means of tracking patient progress during an UE rehabilitation program that includes practice of virtual ADLs. PMID:25265612
Assessing upper extremity motor function in practice of virtual activities of daily living.
Adams, Richard J; Lichter, Matthew D; Krepkovich, Eileen T; Ellington, Allison; White, Marga; Diamond, Paul T
2015-03-01
A study was conducted to investigate the criterion validity of measures of upper extremity (UE) motor function derived during practice of virtual activities of daily living (ADLs). Fourteen hemiparetic stroke patients employed a Virtual Occupational Therapy Assistant (VOTA), consisting of a high-fidelity virtual world and a Kinect™ sensor, in four sessions of approximately one hour in duration. An unscented Kalman Filter-based human motion tracking algorithm estimated UE joint kinematics in real-time during performance of virtual ADL activities, enabling both animation of the user's avatar and automated generation of metrics related to speed and smoothness of motion. These metrics, aggregated over discrete sub-task elements during performance of virtual ADLs, were compared to scores from an established assessment of UE motor performance, the Wolf Motor Function Test (WMFT). Spearman's rank correlation analysis indicates a moderate correlation between VOTA-derived metrics and the time-based WMFT assessments, supporting the criterion validity of VOTA measures as a means of tracking patient progress during an UE rehabilitation program that includes practice of virtual ADLs.
Rikli, Roberta E; Jones, C Jessie
2013-04-01
To develop and validate criterion-referenced fitness standards for older adults that predict the level of capacity needed for maintaining physical independence into later life. The proposed standards were developed for use with a previously validated test battery for older adults-the Senior Fitness Test (Rikli, R. E., & Jones, C. J. (2001). Development and validation of a functional fitness test for community--residing older adults. Journal of Aging and Physical Activity, 6, 127-159; Rikli, R. E., & Jones, C. J. (1999a). Senior fitness test manual. Champaign, IL: Human Kinetics.). A criterion measure to assess physical independence was identified. Next, scores from a subset of 2,140 "moderate-functioning" older adults from a larger cross-sectional database, together with findings from longitudinal research on physical capacity and aging, were used as the basis for proposing fitness standards (performance cut points) associated with having the ability to function independently. Validity and reliability analyses were conducted to test the standards for their accuracy and consistency as predictors of physical independence. Performance standards are presented for men and women ages 60-94 indicating the level of fitness associated with remaining physically independent until late in life. Reliability and validity indicators for the standards ranged between .79 and .97. The proposed standards provide easy-to-use, previously unavailable methods for evaluating physical capacity in older adults relative to that associated with physical independence. Most importantly, the standards can be used in planning interventions that target specific areas of weakness, thus reducing risk for premature loss of mobility and independence.
Revision, Criterion Validity, and Multi-group Assessment of the Reactions to Homosexuality Scale
Smolenski, Derek J.; Diamond, Pamela M.; Ross, Michael W.; Simon Rosser, B. R.
2010-01-01
Internalized homonegativity encompasses negative attitudes toward one’s own sexual orientation, and is associated with negative mental and physical health outcomes. The Reactions to Homosexuality scale (Ross & Rosser, 1996), an instrument used to measure internalized homonegativity, has been criticized for including content irrelevant to the construct of internalized homonegativity. We revised the scale using exploratory and confirmatory factor analyses, and identified a seven-item, three-factor reduced version that demonstrated measurement invariance across racial/ethnic categorizations and between English and Spanish versions. We also investigated criterion validity by estimating correlations with hypothesized outcomes associated with outness, relationship status, sexual orientation, and gay community affiliation. The evidence of measurement invariance suggests that this scale is appropriate for pluralistic treatment or study groups. PMID:20954058
Moschella, Melissa
2016-06-01
This article explains the problems with Alan Shewmon's critique of brain death as a valid sign of human death, beginning with a critical examination of his analogy between brain death and severe spinal cord injury. The article then goes on to assess his broader argument against the necessity of the brain for adult human organismal integration, arguing that he fails to translate correctly from biological to metaphysical claims. Finally, on the basis of a deeper metaphysical analysis, I offer a revised rationale for the validity of the neurological criterion of human death. © The Author 2016. Published by Oxford University Press, on behalf of the Journal of Medicine and Philosophy Inc. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Development of the beliefs about yoga scale.
Sohl, Stephanie J; Schnur, Julie B; Daly, Leslie; Suslov, Kathryn; Montgomery, Guy H
2011-01-01
Beliefs about yoga may influence participation in yoga and outcomes of yoga interventions. There is currently no scale appropriate for assessing these beliefs in the general U.S. population. This study took the first steps in developing and validating a Beliefs About Yoga Scale (BAYS) to assess beliefs about yoga that may influence people's engagement in yoga interventions. Items were generated based on previously published research about perceptions of yoga and reviewed by experts within the psychology and yoga communities. 426 adult participants were recruited from an urban medical center to respond to these items. The mean age was 40.7 (SD=13.5) years. Participants completed the BAYS and seven additional indicators of criterion-related validity. The BAYS demonstrated internal consistency (11 items; α=0.76) and three factors emerged: expected health benefits, expected discomfort, and expected social norms. The factor structure was confirmed: x2 (41, n=213)=72.06, p<.001; RMSEA=06, p=.23. Criterion-related validity was supported by positive associations of the BAYS with past experiences and future intentions related to yoga. This initial analysis of the BAYS demonstrated that it is an adequately reliable and valid measure of beliefs about yoga with a three-factor structure. However, the scale may need to be modified based on the population to which it is applied.
Messner, Steven F.; Raffalovich, Lawrence E.; Sutton, Gretchen M.
2011-01-01
This paper assesses the extent to which the infant mortality rate might be treated as a “proxy” for poverty in research on cross-national variation in homicide rates. We have assembled a pooled, cross-sectional time-series dataset for 16 advanced nations over the 1993–2000 period that includes standard measures of infant mortality and homicide and also contains information on two commonly used “income-based” poverty measures: a measure intended to reflect “absolute” deprivation and a measure intended to reflect “relative” deprivation. With these data, we are able to assess the criterion validity of the infant mortality rate with reference to the two income-based poverty measures. We are also able to estimate the effects of the various indicators of disadvantage on homicide rates in regression models, thereby assessing construct validity. The results reveal that the infant mortality rate is more strongly correlated with “relative poverty” than with “absolute poverty,” although much unexplained variance remains. In the regression models, the measure of infant mortality and the relative poverty measure yield significant positive effects on homicide rates, while the absolute poverty measure does not exhibit any significant effects. Our analyses suggest that it would be premature to dismiss relative deprivation in cross-national research on homicide, and that disadvantage is best conceptualized and measured as a multidimensional construct. PMID:21643432
Erel, Suat; Şimşek, İbrahim Engin; Özkan, Hüseyin
2015-01-01
The aim of this study was to analyze the validity and reliability of the Turkish version (ICOAP-TR) of the intermittent and constant osteoarthritis pain (ICOAP) questionnaire in patients with knee osteoarthritis (OA). Thirty-eight volunteer patients diagnosed with knee OA answered the questionnaire twice with an interval of 2-4 days. The reliability of the measurement was assessed using Cronbach's alpha coefficient and intraclass correlation (ICC) for test-retest reliability. Criterion validity was tested against the Western Ontario and McMaster Universities Arthritis Index (WOMAC) pain score and visual analog scale (VAS) designed to assess the perceived discomfort rated by the patient. Test-retest reliability was found to be ICC=0.942 for total score, 0.902 for constant pain subscale, and 0.945 for intermittent pain subscale. Internal consistency was tested using Cronbach's alpha and was found to be 0.970 for total score, 0.948 for constant pain subscale, and 0.972 for intermittent pain subscale. For criterion validity, the correlation between the total score of ICOAP-TR and WOMAC pain subscale was r=0.779 (p<0.05), and correlation between total score of ICOAP-TR and VAS was r=0.570 (p<0.05). The ICOAP-TR is a reliable and valid instrument to be used with patients with knee OA.
Psychometric Characteristics of a Vocational Preference Inventory Short Form.
ERIC Educational Resources Information Center
Lowman, Rodney L.; Schurman, Susan J.
1982-01-01
The psychometric properties of a revised version of Holland's Vocational Preference Inventory were assessed using federal government employees. Factor analyses, interscale correlations, measures of internal consistency, and criterion group profiles are presented. Evidence was supportive of the validity of the revised form. (Author/BW)
Study to validate the Non-Interference Performance Assessment (NIPA) technique
NASA Technical Reports Server (NTRS)
Seeman, J. S.; Murphy, G. L.
1973-01-01
The NIPA (Non-Interference Performance Assessment) technique involves direct observation of group verbal activities by trained observers who rate the emotional content (affect) of each verbal interaction as either positive, negative, or neutral. During the test, in which four men were confined for 90 consecutive days, feasibility of the NIPA technique was demonstrated and observer reliability was verified. However, the validity of the test was not proved because an independent criterion measure of morale for the confined crew was lacking. There were indications, however, that NIPA measures were tracking changes in crew morale. At approximately the two-thirds point (Days 60 to 70), morale apparently fell dramatically for a period of about ten days, and simultaneously NIPA measure of positive verbalization decreased in number. A need was indicated for a separate study to apply the NIPA technique under experimental conditions and using a clearly defined criterion measure against which the ability of NIPA observations to truly measure morale changes could be determined.
Criterion-Related Validity of the TOEFL iBT Listening Section. TOEFL iBT Research Report. RR-09-02
ERIC Educational Resources Information Center
Sawaki, Yasuyo; Nissan, Susan
2009-01-01
The study investigated the criterion-related validity of the "Test of English as a Foreign Language"[TM] Internet-based test (TOEFL[R] iBT) Listening section by examining its relationship to a criterion measure designed to reflect language-use tasks that university students encounter in everyday academic life: listening to academic…
Validation and cross cultural adaptation of the Italian version of the Harris Hip Score.
Dettoni, Federico; Pellegrino, Pietro; La Russa, Massimo R; Bonasia, Davide E; Blonna, Davide; Bruzzone, Matteo; Castoldi, Filippo; Rossi, Roberto
2015-01-01
The Harris Hip Score (HHS) is one of the most widely used health related quality of life (HRQOL) measures for the assessment of hip pathology: in spite of this, a validation study, and an official Italian version have not been provided yet. The aim of this study was to create an Italian valid and reliable version of the HHS. The score was translated and modified in Italian; then 103 patients with different hip pathologies were evaluated using this HHS version and also with the WOMAC and the SF-12 questionnaires. Content, construct and criterion validities were tested, such as interobserver reliability, test-retest reliability and internal consistency. Cross-cultural adaptation was easy, and only minor adaptation was required in the translation process. Construct and criterion validity of the HHS Italian Version were confirmed by satisfactory values of Spearman's Rho for correlation between specific domains of HHS and Womac and SF12 scores. Interobserver and test-retest reliabilities obtained values of 0.996 and 0.975 respectively; Cronbach's alpha for internal consistency was 0.816. Statistical and clinical analysis showed that HHS is highly valid and reliable in this new Italian version.
Soble, Jason R; Bain, Kathleen M; Bailey, K Chase; Kirton, Joshua W; Marceaux, Janice C; Critchfield, Edan A; McCoy, Karin J M; O'Rourke, Justin J F
2018-01-08
Embedded performance validity tests (PVTs) allow for continuous assessment of invalid performance throughout neuropsychological test batteries. This study evaluated the utility of the Wechsler Memory Scale-Fourth Edition (WMS-IV) Logical Memory (LM) Recognition score as an embedded PVT using the Advanced Clinical Solutions (ACS) for WAIS-IV/WMS-IV Effort System. This mixed clinical sample was comprised of 97 total participants, 71 of whom were classified as valid and 26 as invalid based on three well-validated, freestanding criterion PVTs. Overall, the LM embedded PVT demonstrated poor concordance with the criterion PVTs and unacceptable psychometric properties using ACS validity base rates (42% sensitivity/79% specificity). Moreover, 15-39% of participants obtained an invalid ACS base rate despite having a normatively-intact age-corrected LM Recognition total score. Receiving operating characteristic curve analysis revealed a Recognition total score cutoff of < 61% correct improved specificity (92%) while sensitivity remained weak (31%). Thus, results indicated the LM Recognition embedded PVT is not appropriate for use from an evidence-based perspective, and that clinicians may be faced with reconciling how a normatively intact cognitive performance on the Recognition subtest could simultaneously reflect invalid performance validity.
Validation of the Spanish Addiction Severity Index Multimedia Version (S-ASI-MV).
Butler, Stephen F; Redondo, José Pedro; Fernandez, Kathrine C; Villapiano, Albert
2009-01-01
This study aimed to develop and test the reliability and validity of a Spanish adaptation of the ASI-MV, a computer administered version of the Addiction Severity Index, called the S-ASI-MV. Participants were 185 native Spanish-speaking adult clients from substance abuse treatment facilities serving Spanish-speaking clients in Florida, New Mexico, California, and Puerto Rico. Participants were administered the S-ASI-MV as well as Spanish versions of the general health subscale of the SF-36, the work and family unit subscales of the Social Adjustment Scale Self-Report, the Michigan Alcohol Screening Test, the alcohol and drug subscales of the Personality Assessment Inventory, and the Hopkins Symptom Checklist-90. Three-to-five-day test-retest reliability was examined along with criterion validity, convergent/discriminant validity, and factorial validity. Measurement invariance between the English and Spanish versions of the ASI-MV was also examined. The S-ASI-MV demonstrated good test-retest reliability (ICCs for composite scores between .59 and .93), criterion validity (rs for composite scores between .66 and .87), and convergent/discriminant validity. Factorial validity and measurement invariance were demonstrated. These results compared favorably with those reported for the original interviewer version of the ASI and the English version of the ASI-MV.
Guo, Yi; Bian, Jiang; Leavitt, Trevor; Vincent, Heather K; Vander Zalm, Lindsey; Teurlings, Tyler L; Smith, Megan D; Modave, François
2017-03-07
Regular physical activity can not only help with weight management, but also lower cardiovascular risks, cancer rates, and chronic disease burden. Yet, only approximately 20% of Americans currently meet the physical activity guidelines recommended by the US Department of Health and Human Services. With the rapid development of mobile technologies, mobile apps have the potential to improve participation rates in exercise programs, particularly if they are evidence-based and are of sufficient content quality. The goal of this study was to develop and test an instrument, which was designed to score the content quality of exercise program apps with respect to the exercise guidelines set forth by the American College of Sports Medicine (ACSM). We conducted two focus groups (N=14) to elicit input for developing a preliminary 27-item scoring instruments based on the ACSM exercise prescription guidelines. Three reviewers who were no sports medicine experts independently scored 28 exercise program apps using the instrument. Inter- and intra-rater reliability was assessed among the 3 reviewers. An expert reviewer, a Fellow of the ACSM, also scored the 28 apps to create criterion scores. Criterion validity was assessed by comparing nonexpert reviewers' scores to the criterion scores. Overall, inter- and intra-rater reliability was high with most coefficients being greater than .7. Inter-rater reliability coefficients ranged from .59 to .99, and intra-rater reliability coefficients ranged from .47 to 1.00. All reliability coefficients were statistically significant. Criterion validity was found to be excellent, with the weighted kappa statistics ranging from .67 to .99, indicating a substantial agreement between the scores of expert and nonexpert reviewers. Finally, all apps scored poorly against the ACSM exercise prescription guidelines. None of the apps received a score greater than 35, out of a possible maximal score of 70. We have developed and presented valid and reliable scoring instruments for exercise program apps. Our instrument may be useful for consumers and health care providers who are looking for apps that provide safe, progressive general exercise programs for health and fitness. ©Yi Guo, Jiang Bian, Trevor Leavitt, Heather K Vincent, Lindsey Vander Zalm, Tyler L Teurlings, Megan D Smith, François Modave. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 07.03.2017.
Evaluation of Measurement Instrument Criterion Validity in Finite Mixture Settings
ERIC Educational Resources Information Center
Raykov, Tenko; Marcoulides, George A.; Li, Tenglong
2016-01-01
A method for evaluating the validity of multicomponent measurement instruments in heterogeneous populations is discussed. The procedure can be used for point and interval estimation of criterion validity of linear composites in populations representing mixtures of an unknown number of latent classes. The approach permits also the evaluation of…
Evaluation of Validity and Reliability for Hierarchical Scales Using Latent Variable Modeling
ERIC Educational Resources Information Center
Raykov, Tenko; Marcoulides, George A.
2012-01-01
A latent variable modeling method is outlined, which accomplishes estimation of criterion validity and reliability for a multicomponent measuring instrument with hierarchical structure. The approach provides point and interval estimates for the scale criterion validity and reliability coefficients, and can also be used for testing composite or…
ERIC Educational Resources Information Center
Anderson, Daniel; Alonzo, Julie; Tindal, Gerald
2010-01-01
In this technical report, we present the results of a study examining the relation between the math measures available on the easyCBM[R] online benchmark and progress monitoring assessment system and the Oregon statewide assessment of mathematics. Designed for use within a response to intervention (RTI) framework, easyCBM[R] is intended to help…
ERIC Educational Resources Information Center
Dedrick, Robert F.; Shaunessy-Dedrick, Elizabeth; Suldo, Shannon M.; Ferron, John M.
2015-01-01
In two studies (ns = 312 and 1,149) with 9- to 12-grade students in pre-International Baccalaureate (IB) and IB Diploma programs, we evaluated the reliability, factor structure, measurement invariance, and criterion-related validity of the scores from the School Attitude Assessment Survey-Revised (SAAS-R). Reliabilities of the five SAAS-R subscale…
ERIC Educational Resources Information Center
Scheeringa, Michael S.; Haslett, Nancy
2010-01-01
The need to assess Diagnostic and Statistical Manual, Fourth Edition (DSM-IV) disorders in children younger than 7 years of age has intensified as clinical efforts to diagnose and treat this population have increased, and clinical research on psychopathology has advanced. A new diagnostic instrument for young children was created, the Diagnostic…
Monacis, Lucia; Palo, Valeria de; Griffiths, Mark D; Sinatra, Maria
2016-12-01
Background and aims The inclusion of Internet Gaming Disorder (IGD) in Section III of the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders has increased the interest of researchers in the development of new standardized psychometric tools for the assessment of such a disorder. To date, the nine-item Internet Gaming Disorder Scale - Short-Form (IGDS9-SF) has only been validated in English, Portuguese, and Slovenian languages. Therefore, the aim of this investigation was to examine the psychometric properties of the IGDS9-SF in an Italian-speaking sample. Methods A total of 757 participants were recruited to the present study. Confirmatory factor analysis and multi-group analyses were applied to assess the construct validity. Reliability analyses comprised the average variance extracted, the standard error of measurement, and the factor determinacy coefficient. Convergent and criterion validities were established through the associations with other related constructs. The receiver operating characteristic curve analysis was used to determine an empirical cut-off point. Results Findings confirmed the single-factor structure of the instrument, its measurement invariance at the configural level, and the convergent and criterion validities. Satisfactory levels of reliability and a cut-off point of 21 were obtained. Discussion and conclusions The present study provides validity evidence for the use of the Italian version of the IGDS9-SF and may foster research into gaming addiction in the Italian context.
Monacis, Lucia; de Palo, Valeria; Griffiths, Mark D.; Sinatra, Maria
2016-01-01
Background and aims The inclusion of Internet Gaming Disorder (IGD) in Section III of the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders has increased the interest of researchers in the development of new standardized psychometric tools for the assessment of such a disorder. To date, the nine-item Internet Gaming Disorder Scale – Short-Form (IGDS9-SF) has only been validated in English, Portuguese, and Slovenian languages. Therefore, the aim of this investigation was to examine the psychometric properties of the IGDS9-SF in an Italian-speaking sample. Methods A total of 757 participants were recruited to the present study. Confirmatory factor analysis and multi-group analyses were applied to assess the construct validity. Reliability analyses comprised the average variance extracted, the standard error of measurement, and the factor determinacy coefficient. Convergent and criterion validities were established through the associations with other related constructs. The receiver operating characteristic curve analysis was used to determine an empirical cut-off point. Results Findings confirmed the single-factor structure of the instrument, its measurement invariance at the configural level, and the convergent and criterion validities. Satisfactory levels of reliability and a cut-off point of 21 were obtained. Discussion and conclusions The present study provides validity evidence for the use of the Italian version of the IGDS9-SF and may foster research into gaming addiction in the Italian context. PMID:27876422
Wood, David L; Sawicki, Gregory S; Miller, M David; Smotherman, Carmen; Lukens-Bull, Katryne; Livingood, William C; Ferris, Maria; Kraemer, Dale F
2014-01-01
National consensus statements recommend that providers regularly assess the transition readiness skills of adolescent and young adults (AYA). In 2010 we developed a 29-item version of Transition Readiness Assessment Questionnaire (TRAQ). We reevaluated item performance and factor structure, and reassessed the TRAQ's reliability and validity. We surveyed youth from 3 academic clinics in Jacksonville, Florida; Chapel Hill, North Carolina; and Boston, Massachusetts. Participants were AYA with special health care needs aged 14 to 21 years. From a convenience sample of 306 patients, we conducted item reduction strategies and exploratory factor analysis (EFA). On a second convenience sample of 221 patients, we conducted confirmatory factor analysis (CFA). Internal reliability was assessed by Cronbach's alpha and criterion validity. Analyses were conducted by the Wilcoxon rank sum test and mixed linear models. The item reduction and EFA resulted in a 20-item scale with 5 identified subscales. The CFA conducted on a second sample provided a good fit to the data. The overall scale has high reliability overall (Cronbach's alpha = .94) and good reliability for 4 of the 5 subscales (Cronbach's alpha ranging from .90 to .77 in the pooled sample). Each of the 5 subscale scores were significantly higher for adolescents aged 18 years and older versus those younger than 18 (P < .0001) in both univariate and multivariate analyses. The 20-item, 5-factor structure for the TRAQ is supported by EFA and CFA on independent samples and has good internal reliability and criterion validity. Additional work is needed to expand or revise the TRAQ subscales and test their predictive validity. Copyright © 2014 Academic Pediatric Association. Published by Elsevier Inc. All rights reserved.
Validity of an adapted Household Food Insecurity Access Scale in urban households in Iran.
Mohammadi, Fatemeh; Omidvar, Nasrin; Houshiar-Rad, Anahita; Khoshfetrat, Mohammad-Reza; Abdollahi, Morteza; Mehrabi, Yadollah
2012-01-01
To assess the validity of a locally adapted Household Food Insecurity Access Scale (HFIAS) in the measurement of household food insecurity (FI) in the city of Tehran. A cross-sectional study. Urban households were selected through a systematic cluster sampling method from six different districts of Tehran. The socio-economic status of households was evaluated using a questionnaire by means of interviews. An adapted HFIAS was used to measure FI. Content validity was assessed by an expert panel, and the questionnaire was then tested among ten households for clarity. Criterion validity was assessed by comparing the measure with a number of determinants and consequences of FI. Internal consistency was evaluated by Cronbach's α and exploratory factor analysis. For repeatability, the questionnaire was administered twice to twenty-five households at an interval of 20 d and Pearson's correlation coefficient was calculated. A total of 416 households. In all, 11·8 %, 14·4 % and 17·5 % of the households were severely, moderately and mildly food insecure, respectively. Cronbach's α was 0·855. A significant correlation was observed between the two administrations of the questionnaire (r = 0·895, P < 0·001). Factor analysis of HFIAS items revealed two factors: the first five items as factor 1 (mild-to-moderate FI) and the last four as factor 2 (severe FI). Heads of food-secure households had higher education and higher job position compared with heads of food-insecure households (P < 0·001). Income and expenditure were lower in food-insecure households compared with food-secure households. Adapted HFIAS showed acceptable levels of internal consistency, criterion validity and reliability in assessing household FI among Tehranians.
Quantitative model validation of manipulative robot systems
NASA Astrophysics Data System (ADS)
Kartowisastro, Iman Herwidiana
This thesis is concerned with applying the distortion quantitative validation technique to a robot manipulative system with revolute joints. Using the distortion technique to validate a model quantitatively, the model parameter uncertainties are taken into account in assessing the faithfulness of the model and this approach is relatively more objective than the commonly visual comparison method. The industrial robot is represented by the TQ MA2000 robot arm. Details of the mathematical derivation of the distortion technique are given which explains the required distortion of the constant parameters within the model and the assessment of model adequacy. Due to the complexity of a robot model, only the first three degrees of freedom are considered where all links are assumed rigid. The modelling involves the Newton-Euler approach to obtain the dynamics model, and the Denavit-Hartenberg convention is used throughout the work. The conventional feedback control system is used in developing the model. The system behavior to parameter changes is investigated as some parameters are redundant. This work is important so that the most important parameters to be distorted can be selected and this leads to a new term called the fundamental parameters. The transfer function approach has been chosen to validate an industrial robot quantitatively against the measured data due to its practicality. Initially, the assessment of the model fidelity criterion indicated that the model was not capable of explaining the transient record in term of the model parameter uncertainties. Further investigations led to significant improvements of the model and better understanding of the model properties. After several improvements in the model, the fidelity criterion obtained was almost satisfied. Although the fidelity criterion is slightly less than unity, it has been shown that the distortion technique can be applied in a robot manipulative system. Using the validated model, the importance of friction terms in the model was highlighted with the aid of the partition control technique. It was also shown that the conventional feedback control scheme was insufficient for a robot manipulative system due to high nonlinearity which was inherent in the robot manipulator.
Simulated Driving Assessment (SDA) for teen drivers: results from a validation study.
McDonald, Catherine C; Kandadai, Venk; Loeb, Helen; Seacrist, Thomas S; Lee, Yi-Ching; Winston, Zachary; Winston, Flaura K
2015-06-01
Driver error and inadequate skill are common critical reasons for novice teen driver crashes, yet few validated, standardised assessments of teen driving skills exist. The purpose of this study is to evaluate the construct and criterion validity of a newly developed Simulated Driving Assessment (SDA) for novice teen drivers. The SDA's 35 min simulated drive incorporates 22 variations of the most common teen driver crash configurations. Driving performance was compared for 21 inexperienced teens (age 16-17 years, provisional license ≤90 days) and 17 experienced adults (age 25-50 years, license ≥5 years, drove ≥100 miles per week, no collisions or moving violations ≤3 years). SDA driving performance (Error Score) was based on driving safety measures derived from simulator and eye-tracking data. Negative driving outcomes included simulated collisions or run-off-the-road incidents. A professional driving evaluator/instructor (DEI Score) reviewed videos of SDA performance. The SDA demonstrated construct validity: (1) teens had a higher Error Score than adults (30 vs. 13, p=0.02); (2) For each additional error committed, the RR of a participant's propensity for a simulated negative driving outcome increased by 8% (95% CI 1.05 to 1.10, p<0.01). The SDA-demonstrated criterion validity: Error Score was correlated with DEI Score (r=-0.66, p<0.001). This study supports the concept of validated simulated driving tests like the SDA to assess novice driver skill in complex and hazardous driving scenarios. The SDA, as a standard protocol to evaluate teen driver performance, has the potential to facilitate screening and assessment of teen driving readiness and could be used to guide targeted skill training. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Automated Ecological Assessment of Physical Activity: Advancing Direct Observation.
Carlson, Jordan A; Liu, Bo; Sallis, James F; Kerr, Jacqueline; Hipp, J Aaron; Staggs, Vincent S; Papa, Amy; Dean, Kelsey; Vasconcelos, Nuno M
2017-12-01
Technological advances provide opportunities for automating direct observations of physical activity, which allow for continuous monitoring and feedback. This pilot study evaluated the initial validity of computer vision algorithms for ecological assessment of physical activity. The sample comprised 6630 seconds per camera (three cameras in total) of video capturing up to nine participants engaged in sitting, standing, walking, and jogging in an open outdoor space while wearing accelerometers. Computer vision algorithms were developed to assess the number and proportion of people in sedentary, light, moderate, and vigorous activity, and group-based metabolic equivalents of tasks (MET)-minutes. Means and standard deviations (SD) of bias/difference values, and intraclass correlation coefficients (ICC) assessed the criterion validity compared to accelerometry separately for each camera. The number and proportion of participants sedentary and in moderate-to-vigorous physical activity (MVPA) had small biases (within 20% of the criterion mean) and the ICCs were excellent (0.82-0.98). Total MET-minutes were slightly underestimated by 9.3-17.1% and the ICCs were good (0.68-0.79). The standard deviations of the bias estimates were moderate-to-large relative to the means. The computer vision algorithms appeared to have acceptable sample-level validity (i.e., across a sample of time intervals) and are promising for automated ecological assessment of activity in open outdoor settings, but further development and testing is needed before such tools can be used in a diverse range of settings.
Automated Ecological Assessment of Physical Activity: Advancing Direct Observation
Carlson, Jordan A.; Liu, Bo; Sallis, James F.; Kerr, Jacqueline; Papa, Amy; Dean, Kelsey; Vasconcelos, Nuno M.
2017-01-01
Technological advances provide opportunities for automating direct observations of physical activity, which allow for continuous monitoring and feedback. This pilot study evaluated the initial validity of computer vision algorithms for ecological assessment of physical activity. The sample comprised 6630 seconds per camera (three cameras in total) of video capturing up to nine participants engaged in sitting, standing, walking, and jogging in an open outdoor space while wearing accelerometers. Computer vision algorithms were developed to assess the number and proportion of people in sedentary, light, moderate, and vigorous activity, and group-based metabolic equivalents of tasks (MET)-minutes. Means and standard deviations (SD) of bias/difference values, and intraclass correlation coefficients (ICC) assessed the criterion validity compared to accelerometry separately for each camera. The number and proportion of participants sedentary and in moderate-to-vigorous physical activity (MVPA) had small biases (within 20% of the criterion mean) and the ICCs were excellent (0.82–0.98). Total MET-minutes were slightly underestimated by 9.3–17.1% and the ICCs were good (0.68–0.79). The standard deviations of the bias estimates were moderate-to-large relative to the means. The computer vision algorithms appeared to have acceptable sample-level validity (i.e., across a sample of time intervals) and are promising for automated ecological assessment of activity in open outdoor settings, but further development and testing is needed before such tools can be used in a diverse range of settings. PMID:29194358
Examining the Criterion-Related Validity of the Pervasive Developmental Disorder Behavior Inventory
ERIC Educational Resources Information Center
McMorris, Carly A.; Perry, Adrienne
2015-01-01
The Pervasive Developmental Disorder Behavior Inventory is a questionnaire designed to aid in the diagnosis of pervasive developmental disorders or autism spectrum disorders. The Pervasive Developmental Disorder Behavior Inventory assesses adaptive and maladaptive behaviors associated with pervasive developmental disorders and provides an…
Teacher Competency: A Public Farce!
ERIC Educational Resources Information Center
Weitman, Catheryn J.
The current popularity of teacher testing allows for content, criterion, and construct validity to be assessed, as pertaining to achievement levels on basic knowledge examinations. Teacher competency is a complex issue that is inaccurately confused with or identified as measures derived from academic testing. The problems in addressing the…
Uehara, Kosuke; Ogura, Koichi; Akiyama, Toru; Shinoda, Yusuke; Iwata, Shintaro; Kobayashi, Eisuke; Tanzawa, Yoshikazu; Yonemoto, Tsukasa; Kawano, Hirotaka; Kawai, Akira
2017-09-01
The Musculoskeletal Tumor Society (MSTS) scoring system developed in 1993 is a widely used disease-specific evaluation tool for assessment of physical function in patients with musculoskeletal tumors; however, only a few studies have confirmed its reliability and validity. The aim of this study was to validate the MSTS scoring system for the upper extremity (MSTS-UE) in Japanese patients with musculoskeletal tumors for use by others in research. Does the MSTS-UE have: (1) sufficient reliability and internal consistency; (2) adequate construct validity; and (3) reasonable criterion validity in comparison to the Toronto Extremity Salvage Score (TESS) or SF-36? Reliability was performed using test-retest analysis, and internal consistency was evaluated with Cronbach's alpha coefficient. Construct validity was evaluated using a scree plot to confirm the construct number and the Akaike information criterion network. Criterion validity was evaluated by comparing the MSTS-UE with the TESS and SF-36. The test-retest reliability with intraclass correlation coefficient (0.95; 95% CI, 0.91-0.97) was excellent, and internal consistency with Cronbach's α (0.7; 95% CI, 0.53-0.81) was acceptable. There were no ceiling and floor effects. The Akaike Information Criterion network showed that lifting ability, pain, and dexterity played central roles among the components. The MSTS-UE showed substantial correlation with the TESS scoring scale (r = 0.75; p < 0.001) and fair correlation with the SF-36 physical component summary (r = 0.37; p = 0.007). Although the MSTS-UE showed slight correlation with the SF-36 mental component summary, the emotional acceptance component of the MSTS-UE showed fair correlation (r = 0.29; p = 0.039). We can conclude that the MSTS is not an adequate measure of general health-related quality of life; however, this system was designed mainly to be a simple measure of function in a single extremity. To evaluate the mental state of patients with musculoskeletal tumors in the upper extremity, further study is needed.
Hoenig, Helen M; Amis, Kristopher; Edmonds, Carol; Morgan, Michelle S; Landerman, Lawrence; Caves, Kevin
2017-01-01
Background There is limited research about the effects of video quality on the accuracy of assessments of physical function. Methods A repeated measures study design was used to assess reliability and validity of the finger-nose test (FNT) and the finger-tapping test (FTT) carried out with 50 veterans who had impairment in gross and/or fine motor coordination. Videos were scored by expert raters under eight differing conditions, including in-person, high definition video with slow motion review and standard speed videos with varying bit rates and frame rates. Results FTT inter-rater reliability was excellent with slow motion video (ICC 0.98-0.99) and good (ICC 0.59) under the normal speed conditions. Inter-rater reliability for FNT 'attempts' was excellent (ICC 0.97-0.99) for all viewing conditions; for FNT 'misses' it was good to excellent (ICC 0.89) with slow motion review but substantially worse (ICC 0.44) on the normal speed videos. FTT criterion validity (i.e. compared to slow motion review) was excellent (β = 0.94) for the in-person rater and good ( β = 0.77) on normal speed videos. Criterion validity for FNT 'attempts' was excellent under all conditions ( r ≥ 0.97) and for FNT 'misses' it was good to excellent under all conditions ( β = 0.61-0.81). Conclusions In general, the inter-rater reliability and validity of the FNT and FTT assessed via video technology is similar to standard clinical practices, but is enhanced with slow motion review and/or higher bit rate.
Al Ansari, Ahmed; Donnon, Tyrone; Al Khalifa, Khalid; Darwish, Abdulla; Violato, Claudio
2014-01-01
Background The purpose of this study was to conduct a meta-analysis on the construct and criterion validity of multi-source feedback (MSF) to assess physicians and surgeons in practice. Methods In this study, we followed the guidelines for the reporting of observational studies included in a meta-analysis. In addition to PubMed and MEDLINE databases, the CINAHL, EMBASE, and PsycINFO databases were searched from January 1975 to November 2012. All articles listed in the references of the MSF studies were reviewed to ensure that all relevant publications were identified. All 35 articles were independently coded by two authors (AA, TD), and any discrepancies (eg, effect size calculations) were reviewed by the other authors (KA, AD, CV). Results Physician/surgeon performance measures from 35 studies were identified. A random-effects model of weighted mean effect size differences (d) resulted in: construct validity coefficients for the MSF system on physician/surgeon performance across different levels in practice ranged from d=0.14 (95% confidence interval [CI] 0.40–0.69) to d=1.78 (95% CI 1.20–2.30); construct validity coefficients for the MSF on physician/surgeon performance on two different occasions ranged from d=0.23 (95% CI 0.13–0.33) to d=0.90 (95% CI 0.74–1.10); concurrent validity coefficients for the MSF based on differences in assessor group ratings ranged from d=0.50 (95% CI 0.47–0.52) to d=0.57 (95% CI 0.55–0.60); and predictive validity coefficients for the MSF on physician/surgeon performance across different standardized measures ranged from d=1.28 (95% CI 1.16–1.41) to d=1.43 (95% CI 0.87–2.00). Conclusion The construct and criterion validity of the MSF system is supported by small to large effect size differences based on the MSF process and physician/surgeon performance across different clinical and nonclinical domain measures. PMID:24600300
Teel, Elizabeth F; Slobounov, Semyon M
2015-03-01
To determine the criterion and content validity of a virtual reality (VR) balance module for use in clinical practice. Retrospective, VR balance module completed by participants during concussion baseline or assessment testing session. A Pennsylvania State University research laboratory. A total of 60 control and 28 concussed students and athletes from the Pennsylvania State University. None. This study examined: (1) the relationship between VR composite balance scores (final, stationary, yaw, pitch, and roll) and area of the center-of-pressure (eyes open and closed) scores and (2) group differences (normal volunteers and concussed student-athletes) on VR composite balance scores. With the exception of the stationary composite score, all other VR balance composite scores were significantly correlated with the center of pressure data obtained from a force platform. Significant correlations ranged from r = -0.273 to -0.704 for the eyes open conditions and from r = -0.353 to -0.876 for the eyes closed condition. When examining group differences on the VR balance composite modules, the concussed group did significantly (P < 0.01) worse on all measures compared with the control group. The VR balance module met or exceeded the criterion and content validity standard set by the current balance tools and may be appropriate for use in a clinical concussion setting. Virtual reality balance module is a valid tool for concussion assessment in clinical settings. This novel type of balance assessment may be more sensitive to concussion diagnoses, especially later (7-10 days) in the recovery phase than current clinical balance tools.
Using the Rasch Measurement Model in Psychometric Analysis of the Family Effectiveness Measure
McCreary, Linda L.; Conrad, Karen M.; Conrad, Kendon J.; Scott, Christy K; Funk, Rodney R.; Dennis, Michael L.
2013-01-01
Background Valid assessment of family functioning can play a vital role in optimizing client outcomes. Because family functioning is influenced by family structure, socioeconomic context, and culture, existing measures of family functioning--primarily developed with nuclear, middle class European American families--may not be valid assessments of families in diverse populations. The Family Effectiveness Measure was developed to address this limitation. Objectives To test the Family Effectiveness Measure with data from a primarily low-income African American convenience sample, using the Rasch measurement model. Method A sample of 607 adult women completed the measure. Rasch analysis was used to assess unidimensionality, response category functioning, item fit, person reliability, differential item functioning by race and parental status, and item hierarchy. Criterion-related validity was tested using correlations with five other variables related to family functioning. Results The Family Effectiveness Measure measures two separate constructs: The effective family functioning construct was a psychometrically sound measure of the target construct that was more efficient due to the deletion of 22 items. The ineffective family functioning construct consisted of 16 of those deleted items but was not as strong psychometrically. Items in both constructs evidenced no differential item functioning by race. Criterion-related validity was supported for both. Discussion In contrast to the prevailing conceptualization that family functioning is a single construct, assessed by positively and negatively worded items, use of the Rasch analysis suggested the existence of two constructs. While the effective family functioning is a strong and efficient measure of family functioning, the ineffective family functioning will require additional item development and psychometric testing. PMID:23636342
The Multimedia Activity Recall for Children and Adolescents (MARCA): development and evaluation.
Ridley, Kate; Olds, Tim S; Hill, Alison
2006-05-26
Self-report recall questionnaires are commonly used to measure physical activity, energy expenditure and time use in children and adolescents. However, self-report questionnaires show low to moderate validity, mainly due to inaccuracies in recalling activity in terms of duration and intensity. Aside from recall errors, inaccuracies in estimating energy expenditure from self-report questionnaires are compounded by a lack of data on the energy cost of everyday activities in children and adolescents. This article describes the development of the Multimedia Activity Recall for Children and Adolescents (MARCA), a computer-delivered use-of-time instrument designed to address both the limitations of self-report recall questionnaires in children, and the lack of energy cost data in children. The test-retest reliability of the MARCA was assessed using a sample of 32 children (aged 11.8 +/- 0.7 y) who undertook the MARCA twice within 24-h. Criterion validity was assessed by comparing self-reports with accelerometer counts collected on a sample of 66 children (aged 11.6 +/- 0.8 y). Content and construct validity were assessed by establishing whether data collected using the MARCA on 1429 children (aged 11.9 +/- 0.8 y) exhibited relationships and trends in children's physical activity consistent with established findings from a number of previous research studies. Test-retest reliability was high with intra-class coefficients ranging from 0.88 to 0.94. The MARCA demonstrated criterion validity comparable to other self-report instruments with Spearman coefficients ranging from rho = 0.36 to 0.45, and provided evidence of good content and construct validity. The MARCA is a valid and reliable self-report questionnaire, capable of a wide variety of flexible use-of-time analyses related to both physical activity and sedentary behaviour, and offers advantages over existing pen-and-paper questionnaires.
Is the Simple Shoulder Test a valid outcome instrument for shoulder arthroplasty?
Hsu, Jason E; Russ, Stacy M; Somerson, Jeremy S; Tang, Anna; Warme, Winston J; Matsen, Frederick A
2017-10-01
The Simple Shoulder Test (SST) is a brief, inexpensive, and widely used patient-reported outcome tool, but it has not been rigorously evaluated for patients having shoulder arthroplasty. The goal of this study was to rigorously evaluate the validity of the SST for outcome assessment in shoulder arthroplasty using a systematic review of the literature and an analysis of its properties in a series of 408 surgical cases. SST scores, 36-Item Short Form Health Survey scores, and satisfaction scores were collected preoperatively and 2 years postoperatively. Responsiveness was assessed by comparing preoperative and 2-year postoperative scores. Criterion validity was determined by correlating the SST with the 36-Item Short Form Health Survey. Construct validity was tested through 5 clinical hypotheses regarding satisfaction, comorbidities, insurance status, previous failed surgery, and narcotic use. Scores after arthroplasty improved from 3.9 ± 2.8 to 10.2 ± 2.3 (P < .001). The change in SST correlated strongly with patient satisfaction (P < .001). The SST had large Cohen's d effect sizes and standardized response means. Criterion validity was supported by significant differences between satisfied and unsatisfied patients, those with more severe and less severe comorbidities, those with workers' compensation or Medicaid and other types of insurance, those with and without previous failed shoulder surgery, and those taking and those not taking narcotic pain medication before surgery (P < .005). These data combined with a systematic review of the literature demonstrate that the SST is a valid and responsive patient-reported outcome measure for assessing the outcomes of shoulder arthroplasty. Copyright © 2017 Journal of Shoulder and Elbow Surgery Board of Trustees. Published by Elsevier Inc. All rights reserved.
Discriminative and Criterion Validity of the Autism Spectrum Identity Scale (ASIS)
ERIC Educational Resources Information Center
McDonald, T. A. M.
2017-01-01
Individuals on the autism spectrum face stigma that can influence identity development. Previous research on the 22-item Autism Spectrum Identity Scale (ASIS) reported a four-factor structure with strong split-sample cross-validation and good internal consistency. This study reports the discriminative and criterion validity of the ASIS with other…
Tork, Hanan; Dassen, Theo; Lohrmann, Christa
2009-02-01
This paper is a report of a study to examine the psychometric properties of the Care Dependency Scale for Paediatrics in Germany and Egypt and to compare the care dependency of school-age children in both countries. Cross-cultural differences in care dependency of older adults have been documented in the literature, but little is known about the differences and similarities with regard to children's care dependency in different cultures. A convenience sample of 258 school-aged children from Germany and Egypt participated in the study in 2005. The reliability of the Care Dependency Scale for Paediatrics was assessed in terms of internal consistency and interrater reliability. Factor analysis (principal component analysis) was employed to verify the construct validity. A Visual Analogue Scale was used to investigate the criterion-related validity. Good internal consistency was detected both for the Arabic and German versions. Factor analysis revealed one factor for both versions. A Pearson's correlation between the Care Dependency Scale for Paediatrics and Visual Analogue Scale was statistically significant for both versions indicating criterion-related validity. Statistically significant differences between the participants were detected regarding the mean sum score on the Care Dependency Scale for Paediatrics. The Care Dependency Scale for Paediatrics is a reliable and valid tool for assessing the care dependency of children and is recommended for assessing the care dependency of children from different ethnic origins. Differences in care dependency between German and Egyptian children were detected, which might be due to cultural differences.
ERIC Educational Resources Information Center
Geiser, Saul; Santelices, Maria Veronica
2007-01-01
High-school grades are often viewed as an unreliable criterion for college admissions, owing to differences in grading standards across high schools, while standardized tests are seen as methodologically rigorous, providing a more uniform and valid yardstick for assessing student ability and achievement. The present study challenges that…
Sierpińska, Lidia
2013-09-01
The Authentic Leadership Questionnaire (ALQ) is a standardized research instrument for the evaluation of individual elements of leader's conduct which contribute to the authentic leadership. The application of this questionnaire in Polish conditions required to carry out the validation process. The aim of the study was to evaluate of validity and reliability of the Polish version of the American research instrument for the needs of evaluation of authenticity of leadership of the nursing management in Polish hospitals. The study covered 286 nurses (143 head nurses and 143 of their subordinates) employed in 45 hospitals in Poland. Theoretical validity of the instrument was evaluated using Fisher's transformation (r-Person correlation coefficient), while the criterion validity of the ALQ was evaluated using rho-Spearman correlation coefficient and the BOHIPSZO questionnaire. The reliability of the ALQ was assessed by means of the Cronbach-alpha coefficient. The ALQ questionnaire applied for the evaluation of authenticity of leadership of the nursing management in Polish hospital wards shows an acceptable theoretical and criterion validity and reliability (Cronbach-alpha coefficient 0.80). The Polish version of the ALQ is valid and reliable, and may be applied in studies concerning the evaluation of authenticity of leadership of the nursing management in Polish hospital wards.
Nakano, Hideki; Kodama, Takayuki; Ukai, Kazumasa; Kawahara, Satoru; Horikawa, Shiori; Murata, Shin
2018-01-01
In this study, we aimed to (1) translate the English version of the Kinesthetic and Visual Imagery Questionnaire (KVIQ), which assesses motor imagery ability, into Japanese, and (2) investigate the reliability and validity of the Japanese KVIQ. We enrolled 28 healthy adults in this study. We used Cronbach’s alpha coefficients to assess reliability reflected by the internal consistency. Additionally, we assessed validity reflected by the criterion-related validity between the Japanese KVIQ and the Japanese version of the Movement Imagery Questionnaire-Revised (MIQ-R) with Spearman’s rank correlation coefficients. The Cronbach’s alpha coefficients for the KVIQ-20 were 0.88 (Visual) and 0.91 (Kinesthetic), which indicates high reliability. There was a significant positive correlation between the Japanese KVIQ-20 (Total) and the Japanese MIQ-R (Total) (r = 0.86, p < 0.01). Our results suggest that the Japanese KVIQ is an assessment that is a reliable and valid index of motor imagery ability. PMID:29724042
Nakano, Hideki; Kodama, Takayuki; Ukai, Kazumasa; Kawahara, Satoru; Horikawa, Shiori; Murata, Shin
2018-05-02
In this study, we aimed to (1) translate the English version of the Kinesthetic and Visual Imagery Questionnaire (KVIQ), which assesses motor imagery ability, into Japanese, and (2) investigate the reliability and validity of the Japanese KVIQ. We enrolled 28 healthy adults in this study. We used Cronbach’s alpha coefficients to assess reliability reflected by the internal consistency. Additionally, we assessed validity reflected by the criterion-related validity between the Japanese KVIQ and the Japanese version of the Movement Imagery Questionnaire-Revised (MIQ-R) with Spearman’s rank correlation coefficients. The Cronbach’s alpha coefficients for the KVIQ-20 were 0.88 (Visual) and 0.91 (Kinesthetic), which indicates high reliability. There was a significant positive correlation between the Japanese KVIQ-20 (Total) and the Japanese MIQ-R (Total) (r = 0.86, p < 0.01). Our results suggest that the Japanese KVIQ is an assessment that is a reliable and valid index of motor imagery ability.
The validation of a home food inventory.
Fulkerson, Jayne A; Nelson, Melissa C; Lytle, Leslie; Moe, Stacey; Heitzler, Carrie; Pasch, Keryn E
2008-11-04
Home food inventories provide an efficient method for assessing home food availability; however, few are validated. The present study's aim was to develop and validate a home food inventory that is easily completed by research participants in their homes and includes a comprehensive range of both healthful and less healthful foods that are associated with obesity. A home food inventory (HFI) was developed and tested with two samples. Sample 1 included 51 adult participants and six trained research staff who independently completed the HFI in participants' homes. Sample 2 included 342 families in which parents completed the HFI and the Diet History Questionnaire (DHQ) and students completed three 24-hour dietary recall interviews. HFI items assessed 13 major food categories as well as two categories assessing ready-access to foods in the kitchen and the refrigerator. An obesogenic household food availability score was also created. To assess criterion validity, participants' and research staffs' assessment of home food availability were compared (staff = gold standard). Criterion validity was evaluated with kappa, sensitivity, and specificity. Construct validity was assessed with correlations of five HFI major food category scores with servings of the same foods and associated nutrients from the DHQ and dietary recalls. Kappa statistics for all 13 major food categories and the two ready-access categories ranged from 0.61 to 0.83, indicating substantial agreement. Sensitivity ranged from 0.69 to 0.89, and specificity ranged from 0.86 to 0.95. Spearman correlations between staff and participant major food category scores ranged from 0.71 to 0.97. Correlations between the HFI scores and food group servings and nutrients on the DHQ (parents) were all significant (p < .05) while about half of associations between the HFI and dietary recall interviews (adolescents) were significant (p < .05). The obesogenic home food availability score was significantly associated (p < .05) with energy intake of both parents and adolescents. This new home food inventory is valid, participant-friendly, and may be useful for community-based behavioral nutrition and obesity prevention research. The inventory builds on previous measures by including a wide range of healthful and less healthful foods rather than foods targeted for a specific intervention.
2014-01-01
Background Foot disease complications, such as foot ulcers and infection, contribute to considerable morbidity and mortality. These complications are typically precipitated by “high-risk factors”, such as peripheral neuropathy and peripheral arterial disease. High-risk factors are more prevalent in specific “at risk” populations such as diabetes, kidney disease and cardiovascular disease. To the best of the authors’ knowledge a tool capturing multiple high-risk factors and foot disease complications in multiple at risk populations has yet to be tested. This study aimed to develop and test the validity and reliability of a Queensland High Risk Foot Form (QHRFF) tool. Methods The study was conducted in two phases. Phase one developed a QHRFF using an existing diabetes foot disease tool, literature searches, stakeholder groups and expert panel. Phase two tested the QHRFF for validity and reliability. Four clinicians, representing different levels of expertise, were recruited to test validity and reliability. Three cohorts of patients were recruited; one tested criterion measure reliability (n = 32), another tested criterion validity and inter-rater reliability (n = 43), and another tested intra-rater reliability (n = 19). Validity was determined using sensitivity, specificity and positive predictive values (PPV). Reliability was determined using Kappa, weighted Kappa and intra-class correlation (ICC) statistics. Results A QHRFF tool containing 46 items across seven domains was developed. Criterion measure reliability of at least moderate categories of agreement (Kappa > 0.4; ICC > 0.75) was seen in 91% (29 of 32) tested items. Criterion validity of at least moderate categories (PPV > 0.7) was seen in 83% (60 of 72) tested items. Inter- and intra-rater reliability of at least moderate categories (Kappa > 0.4; ICC > 0.75) was seen in 88% (84 of 96) and 87% (20 of 23) tested items respectively. Conclusions The QHRFF had acceptable validity and reliability across the majority of items; particularly items identifying relevant co-morbidities, high-risk factors and foot disease complications. Recommendations have been made to improve or remove identified weaker items for future QHRFF versions. Overall, the QHRFF possesses suitable practicality, validity and reliability to assess and capture relevant foot disease items across multiple at risk populations. PMID:24468080
Competency Verification in the Health Professions Via Limited Focus Measurement
ERIC Educational Resources Information Center
Popham, W. James
1978-01-01
Norm-referenced tests are inappropriate for evaluating health care practitioners; criterion referenced tests better describe what is not being measured. Reliable assessment of competency should focus on the valid testing of reasonable numbers of important skills. Available from: Sage Publications, Inc., 275 South Beverly Drive, Beverly Hills,…
Development and Implementation of a Food Safety Knowledge Instrument
ERIC Educational Resources Information Center
Byrd-Bredbenner, Carol; Wheatley, Virginia; Schaffner, Donald; Bruhn, Christine; Blalock, Lydia; Maurer, Jaclyn
2007-01-01
Little is known about the food safety knowledge of young adults. In addition, few knowledge questionnaires and no comprehensive, criterion-referenced measure that assesses the full range of food safety knowledge could be identified. Without appropriate, valid, and reliable measures and baseline data, it is difficult to develop and implement…
Qualitative Analysis on Stage: Making the Research Process More Public.
ERIC Educational Resources Information Center
Anfara, Vincent A., Jr.; Brown, Kathleen M.
The increased use of qualitative research methods has spurred interest in developing formal standards for assessing its validity. These standards, however, fall short if they do not include public disclosure of methods as a criterion. The researcher must be accountable in documenting the actions associated with establishing internal validity…
Measuring Speech Comprehensibility in Students with Down Syndrome
ERIC Educational Resources Information Center
Yoder, Paul J.; Woynaroski, Tiffany; Camarata, Stephen
2016-01-01
Purpose: There is an ongoing need to develop assessments of spontaneous speech that focus on whether the child's utterances are comprehensible to listeners. This study sought to identify the attributes of a stable ratings-based measure of speech comprehensibility, which enabled examining the criterion-related validity of an orthography-based…
The Counselor Evaluation Rating Scale: A Valid Criterion of Counselor Effectiveness?
ERIC Educational Resources Information Center
Jones, Lawrence K.
1974-01-01
The validity of recent recommendations regarding the use of certain factors of the 16 Personality Factor Questionnaire (16PF) to select persons for counselor training programs, where the CERS was the criterion measure, is challenged. (Author)
Bartels, Meike; Cath, Danielle C.; Boomsma, Dorret I.
2008-01-01
The factor structure of the Dutch translation of the Autism-Spectrum Quotient (AQ; a continuous, quantitative measure of autistic traits) was evaluated with confirmatory factor analyses in a large general population and student sample. The criterion validity of the AQ was examined in three matched patient groups (autism spectrum conditions (ASC), social anxiety disorder, and obsessive–compulsive disorder). A two factor model, consisting of a “Social interaction” factor and “Attention to detail” factor could be identified. The internal consistency and test–retest reliability of the AQ were satisfactory. High total AQ and factor scores were specific to ASC patients. Men scored higher than women and science students higher than non-science students. The Dutch translation of the AQ is a reliable instrument to assess autism spectrum conditions. PMID:18302013
Ruch, Willibald; Heintz, Sonja
2017-01-01
How strongly does humor (i.e., the construct-relevant content) in the Humor Styles Questionnaire (HSQ; Martin et al., 2003) determine the responses to this measure (i.e., construct validity)? Also, how much does humor influence the relationships of the four HSQ scales, namely affiliative, self-enhancing, aggressive, and self-defeating, with personality traits and subjective well-being (i.e., criterion validity)? The present paper answers these two questions by experimentally manipulating the 32 items of the HSQ to only (or mostly) contain humor (i.e., construct-relevant content) or to substitute the humor content with non-humorous alternatives (i.e., only assessing construct-irrelevant context). Study 1 (N = 187) showed that the HSQ affiliative scale was mainly determined by humor, self-enhancing and aggressive were determined by both humor and non-humorous context, and self-defeating was primarily determined by the context. This suggests that humor is not the primary source of the variance in three of the HQS scales, thereby limiting their construct validity. Study 2 (N = 261) showed that the relationships of the HSQ scales to the Big Five personality traits and subjective well-being (positive affect, negative affect, and life satisfaction) were consistently reduced (personality) or vanished (subjective well-being) when the non-humorous contexts in the HSQ items were controlled for. For the HSQ self-defeating scale, the pattern of relationships to personality was also altered, supporting an positive rather than a negative view of the humor in this humor style. The present findings thus call for a reevaluation of the role that humor plays in the HSQ (construct validity) and in the relationships to personality and well-being (criterion validity). PMID:28473794
Cha, Young Joo; Lee, Jae Jin; Kim, Do Hyun; You, Joshua Sung H
2017-10-23
Core stabilization plays an important role in the regulation of postural stability. To overcome shortcomings associated with pain and severe core instability during conventional core stabilization tests, we recently developed the dynamic neuromuscular stabilization-based heel sliding (DNS-HS) test. The purpose of this study was to establish the criterion validity and test-retest reliability of the novel DNS-HS test. Twenty young adults with core instability completed both the bilateral straight leg lowering test (BSLLT) and DNS-HS test for the criterion validity study and repeated the DNS-HS test for the test-retest reliability study. Criterion validity was determined by comparing hip joint angle data that were obtained from BSLLT and DNS-HS measures. The test-retest reliability was determined by comparing hip joint angle data. Criterion validity was (ICC2,3) = 0.700 (p< 0.05), suggesting a good relationship between the two core stability measures. Test-retest reliability was (ICC3,3) = 0.953 (p< 0.05), indicating excellent consistency between the repeated DNS-HS measurements. Criterion validity data demonstrated a good relationship between the gold standard BSLLT and DNS-HS core stability measures. Test-retest reliability data suggests that DNS-HS core stability was a reliable test for core stability. Clinically, the DNS-HS test is useful to objectively quantify core instability and allow early detection and evaluation.
ERIC Educational Resources Information Center
Livingstone, Holly A.; Day, Arla L.
2005-01-01
Despite the popularity of the concept of emotional intelligence(EI), there is much controversy around its definition, measurement, and validity. Therefore, the authors examined the construct and criterion-related validity of an ability-based EI measure (Mayer Salovey Caruso Emotional Intelligence Test [MSCEIT]) and a mixed-model EI measure…
Home Healthcare Nurses' Job Satisfaction Scale: refinement and psychometric testing.
Ellenbecker, Carol H; Byleckie, James J
2005-10-01
This paper describes a study to further develop and test the psychometric properties of the Home Healthcare Nurses' Job Satisfaction Scale, including reliability and construct and criterion validity. Numerous scales have been developed to measure nurses' job satisfaction. Only one, the Home Healthcare Nurses' Job Satisfaction Scale, has been designed specifically to measure job satisfaction of home healthcare nurses. The Home Healthcare Nurses' Job Satisfaction Scale is based on a theoretical model that integrates the findings of empirical research related to job satisfaction. A convenience sample of 340 home healthcare nurses completed the Home Healthcare Nurses' Job Satisfaction Scale and the Mueller and McCloskey Satisfaction Scale, which was used to test criterion validity. Factor analysis was used for testing and refinement of the theory-based assignment of items to constructs. Reliability was assessed by Cronbach's alpha internal consistency reliability coefficients. The data were collected in 2003. Nine factors contributing to home healthcare nurses' job satisfaction emerged from the factor analysis and were strongly supported by the underlying theory. Factor loadings were all above 0.4. Cronbach's alpha coefficients for each of the nine subscales ranged from 0.64 to 0.83; the alpha for the global scale was 0.89. The correlations between the Home Healthcare Nurses' Job Satisfaction Scale and Mueller and McCloskey Satisfaction Scale was 0.79, indicating good criterion-related validity. The Home Healthcare Nurses' Job Satisfaction Scale has potential as a reliable and valid scale for measurement of job satisfaction of home healthcare nurses.
The reliability and validity of a sexual functioning questionnaire.
Corty, E W; Althof, S E; Kurit, D M
1996-01-01
The present study assessed the reliability and validity of a measure of sexual functioning, the CMSH-SFQ, for male patients and their partners. The CMSH-SFQ measures erectile and orgasmic functioning, sexual drive, frequency of sexual behavior, and sexual satisfaction. Test-retest reliability was assessed with 19 males and 19 females for the baseline CMSH-SFQ. Criterion validity was measured by comparing the answers of 25 male patients to those of their partners at baseline and follow-up. The majority of items had acceptable levels of reliability and validity. The CMSH-SFQ provides a reliable and valid device that can be used to measure global sexual functioning in men and their partners and may be used to evaluate the efficacy of treatments for sexual dysfunctions. Limitations and suggestions for use of the CMSH-SFQ are addressed.
Neijenhuijs, Koen I; Jansen, Femke; Aaronson, Neil K; Brédart, Anne; Groenvold, Mogens; Holzner, Bernhard; Terwee, Caroline B; Cuijpers, Pim; Verdonck-de Leeuw, Irma M
2018-05-07
The EORTC IN-PATSAT32 is a patient-reported outcome measure (PROM) to assess cancer patients' satisfaction with in-patient health care. The aim of this study was to investigate whether the initial good measurement properties of the IN-PATSAT32 are confirmed in new studies. Within the scope of a larger systematic review study (Prospero ID 42017057237), a systematic search was performed of Embase, Medline, PsycINFO, and Web of Science for studies that investigated measurement properties of the IN-PATSAT32 up to July 2017. Study quality was assessed, data were extracted, and synthesized according to the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) methodology. Nine studies were included in this review. The evidence on reliability and construct validity were rated as sufficient and of the quality of the evidence as moderate. The evidence on structural validity was rated as insufficient and of low quality. The evidence on internal consistency was indeterminate. Measurement error, responsiveness, criterion validity, and cross-cultural validity were not reported in the included studies. Measurement error could be calculated for two studies and was judged indeterminate. In summary, the IN-PATSAT32 performs as expected with respect to reliability and construct validity. No firm conclusions can be made yet whether the IN-PATSAT32 also performs as well with respect to structural validity and internal consistency. Further research on these measurement properties of the PROM is therefore needed as well as on measurement error, responsiveness, criterion validity, and cross-cultural validity. For future studies, it is recommended to take the COSMIN methodology into account.
Stockman, Ida J; Newkirk-Turner, Brandi L; Swartzlander, Elaina; Morris, Lekeitha R
2016-02-01
This study is a response to the need for evidence-based measures of spontaneous oral language to assess African American children under the age of 4 years. We determined if pass/fail status on a minimal competence core for morphosyntax (MCC-MS) was more highly related to scores on the Index of Productive Syntax (IPSyn)-the measure of convergent criterion validity-than to scores on 3 measures of divergent validity: number of different words (Watkins, Kelly, Harbers, & Hollis, 1995), Percentage of Consonants Correct-Revised (Shriberg, Austin, Lewis, McSweeney, & Wilson, 1997), and the Leiter International Performance Scale-Revised (Roid & Miller, 1997). Archival language samples for 68 African American 3-year-olds were analyzed to determine MCC-MS pass/fail status and the scores on measures of convergent and divergent validity. Higher IPSyn scores were observed for 60 children who passed the MCC-MS than for 8 children who did not. A significant positive correlation, rpb = .73, between MCC-MS pass/fail status and IPSyn scores was observed. This coefficient was higher than MCC-MS correlations with measures of divergent validity: rpb = .13 (Leiter International Performance Scale-Revised), rpb = .42 (number of different words in 100 utterances), and rpb = .46 (Percentage of Consonants Correct-Revised). The MCC-MS has convergent criterion validity with the IPSyn. Although more research is warranted, both measures can be potentially used in oral language assessments of African American 3-year-olds.
A contextual approach to social skills assessment in the peer group: who is the best judge?
Kwon, Kyongboon; Kim, Elizabeth Moorman; Sheridan, Susan M
2012-09-01
Using a contextual approach to social skills assessment in the peer group, this study examined the criterion-related validity of contextually relevant social skills and the incremental validity of peers and teachers as judges of children's social skills. Study participants included 342 (180 male and 162 female) students and their classroom teachers (N = 22) from rural communities. As expected, contextually relevant social skills were significantly related to a variety of social status indicators (i.e., likability, peer- and teacher-assessed popularity, reciprocated friendships, clique centrality) and positive school functioning (i.e., school liking and academic competence). Peer-assessed social skills, not teacher-assessed social skills, demonstrated consistent incremental validity in predicting various indicators of social status outcomes; peer- and teacher-assessed social skills alike showed incremental validity in predicting positive school functioning. The relation between contextually relevant social skills and study outcomes did not vary by child gender. Findings are discussed in terms of the significance of peers in the assessment of children's social skills in the peer group as well as the usefulness of a contextual approach to social skills assessment.
Surrogate screening models for the low physical activity criterion of frailty.
Eckel, Sandrah P; Bandeen-Roche, Karen; Chaves, Paulo H M; Fried, Linda P; Louis, Thomas A
2011-06-01
Low physical activity, one of five criteria in a validated clinical phenotype of frailty, is assessed by a standardized, semiquantitative questionnaire on up to 20 leisure time activities. Because of the time demanded to collect the interview data, it has been challenging to translate to studies other than the Cardiovascular Health Study (CHS), for which it was developed. Considering subsets of activities, we identified and evaluated streamlined surrogate assessment methods and compared them to one implemented in the Women's Health and Aging Study (WHAS). Using data on men and women ages 65 and older from the CHS, we applied logistic regression models to rank activities by "relative influence" in predicting low physical activity.We considered subsets of the most influential activities as inputs to potential surrogate models (logistic regressions). We evaluated predictive accuracy and predictive validity using the area under receiver operating characteristic curves and assessed criterion validity using proportional hazards models relating frailty status (defined using the surrogate) to mortality. Walking for exercise and moderately strenuous household chores were highly influential for both genders. Women required fewer activities than men for accurate classification. The WHAS model (8 CHS activities) was an effective surrogate, but a surrogate using 6 activities (walking, chores, gardening, general exercise, mowing and golfing) was also highly predictive. We recommend a 6 activity questionnaire to assess physical activity for men and women. If efficiency is essential and the study involves only women, fewer activities can be included.
Validation of the Dutch Eating Behaviour Questionnaire (DEBQ) among Maltese women.
Dutton, Elaine; Dovey, Terence M
2016-12-01
The main aim of this study was to assess the dimensional structure of the Maltese version of the Dutch Eating Behaviour Questionnaire (DEBQ) and evaluate the instrument's validity and reliability among Maltese women (N = 586). Exploratory factor analysis reflected the theoretical structure of three factors; emotional, restrained and external eating which was supported by a Confirmatory Factor analysis. Minor issues with specific items in the Emotional and External eating scale were identified and discussed. Criterion-related validity was ascertained through correlations with the EAT-26. The study also assessed the DEBQ's predictive value in differentiating between BMI groups and between dieters and weight maintainers. The results suggest that the Maltese DEBQ is a psychometrically valid and reliable instrument for assessing eating behaviours with women in the Maltese community. The study also highlights the critical role of Emotional and Restrained eating in dieting and overweight Maltese women. Copyright © 2016 Elsevier Ltd. All rights reserved.
Validity of the modified back-saver sit-and-reach test: a comparison with other protocols.
Hui, S S; Yuen, P Y
2000-09-01
Studies have shown that the classical sit-and-reach (CSR) test, the modified sit-and-reach (MSR), and the newly developed back-saver sit-and-reach (BS) test have poor criterion-related validity in estimating low-back flexibility but yielded moderate criterion-related validity in hamstring flexibility. The V sit-and-reach (VSR) test was found to be practical but the validity has not been established. The purpose of this study was to propose a modified back-saver sit-and-reach (MBS) test, which incorporated all advantages of the various protocols, and to compare the criterion-related validity and reliability of all these tests. 158 college students (F = 96, and M = 62; age = 20.77 +/- 2.51) performed CSR, VSR, BS (left and right leg), and MBS (left and right leg) tests in a randomized order. Scores from each test were then correlated with the criterion measures. For all sit-reach tests, intraclass reliability (single trial) was very high (r = 0.89-0.98). MBS yielded significant and highest r with low-back and hamstring criterion for men (r = 0.47-0.67) and women (r = 0.23-0.54). The low-back and right hamstring validity of MBS for men were significantly (P < 0.01) higher than those from BS and CSR, whereas no differences in criterion-related validity were found between the MBS and other protocols in women. The ratings of perceived comfort among the sit-and-reach protocols were significantly different (P < 0.001) from each other. The rating for MBS was observed the most comfortable test as compared with other protocols. The MBS test is not only a reliable test for hamstring and low-back flexibility, it is also a more practical with improved validity for hamstring and low-back flexibility in men than previous protocols.
López-Villalobos, José A; Andrés-De Llano, Jesús; López-Sánchez, María V; Rodríguez-Molinero, Luis; Garrido-Redondo, Mercedes; Sacristán-Martín, Ana M; Martínez-Rivera, María T; Alberola-López, Susana
2017-02-01
The aim of this research is to analyze Attention Deficit Hyperactivity Disorder Rating Scales IV (ADHD RS-IV) criteria validity and its clinical usefulness for the assessment of Attention Deficit Hyperactivity Disorder (ADHD) as a function of assessment method and age. A sample was obtained from an epidemiological study (n = 1095, 6-16 years). Clinical cases of ADHD (ADHD-CL) were selected by dimensional ADHD RS-IV and later by clinical interview (DSM-IV). ADHD-CL cases were compared with four categorical results of ADHD RS-IV provided by parents (CATPA), teachers (CATPR), either parents or teachers (CATPAOPR) and both parents and teachers (CATPA&PR). Criterion validity and clinical usefulness of the answer modalities to ADHD RS-IV were studied. ADHD-CL rate was 6.9% in childhood, 6.2% in preadolescence and 6.9% in adolescence. Alternative methods to the clinical interview led to increased numbers of ADHD cases in all age groups analyzed, in the following sequence: CATPAOPR> CATPRO> CATPA> CATPA&PR> ADHD-CL. CATPA&PR was the procedure with the greatest validity, specificity and clinical usefulness in all three age groups, particularly in the childhood. Isolated use of ADHD RS-IV leads to an increase in ADHD cases compared to clinical interview, and varies depending on the procedure used.
Development and Validation of the Five-by-Five Resilience Scale.
DeSimone, Justin A; Harms, P D; Vanhove, Adam J; Herian, Mitchel N
2017-09-01
This article introduces a new measure of resilience and five related protective factors. The Five-by-Five Resilience Scale (5×5RS) is developed on the basis of theoretical and empirical considerations. Two samples ( N = 475 and N = 613) are used to assess the factor structure, reliability, convergent validity, and criterion-related validity of the 5×5RS. Confirmatory factor analysis supports a bifactor model. The 5×5RS demonstrates adequate internal consistency as evidenced by Cronbach's alpha and empirical reliability estimates. The 5×5RS correlates positively with the Connor-Davidson Resilience Scale (CD-RISC), a commonly used measure of resilience. The 5×5RS exhibits similar criterion-related validity to the CD-RISC as evidenced by positive correlations with satisfaction with life, meaning in life, and secure attachment style as well as negative correlations with rumination and anxious or avoidant attachment styles. 5×5RS scores are positively correlated with healthy behaviors such as exercise and negatively correlated with sleep difficulty and symptomology of anxiety and depression. The 5×5RS incrementally explains variance in some criteria above and beyond the CD-RISC. Item responses are modeled using the graded response model. Information estimates demonstrate the ability of the 5×5RS to assess individuals within at least one standard deviation of the mean on relevant latent traits.
[Development and Validation of the Academic Resilience Inventory for Nursing Students in Taiwan].
Li, Cheng-Chieh; Wei, Chi-Fang; Tung, Yuk-Ying
2017-10-01
Failure to cope with learning pressures has been shown to influence the learning achievement and professional performance of nursing students. In order to enable nursing students to adapt successfully to their academic stress, it is essential to explore their academic resilience in the process of learning. To develop the Academic Resilience Inventory for Nursing Students (ARINS) and to test its reliability and validity. A total of 611 nursing students in central and southern Taiwan were recruited as participants. We divided the sample into two subsamples randomly using R software. The first sample was used to conduct item analysis and exploratory factor analysis. The other sample was used to conduct confirmatory factor analysis, cross validation, and criterion-related validity. There are 15 items in the ARINS, with cognitive maturity, emotional regulation, and help-seeking behavior used as the measurement indicators of academic resilience in nursing students. The assessed goodness-of-fit index indicates that the model fit the data well based upon the CFA and has good convergent validity and discriminant validity. Criterion-related validity was supported by the correlation among ARINS, learning performance and attitude, hope and optimistic, and depression. The ARINS has good reliability and validation and is a suitable measure of academic resilience in nursing students. It is helpful for nursing students to examine their academic stress and coping efficacy in the learning process.
Criterion-Referenced Testing in Foreign Language Teaching.
ERIC Educational Resources Information Center
Takala, Sauli
A review of literature serves as the basis for a discussion of various aspects of criterion-referenced tests. The aspects discussed are: teaching and evaluation objectives, criterion- and norm-referenced measurement, stages in construction of criterion-referenced tests, construction and selection of items, test validity, and test reliability.…
Sheffield, Alexandra; Waller, Glenn; Emanuelli, Francesca; Murray, James
2006-01-01
Recent studies support the reliability and validity of the Young Parenting Inventory-Revised (YPI-R) and its use in investigating the role of parenting in the aetiology and maintenance of eating pathology. However, criterion validity has yet to be fully established. To investigate one aspect of criterion validity, this study examines the association between parenting and comorbid problems in the eating disorders (including general psychopathology and impulsivity). The participants were 124 women with eating disorders. They completed the YPI-R and the Brief Symptom Inventory (BSI; a measure of general psychopathology). They were also interviewed about their use of a number of impulsive behaviours. YPI-R scales were significant predictors of one of the nine BSI scales, and distinguished those patients who did or did not use specific impulsive behaviours. The criterion validity of the YPI-R is partially supported with regards to general psychopathology and impulsivity. The findings highlight the specificity of the parenting styles measured by the YPI-R, and the need for further research using this tool.
Van Iddekinge, Chad H; Roth, Philip L; Putka, Dan J; Lanivich, Stephen E
2011-11-01
A common belief among researchers is that vocational interests have limited value for personnel selection. However, no comprehensive quantitative summaries of interests validity research have been conducted to substantiate claims for or against the use of interests. To help address this gap, we conducted a meta-analysis of relations between interests and employee performance and turnover using data from 74 studies and 141 independent samples. Overall validity estimates (corrected for measurement error in the criterion but not for range restriction) for single interest scales were .14 for job performance, .26 for training performance, -.19 for turnover intentions, and -.15 for actual turnover. Several factors appeared to moderate interest-criterion relations. For example, validity estimates were larger when interests were theoretically relevant to the work performed in the target job. The type of interest scale also moderated validity, such that corrected validities were larger for scales designed to assess interests relevant to a particular job or vocation (e.g., .23 for job performance) than for scales designed to assess a single, job-relevant realistic, investigative, artistic, social, enterprising, or conventional (i.e., RIASEC) interest (.10) or a basic interest (.11). Finally, validity estimates were largest when studies used multiple interests for prediction, either by using a single job or vocation focused scale (which tend to tap multiple interests) or by using a regression-weighted composite of several RIASEC or basic interest scales. Overall, the results suggest that vocational interests may hold more promise for predicting employee performance and turnover than researchers may have thought. (c) 2011 APA, all rights reserved.
ERIC Educational Resources Information Center
Rikli, Roberta E.; Jones, C. Jessie
2013-01-01
Purpose: To develop and validate criterion-referenced fitness standards for older adults that predict the level of capacity needed for maintaining physical independence into later life. The proposed standards were developed for use with a previously validated test battery for older adults--the Senior Fitness Test (Rikli, R. E., & Jones, C. J.…
ERIC Educational Resources Information Center
Daviss, W. Burleson; Birmaher, Boris; Melhem, Nadine A.; Axelson, David A.; Michaels, Shana M.; Brent, David A.
2006-01-01
Background: Previous measures of pediatric depression have shown inconsistent validity in groups with differing demographics, comorbid diagnoses, and clinic or non-clinic origins. The current study re-examines the criterion validity of child- and parent-versions of the Mood and Feelings Questionnaire (MFQ-C, MFQ-P) in a heterogeneous sample of…
Quantifying Human Movement Using the Movn Smartphone App: Validation and Field Study
2017-01-01
Background The use of embedded smartphone sensors offers opportunities to measure physical activity (PA) and human movement. Big data—which includes billions of digital traces—offers scientists a new lens to examine PA in fine-grained detail and allows us to track people’s geocoded movement patterns to determine their interaction with the environment. Objective The objective of this study was to examine the validity of the Movn smartphone app (Moving Analytics) for collecting PA and human movement data. Methods The criterion and convergent validity of the Movn smartphone app for estimating energy expenditure (EE) were assessed in both laboratory and free-living settings, compared with indirect calorimetry (criterion reference) and a stand-alone accelerometer that is commonly used in PA research (GT1m, ActiGraph Corp, convergent reference). A supporting cross-validation study assessed the consistency of activity data when collected across different smartphone devices. Global positioning system (GPS) and accelerometer data were integrated with geographical information software to demonstrate the feasibility of geospatial analysis of human movement. Results A total of 21 participants contributed to linear regression analysis to estimate EE from Movn activity counts (standard error of estimation [SEE]=1.94 kcal/min). The equation was cross-validated in an independent sample (N=42, SEE=1.10 kcal/min). During laboratory-based treadmill exercise, EE from Movn was comparable to calorimetry (bias=0.36 [−0.07 to 0.78] kcal/min, t82=1.66, P=.10) but overestimated as compared with the ActiGraph accelerometer (bias=0.93 [0.58-1.29] kcal/min, t89=5.27, P<.001). The absolute magnitude of criterion biases increased as a function of locomotive speed (F1,4=7.54, P<.001) but was relatively consistent for the convergent comparison (F1,4=1.26, P<.29). Furthermore, 95% limits of agreement were consistent for criterion and convergent biases, and EE from Movn was strongly correlated with both reference measures (criterion r=.91, convergent r=.92, both P<.001). Movn overestimated EE during free-living activities (bias=1.00 [0.98-1.02] kcal/min, t6123=101.49, P<.001), and biases were larger during high-intensity activities (F3,6120=1550.51, P<.001). In addition, 95% limits of agreement for convergent biases were heterogeneous across free-living activity intensity levels, but Movn and ActiGraph measures were strongly correlated (r=.87, P<.001). Integration of GPS and accelerometer data within a geographic information system (GIS) enabled creation of individual temporospatial maps. Conclusions The Movn smartphone app can provide valid passive measurement of EE and can enrich these data with contextualizing temporospatial information. Although enhanced understanding of geographic and temporal variation in human movement patterns could inform intervention development, it also presents challenges for data processing and analytics. PMID:28818819
Merk, Josef; Schlotz, Wolff; Falter, Thomas
2017-01-01
This study presents a new measure of value systems, the Motivational Value Systems Questionnaire (MVSQ), which is based on a theory of value systems by psychologist Clare W. Graves. The purpose of the instrument is to help people identify their personal hierarchies of value systems and thus become more aware of what motivates and demotivates them in work-related contexts. The MVSQ is a forced-choice (FC) measure, making it quicker to complete and more difficult to intentionally distort, but also more difficult to assess its psychometric properties due to ipsativity of FC data compared to rating scales. To overcome limitations of ipsative data, a Thurstonian IRT (TIRT) model was fitted to the questionnaire data, based on a broad sample of N = 1,217 professionals and students. Comparison of normative (IRT) scale scores and ipsative scores suggested that MVSQ IRT scores are largely freed from restrictions due to ipsativity and thus allow interindividual comparison of scale scores. Empirical reliability was estimated using a sample-based simulation approach which showed acceptable and good estimates and, on average, slightly higher test-retest reliabilities. Further, validation studies provided evidence on both construct validity and criterion-related validity. Scale score correlations and associations of scores with both age and gender were largely in line with theoretically- and empirically-based expectations, and results of a multitrait-multimethod analysis supports convergent and discriminant construct validity. Criterion validity was assessed by examining the relation of value system preferences to departmental affiliation which revealed significant relations in line with prior hypothesizing. These findings demonstrate the good psychometric properties of the MVSQ and support its application in the assessment of value systems in work-related contexts. PMID:28979228
Merk, Josef; Schlotz, Wolff; Falter, Thomas
2017-01-01
This study presents a new measure of value systems, the Motivational Value Systems Questionnaire (MVSQ), which is based on a theory of value systems by psychologist Clare W. Graves. The purpose of the instrument is to help people identify their personal hierarchies of value systems and thus become more aware of what motivates and demotivates them in work-related contexts. The MVSQ is a forced-choice (FC) measure, making it quicker to complete and more difficult to intentionally distort, but also more difficult to assess its psychometric properties due to ipsativity of FC data compared to rating scales. To overcome limitations of ipsative data, a Thurstonian IRT (TIRT) model was fitted to the questionnaire data, based on a broad sample of N = 1,217 professionals and students. Comparison of normative (IRT) scale scores and ipsative scores suggested that MVSQ IRT scores are largely freed from restrictions due to ipsativity and thus allow interindividual comparison of scale scores. Empirical reliability was estimated using a sample-based simulation approach which showed acceptable and good estimates and, on average, slightly higher test-retest reliabilities. Further, validation studies provided evidence on both construct validity and criterion-related validity. Scale score correlations and associations of scores with both age and gender were largely in line with theoretically- and empirically-based expectations, and results of a multitrait-multimethod analysis supports convergent and discriminant construct validity. Criterion validity was assessed by examining the relation of value system preferences to departmental affiliation which revealed significant relations in line with prior hypothesizing. These findings demonstrate the good psychometric properties of the MVSQ and support its application in the assessment of value systems in work-related contexts.
Burnout and hopelessness among farmers: The Farmers Stressors Inventory.
Truchot, Didier; Andela, Marie
2018-05-03
Farming is a stressful occupation with a high rate of suicide. However, there have been relatively few studies that have examined the antecedents of stress and suicide in farmers. We also lack methodologically sound scales aimed at assessing the stressors faced by farmers. Therefore, the purposes of this study were to develop an instrument assessing the stressors met by farmers, The Farmers Stressors Inventory, and to test its factorial structure, internal consistency and criterion validity. First, based on the existing literature and interviews with farmers, we designed a scale containing 37 items. Then a sample of 2142 French farmers completed a questionnaire containing the 37 items along with two measures: The MBIGS that assesses burnout and the BHS that assesses hopelessness. The statistical analyses (EFA and CFA) revealed eight factors in accordance with different aspects of farmers job stressors: workload and lack of time, incertitude toward the future and the financial market, agricultural legislation pressure, social and geographical isolation, financial worry, conflicts with associates or family members, family succession of the farm, and unpredictable interference with farm work. The internal consistency of the eight subscales was satisfactory. Correlation between these eight dimensions and burnout on the one side and hopelessness on the other side support the criterion-related validity of the scale.
Neuropathic pain screening questionnaires have limited measurement properties. A systematic review.
Mathieson, Stephanie; Maher, Christopher G; Terwee, Caroline B; Folly de Campos, Tarcisio; Lin, Chung-Wei Christine
2015-08-01
The Douleur Neuropathique 4 (DN4), ID Pain, Leeds Assessment of Neuropathic Symptoms and Signs (LANSS), PainDETECT, and Neuropathic Pain Questionnaire have been recommended as screening questionnaires for neuropathic pain. This systematic review aimed to evaluate the measurement properties (eg, criterion validity and reliability) of these questionnaires. Online database searches were conducted and two independent reviewers screened studies and extracted data. Methodological quality of included studies and the measurement properties were assessed against established criteria. A modified Grading of Recommendations Assessment, Development and Evaluation approach was used to summarize the level of evidence. Thirty-seven studies were included. Most studies recruited participants from pain clinics. The original version of the DN4 (French) and Neuropathic Pain Questionnaire (English) had the most number of satisfactory measurement properties. The ID Pain (English) demonstrated satisfactory hypothesis testing and reliability, but all other properties tested were unsatisfactory. The LANSS (English) was unsatisfactory for all properties, except specificity. The PainDETECT (English) demonstrated satisfactory hypothesis testing and criterion validity. In general, the cross-cultural adaptations had less evidence than the original versions. Overall, the DN4 and Neuropathic Pain Questionnaire were most suitable for clinical use. These screening questionnaires should not replace a thorough clinical assessment. Crown Copyright © 2015. Published by Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Nunn, Gary D.; And Others
1986-01-01
Investigated the relationships between student locus of control and academic achievement in grades five through eight. The Nowicki-Strickland Locus of Control Scale (NSLOCS) was used to measure motivation, and the Iowa Tests of Basic Skills (ITBS) to assess academic achievement. Results indicated moderate inverse relationships between level of…
Use of the Acceptance Scale To Measure Attitudes of Kindergarten-Age Children.
ERIC Educational Resources Information Center
Favazza, Paddy C.; Odom, Samuel L.
1996-01-01
A study of 188 kindergarten children evaluated the effectiveness of the Assessment Scale for Kindergartners, which was designed to evaluate the attitudes of kindergarten children toward children with disabilities. Results found the scale provided evidence of criterion-related validity and that children who'd had contact with children with…
Measuring Implicit European and Mediterranean Landscape Identity: A Tool Proposal.
Fornara, Ferdinando; Dentale, Francesco; Troffa, Renato; Piras, Simona
2016-01-01
This study presents a tool - the Landscape Identity Implicit Association Test (LI-IAT) - devoted to measure the implicit identification with European and Mediterranean landscapes. To this aim, a series of prototypical landscapes was selected as stimulus, following an accurate multi-step procedure. Participants (N = 174), recruited in two Italian cities, performed two LI-IATs devoted to assess their identification with European vs. Not-European and Mediterranean vs. Not-Mediterranean prototypical landscapes. Psychometric properties and criterion validity of these measures were investigated. Two self-report measures, assessing, respectively, European and Mediterranean place identity and pleasantness of the target landscapes, were also administered. Results showed: (1) an adequate level of internal consistency for both LI-IATs; (2) a higher identification with European and Mediterranean landscapes than, respectively, with Not-European and Not-Mediterranean ones; and (3) a significant positive relationship between the European and Mediterranean LI-IATs and the corresponding place identity scores, also when pleasantness of landscapes was controlled for. Overall, these findings provide a first evidence supporting the reliability and criterion validity of the European and Mediterranean LI-IATs.
Measuring Implicit European and Mediterranean Landscape Identity: A Tool Proposal
Fornara, Ferdinando; Dentale, Francesco; Troffa, Renato; Piras, Simona
2016-01-01
This study presents a tool – the Landscape Identity Implicit Association Test (LI-IAT) – devoted to measure the implicit identification with European and Mediterranean landscapes. To this aim, a series of prototypical landscapes was selected as stimulus, following an accurate multi-step procedure. Participants (N = 174), recruited in two Italian cities, performed two LI-IATs devoted to assess their identification with European vs. Not-European and Mediterranean vs. Not-Mediterranean prototypical landscapes. Psychometric properties and criterion validity of these measures were investigated. Two self-report measures, assessing, respectively, European and Mediterranean place identity and pleasantness of the target landscapes, were also administered. Results showed: (1) an adequate level of internal consistency for both LI-IATs; (2) a higher identification with European and Mediterranean landscapes than, respectively, with Not-European and Not-Mediterranean ones; and (3) a significant positive relationship between the European and Mediterranean LI-IATs and the corresponding place identity scores, also when pleasantness of landscapes was controlled for. Overall, these findings provide a first evidence supporting the reliability and criterion validity of the European and Mediterranean LI-IATs. PMID:27642284
Johnson, Marquell; Turek, Jillian; Dornfeld, Chelsea; Drews, Jennifer; Hansen, Nicole
2016-01-01
Background The emergence of mHealth and the utilization of smartphones in physical activity interventions warrant a closer examination of validity evidence for such technology. This study examined the validity of the Samsung S Health application in measuring steps and energy expenditure. Methods Twenty-nine participants (mean age 21.69 ± 1.63) participated in the study. Participants carried a Samsung smartphone in their non-dominant hand and right pocket while walking around a 200-meter track and running on a treadmill at 2.24 m∙s−1. Steps and energy expenditure from the S Health app were compared with StepWatch 3 Step Activity Monitor steps and indirect calorimetry. Results No significant differences between S Health estimated steps and energy expenditure during walking and their respective criterion measures, regardless of placement. There was also no significant difference between S Health estimated steps and the criterion measure during treadmill running, regardless of placement. There was significant differences between S Health estimated energy expenditure and the criterion during treadmill running for both placements (both p < 0.001). Conclusions The S Health application measures steps and energy expenditure accurately during self-selected pace walking regardless of placement. Placement of the phone impacts the S Health application accuracy in measuring physical activity variables during treadmill running. PMID:29942556
Rosa-Rizzotto, M; Visonà Dalla Pozza, L; Corlatti, A; Luparia, A; Marchi, A; Molteni, F; Facchin, P; Pagliano, E; Fedrizzi, E
2014-10-01
In hemiplegic children, the recognition of the activity limitation pattern and the possibility of grading its severity are relevant for clinicians while planning interventions, monitoring results, predicting outcomes. Aim of the study is to examine the reliability and validity of Besta Scale, an instrument used to measure in hemiplegic children from 18 months to 12 years of age both grasp on request (capacity) and spontaneous use of upper limb (performance) in bimanual play activities and in ADL. Psychometric analysis of reliability and of validity of the Besta scale was performed. Outpatient study sample Reliability study: A sample of 39 patients was enrolled. The administration of Besta scale was video-recorded in a standardized manner. All videos were scored by 20 independent raters on subsequent viewing. 3 raters randomly selected from the 20-raters group rescored the same video two years later for intra-rater reliability. Intra and inter-rater reliability were calculated using Intraclass Correlation Coefficient (ICC) and Kendall's coefficient (K), respectively. Internal consistency reliability was assessed using Alpha's Chronbach coefficient. Validity study: a sample of 105 children was assessed 5 times (at t0 and 2, 3, 6 and 12 months later) by 20 independent raters. Each patient underwent at the same time to QUEST and Besta scale administration and assessment. Criterion validity was calculated using rho-Pearson coefficient. Reliability study: The inter-rater reliability calculated with Kendall's coefficient resulted moderate K=0.47. The intra-rater (or test-retest) reliability for 3 raters was excellent (ICC=0.927). The Cronbach's alpha for internal consistency was 0.972. Validity study: Besta scale showed a good criterion validity compared to QUEST increasing by age and severity of impairment. Rho Pearson's correlation coefficient r was 0.81 (P<0.0001). Limitations. Besta scales in infants finds hard to distinguish between mild to moderately impaired hand function. Besta scale scoring system is a valid and reliable tool, utilizable in a clinical setting to monitor evolution of unimanual and bimanual manipulation and to distinguish hand's capacity from performance.
Westlake, Bryce; Bouchard, Martin; Frank, Richard
2017-10-01
The distribution of child sexual exploitation (CE) material has been aided by the growth of the Internet. The graphic nature and prevalence of the material has made researching and combating difficult. Although used to study online CE distribution, automated data collection tools (e.g., webcrawlers) have yet to be shown effective at targeting only relevant data. Using CE-related image and keyword criteria, we compare networks starting from CE websites to those from similar non-CE sexuality websites and dissimilar sports websites. Our results provide evidence that (a) webcrawlers have the potential to provide valid CE data, if the appropriate criterion is selected; (b) CE distribution is still heavily image-based suggesting images as an effective criterion; (c) CE-seeded networks are more hub-based and differ from non-CE-seeded networks on several website characteristics. Recommendations for improvements to reliable criteria selection are discussed.
Automated Assessment of Child Vocalization Development Using LENA.
Richards, Jeffrey A; Xu, Dongxin; Gilkerson, Jill; Yapanel, Umit; Gray, Sharmistha; Paul, Terrance
2017-07-12
To produce a novel, efficient measure of children's expressive vocal development on the basis of automatic vocalization assessment (AVA), child vocalizations were automatically identified and extracted from audio recordings using Language Environment Analysis (LENA) System technology. Assessment was based on full-day audio recordings collected in a child's unrestricted, natural language environment. AVA estimates were derived using automatic speech recognition modeling techniques to categorize and quantify the sounds in child vocalizations (e.g., protophones and phonemes). These were expressed as phone and biphone frequencies, reduced to principal components, and inputted to age-based multiple linear regression models to predict independently collected criterion-expressive language scores. From these models, we generated vocal development AVA estimates as age-standardized scores and development age estimates. AVA estimates demonstrated strong statistical reliability and validity when compared with standard criterion expressive language assessments. Automated analysis of child vocalizations extracted from full-day recordings in natural settings offers a novel and efficient means to assess children's expressive vocal development. More research remains to identify specific mechanisms of operation.
Harris, Joshua D; Erickson, Brandon J; Cvetanovich, Gregory L; Abrams, Geoffrey D; McCormick, Frank M; Gupta, Anil K; Verma, Nikhil N; Bach, Bernard R; Cole, Brian J
2014-02-01
Condition-specific questionnaires are important components in evaluation of outcomes of surgical interventions. No condition-specific study methodological quality questionnaire exists for evaluation of outcomes of articular cartilage surgery in the knee. To develop a reliable and valid knee articular cartilage-specific study methodological quality questionnaire. Cross-sectional study. A stepwise, a priori-designed framework was created for development of a novel questionnaire. Relevant items to the topic were identified and extracted from a recent systematic review of 194 investigations of knee articular cartilage surgery. In addition, relevant items from existing generic study methodological quality questionnaires were identified. Items for a preliminary questionnaire were generated. Redundant and irrelevant items were eliminated, and acceptable items modified. The instrument was pretested and items weighed. The instrument, the MARK score (Methodological quality of ARticular cartilage studies of the Knee), was tested for validity (criterion validity) and reliability (inter- and intraobserver). A 19-item, 3-domain MARK score was developed. The 100-point scale score demonstrated face validity (focus group of 8 orthopaedic surgeons) and criterion validity (strong correlation to Cochrane Quality Assessment score and Modified Coleman Methodology Score). Interobserver reliability for the overall score was good (intraclass correlation coefficient [ICC], 0.842), and for all individual items of the MARK score, acceptable to perfect (ICC, 0.70-1.000). Intraobserver reliability ICC assessed over a 3-week interval was strong for 2 reviewers (≥0.90). The MARK score is a valid and reliable knee articular cartilage condition-specific study methodological quality instrument. This condition-specific questionnaire may be used to evaluate the quality of studies reporting outcomes of articular cartilage surgery in the knee.
Psychometric validation of a condom self-efficacy scale in Korean.
Cha, EunSeok; Kim, Kevin H; Burke, Lora E
2008-01-01
When an instrument is translated for use in cross-cultural research, it needs to account for cultural factors without distorting the psychometric properties of the instrument. To validate the psychometric properties of the condom self-efficacy scale (CSE) originally developed for American adolescents and young adults after translating the scale to Korean (CSE-K) to determine its suitability for cross-cultural research among Korean college students. A cross-sectional, correlational design was used with an exploratory survey methodology through self-report questionnaires. A convenience sample of 351 students, aged 18 to 25 years, were recruited at a university in Seoul, Korea. The participants completed the CSE-K and the intention of condom use scales after they were translated from English to Korean using a combined translation technique. A demographic and sex history questionnaire, which included an item to assess actual condom usage, was also administered. Mean, variance, reliability, criterion validity, and factorial validity using confirmatory factor analysis were assessed in the CSE-K. Norms for the CSE-K were similar, but not identical, to norms for the English version. The means of all three subscales were lower for the CSE-K than for the original CSE; however, the obtained variance in CSE-K was roughly similar with the original CSE. The Cronbach's alpha coefficient for the total scale was higher for the CSE-K (.91) than that for either the CSE (.85) or CSE in Thai (.85). Criterion validity and construct validity of the CSE-K were confirmed. The CSE-K was a reliable and valid scale in measuring condom self-efficacy among Korean college students. The findings suggest that the CSE was an appropriate instrument to conduct cross-cultural research on sexual behavior in adolescents and young adults.
Sun, Fan-Ko; Chiang, Chun-Ying; Lu, Chu-Yun; Yu, Pei-Jane; Liao, Tzu-Chiao; Lan, Chu-Mei
2018-03-01
To develop the Health of Body, Mind and Spirit Scale (HBMSS), which was designed to assess drug abusers' health condition. Helping drug abusers to become healthy is important to healthcare professionals. However, no instrument exists to assess drug abusers' state of health. A cross-sectional questionnaire survey was implemented to examine the validity of the HBMSS. Data were collected from 2015-2016 at one drug abuse prevention centre in Taiwan. Participants (N = 320) who had abused drugs were invited to complete a preliminary 64-item version of the HBMSS. An item analysis, criterion-related validity analysis (using the Relapse Prediction Scale [RPS] score), split-half reliability testing and confirmatory factor analysis (CFA) were conducted to examine the psychometric properties of the HBMSS. The final version of the HBMSS contained 15 items that were divided into three subscales: the health of the body, mind and spirit. Cronbach's α and split-half reliability coefficients were all above .85. The factor loading of each item was between .74-.95. The HBMSS had satisfactory criterion-related validity with the RPS score (r = -.50, p < .001). A second-order CFA was conducted on the HBMSS. The fit indexes were good, χ 2 = 184.060, df = 94, χ 2 /df = 1.958 (p = .000). The entire HBMSS and the subscales had satisfactory reliability and validity. Healthcare professionals could use the HBMSS to evaluate the condition of the health of individuals with a drug abuse history. © 2017 John Wiley & Sons Ltd.
The Missing Middle in Validation Research
ERIC Educational Resources Information Center
Taylor, Erwin K.; Griess, Thomas
1976-01-01
In most selection validation research, only the upper and lower tails of the criterion distribution are used, often yielding misleading or incorrect results. Provides formulas and tables which enable the researcher to account more accurately for the distribution of criterion within the middle range of population. (Author/RW)
Mayorga-Vega, Daniel; Merino-Marban, Rafael; Viciana, Jesús
2014-01-01
The main purpose of the present meta-analysis was to examine the scientific literature on the criterion-related validity of sit-and-reach tests for estimating hamstring and lumbar extensibility. For this purpose relevant studies were searched from seven electronic databases dated up through December 2012. Primary outcomes of criterion-related validity were Pearson´s zero-order correlation coefficients (r) between sit-and-reach tests and hamstrings and/or lumbar extensibility criterion measures. Then, from the included studies, the Hunter- Schmidt´s psychometric meta-analysis approach was conducted to estimate population criterion- related validity of sit-and-reach tests. Firstly, the corrected correlation mean (rp), unaffected by statistical artefacts (i.e., sampling error and measurement error), was calculated separately for each sit-and-reach test. Subsequently, the three potential moderator variables (sex of participants, age of participants, and level of hamstring extensibility) were examined by a partially hierarchical analysis. Of the 34 studies included in the present meta-analysis, 99 correlations values across eight sit-and-reach tests and 51 across seven sit-and-reach tests were retrieved for hamstring and lumbar extensibility, respectively. The overall results showed that all sit-and-reach tests had a moderate mean criterion-related validity for estimating hamstring extensibility (rp = 0.46-0.67), but they had a low mean for estimating lumbar extensibility (rp = 0. 16-0.35). Generally, females, adults and participants with high levels of hamstring extensibility tended to have greater mean values of criterion-related validity for estimating hamstring extensibility. When the use of angular tests is limited such as in a school setting or in large scale studies, scientists and practitioners could use the sit-and-reach tests as a useful alternative for hamstring extensibility estimation, but not for estimating lumbar extensibility. Key Points Overall sit-and-reach tests have a moderate mean criterion-related validity for estimating hamstring extensibility, but they have a low mean validity for estimating lumbar extensibility. Among all the sit-and-reach test protocols, the Classic sit-and-reach test seems to be the best option to estimate hamstring extensibility. End scores (e.g., the Classic sit-and-reach test) are a better indicator of hamstring extensibility than the modifications that incorporate fingers-to-box distance (e.g., the Modified sit-and-reach test). When angular tests such as straight leg raise or knee extension tests cannot be used, sit-and-reach tests seem to be a useful field test alternative to estimate hamstring extensibility, but not to estimate lumbar extensibility. PMID:24570599
Guo, Xinying; Wu, Xinjuan; Guo, Aimin; Zhao, Yanwei
2018-01-01
Abstract Condyloma acuminata (CA) is a sexually transmitted disease that affects quality of life (QOL). CECA10 is an English-language questionnaire for assessing QOL in patients with CA, but there is no equivalent in China. This study aimed to develop a validated and reliable Chinese version of CECA10. The Chinese CECA10 was developed from the English version by forward translation, back translation, comparison with the original, cultural adjustments, and a pre-test (5 patients). The Chinese CECA10 and EuroQol Five Dimensions Three Level Questionnaire (EQ-5D-3L) was administered to patients with CA. Content validity (item/scale content validity indexes, I-CVI/S-CVI), test–retest reliability (intraclass coefficient, ICC), internal consistency (Cronbach α), criterion validity (comparison with the Dermatology Life Quality Index, DLQL, using Spearman correlation analysis), construct validity (exploratory factor analysis), and discriminant validity (between subgroups based on number of warts, number of recurrences, or number of sites involved) were assessed. The Chinese CECA10 had good test–retest reliability (ICC = 0.98, P < .001), internal consistency (Cronbach α values of 0.88, 0.84, and 0.83 for the total questionnaire, psychological dimension, and sexual dimension, respectively), content validity (I-CVI = 1 for all items), and criterion validity (r = -0.50, P < .001). Exploratory factor analysis extracted 2 factors with a cumulative contribution of 61.75%; the factor loading with each item was >0.4. Discriminant validity was not high. The mean CECA10 and EQ-VAS scores of 211 patients with CA (28.19 ± 7.16 years; 139 males) were 34.56 ± 19.01 and 64.64 ± 19.28, respectively. The Chinese CECA10 has good reliability and validity for evaluating the QOL of Chinese patients with CA. PMID:29489693
Abbas, Ismail; Rovira, Joan; Casanovas, Josep
2006-12-01
To develop and validate a model of a clinical trial that evaluates the changes in cholesterol level as a surrogate marker for lipodystrophy in HIV subjects under alternative antiretroviral regimes, i.e., treatment with Protease Inhibitors vs. a combination of nevirapine and other antiretroviral drugs. Five simulation models were developed based on different assumptions, on treatment variability and pattern of cholesterol reduction over time. The last recorded cholesterol level, the difference from the baseline, the average difference from the baseline and level evolution, are the considered endpoints. Specific validation criteria based on a 10% minus or plus standardized distance in means and variances were used to compare the real and the simulated data. The validity criterion was met by all models for considered endpoints. However, only two models met the validity criterion when all endpoints were considered. The model based on the assumption that within-subjects variability of cholesterol levels changes over time is the one that minimizes the validity criterion, standardized distance equal to or less than 1% minus or plus. Simulation is a useful technique for calibration, estimation, and evaluation of models, which allows us to relax the often overly restrictive assumptions regarding parameters required by analytical approaches. The validity criterion can also be used to select the preferred model for design optimization, until additional data are obtained allowing an external validation of the model.
Sepehry, Amir A; Lee, Philip E; Hsiung, Ging-Yuek R; Beattie, B Lynn; Feldman, Howard H; Jacova, Claudia
2017-01-01
Presented herein is evidence for criterion, content, and convergent/discriminant validity of the NIMH-Provisional Diagnostic Criteria for depression of Alzheimer's Disease (PDC-dAD) that were formulated to address depression in Alzheimer's disease (AD). Using meta-analytic and systematic review methods, we examined criterion validity evidence in epidemiological and clinical studies comparing the PDC-dAD to Diagnostic and Statistical Manual of Mental Disorders fourth edition (DSM-IV), and International Classification of Disease (ICD 9) depression diagnostic criteria. We estimated prevalence of depression by PDC, DSM, and ICD with an omnibus event rate effect-size. We also examined diagnostic agreement between PDC and DSM. To gauge content validity, we reviewed rates of symptom endorsement for each diagnostic approach. Finally, we examined the PDC's relationship with assessment scales (global cognition, neuropsychiatric, and depression definition) for convergent validity evidence. The aggregate evidence supports the validity of the PDC-dAD. Our findings suggest that depression in AD differs from other depressive disorders including Major Depressive Disorder (MDD) in that dAD is more prevalent, with generally a milder presentation and with unique features not captured by the DSM. Although the PDC are the current standard for diagnosis of depression in AD, we identified the need for their further optimization based on predictive validity evidence.
Brazilian validation of the Alberta Infant Motor Scale.
Valentini, Nadia Cristina; Saccani, Raquel
2012-03-01
The Alberta Infant Motor Scale (AIMS) is a well-known motor assessment tool used to identify potential delays in infants' motor development. Although Brazilian researchers and practitioners have used the AIMS in laboratories and clinical settings, its translation to Portuguese and validation for the Brazilian population is yet to be investigated. This study aimed to translate and validate all AIMS items with respect to internal consistency and content, criterion, and construct validity. A cross-sectional and longitudinal design was used. A cross-cultural translation was used to generate a Brazilian-Portuguese version of the AIMS. In addition, a validation process was conducted involving 22 professionals and 766 Brazilian infants (aged 0-18 months). The results demonstrated language clarity and internal consistency for the motor criteria (motor development score, α=.90; prone, α=.85; supine, α=.92; sitting, α=.84; and standing, α=.86). The analysis also revealed high discriminative power to identify typical and atypical development (motor development score, P<.001; percentile, P=.04; classification criterion, χ(2)=6.03; P=.05). Temporal stability (P=.07) (rho=.85, P<.001) was observed, and predictive power (P<.001) was limited to the group of infants aged from 3 months to 9 months. Limited predictive validity was observed, which may have been due to the restricted time that the groups were followed longitudinally. In sum, the translated version of AIMS presented adequate validity and reliability.
Swanson, Brian T.; Riley, Sean P.; Cote, Mark P.; Leger, Robin R.; Moss, Isaac L.; Carlos,, John
2016-01-01
Background To date, no research has examined the reliability or predictive validity of manual unloading tests of the lumbar spine to identify potential responders to lumbar mechanical traction. Purpose To determine: (1) the intra and inter-rater reliability of a manual unloading test of the lumbar spine and (2) the criterion referenced predictive validity for the manual unloading test. Methods Ten volunteers with low back pain (LBP) underwent a manual unloading test to establish reliability. In a separate procedure, 30 consecutive patients with LBP (age 50·86±11·51) were assessed for pain in their most provocative standing position (visual analog scale (VAS) 49·53±25·52 mm). Patients were assessed with a manual unloading test in their most provocative position followed by a single application of intermittent mechanical traction. Post traction, pain in the provocative position was reassessed and utilized as the outcome criterion. Results The test of unloading demonstrated substantial intra and inter-rater reliability K = 1·00, P = 0·002, K = 0·737, P = 0·001, respectively. There were statistically significant within group differences for pain response following traction for patients with a positive manual unloading test (P<0·001), while patients with a negative manual unloading test did not demonstrate a statistically significant change (P>0·05). There were significant between group differences for proportion of responders to traction based on manual unloading response (P = 0·031), and manual unloading response demonstrated a moderate to strong relationship with traction response Phi = 0·443, P = 0·015. Discussion and conclusion The manual unloading test appears to be a reliable test and has a moderate to strong correlation with pain relief that exceeds minimal clinically important difference (MCID) following traction supporting the validity of this test. PMID:27559274
Müller, Alessandra Bombarda; Valentini, Nadia Cristina; Bandeira, Paulo Felipe Ribeiro
2017-05-01
The range of stimuli provided by physical space, toys and care practices contributes to the motor, cognitive and social development of children. However, assessing the quality of child education environments is a challenge, and can be considered a health promotion initiative. This study investigated the validity of the criterion, content, construct and reliability of the Affordances in the Home Environment for Motor Development - Infant Scale (AHEMD-IS), version 3-18 months, for the use in daycare settings. Content validation was conducted with the participation of seven motor development and health care experts; and, face validity by 20 specialists in health and education. The results indicate the suitability of the adapted AHEMD-IS, evidencing its validity for the daycare setting a potential tool to assess the opportunities that the collective context offers to child development. Copyright © 2017 Elsevier Inc. All rights reserved.
Numerical and Experimental Validation of a New Damage Initiation Criterion
NASA Astrophysics Data System (ADS)
Sadhinoch, M.; Atzema, E. H.; Perdahcioglu, E. S.; van den Boogaard, A. H.
2017-09-01
Most commercial finite element software packages, like Abaqus, have a built-in coupled damage model where a damage evolution needs to be defined in terms of a single fracture energy value for all stress states. The Johnson-Cook criterion has been modified to be Lode parameter dependent and this Modified Johnson-Cook (MJC) criterion is used as a Damage Initiation Surface (DIS) in combination with the built-in Abaqus ductile damage model. An exponential damage evolution law has been used with a single fracture energy value. Ultimately, the simulated force-displacement curves are compared with experiments to validate the MJC criterion. 7 out of 9 fracture experiments were predicted accurately. The limitations and accuracy of the failure predictions of the newly developed damage initiation criterion will be discussed shortly.
Criterion Related Validity of Karate Specific Aerobic Test (KSAT).
Chaabene, Helmi; Hachana, Younes; Franchini, Emerson; Tabben, Montassar; Mkaouer, Bessem; Negra, Yassine; Hammami, Mehrez; Chamari, Karim
2015-09-01
Karate is one the most popular combat sports in the world. Physical fitness assessment on a regular manner is important for monitoring the effectiveness of the training program and the readiness of karatekas to compete. The aim of this research was to examine the criterion related to validity of the karate specific aerobic test (KSAT) as an indicator of aerobic level of karate practitioners. Cardiorespiratory responses, aerobic performance level through both treadmill laboratory test and YoYo intermittent recovery test level 1 (YoYoIRTL1) as well as time to exhaustion in the KSAT test (TE'KSAT) were determined in a total of fifteen healthy international karatekas (i.e. karate practitioners) (means ± SD: age: 22.2 ± 4.3 years; height: 176.4 ± 7.5 cm; body mass: 70.3 ± 9.7 kg and body fat: 13.2 ± 6%). Peak heart rate obtained from KSAT represented ~99% of maximal heart rate registered during the treadmill test showing that KSAT imposes high physiological demands. There was no significant correlation between KSAT's TE and relative (mL/min kg) treadmill maximal oxygen uptake (r = 0.14; P = 0.69; [small]). On the other hand, there was a significant relationship between KSAT's TE and the velocity associated with VO2max (vVO2max) (r = 0.67; P = 0.03; [large]) as well as the velocity at VO2 corresponding to the second ventilatory threshold (vVO2 VAT) (r = 0.64; P = 0.04; [large]). Moreover, significant relationship was found between TE's KSAT and both the total distance covered and parameters of intermittent endurance measured through YoYoIRTL1. The KSAT has not proved to have indirect criterion related validity as no significant correlations have been found between TE's KSAT and treadmill VO2max. Nevertheless, as correlated to other aerobic fitness variables, KSAT can be considered as an indicator of karate specific endurance. The establishment of the criterion related validity of the KSAT requires further investigation.
PTSD’s risky behavior criterion: Relation with DSM-5 PTSD symptom clusters and psychopathology
Contractor, Ateka A.; Weiss, Nicole H.; Dranger, Paula; Ruggero, Camilo; Armour, Cherie
2017-01-01
A new symptom criterion of reckless and self-destructive behaviors (E2) was recently added to posttraumatic stress disorder’s (PTSD) diagnostic criteria in DSM-5, which is unsurprising given the well-established relation between PTSD and risky behaviors. Researchers have questioned the significance and incremental validity of this symptom criterion within PTSD’s symptomatology. Unprecedented to our knowledge, we aim to compare trauma-exposed groups differing on their endorsement status of the risky behavior symptom on several psychopathology constructs (PTSD, depression, distress tolerance, rumination, anger). The sample included 123 trauma-exposed participants seeking mental health treatment (M age=35.70; 68.30% female) who completed self-report questionnaires assessing PTSD symptoms, depression, rumination, distress tolerance, and anger. Results of independent samples t-tests indicated that participants who endorsed the E2 criterion at a clinically significant level reported significantly greater PTSD subscale severity; depression severity; rumination facets of repetitive thoughts, counterfactual thinking, and problem-focused thinking; and anger reactions; and significantly less absorption and regulation (distress tolerance facets) compared to participants who did not endorse the E2 criterion at a clinically significant level. Results indicate the utility of the E2 criterion in identifying trauma-exposed individual with greater posttraumatic distress, and emphasize the importance of targeting such behaviors in treatment. PMID:28285248
Becker, Anne E; Thomas, Jennifer J; Bainivualiku, Asenaca; Richards, Lauren; Navara, Kesaia; Roberts, Andrea L; Gilman, Stephen E; Striegel-Moore, Ruth H
2010-01-01
Objective: Measurement of disease-related impairment and distress is central to diagnostic, therapeutic, and health policy considerations for eating disorders across diverse populations. This study evaluates psychometric properties of a translated and adapted version of the Clinical Impairment Assessment (CIA) in an ethnic Fijian population. Method: The adapted CIA was administered to ethnic Fijian adolescent schoolgirls (N = 215). We calculated Cronbach's α to assess the internal consistency, examined the association between indicators of eating disorder symptom severity and the CIA to assess construct and criterion validity, and compared the strength of relation between the CIA and measures of disordered eating versus with measures of generalized distress. Results: The Fijian version of the CIA is feasible to administer as an investigator-based interview. It has excellent internal consistency (α = 0.93). Both construct and criterion validity were supported by the data, and regression models indicated that the CIA predicts eating disorder severity, even when controlling for generalized distress and psychopathology. Discussion: The adapted CIA has excellent psychometric properties in this Fijian study population. Findings suggest that the CIA can be successfully adapted for use in a non-Western study population and that at least some associated distress and impairment transcends cultural differences. © 2009 by Wiley Periodicals, Inc. Int J Eat Disord, 2010 PMID:19308992
Lee, Lay Wah
2008-06-01
Malay is an alphabetic language with transparent orthography. A Malay reading-related assessment battery which was conceptualised based on the International Dyslexia Association definition of dyslexia was developed and validated for the purpose of dyslexia assessment. The battery consisted of ten tests: Letter Naming, Word Reading, Non-word Reading, Spelling, Passage Reading, Reading Comprehension, Listening Comprehension, Elision, Rapid Letter Naming and Digit Span. Content validity was established by expert judgment. Concurrent validity was obtained using the schools' language tests as criterion. Evidence of predictive and construct validity was obtained through regression analyses and factor analyses. Phonological awareness was the most significant predictor of word-level literacy skills in Malay, with rapid naming making independent secondary contributions. Decoding and listening comprehension made separate contributions to reading comprehension, with decoding as the more prominent predictor. Factor analysis revealed four factors: phonological decoding, phonological naming, comprehension and verbal short-term memory. In conclusion, despite differences in orthography, there are striking similarities in the theoretical constructs of reading-related tasks in Malay and in English.
[Validation of a scale to assess the labour quality of life in public hospitals from Tlaxcala].
Hernández-Vicente, Irma Alejandra; Lumbreras-Guzmán, Marivel; Méndez-Hernández, Pablo; Rojas-Lima, Elodia; Cervantes-Rodríguez, Margarita; Juárez-Flores, Clara Arlina
2017-01-01
To validate a scale for assessing the labour quality of life in public hospitals (LQL-PH) from Tlaxcala, Mexico. The instrument was validated among 669 health workers from six hospitals from the Ministry of Health of Tlaxcala, Mexico. Content validity was by inquiry to experts, construct validity by factor analysis, criterion validity by comparing with other scales, and reliability with Cronbach's Alpha. The factor analysis uncovered four dimensions: "individual welfare", "conditions and labour environment", "organization", and "well-being accomplished by the work"; reliability was 0.921. Workers who perceibed better LQL-PH were: under 50 years old, with temporary contract, with less seniority in job, with work schedule at daytime of weekends, and those with academic degree. LQL-PH showed to be an instrument phsycometrically valid and reliable. It's recommendable to prove this scale in other public and private health institutions, as well as its relationship with key health care indicators of labour performance and management.
NASA Astrophysics Data System (ADS)
Putra, Z. A. Z.; Sumarmin, R.; Violita, V.
2018-04-01
The guides used for practicing animal physiology need to be revised and adapted to the lecture material. This is because in the subject of Animal Physiology. The guidance of animal physiology practitioners is still conventional with prescription model instructions and is so simple that it is necessary to develop a practical guide that can lead to the development of scientific work. One of which is through practice guided inquiry guided practicum guide. This study aims to describe the process development of the practical guidance and reveal the validity, practicality, and effectiveness Guidance Physiology Animals guided inquiry inferior to the subject of Animal Physiology for students Biology Department State University of Padang. This type of research is development research. This development research uses the Plomp model. Stages performed are problem identification and analysis stage, prototype development and prototyping stage, and assessment phase. Data analysis using descriptive analysis. The instrument of data collection using validation and practical questionnaires, competence and affective field of competence observation and psychomotor and cognitive domain competence test. The result of this research shows that guidance of Inquiry Guided Initiative Guided Physiology with 3.23 valid category, practicality by lecturer with value 3.30 practical category, student with value 3.37 practical criterion. Affective effectiveness test with 93,00% criterion is very effective, psychomotor aspect 89,50% with very effective criteria and cognitive domain with value of 67, pass criterion. The conclusion of this research is Guided Inquiry Student Guided Protoxial Guidance For Students stated valid, practical and effective.
Ethical leadership: meta-analytic evidence of criterion-related and incremental validity.
Ng, Thomas W H; Feldman, Daniel C
2015-05-01
This study examines the criterion-related and incremental validity of ethical leadership (EL) with meta-analytic data. Across 101 samples published over the last 15 years (N = 29,620), we observed that EL demonstrated acceptable criterion-related validity with variables that tap followers' job attitudes, job performance, and evaluations of their leaders. Further, followers' trust in the leader mediated the relationships of EL with job attitudes and performance. In terms of incremental validity, we found that EL significantly, albeit weakly in some cases, predicted task performance, citizenship behavior, and counterproductive work behavior-even after controlling for the effects of such variables as transformational leadership, use of contingent rewards, management by exception, interactional fairness, and destructive leadership. The article concludes with a discussion of ways to strengthen the incremental validity of EL. (PsycINFO Database Record (c) 2015 APA, all rights reserved).
2015-07-01
prior to, during, and following deployment: Dyadic Adjustment Scale – measures marital functioning Conflict-Tactics Scale Family Adaptability and...Applied Psychosocial Measurement,1, 385-401. Rocissano, L., Slade, A., & Lynch, V. (1987). Dyadic synchrony and toddler compliance. Developmental...new criterion Q-sort scale. Developmental Psychology, 33, 906-916. Spanier, G.B. (1976). Measuring dyadic adjustment: new scales for assessing the
Saraf, Sanatan; Mathew, Thomas; Roy, Anindya
2015-01-01
For the statistical validation of surrogate endpoints, an alternative formulation is proposed for testing Prentice's fourth criterion, under a bivariate normal model. In such a setup, the criterion involves inference concerning an appropriate regression parameter, and the criterion holds if the regression parameter is zero. Testing such a null hypothesis has been criticized in the literature since it can only be used to reject a poor surrogate, and not to validate a good surrogate. In order to circumvent this, an equivalence hypothesis is formulated for the regression parameter, namely the hypothesis that the parameter is equivalent to zero. Such an equivalence hypothesis is formulated as an alternative hypothesis, so that the surrogate endpoint is statistically validated when the null hypothesis is rejected. Confidence intervals for the regression parameter and tests for the equivalence hypothesis are proposed using bootstrap methods and small sample asymptotics, and their performances are numerically evaluated and recommendations are made. The choice of the equivalence margin is a regulatory issue that needs to be addressed. The proposed equivalence testing formulation is also adopted for other parameters that have been proposed in the literature on surrogate endpoint validation, namely, the relative effect and proportion explained.
Pirkle, Catherine M; Dumont, Alexandre; Traore, Mamadou; Zunzunegui, Maria-Victoria
2012-10-29
In Mali and Senegal, over 1% of women die giving birth in hospital. At some hospitals, over a third of infants are stillborn. Many deaths are due to substandard medical practices. Criterion-based clinical audits (CBCA) are increasingly used to measure and improve obstetrical care in resource-limited settings, but their measurement properties have not been formally evaluated. In 2011, we published a systematic review of obstetrical CBCA highlighting insufficient considerations of validity and reliability. The objective of this study is to develop an obstetrical CBCA adapted to the West African context and assess its reliability and validity. This work was conducted as a sub-study within a cluster randomized trial known as QUARITE. Criteria were selected based on extensive literature review and expert opinion. Early 2010, two auditors applied the CBCA to identical samples at 8 sites in Mali and Senegal (n = 185) to evaluate inter-rater reliability. In 2010-11, we conducted CBCA at 32 hospitals to assess construct validity (n = 633 patients). We correlated hospital characteristics (resource availability, facility perinatal and maternal mortality) with mean hospital CBCA scores. We used generalized estimating equations to assess whether patient CBCA scores were associated with perinatal mortality. Results demonstrate substantial (ICC = 0.67, 95% CI 0.54; 0.76) to elevated inter-rater reliability (ICC = 0.84, 95% CI 0.77; 0.89) in Senegal and Mali, respectively. Resource availability positively correlated with mean hospital CBCA scores and maternal and perinatal mortality were inversely correlated with hospital CBCA scores. Poor CBCA scores, adjusted for hospital and patient characteristics, were significantly associated with perinatal mortality (OR 1.84, 95% CI 1.01-3.34). Our CBCA has substantial inter-rater reliability and there is compelling evidence of its validity as the tool performs according to theory. Current Controlled Trials ISRCTN46950658.
Roets-Merken, Lieve M; Zuidema, Sytse U; Vernooij-Dassen, Myrra J F J; Kempen, Gertrudis I J M
2014-11-01
This study investigated the psychometric properties of the Severe Dual Sensory Loss screening tool, a tool designed to help nurses and care assistants to identify hearing, visual and dual sensory impairment in older adults. Construct validity of the Severe Dual Sensory Loss screening tool was evaluated using Crohnbach's alpha and factor analysis. Interrater reliability was calculated using Kappa statistics. To evaluate the predictive validity, sensitivity and specificity were calculated by comparison with the criterion standard assessment for hearing and vision. The criterion used for hearing impairment was a hearing loss of ≥40 decibel measured by pure-tone audiometry, and the criterion for visual impairment was a visual acuity of ≤0.3 diopter or a visual field of ≤0.3°. Feasibility was evaluated by the time needed to fill in the screening tool and the clarity of the instruction and items. Prevalence of dual sensory impairment was calculated. A total of 56 older adults receiving aged care and 12 of their nurses and care assistants participated in the study. Crohnbach's alpha was 0.81 for the hearing subscale and 0.84 for the visual subscale. Factor analysis showed two constructs for hearing and two for vision. Kappa was 0.71 for the hearing subscale and 0.74 for the visual subscale. The predictive validity showed a sensitivity of 0.71 and a specificity of 0.72 for the hearing subscale; and a sensitivity of 0.69 and a specificity of 0.78 for the visual subscale. The optimum cut-off point for each subscale was score 1. The nurses and care assistants reported that the Severe Dual Sensory Loss screening tool was easy to use. The prevalence of hearing and vision impairment was 55% and 29%, respectively, and that of dual sensory impairment was 20%. The Severe Dual Sensory Loss screening tool was compared with the criterion standards for hearing and visual impairment and was found a valid and reliable tool, enabling nurses and care assistants to identify hearing, visual and dual sensory impairment among older adults. Copyright © 2014 Elsevier Ltd. All rights reserved.
ERIC Educational Resources Information Center
MacQuarrie, David; Applegate, Brooks; Lacefield, Warren
2008-01-01
Career and Technical Education (CTE) is a nationwide program that emphasizes training for primary, secondary, and post secondary educational stages for the career and workforce needs of today and tomorrow's society. Mandated indicators of success have been set in place and secondary schools are expected to improve student's skill levels in…
ERIC Educational Resources Information Center
Schimmel, Tammy; Johnston, Pattie C.; Stasio, Mike
2013-01-01
The professoriate has been debating the value of adding collegiality as a fourth criterion in faculty evaluations. Collegiality is considered to be any extra-role behavior that represents individuals' behavior that is discretionary, not recognized by the formal reward system and that, in the aggregate, promotes the effective functioning of the…
González-Sánchez, Manuel; Ruiz-Muñoz, Maria; Li, Guang Zhi; Cuesta-Vargas, Antonio I
2018-08-01
To perform a cross-cultural adaptation and validation of the Foot Function Index (FFI) questionnaire to develop the Chinese version. Three hundred and six patients with foot and ankle neuromusculoskeletal diseases participated in this observational study. Construct validity, internal consistency and criterion validity were calculated for the FFI Chinese version after the translation and transcultural adaptation process. Internal consistency ranged from 0.996 to 0.998. Test-retest analysis ranged from 0.985 to 0.994; minimal detectable change 90: 2.270; standard error of measurement: 0.973. Load distribution of the three factors had an eigenvalue greater than 1. Chi-square value was 9738.14 (p < 0.001). Correlations with the three factors were significant between Factor 1 and the other two: r = -0.634 (Factor 2) and r = -0.191 (Factor 1). Foot Function Index (Taiwan Version), Short-Form 12 (Version 2) and EuroQol-5D were used for criterion validity. Factors 1 and 2 showed significant correlation with 15/16 and 14/16 scales and subscales, respectively. Foot Function Index Chinese version psychometric characteristics were good to excellent. Chinese researchers and clinicians may use this tool for foot and ankle assessment and monitoring. Implications for rehabilitation A cross-cultural adaptation of the FFI has been done from original version to Chinese. Consistent results and satisfactory psychometric properties of the Foot Function Index Chinese version have been reported. For Chinese speaking researcher and clinician FFI-Ch could be used as a tool to assess patients with foot disease.
Pellegrino, Federica; Groff, Elena; Bastiani, Luca; Fattori, Bruno; Sotti, Guido
2015-04-01
Xerostomia is the most common acute and late side effect of radiation treatment for head and neck cancer. Affecting taste perception, chewing, swallowing and speech, xerostomia is also the major cause of decreased quality of life. The aims of this study were to validate the Italian translation of the self-reported eight-item xerostomia questionnaire (XQ) and determine its psychometric properties in patients treated with radiotherapy for head and neck cancer. An observational cross-sectional study was conducted in the Radiotherapy Unit of the Veneto Institute of Oncology - IOV in Padua. The XQ was translated according to international guidelines and filled out by 102 patients. Construct validity was assessed using principal component analysis, internal consistency using Cronbach's α coefficient and test-retest reliability at 1-month interval using the intraclass correlation coefficient (ICC). Criterion-related validity was evaluated to compare the Italian version of XQ with the European Organization for Research and Treatment of Cancer (EORTC) Core Quality-of-Life Questionnaire (QLQ-C30) and its Head and Neck Cancer Module (QLQ-H&N35). Cronbach's α for the Italian version of XQ was strong at α = 0.93, test-retest reliability was also strong (0.79) and factor analysis confirmed that the questionnaire was one-dimensional. Criterion-related validity was excellent with high association with the EORTC QLQ-H&N35 xerostomia and sticky saliva scales. The Italian version of XQ has excellent psychometric properties and can be used to evaluate the impact of emerging radiation delivery techniques aiming at preventing xerostomia.
Anota, Amélie; Mariet, Anne-Sophie; Maingon, Philippe; Joly, Florence; Bosset, Jean-François; Guizard, Anne-Valérie; Bittard, Hugues; Velten, Michel; Mercier, Mariette
2016-12-06
Health-related quality of life (HRQoL) has been positioned as one of the major endpoints in oncology. Thus, there is a need to validate cancer-site specific survey instruments. This study aimed to perform a transcultural adaptation of the 50-item Expanded Prostate cancer Index Composite (EPIC) questionnaire for HRQoL in prostate cancer patients and to validate the psychometric properties of the French-language version. The EPIC questionnaire measures urinary, bowel, sexual and hormonal domains. The first step, corresponding to transcultural adaptation of the original English version of the EPIC was performed according to the back translation technique. The second step, comprising the validation of the psychometric properties of the EPIC questionnaire, was performed in patients under treatment for localized prostate cancer (treatment group) and in patients cured of prostate cancer (cured group). The EORTC QLQ-C30 and QLQ-PR25 prostate cancer module were also completed by patients to assess criterion validity. Two assessments were performed, i.e., before and at the end of treatment for the Treatment group, to assess sensitivity to change; and at 2 weeks' interval in the Cured group to assess test-retest reliability. Psychometric properties were explored according to classical test theory. The first step showed overall good acceptability and understanding of the questionnaire. In the second step, 215 patients were included from January 2012 to June 2014: 125 in the Treatment group, and 90 in the Cured group. All domains exhibited good internal consistency, except the bowel domain (Cronbach's α = 0.61). No floor effect was observed. Test-retest reliability assessed in the cured group was acceptable, expect for bowel function (intraclass coefficient = 0.68). Criterion validity was good for each domain and subscale. Construct validity was not demonstrated for the hormonal and bowel domains. Sensitivity to change was exhibited for 5/8 subscales and 2/4 summary scores for patients who experienced toxicities during treatment. The French EPIC questionnaire seems to have adequate psychometric properties, comparable to those exhibited by the original English-language version, except for the construct validity, which was not available in original version.
Evaluating the spoken English proficiency of graduates of foreign medical schools.
Boulet, J R; van Zanten, M; McKinley, D W; Gary, N E
2001-08-01
The purpose of this study was to gather additional evidence for the validity and reliability of spoken English proficiency ratings provided by trained standardized patients (SPs) in high-stakes clinical skills examination. Over 2500 candidates who took the Educational Commission for Foreign Medical Graduates' (ECFMG) Clinical Skills Assessment (CSA) were studied. The CSA consists of 10 or 11 timed clinical encounters. Standardized patients evaluate spoken English proficiency and interpersonal skills in every encounter. Generalizability theory was used to estimate the consistency of spoken English ratings. Validity coefficients were calculated by correlating summary English ratings with CSA scores and other external criterion measures. Mean spoken English ratings were also compared by various candidate background variables. The reliability of the spoken English ratings, based on 10 independent evaluations, was high. The magnitudes of the associated variance components indicated that the evaluation of a candidate's spoken English proficiency is unlikely to be affected by the choice of cases or SPs used in a given assessment. Proficiency in spoken English was related to native language (English versus other) and scores from the Test of English as a Foreign Language (TOEFL). The pattern of the relationships, both within assessment components and with external criterion measures, suggests that valid measures of spoken English proficiency are obtained. This result, combined with the high reproducibility of the ratings over encounters and SPs, supports the use of trained SPs to measure spoken English skills in a simulated medical environment.
Guetterman, Timothy C; Creswell, John W; Wittink, Marsha; Barg, Fran K; Castro, Felipe G; Dahlberg, Britt; Watkins, Daphne C; Deutsch, Charles; Gallo, Joseph J
2017-01-01
Demand for training in mixed methods is high, with little research on faculty development or assessment in mixed methods. We describe the development of a self-rated mixed methods skills assessment and provide validity evidence. The instrument taps six research domains: "Research question," "Design/approach," "Sampling," "Data collection," "Analysis," and "Dissemination." Respondents are asked to rate their ability to define or explain concepts of mixed methods under each domain, their ability to apply the concepts to problems, and the extent to which they need to improve. We administered the questionnaire to 145 faculty and students using an internet survey. We analyzed descriptive statistics and performance characteristics of the questionnaire using the Cronbach alpha to assess reliability and an analysis of variance that compared a mixed methods experience index with assessment scores to assess criterion relatedness. Internal consistency reliability was high for the total set of items (0.95) and adequate (≥0.71) for all but one subscale. Consistent with establishing criterion validity, respondents who had more professional experiences with mixed methods (eg, published a mixed methods article) rated themselves as more skilled, which was statistically significant across the research domains. This self-rated mixed methods assessment instrument may be a useful tool to assess skills in mixed methods for training programs. It can be applied widely at the graduate and faculty level. For the learner, assessment may lead to enhanced motivation to learn and training focused on self-identified needs. For faculty, the assessment may improve curriculum and course content planning.
Bahammam, Maha A.
2016-01-01
Objectives: To test the psychometric properties of an adapted Arabic version of the state trait anxiety-form Y (STAI-Y) in Saudi adult dental patients. Methods: In this cross-sectional study, the published Arabic version of the STAI-Y was evaluated by 2 experienced bilingual professionals for its compatibility with Saudi culture and revised prior to testing. Three hundred and eighty-seven patients attending dental clinics for treatment at the Faculty of Dentistry Hospital, King Abdullah University, Jeddah, Kingdom of Saudi Arabia, participated in the study. The Arabic version of the modified dental anxiety scale (MDAS) and visual analogue scale (VAS) ratings of anxiety were used to assess the concurrent criterion validity. Results: The Arabic version of the STAI-Y had high internal consistency reliability (Cronbach’s alpha: 0.989) for state and trait subscales. Factor analysis indicated unidimensionality of the scale. Correlations between STAI-Y scores and both MDAS and VAS scores indicated strong concurrent criterion validity. Discriminant validity was supported by the findings that higher anxiety levels were present among females as opposed to males, younger individuals as compared to older individuals, and patients who do not visit the dentist unless they have a need as opposed to more frequent visitors to the dental office. Conclusion: The Arabic version of the STAI-Y has an adequate internal consistency reliability, generally similar to that reported in the international literature, suggesting it is appropriate for assessing dental anxiety in Arabic speaking populations. PMID:27279514
Tsuno, Kanami; Yoshimasu, Kouichi; Hayashi, Takashi; Tatsuta, Nozomi; Ito, Yuki; Kamijima, Michihiro; Nakai, Kunihiko
2018-01-01
Nowadays, attention deficit hyperactivity (ADH) problems are observed commonly among school-age children. However, questionnaires specific to ADH behaviors among preschool children are very few. The aim of this study was to investigate the reliability and validity of the 25-item Behavioral Check List (BCL), which was developed from interviews of parents with children who were diagnosed as having Attention-deficit/hyperactivity disorder (ADHD) and measures ADH behaviors in preschool age. We recruited 22 teachers from 10 nurseries/kindergartens in Miyagi Prefecture, Japan. A total of 138 preschool children were assessed using the BCL. To investigate inter-rater reliability, two teachers from each facility assess seven to twenty children in their class, and intraclass correlation coefficients (ICCs) were calculated. The teachers additionally answered questions in the 1/5-5 Caregiver-Teacher Report Form (C-TRF) to investigate the criterion validity of the BCL. To investigate structural validity, exploratory factor analysis with promax rotation and confirmatory factor analysis were performed. The internal consistency reliability of the BCL was good (α = 0.92) and correlation analyses also confirmed its excellent criterion validity. Although exploratory factor analysis for the BCL yielded a five-factor model that consisted of a factor structure different from that of the original one, the results were similar to the original six factors. The ICCs of the BCL were 0.38-0.99 and it was not high enough for inter-rater reliability in some facilities. However, there is a possibility to improve it by giving raters adequate explanations when using BCL. The present study showed acceptable levels of reliability and validity of the BCL among Japanese preschool children.
Validation of the Military Entrance Physical Strength Capacity Test. Technical Report 610.
ERIC Educational Resources Information Center
Myers, David C.; And Others
A battery of physical ability tests was validated using a predictive, criterion-related strategy. The battery was given to 1,003 female soldiers and 980 male soldiers before they had begun Army Basic Training. Criterion measures which represented physical competency in Basic Training (physical proficiency tests, sick call, profiles, and separation…
ERIC Educational Resources Information Center
Mooney, Paul; Lastrapes, Renée E.
2016-01-01
The amount of research evaluating the technical merits of general outcome measures of science and social studies achievement is growing. This study targeted criterion validity for critical content monitoring. Questions addressed the concurrent criterion validity of alternate presentation formats of critical content monitoring and the measure's…
Validation of a Criterion Referenced Test for Young Handicapped Children: PIPER.
ERIC Educational Resources Information Center
Strum, Irene; Shapiro, Madelaine
The purpose of this study was to validate the Prescriptive Instructional Program for Educational Readiness (PIPER) for utilization as a criterion referenced test (CRT) among learning disabled children. The program consisted of behavioral objectives and diagnostic and/or mastery tasks and activities for each objective in the area of gross motor…
Evaluation of Weighted Scale Reliability and Criterion Validity: A Latent Variable Modeling Approach
ERIC Educational Resources Information Center
Raykov, Tenko
2007-01-01
A method is outlined for evaluating the reliability and criterion validity of weighted scales based on sets of unidimensional measures. The approach is developed within the framework of latent variable modeling methodology and is useful for point and interval estimation of these measurement quality coefficients in counseling and education…
Meta-Analysis of Criterion Validity for Curriculum-Based Measurement in Written Language
ERIC Educational Resources Information Center
Romig, John Elwood; Therrien, William J.; Lloyd, John W.
2017-01-01
We used meta-analysis to examine the criterion validity of four scoring procedures used in curriculum-based measurement of written language. A total of 22 articles representing 21 studies (N = 21) met the inclusion criteria. Results indicated that two scoring procedures, correct word sequences and correct minus incorrect sequences, have acceptable…
SymptoMScreen: A Tool for Rapid Assessment of Symptom Severity in MS Across Multiple Domains.
Green, R; Kalina, J; Ford, R; Pandey, K; Kister, I
2017-01-01
The objective of this study was to describe SymptoMScreen, an in-house developed tool for rapid assessment of MS symptom severity in routine clinical practice, and to validate SymptoMScreen against Performance Scales (PS). MS patients typically experience symptoms in many neurologic domains. A tool that would enable MS patients to efficiently relay their symptom severity across multiple domains to the healthcare providers could lead to improved symptom management. We developed "SymptoMScreen," a battery of 7-point Likert scales for 12 distinct domains commonly affected by MS: mobility, dexterity, body pain, sensation, bladder function, fatigue, vision, dizziness, cognition, depression, and anxiety. We administered SymptoMScreen and PS scales to consecutive MS patients at a specialty MS Care Center. We assessed the criterion and construct validity of SymptoMScreen by calculating Spearmen rank correlations between the SymptoMScreen composite score and PS composite score, and between SymptoMScreen subscale and the respective PS subscale scores, where applicable. A total of 410 patients with MS (age 46.6 ± 12.9 years; 74% female; mean disease duration 12.2 ± 8.7 years) completed the SymptoMScreen and PSs during their clinic visit. Composite SymptoMScreen score correlated strongly with combined PS score (r = 0.88, p < 0.0001). SymptoMScreen sub scores correlated strongly with the criterion measures of the respective PS (r = 0.69-0.87, p < 0.0001). Test-retest reliability of SymptoMScreen and its subscales was excellent (r = 0.71-0.94, p < .0001). SymptoMScreen is a single-page battery of Likert scales that assesses symptom impact in 12 domains commonly affected in MS. It has excellent criterion and construct validity. SymptoMScreen is patient and clinician friendly, takes approximately one minute to complete, and can help better document, understand, and manage patients' symptoms in routine clinical practice. SymptoMScreen is freely available to clinicians and researchers.
Lou, Yanni; Lu, Linghui; Li, Yuan; Liu, Meng; Bredle, Jason M; Jia, Liqun
2015-10-01
The study objective was to determine the reliability and validity of the Chinese version of the Functional Assessment of Chronic Illness Therapy - Ascites Index (FACIT-AI). A forward-backward translation procedure was adopted to develop the Chinese version of the FACIT-AI, which was tested in 69 patients with malignant ascites. Cronbach's α, split-half reliability, and test-retest reliability were used to assess the reliability of the scale. The content validity index was used to assess the content validity, while factor analysis was used for construct validity and correlation analysis was used for criterion validity. The Cronbach's α was 0.772 for the total scale, and the split-half reliability was 0.693. The test-retest correlation was 0.972. The content validity index for the scale was 0.8-1.0. Four factors were extracted by factor analysis, and these contributed 63.51% of the total variance. Item-total correlations ranged from 0.591 to 0.897, and these were correlated with visual analog scale scores (correlation coefficient, 0.889; P<0.01). The Chinese version of the FACIT-AI has good reliability and validity and can be used as a tool to measure quality of life in Chinese patients with malignant ascites.
Rodríguez-Martínez, Carlos E; Nino, Gustavo; Castro-Rodriguez, Jose A
2014-01-01
There is a critical need for validation studies of questionnaires designed to assess the level of control of asthma in children younger than 5 years old. To validate the Spanish version of the Test for Respiratory and Asthma Control in Kids (TRACK) questionnaire in children younger than age 5 years with symptoms consistent with asthma. In a prospective cohort validation study, parents and/or caregivers of children younger than age 5 years and with symptoms consistent with asthma, during a baseline and a follow-up visit 2 to 6 weeks later, completed the information required to assess the content validity, criterion validity, construct validity, test-retest reliability, sensitivity to change, internal consistency reliability, and usability of the TRACK questionnaire. Median (interquartile range) of the TRACK scores were significantly different between patients with well-controlled asthma, patients with not well-controlled asthma, and patients with very poorly controlled asthma (90.0 [75.0-95.0], 75.0 [55.0-85.0], and 35.0 [25.0-55.0], respectively, P < .001). TRACK scores were significantly different between patients classified as currently symptomatic and symptomatic in the recent past (42.5 [25.0-55.0] vs 85.0 [75.0-90.0]; P < .001). The intraclass correlation coefficient of the measurements was 0.755 (95% CI, 0.503-1.00). All patients whose clinical status changed showed an increase of 10 or more points in TRACK score between baseline and follow-up visits. The Cronbach α was 0.77 for the questionnaire as a whole. The Spanish version of the TRACK questionnaire has excellent sensitivity to change and usability; adequate criterion validity, construct validity, and test-retest reliability; and an acceptable internal consistency, when used in children younger than age 5 years with symptoms consistent with asthma. Copyright © 2014 American Academy of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.
Maćkiewicz, Marta; Cieciuch, Jan
2016-01-01
In order to adjust personality measurements to children's developmental level, we constructed the Pictorial Personality Traits Questionnaire for Children (PPTQ-C). To validate the measure, we conducted a study with a total group of 1028 children aged between 7 and 13 years old. Structural validity was established through Exploratory Structural Equation Model (ESEM). Criterion validity was confirmed with a multitrait-multimethod analysis for which we introduced the children's self-assessment scores from the Big Five Questionnaire for Children. Despite some problems with reliability, one can conclude that the PPTQ-C can be a valid instrument for measuring personality traits, particularly in a group of young children (aged ~7-10 years).
Validation of the Weight Concerns Scale Applied to Brazilian University Students.
Dias, Juliana Chioda Ribeiro; da Silva, Wanderson Roberto; Maroco, João; Campos, Juliana Alvares Duarte Bonini
2015-06-01
The aim of this study was to evaluate the validity and reliability of the Portuguese version of the Weight Concerns Scale (WCS) when applied to Brazilian university students. The scale was completed by 1084 university students from Brazilian public education institutions. A confirmatory factor analysis was conducted. The stability of the model in independent samples was assessed through multigroup analysis, and the invariance was estimated. Convergent, concurrent, divergent, and criterion validities as well as internal consistency were estimated. Results indicated that the one-factor model presented an adequate fit to the sample and values of convergent validity. The concurrent validity with the Body Shape Questionnaire and divergent validity with the Maslach Burnout Inventory for Students were adequate. Internal consistency was adequate, and the factorial structure was invariant in independent subsamples. The results present a simple and short instrument capable of precisely and accurately assessing concerns with weight among Brazilian university students. Copyright © 2015 Elsevier Ltd. All rights reserved.
Rohlf, Helena L; Krahé, Barbara
2015-01-01
An observational measure of anger regulation in middle childhood was developed that facilitated the in situ assessment of five maladaptive regulation strategies in response to an anger-eliciting task. 599 children aged 6-10 years (M = 8.12, SD = 0.92) participated in the study. Construct validity of the measure was examined through correlations with parent- and self-reports of anger regulation and anger reactivity. Criterion validity was established through links with teacher-rated aggression and social rejection measured by parent-, teacher-, and self-reports. The observational measure correlated significantly with parent- and self-reports of anger reactivity, whereas it was unrelated to parent- and self-reports of anger regulation. It also made a unique contribution to predicting aggression and social rejection.
Haggerty, Greg; Bornstein, Robert F.; Khalid, Mohammad; Sharma, Vishal; Riaz, Usman; Blanchard, Mark; Siefert, Caleb J; Sinclair, Samuel J.
2015-01-01
This study assessed the construct validity of the Relationship Profile Test (RPT; Bornstein & Languirand, 2003) with a substance abuse sample. One hundred-eight substance abuse patients completed the RPT, Experiences in Close Relationships Scale (ECR-SF; Wei, Russell, Mallinckrodt, & Vogel, 2007), Personality Assessment Inventory (PAI; Morey, 1991), and Symptom Checklist-90-Revised (SCL-90-R: Derogatis 1983). Results suggest that the RPT has good construct validity when compared against theoretically related broadband measures of personality, psychopathology and adult attachment. Overall, health hependency was negatively related to measures of psychopathology and insecure attachment, and overdependence was positively related to measures of psychopathology and attachment anxiety. Many of the predictions regarding RPT detachment and the criterion measures were not supported. Implications of these findings are discussed. PMID:26620463
Rohlf, Helena L.; Krahé, Barbara
2015-01-01
An observational measure of anger regulation in middle childhood was developed that facilitated the in situ assessment of five maladaptive regulation strategies in response to an anger-eliciting task. 599 children aged 6–10 years (M = 8.12, SD = 0.92) participated in the study. Construct validity of the measure was examined through correlations with parent- and self-reports of anger regulation and anger reactivity. Criterion validity was established through links with teacher-rated aggression and social rejection measured by parent-, teacher-, and self-reports. The observational measure correlated significantly with parent- and self-reports of anger reactivity, whereas it was unrelated to parent- and self-reports of anger regulation. It also made a unique contribution to predicting aggression and social rejection. PMID:25964767
Scherrer, Vsevolod; Roberts, Richard; Preckel, Franzis
2016-01-01
Meta-analyses suggest that morning-oriented students obtain better school grades than evening-oriented students. This finding has generally been found for students in high school using self-report data for the assessment of circadian preference. Two studies (N = 2718/192) investigated whether these findings generalize across samples (i.e. elementary school-aged students) and methods (i.e. parent reports). These studies also explored whether the relation between circadian preference and school achievement could be explained within an expectancy-value framework. To this end, the Lark-Owl Chronotype Indicator (LOCI) was modified to obtain parents' evaluations of their children's circadian preference, while students completed a battery of assessments designed to explore the test-criterion evidence. Structural equation modeling and correlational analyses revealed: (1) morning and evening orientation were two separable factors of children's circadian preference; (2) correlations with behavioral (e.g. sleep and eating times) and psychological (e.g. cognitive ability) data supported the test-criterion validity of both factors; (3) morning orientation was positively related to school achievement and (4) consistent with an expectancy-value framework this relation was mediated by children's academic self-concept (ASC). These findings have important research and policy implications for considering circadian preference in the schooling of elementary students.
Gaudin, Valérie
2017-09-01
Screening methods are used as a first-line approach to detect the presence of antibiotic residues in food of animal origin. The validation process guarantees that the method is fit-for-purpose, suited to regulatory requirements, and provides evidence of its performance. This article is focused on intra-laboratory validation. The first step in validation is characterisation of performance, and the second step is the validation itself with regard to pre-established criteria. The validation approaches can be absolute (a single method) or relative (comparison of methods), overall (combination of several characteristics in one) or criterion-by-criterion. Various approaches to validation, in the form of regulations, guidelines or standards, are presented and discussed to draw conclusions on their potential application for different residue screening methods, and to determine whether or not they reach the same conclusions. The approach by comparison of methods is not suitable for screening methods for antibiotic residues. The overall approaches, such as probability of detection (POD) and accuracy profile, are increasingly used in other fields of application. They may be of interest for screening methods for antibiotic residues. Finally, the criterion-by-criterion approach (Decision 2002/657/EC and of European guideline for the validation of screening methods), usually applied to the screening methods for antibiotic residues, introduced a major characteristic and an improvement in the validation, i.e. the detection capability (CCβ). In conclusion, screening methods are constantly evolving, thanks to the development of new biosensors or liquid chromatography coupled to tandem-mass spectrometry (LC-MS/MS) methods. There have been clear changes in validation approaches these last 20 years. Continued progress is required and perspectives for future development of guidelines, regulations and standards for validation are presented here.
Witt, Edward A.; Donnellan, M. Brent; Blonigen, Daniel M.; Krueger, Robert F.; Conger, Rand D.
2009-01-01
This report provides evidence for the reliability, validity, and developmental course of the psychopathic personality traits of Fearless Dominance (FD) and Impulsive Antisociality (IA) as assessed by items from Multidimensional Personality Questionnaire (MPQ; Patrick, Curtin, & Tellegen, 2002). In Study 1, MPQ-based measures of FD and IA were strongly correlated with their corresponding composite scores from the Psychopathic Personality Inventory-Revised (Lilienfeld & Widows, 2005). In Study 2, FD and IA had relatively distinct associations with measures of normal and maladaptive personality traits. In Study 3, FD and IA had substantial retest coefficients during the transition to adulthood and both traits showed average declines with an especially substantial drop in IA. In Study 4, FD and IA were correlated with measures of internalizing and externalizing problems in ways consistent with previous research and theory. Collectively, these results provide important information about the assessment of FD and IA. PMID:19365767
2012-01-01
Background In the continuing revision of Diagnostic and Statistical Manual (DSM-V) “identity” is integrated as a central diagnostic criterion for personality disorders (self-related personality functioning). According to Kernberg, identity diffusion is one of the core elements of borderline personality organization. As there is no elaborated self-rating inventory to assess identity development in healthy and disturbed adolescents, we developed the AIDA (Assessment of Identity Development in Adolescence) questionnaire to assess this complex dimension, varying from “Identity Integration” to “Identity Diffusion”, in a broad and substructured way and evaluated its psychometric properties in a mixed school and clinical sample. Methods Test construction was deductive, referring to psychodynamic as well as social-cognitive theories, and led to a special item pool, with consideration for clarity and ease of comprehension. Participants were 305 students aged 12–18 attending a public school and 52 adolescent psychiatric inpatients and outpatients with diagnoses of personality disorders (N = 20) or other mental disorders (N = 32). Convergent validity was evaluated by covariations with personality development (JTCI 12–18 R scales), criterion validity by differences in identity development (AIDA scales) between patients and controls. Results AIDA showed excellent total score (Diffusion: α = .94), scale (Discontinuity: α = .86; Incoherence: α = .92) and subscale (α = .73-.86) reliabilities. High levels of Discontinuity and Incoherence were associated with low levels in Self Directedness, an indicator of maladaptive personality functioning. Both AIDA scales were significantly different between PD-patients and controls with remarkable effect sizes (d) of 2.17 and 1.94 standard deviations. Conclusion AIDA is a reliable and valid instrument to assess normal and disturbed identity in adolescents. Studies for further validation and for obtaining population norms are in progress and may provide insight in the relevant aspects of identity development in differentiating specific psychopathology and therapeutic focus and outcome. PMID:22812911
Goth, Kirstin; Foelsch, Pamela; Schlüter-Müller, Susanne; Birkhölzer, Marc; Jung, Emanuel; Pick, Oliver; Schmeck, Klaus
2012-07-19
In the continuing revision of Diagnostic and Statistical Manual (DSM-V) "identity" is integrated as a central diagnostic criterion for personality disorders (self-related personality functioning). According to Kernberg, identity diffusion is one of the core elements of borderline personality organization. As there is no elaborated self-rating inventory to assess identity development in healthy and disturbed adolescents, we developed the AIDA (Assessment of Identity Development in Adolescence) questionnaire to assess this complex dimension, varying from "Identity Integration" to "Identity Diffusion", in a broad and substructured way and evaluated its psychometric properties in a mixed school and clinical sample. Test construction was deductive, referring to psychodynamic as well as social-cognitive theories, and led to a special item pool, with consideration for clarity and ease of comprehension. Participants were 305 students aged 12-18 attending a public school and 52 adolescent psychiatric inpatients and outpatients with diagnoses of personality disorders (N = 20) or other mental disorders (N = 32). Convergent validity was evaluated by covariations with personality development (JTCI 12-18 R scales), criterion validity by differences in identity development (AIDA scales) between patients and controls. AIDA showed excellent total score (Diffusion: α = .94), scale (Discontinuity: α = .86; Incoherence: α = .92) and subscale (α = .73-.86) reliabilities. High levels of Discontinuity and Incoherence were associated with low levels in Self Directedness, an indicator of maladaptive personality functioning. Both AIDA scales were significantly different between PD-patients and controls with remarkable effect sizes (d) of 2.17 and 1.94 standard deviations. AIDA is a reliable and valid instrument to assess normal and disturbed identity in adolescents. Studies for further validation and for obtaining population norms are in progress and may provide insight in the relevant aspects of identity development in differentiating specific psychopathology and therapeutic focus and outcome.
Steagall, Paulo V M; Monteiro, Beatriz P; Lavoie, Anne-Marie; Frank, Diane; Troncy, Eric; Luna, Stelio P L; Brondani, Juliana T
2017-01-01
Validation of the French version of the UNESP-Botucatu multidimensional composite pain scale for assessing postoperative pain in cats. The aim of this study was to validate the French version of the UNESP-Botucatu multidimensional composite pain scale (MCPS-Fr) to assess postoperative pain in cats. Two veterinarians and one DVM student identified three domains of behavior based on video analyses: "psychomotor change", "protection of the painful area" and "physiological variables". Internal consistency was excellent (Cronbach's alpha coefficient of 0.94, 0.90 and 0.61, respectively). Criterion validity was good to very good when evaluations from the three observers were compared with a "gold standard". Inter- and intra-rater reliability for each scale item were good to very good. The optimal cut-off point identified with a ROC curve was > 7 (scale range 0-30 points), with a sensitivity of 97.8% and specificity of 99.1%. The MCPS-Fr is a valid, reliable and responsive instrument for assessing acute pain in cats undergoing ovariohysterectomy.(Translated by Dr. Beatriz Monteiro).
A criterion for maximum resin flow in composite materials curing process
NASA Astrophysics Data System (ADS)
Lee, Woo I.; Um, Moon-Kwang
1993-06-01
On the basis of Springer's resin flow model, a criterion for maximum resin flow in autoclave curing is proposed. Validity of the criterion was proved for two resin systems (Fiberite 976 and Hercules 3501-6 epoxy resin). The parameter required for the criterion can be easily estimated from the measured resin viscosity data. The proposed criterion can be used in establishing the proper cure cycle to ensure maximum resin flow and, thus, the maximum compaction.
A Rapid Assessment Tool for affirming good practice in midwifery education programming.
Fullerton, Judith T; Johnson, Peter; Lobe, Erika; Myint, Khine Haymar; Aung, Nan Nan; Moe, Thida; Linn, Nay Aung
2016-03-01
to design a criterion-referenced assessment tool that could be used globally in a rapid assessment of good practices and bottlenecks in midwifery education programs. a standard tool development process was followed, to generate standards and reference criteria; followed by external review and field testing to document psychometric properties. review of standards and scoring criteria were conducted by stakeholders around the globe. Field testing of the tool was conducted in Myanmar. eleven of Myanmar׳s 22 midwifery education programs participated in the assessment. the clinimetric tool was demonstrated to have content validity and high inter-rater reliability in use. a globally validated tool, and accompanying user guide and handbook are now available for conducting rapid assessments of compliance with good practice criteria in midwifery education programming. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
Grzybowska, Magdalena Emilia; Piaskowska-Cala, Justyna; Wydra, Dariusz Grzegorz
2017-12-29
The aim of the study was to translate into Polish the Pelvic Organ Prolapse/Incontinence Sexual Questionnaire, IUGA-Revised (PISQ-IR), which evaluates sexual function in sexually active (SA) and not SA (NSA) women with pelvic floor disorders (PFD), and to validate the Polish version. After translation, back-translation and cognitive interviews, the final version of PISQ-IR was established. The study group included 252 women with PFD (124 NSA and 128 SA). All women underwent clinical evaluation and completed the PISQ-IR. For test-retest reliability, the questionnaire was administered to 99 patients twice at an interval of 2 weeks. The analysis of criterion validity required the subjects to complete self-reported measures. Internal consistency and criterion validity were assessed separately for NSA and SA women for the PISQ-IR subscales. The mean age of the women was 60.9 ± 10.6 years and their mean BMI was 27.9 ± 4.9 kg/m 2 . Postmenopausal women constituted 82.5% of the study group. Urinary incontinence (UI) was diagnosed in 60 women (23.8%), pelvic organ prolapse (POP) in 90 (35.7%), and UI and POP in 102 (40.5%). Fecal incontinence was reported by 45 women (17.9%). The PISQ-IR Polish version proved to have good internal consistency in NSA women (α 0.651 to 0.857) and SA women (α 0.605 to 0.887), and strong reliability in all subscales (Pearson's coefficient 0.759-0.899; p < 0.001). Criterion validity confirmed moderate to strong correlations between PISQ-IR scores and self-reported measures in SA subscales, as well the SA summary score, and weak to moderate correlations in NSA women. The PISQ-IR Polish version is a valid tool for evaluating sexual function in women with PFD.
Validity of the diagnosis of pre-eclampsia in the Medical Birth Registry of Norway.
Thomsen, Liv C V; Klungsøyr, Kari; Roten, Linda T; Tappert, Christian; Araya, Elisabeth; Baerheim, Gunhild; Tollaksen, Kjersti; Fenstad, Mona H; Macsali, Ferenc; Austgulen, Rigmor; Bjørge, Line
2013-08-01
Evaluating the validity of pre-eclampsia registration in the Medical Birth Registry of Norway (MBRN) according to both broader and restricted disease definitions. Retrospective nested cohort study. Multicenter study. In this study, two cohorts of women with pre-eclamptic pregnancies registered in the MBRN were selected. Study group 1 contained 966 pregnancies from 1967 to 2002. Concomitant participation in the Nord-Trøndelag Health Study 2 was required. Study group 2 comprised 1138 pregnancies recorded in 1967-2005, examined as a pre-eclampsia biobank was established. Diagnostic criteria vary. The broader criteria for pre-eclampsia, used by the MBRN, are one measurement of hypertension and proteinuria (Criterion A). Criteria used internationally today require two measurements of hypertension and proteinuria (Criterion B). The diagnostic validities in Study groups 1 and 2 were judged against medical records according to Criterion A and B, respectively. Positive predictive value (PPV) and trend analyses. The diagnosis was confirmed in 88.3% of pregnancies in Study group 1, and in 63.6% in Study group 2. PPV was high for Study group 1 throughout the period. For Study group 2, results improved significantly after 1986. This study ascertains high PPV of pre-eclampsia in the MBRN using broader traditional criteria, although the PPV decreases through assessment using restricted modern criteria. This illustrates how inclusion of direct measurements may improve registration of complex disorders defined by changing diagnostic criteria. © 2013 Nordic Federation of Societies of Obstetrics and Gynecology.
38 CFR 18.442 - Admissions and recruitment.
Code of Federal Regulations, 2011 CFR
2011-07-01
... conduct periodic validity studies against the criterion of overall success in the education program or... use any test or criterion for admission that has a disproportionate, adverse effect on handicapped persons or any class of handicapped persons unless: (i) The test or criterion, as used by the recipient...
ERIC Educational Resources Information Center
Kelly, William E.; Lutz, Daniel
2014-01-01
The concurrent criterion validity of the Ausburg Multidimensional Personality Instrument (AMPI) clinical scales was examined. The AMPI and several scales purportedly measuring the same or similar constructs as those of the AMPI clinical scales were administered to two samples of college students (N = 134 and N = 118). The correlations between the…
The Validity of the Modified Sit-and-Reach Test in College-Age Students.
ERIC Educational Resources Information Center
Minkler, Sharin; Patterson, Patricia
1994-01-01
Reports a study that examined the criterion-related validity of the modified sit-and-reach test against criterion measures of hamstring and low back flexibility in college students. Results indicated the modified sit-and-reach test moderately related to hamstring flexibility, but its relation to low back flexibility was low. (SM)
ERIC Educational Resources Information Center
Roth, Philip L.; Buster, Maury A.; Bobko, Philip
2011-01-01
A number of applied psychologists have suggested that trainability test Black-White ethnic group differences are low or relatively low (e.g., Siegel & Bergman, 1975), though data are scarce. Likewise, there are relatively few estimates of criterion-related validity for trainability tests predicting job performance (cf. Robertson & Downs,…
easyCBM® Reading Criterion Related Validity Evidence: Grades K-1. Technical Report #1309
ERIC Educational Resources Information Center
Lai, Cheng-Fei; Alonzo, Julie; Tindal, Gerald
2013-01-01
In this technical report, we present the results of a study to gather criterion-related evidence for Grade K-1 easyCBM® reading measures. We used correlations to examine the relation between the easyCBM® measures and other published measures with known reliability and validity evidence, including the Dynamic Indicators of Basic Early Literacy…
ERIC Educational Resources Information Center
Hirschi, Andreas
2009-01-01
Interest differentiation and elevation are supposed to provide important information about a person's state of interest development, yet little is known about their development and criterion validity. The present study explored these constructs among a group of Swiss adolescents. Study 1 applied a cross-sectional design with 210 students in 11th…
What Is True Halving in the Payoff Matrix of Game Theory?
Hasegawa, Eisuke; Yoshimura, Jin
2016-01-01
In game theory, there are two social interpretations of rewards (payoffs) for decision-making strategies: (1) the interpretation based on the utility criterion derived from expected utility theory and (2) the interpretation based on the quantitative criterion (amount of gain) derived from validity in the empirical context. A dynamic decision theory has recently been developed in which dynamic utility is a conditional (state) variable that is a function of the current wealth of a decision maker. We applied dynamic utility to the equal division in dove-dove contests in the hawk-dove game. Our results indicate that under the utility criterion, the half-share of utility becomes proportional to a player’s current wealth. Our results are consistent with studies of the sense of fairness in animals, which indicate that the quantitative criterion has greater validity than the utility criterion. We also find that traditional analyses of repeated games must be reevaluated. PMID:27487194
What Is True Halving in the Payoff Matrix of Game Theory?
Ito, Hiromu; Katsumata, Yuki; Hasegawa, Eisuke; Yoshimura, Jin
2016-01-01
In game theory, there are two social interpretations of rewards (payoffs) for decision-making strategies: (1) the interpretation based on the utility criterion derived from expected utility theory and (2) the interpretation based on the quantitative criterion (amount of gain) derived from validity in the empirical context. A dynamic decision theory has recently been developed in which dynamic utility is a conditional (state) variable that is a function of the current wealth of a decision maker. We applied dynamic utility to the equal division in dove-dove contests in the hawk-dove game. Our results indicate that under the utility criterion, the half-share of utility becomes proportional to a player's current wealth. Our results are consistent with studies of the sense of fairness in animals, which indicate that the quantitative criterion has greater validity than the utility criterion. We also find that traditional analyses of repeated games must be reevaluated.
Erdodi, Laszlo A; Sagar, Sanya; Seke, Kristian; Zuccato, Brandon G; Schwartz, Eben S; Roth, Robert M
2018-06-01
This study was designed to develop performance validity indicators embedded within the Delis-Kaplan Executive Function Systems (D-KEFS) version of the Stroop task. Archival data from a mixed clinical sample of 132 patients (50% male; M Age = 43.4; M Education = 14.1) clinically referred for neuropsychological assessment were analyzed. Criterion measures included the Warrington Recognition Memory Test-Words and 2 composites based on several independent validity indicators. An age-corrected scaled score ≤6 on any of the 4 trials reliably differentiated psychometrically defined credible and noncredible response sets with high specificity (.87-.94) and variable sensitivity (.34-.71). An inverted Stroop effect was less sensitive (.14-.29), but comparably specific (.85-90) to invalid performance. Aggregating the newly developed D-KEFS Stroop validity indicators further improved classification accuracy. Failing the validity cutoffs was unrelated to self-reported depression or anxiety. However, it was associated with elevated somatic symptom report. In addition to processing speed and executive function, the D-KEFS version of the Stroop task can function as a measure of performance validity. A multivariate approach to performance validity assessment is generally superior to univariate models. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Teel, Elizabeth F.; Slobounov, Semyon M.
2014-01-01
Objective To determine the criterion and content validity of a virtual reality (VR) balance module for use in clinical practice. Design Retrospective, VR balance module completed by participants during concussion baseline or assessment testing session. Setting A Pennsylvania State University research laboratory Participants A total of 60 control and 28 concussed students and athletes from the Pennsylvania State University Interventions None Main Outcome Measures This study examined: (1) the relationship between VR composite balance scores (final, stationary, yaw, pitch, and roll) and area of the center-of-pressure (eyes open and closed) scores and (2) group differences (normal volunteers and concussed student-athletes) on VR composite balance scores. Results With the exception of the stationary composite score, all other VR balance composite scores were significantly correlated with the center of pressure (COP) data obtained from a force platform. Significant correlations for the eyes open conditions ranged from r= −.273 to −.704 and from r= −.353 to −.876 for the eyes closed condition. When examining group differences on the VR balance composite modules, the concussed group did significantly (p<.01) worse on all measures compared to the control group. Conclusions The VR balance module met or exceeded the criterion and content validity standard set by current balance tools and may be appropriate for use in a clinical concussion setting. PMID:24905539
De Smedt, Delphine; Clays, Els; Doyle, Frank; Kotseva, Kornelia; Prugger, Christof; Pająk, Andrzej; Jennings, Catriona; Wood, David; De Bacquer, Dirk
2013-09-01
To investigate the validity and reliability of the EuroQol-5D (EQ-5D), the 12-item Short-Form Health Survey (SF-12v2), and the Hospital Anxiety and Depression Scale (HADS) in a stable coronary population. Cross-sectional study EUROASPIRE III. Quality of life data (QoL) were available on 8745 patients hospitalized for coronary artery bypass graft (CABG), percutaneous coronary intervention (PCI), acute myocardial infarction (AMI), or myocardial ischemia. They were interviewed and examined at least 6 months after their hospital admission. Reliability and validity of the 3 instruments were tested. Internal consistency, and discriminative, convergent, criterion and construct validity were assessed. Cronbach's alpha indicated good internal consistency for all measures (0.73 to 0.87). Discriminative validity analyses confirmed significant QoL differences between known groups: age, gender, educational level. In addition, all hypothesized correlations between QoL constructs (convergent validity) and items (criterion validity) were confirmed with significant correlations. Confirmatory factor analyses indicated good construct validity for HADS and SF-12v2. On country-specific level, results were roughly similar. The EQ-5D as well as the SF-12v2 and the HADS are reliable and valid instruments for use in a stable coronary population, both on aggregate European level and on country-specific level. However, our results must be generalized with caution, because EUROASPIRE III patients might not be representative for all patients with stable coronary heart disease. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
PTSD's risky behavior criterion: Relation with DSM-5 PTSD symptom clusters and psychopathology.
Contractor, Ateka A; Weiss, Nicole H; Dranger, Paula; Ruggero, Camilo; Armour, Cherie
2017-06-01
A new symptom criterion of reckless and self-destructive behaviors (E2) was recently added to posttraumatic stress disorder's (PTSD) diagnostic criteria in DSM-5, which is unsurprising given the well-established relation between PTSD and risky behaviors. Researchers have questioned the significance and incremental validity of this symptom criterion within PTSD's symptomatology. Unprecedented to our knowledge, we aim to compare trauma-exposed groups differing on their endorsement status of the risky behavior symptom on several psychopathology constructs (PTSD, depression, distress tolerance, rumination, anger). The sample included 123 trauma-exposed participants seeking mental health treatment (M age=35.70; 68.30% female) who completed self-report questionnaires assessing PTSD symptoms, depression, rumination, distress tolerance, and anger. Results of independent samples t-tests indicated that participants who endorsed the E2 criterion at a clinically significant level reported significantly greater PTSD subscale severity; depression severity; rumination facets of repetitive thoughts, counterfactual thinking, and problem-focused thinking; and anger reactions; and significantly less absorption and regulation (distress tolerance facets) compared to participants who did not endorse the E2 criterion at a clinically significant level. Results indicate the utility of the E2 criterion in identifying trauma-exposed individual with greater posttraumatic distress, and emphasize the importance of targeting such behaviors in treatment. Copyright © 2017 Elsevier Ireland Ltd. All rights reserved.
7 CFR 15b.30 - Admissions and recruitment.
Code of Federal Regulations, 2011 CFR
2011-01-01
... first year grades, but shall conduct periodic validity studies against the criterion of overall success... admitted; (2) May not make use of any test or criterion for admission that has a disproportionate, adverse effect on handicapped persons or any class of handicapped persons unless (i) the test or criterion, as...
Amarasinghe, Nirmalie Champika; De AlwisSenevirathne, Rohini
2016-10-17
Musculoskeletal disorders (MSDs) have been identified as a predisposing factor for lesser productivity, but no validated tool has been developed to assess them in the Sri- Lankan context. To develop a validated tool to assess the neck and upper limb MSDs. It comprises three components: item selections, item reduction using principal component analysis, and validation. A tentative self-administrated questionnaire was developed, translated, and pre-tested. Four important domains - neck, shoulder, elbow and wrist - were identified through principal component analysis. Prevalence of any MSDs was 38.1% and prevalence of neck, shoulder, elbow and wrist MSDs are 12.85%, 13.71%, 12%, 13.71% respectively. Content and criterion validity of the tool was assessed. Separate ROC curves were produced and sensitivity and specificity of neck (83.1%, 71.7%), shoulder (97.6%, 91.9%), elbow (98.2%, 87.2%), and wrist (97.6%, 94.9%) was determined. Cronbach's Alpha and correlation coefficient was above 0.7. The tool has high sensitivity, specificity, internal consistency, and test re-test reliability.
NASA Astrophysics Data System (ADS)
Peczalski, K.; Palko, T.; Wojciechowski, D.; Dunajski, Z.; Kowalewski, M.
2013-04-01
The cardiac resynchronization therapy is an effective treatment for systolic failure patients. Independent electrical stimulation of left and right ventricle corrects mechanical ventricular dyssynchrony. About 30-40% treated patients do not respond to therapy. In order to improve clinical outcome authors propose the two channels impedance cardiography for assessment of ventricular dyssynchrony. The proposed method is intended for validation of patients diagnosis and optimization of pacemaker settings for cardiac resynchronization therapy. The preliminary study has showed that bichannel impedance cardiography is a promising tool for assessment of ventricular dyssynchrony.
Afulani, Patience A; Diamond-Smith, Nadia; Golub, Ginger; Sudhinaraset, May
2017-09-22
Person-centered reproductive health care is recognized as critical to improving reproductive health outcomes. Yet, little research exists on how to operationalize it. We extend the literature in this area by developing and validating a tool to measure person-centered maternity care. We describe the process of developing the tool and present the results of psychometric analyses to assess its validity and reliability in a rural and urban setting in Kenya. We followed standard procedures for scale development. First, we reviewed the literature to define our construct and identify domains, and developed items to measure each domain. Next, we conducted expert reviews to assess content validity; and cognitive interviews with potential respondents to assess clarity, appropriateness, and relevance of the questions. The questions were then refined and administered in surveys; and survey results used to assess construct and criterion validity and reliability. The exploratory factor analysis yielded one dominant factor in both the rural and urban settings. Three factors with eigenvalues greater than one were identified for the rural sample and four factors identified for the urban sample. Thirty of the 38 items administered in the survey were retained based on the factors loadings and correlation between the items. Twenty-five items load very well onto a single factor in both the rural and urban sample, with five items loading well in either the rural or urban sample, but not in both samples. These 30 items also load on three sub-scales that we created to measure dignified and respectful care, communication and autonomy, and supportive care. The Chronbach alpha for the main scale is greater than 0.8 in both samples, and that for the sub-scales are between 0.6 and 0.8. The main scale and sub-scales are correlated with global measures of satisfaction with maternity services, suggesting criterion validity. We present a 30-item scale with three sub-scales to measure person-centered maternity care. This scale has high validity and reliability in a rural and urban setting in Kenya. Validation in additional settings is however needed. This scale will facilitate measurement to improve person-centered maternity care, and subsequently improve reproductive outcomes.
Introduction to the special issue on the personality assessment inventory.
Kurtz, John E; Blais, Mark A
2007-02-01
This special issue of the Journal of Personality Assessment brings together 13 new research studies on the Personality Assessment Inventory (PAI; Morey, 1991) that should inform users and stimulate future empirical activity with this measure. In 4 articles, authors evaluate the validity scales and indexes of the PAI using both analog and criterion designs and samples from a variety of clinical and forensic settings. In a 5th article, the authors describe a novel approach to profile interpretation using two PAI negative distortion measures. The authors present applications of the PAI to new populations and problems including a German translation of the PAI and profile information for male batterers and victims of head injury. The authors of 2 studies extend research on the validity of the PAI for the assessment of borderline personality disorder. In the final 3 studies, the authors evaluate the validity of PAI measures of violence and aggression to predict subsequent aggressive behavior and institutional misconduct. Finally, the authors offer several suggestions for future research with the PAI.
NASA Astrophysics Data System (ADS)
Ji, Bing; Tsai, Chin-Chun; Stwalley, William C.
1995-04-01
A modified internuclear distance criterion, RLR- m, as the lower bound for the region of validity of the inverse-power expansion of the diatomic long-range potential is proposed. This new criterion takes into account the spatial orientation of the atomic orbitals while retaining the simplicity of the traditional Le Roy radius, RLR for the interaction of S state atoms. Recent experimental and theoretical results for various excited states in Na 2 suggest that this proposed RLR- m is an appropriate generalization of RLR.
Somma, Antonella; Borroni, Serena; Maffei, Cesare; Giarolli, Laura E; Markon, Kristian E; Krueger, Robert F; Fossati, Andrea
2017-10-01
In order to assess the reliability, factorial validity, and criterion validity of the Personality Inventory for DSM-5 (PID-5) among adolescents, 1,264 Italian high school students were administered the PID-5. Participants were also administered the Questionnaire on Relationships and Substance Use as a criterion measure. In the full sample, McDonald's ω values were adequate for the PID-5 scales (median ω = .85, SD = .06), except for Suspiciousness. However, all PID-5 scales showed average inter-item correlation values in the .20-.55 range. Exploratory structural equation modeling analyses provided moderate support for the a priori model of PID-5 trait scales. Ordinal logistic regression analyses showed that selected PID-5 trait scales predicted a significant, albeit moderate (Cox & Snell R 2 values ranged from .08 to .15, all ps < .001) amount of variance in Questionnaire on Relationships and Substance Use variables.
ERIC Educational Resources Information Center
Naji Qasem, Mamun Ali; Ahmad Gul, Showkeen Bilal
2014-01-01
The study was conducted to know the effect of items direction (positive or negative) on the factorial construction and criterion related validity in Likert scale. The descriptive survey research method was used for the study and the sample consisted of 510 undergraduate students selected by used random sampling technique. A scale developed by…
ERIC Educational Resources Information Center
Kettler, Ryan J.; Elliott, Stephen N.; Davies, Michael; Griffin, Patrick
2012-01-01
This study addresses the predictive validity of results from a screening system of academic enablers, with a sample of Australian elementary school students, when the criterion variable is end-of-year achievement. The investigation included (a) comparing the predictive validity of a brief criterion-referenced nomination system with more…
easyCBM® Reading Criterion Related Validity Evidence: Grades 2-5. Technical Report #1310
ERIC Educational Resources Information Center
Lai, Cheng-Fei; Alonzo, Julie; Tindal, Gerald
2013-01-01
In this technical report, we present the results of a study to gather criterion-related evidence for Grade 2-5 easyCBM® reading measures. We used correlations to examine the relation between the easyCBM® measures and other published measures with known reliability and validity evidence, including the Gates-MacGinitie Reading Tests and the Dynamic…
A Case for Transforming the Criterion of a Predictive Validity Study
ERIC Educational Resources Information Center
Patterson, Brian F.; Kobrin, Jennifer L.
2011-01-01
This study presents a case for applying a transformation (Box and Cox, 1964) of the criterion used in predictive validity studies. The goals of the transformation were to better meet the assumptions of the linear regression model and to reduce the residual variance of fitted (i.e., predicted) values. Using data for the 2008 cohort of first-time,…
Miki, Emi; Yamane, Shingo; Yamaoka, Mai; Fujii, Hiroe; Ueno, Hiroka; Kawahara, Toshie; Tanaka, Keiko; Tamashiro, Hiroaki; Inoue, Eiji; Okamoto, Takatsugu; Kuriyama, Masaru
2016-09-01
The study aim was to investigate the validity and reliability of the Functional Independence Measure and Functional Assessment Measure (FIM + FAM), which is unfamiliar in Japan, by using its Japanese version (FIM + FAM-j) in patients with cerebrovascular accident (CVA). Forty-two CVA patients participated. Criterion validity was examined by correlating the full scale and subscales of FIM + FAM-j with several well-established measurements using Spearman's correlation coefficient. Reliability was evaluated by internal consistency (tested by Cronbach's alpha coefficient) and intra-rater reliability (tested by Kendall's tau correlation coefficient). Good-to-excellent criterion validity was found between the full scale and motor subscales of the FIM + FAM-j and the Barthel Index, National Institutes of Health Stroke Scale, modified Rankin Scale, and lower extremity Brunnstrom Recovery Stage. High internal consistency was observed within the full-scale FIM + FAM-j and the motor and cognitive subscales (Cronbach's alphas were 0.968, 0.954, and 0.948, respectively). Additionally, good intra-rater reliability was observed within the full scale and motor subscales, and excellent reliability for the cognitive subscales (taus were 0.83, 0.80, and 0.98, respectively). This study showed that the FIM + FAM-j demonstrated acceptable levels of validity and reliability when used for CVA as a measure of disability.
Nakagami, Katsuyuki; Yamauchi, Toyoaki; Noguchi, Hiroyuki; Maeda, Tohru; Nakagami, Tomoko
2014-06-01
This study aimed to develop a reliable and valid measure of functional health literacy in a Japanese clinical setting. Test development consisted of three phases: generation of an item pool, consultation with experts to assess content validity, and comparison with external criteria (the Japanese Health Knowledge Test) to assess criterion validity. A trial version of the test was administered to 535 Japanese outpatients. Internal consistency reliability, calculated by Cronbach's alpha, was 0.81, and concurrent validity was moderate. Receiver Operating Characteristics and Item Response Theory were used to classify patients as having adequate, marginal, or inadequate functional health literacy. Both inadequate and marginal functional health literacy were associated with older age, lower income, lower educational attainment, and poor health knowledge. The time required to complete the test was 10-15 min. This test should enable health workers to better identify patients with inadequate health literacy. © 2013 Wiley Publishing Asia Pty Ltd.
Transcultural Adaptation and Validation of the German Version of the Vocal Tract Discomfort Scale.
Lukaschyk, Julia; Brockmann-Bauser, Meike; Beushausen, Ulla
2017-03-01
Currently, there is no standardized German questionnaire to assess vocal tract discomfort in voice patients. The aim of this study was to evaluate the internal consistency, reliability, and validity of the German version of the Vocal Tract Discomfort (VTD) Scale. This is a cross-sectional study. First, a cross-cultural translation and adaptation from English to German was performed. One hundred seven patients between the ages of 18 and 76 with voice disorders were divided into two different diagnosis-related groups (organic and functional voice disorder) and 50 vocally healthy adults were included. All participants completed the VTD Scale and the Voice Handicap Index (VHI). The internal consistency of the VTD Scale was analyzed through Cronbach's α coefficient. Pearson correlation between the VDT Scale and VHI total scores was used to determine criterion validity. The VDT Scale score differences related to diagnosis groups were assessed with analysis of variance. Excellent internal consistency was found (α = 0.919, P < 0.05), and criterion validity was confirmed by a high correlation between the total VTD Scale and VHI (r = 0.674). There was a significant difference between the diagnosis groups' total VTD Scale score (F[4.135] = 15.114, P = 0.000). Furthermore, the vocally healthy adults had significantly lower values than the two diagnosis groups (x¯: 11.48, s = 8.340). The German version of the VTD Scale has an excellent internal consistency and reliability, and shows high clinical validity. Thus, it is a useful instrument in voice diagnostics. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Castro-Díaz, D M; Esteban-Fuertes, M; Salinas-Casado, J; Bustamante-Alarma, S; Gago-Ramos, J L; Galacho-Bech, A; García-Matres, M J; Rodríguez-Toves, L A; Zubiaur-Líbano, C; Collado-Serra, A; Batista-Miranda, J E; Ortiz-Gámiz, A
2014-03-01
To evaluate the psychometric properties of the Spanish version of the ICIQ-Male Lower Urinary Tract Symptoms Questionnaire (ICIQ-MLUTS): Feasibility (% of completion and ceiling/ground effects), reliability (Test-retest), convergent validity (vs Bladder Control Self-Assessment Questionnaire [BSAQ] and vs International Prostate Symptom Score [I-PSS]) and criterion validity (according to presence or absence of symptoms). This was an observational, non-interventionist and multicenter study. 223 male patients with lower urinary tract symptoms (LUTS), predominantly storage symptoms and aged 18-65, took part in the study. Patients completed the ICIQ-MLUTS (test), I-PSS and BSAQ questionnaires and referred their urinary symptoms in a single visit, with the exception of a subgroup composed by 49 patients that completed the questionnaire again 15 days after initial visit to evaluate test-retest reliability. The questionnaire includes 13 items divided in 2 sub-scales: Voiding symptoms (V) from 0-20 and Incontinence symptoms (I) from 0-24. Percentage of patients that completed all items: 98.84%. Ground effect is 0 and ceiling effect was under 6% in both sub-scales. Test-retest reliability: Intraclass correlation coefficient (ICC) ranged from 0.68 to 0.88, except on Delay. Kappa shows a good agreement, between 0.60 and 0.81, except for Nocturia. Convergent validity: Correlation (Spearman) between the questionnaire sub-scales scores and the rest of measures is statistically significant (P < .01 and P < .05). Criterion validity: Statistically significant differences (P < .05) between scores on ICIQ-MLUTS, from patients that refer experiencing symptoms and those who do not. The Spanish version of the ICIQ-MLUTS questionnaire shows adequate feasibility, reliability and validity. Copyright © 2013 AEU. Published by Elsevier Espana. All rights reserved.
Validity of the occupational sitting and physical activity questionnaire.
Chau, Josephine Y; Van Der Ploeg, Hidde P; Dunn, Scott; Kurko, John; Bauman, Adrian E
2012-01-01
Sitting at work is an emerging occupational health risk. Few instruments designed for use in population-based research measure occupational sitting and standing as distinct behaviors. This study aimed to develop and validate brief measure of occupational sitting and physical activity. A convenience sample (n = 99, 61% female) was recruited from two medium-sized workplaces and by word-of-mouth in Sydney, Australia. Participants completed the newly developed Occupational Sitting and Physical Activity Questionnaire (OSPAQ) and a modified version of the MONICA Optional Study on Physical Activity Questionnaire (modified MOSPA-Q) twice, 1 wk apart. Participants also wore an ActiGraph accelerometer for the 7 d in between the test and retest. Analyses determined test-retest reliability with intraclass correlation coefficients and assessed criterion validity against accelerometers using the Spearman ρ. The test-retest intraclass correlation coefficients for occupational sitting, standing, and walking for OSPAQ ranged from 0.73 to 0.90, while that for the modified MOSPA-Q ranged from 0.54 to 0.89. Comparison of sitting measures with accelerometers showed higher Spearman correlations for the OSPAQ (r = 0.65) than for the modified MOSPA-Q (r = 0.52). Criterion validity correlations for occupational standing and walking measures were comparable for both instruments with accelerometers (standing: r = 0.49; walking: r = 0.27-0.29). The OSPAQ has excellent test-retest reliability and moderate validity for estimating time spent sitting and standing at work and is comparable to existing occupational physical activity measures for assessing time spent walking at work. The OSPAQ brief instrument measures sitting and standing at work as distinct behaviors and would be especially suitable in national health surveys, prospective cohort studies, and other studies that are limited by space constraints for questionnaire items.
Validity and reliability of the Japanese version of the Newest Vital Sign: a preliminary study.
Kogure, Takamichi; Sumitani, Masahiko; Suka, Machi; Ishikawa, Hirono; Odajima, Takeshi; Igarashi, Ataru; Kusama, Makiko; Okamoto, Masako; Sugimori, Hiroki; Kawahara, Kazuo
2014-01-01
Health literacy (HL) refers to the ability to obtain, process, and understand basic health information and services, and is thus needed to make appropriate health decisions. The Newest Vital Sign (NVS) is comprised of 6 questions about an ice cream nutrition label and assesses HL numeracy skills. We developed a Japanese version of the NVS (NVS-J) and evaluated the validity and reliability of the NVS-J in patients with chronic pain. The translation of the original NVS into Japanese was achieved as per the published guidelines. An observational study was subsequently performed to evaluate the validity and reliability of the NVS-J in 43 Japanese patients suffering from chronic pain. Factor analysis with promax rotation, using the Kaiser criterion (eigenvalues ≥1.0), and a scree plot revealed that the main component of the NVS-J consists of three determinative factors, and each factor consists of two NVS-J items. The criterion-related validity of the total NVS-J score was significantly correlated with the total score of Ishikawa et al.'s self-rated HL Questionnaire, the clinical global assessment of comprehensive HL level, cognitive function, and the Brinkman index. In addition, Cronbach's coefficient for the total score of the NVS-J was adequate (alpha = 0.72). This study demonstrated that the NVS-J has good validity and reliability. Further, the NVS-J consists of three determinative factors: "basic numeracy ability," "complex numeracy ability," and "serious-minded ability." These three HL abilities comprise a 3-step hierarchical structure. Adequate HL should be promoted in chronic pain patients to enable coping, improve functioning, and increase activities of daily living (ADLs) and quality of life (QOL).
Suraweera, Chathurie; Anandakumar, D; Dahanayake, D; Subendran, M; Perera, U T; Hanwella, Raveen; de Silva, Varuni
2016-12-30
Only the Mini mental state examination (MMSE) and Montreal Cognitive Assessment scale have been validated in a Sri Lankan population for the assessment of cognitive functions. Both tests are deficient in the number of domains assessed. Therefore validation of Repeatable Battery for Assessment of Neuropsychological Status is important as it assesses most of the cognitive domains. To culturally adapt RBANS and investigate the validity and reliability of culturally adapted RBANS (RBANS-S). Fifty four participants with major neurocognitive disorder and 60 normal controls aged >50 were administered with RBANS-S at the Cognitive Assessment Unit, Faculty of Medicine, Colombo and National Hospital of Sri Lanka. The participants were selected after a detailed clinical assessment according to Diagnostic and Statistical Manual – 5 criteria. Data were analysed using SPSS data package. The mean age of the sample was 69.5 years. RBANS-S total scale correlated highly with MMSE total score, (Pearson correlational coefficient = 0.793 p=0.01). Criterion validity was assessed using receiver operating curve characteristic analysis and the area under the curve was 0.937. RBANS-S showed strong concurrent validity us indicated by its significant correlations with the MMSE. All of the RBANS-S subtests demonstrated significant correlations with the MMSE subsets. The sensitivity and specificity for RBANS-S was 89% and 85% respectively at a totals score of 80.5. The RBANS-S yielded a reliability coefficient of 0.929. Culturally adapted RBANS-S is a valid and reliable instrument which can be used in assessment of cognitive functions.
Guetterman, Timothy C.; Creswell, John W.; Wittink, Marsha; Barg, Fran K.; Castro, Felipe G.; Dahlberg, Britt; Watkins, Daphne C.; Deutsch, Charles; Gallo, Joseph J.
2017-01-01
Introduction Demand for training in mixed methods is high, with little research on faculty development or assessment in mixed methods. We describe the development of a Self-Rated Mixed Methods Skills Assessment and provide validity evidence. The instrument taps six research domains: “Research question,” “Design/approach,” “Sampling,” “Data collection,” “Analysis,” and “Dissemination.” Respondents are asked to rate their ability to define or explain concepts of mixed methods under each domain, their ability to apply the concepts to problems, and the extent to which they need to improve. Methods We administered the questionnaire to 145 faculty and students using an internet survey. We analyzed descriptive statistics and performance characteristics of the questionnaire using Cronbach’s alpha to assess reliability and an ANOVA that compared a mixed methods experience index with assessment scores to assess criterion-relatedness. Results Internal consistency reliability was high for the total set of items (.95) and adequate (>=.71) for all but one subscale. Consistent with establishing criterion validity, respondents who had more professional experiences with mixed methods (e.g., published a mixed methods paper) rated themselves as more skilled, which was statistically significant across the research domains. Discussion This Self-Rated Mixed Methods Assessment instrument may be a useful tool to assess skills in mixed methods for training programs. It can be applied widely at the graduate and faculty level. For the learner, assessment may lead to enhanced motivation to learn and training focused on self-identified needs. For faculty, the assessment may improve curriculum and course content planning. PMID:28562495
El-Housseiny, Azza A; Alsadat, Farah A; Alamoudi, Najlaa M; El Derwi, Douaa A; Farsi, Najat M; Attar, Moaz H; Andijani, Basil M
2016-04-14
Early recognition of dental fear is essential for the effective delivery of dental care. This study aimed to test the reliability and validity of the Arabic version of the Children's Fear Survey Schedule-Dental Subscale (CFSS-DS). A school-based sample of 1546 children was randomly recruited. The Arabic version of the CFSS-DS was completed by children during class time. The scale was tested for internal consistency and test-retest reliability. To test criterion validity, children's behavior was assessed using the Frankl scale during dental examination, and results were compared with children's CFSS-DS scores. To test the scale's construct validity, scores on "fear of going to the dentist soon" were correlated with CFSS-DS scores. Factor analysis was also used. The Arabic version of the CFSS-DS showed high reliability regarding both test-retest reliability (intraclass correlation = 0.83, p < 0.001) and internal consistency (Cronbach's α = 0.88). It showed good criterion validity: children with negative behavior had significantly higher fear scores (t = 13.67, p < 0.001). It also showed moderate construct validity (Spearman's rho correlation, r = 0.53, p < 0.001). Factor analysis identified the following factors: "fear of invasive dental procedures," "fear of less invasive dental procedures" and "fear of strangers." The Arabic version of the CFSS-DS is a reliable and valid measure of dental fear in Arabic-speaking children. Pediatric dentists and researchers may use this validated version of the CFSS-DS to measure dental fear in Arabic-speaking children.
Substance versus style: a new look at social desirability in motivating contexts.
Smith, D Brent; Ellingson, Jill E
2002-04-01
Although there is an emerging consensus that social desirability does not meaningfully affect criterion-related validity, several researchers have reaffirmed the argument that social desirability degrades the construct validity of personality measures. Yet, most research demonstrating the adverse consequences of faking for construct validity uses a fake-good instruction set. The consequence of such a manipulation is to exacerbate the effects of response distortion beyond what would be expected under realistic circumstances (e.g., an applicant setting). The research reported in this article was designed to assess these issues by using real-world contexts not influenced by artificial instructions. Results suggest that response distortion has little impact on the construct validity of personality measures used in selection contexts.
Maćkiewicz, Marta; Cieciuch, Jan
2016-01-01
In order to adjust personality measurements to children's developmental level, we constructed the Pictorial Personality Traits Questionnaire for Children (PPTQ-C). To validate the measure, we conducted a study with a total group of 1028 children aged between 7 and 13 years old. Structural validity was established through Exploratory Structural Equation Model (ESEM). Criterion validity was confirmed with a multitrait-multimethod analysis for which we introduced the children's self-assessment scores from the Big Five Questionnaire for Children. Despite some problems with reliability, one can conclude that the PPTQ-C can be a valid instrument for measuring personality traits, particularly in a group of young children (aged ~7–10 years). PMID:27252661
2013-01-01
Background Quality of life (QOL) is an important outcome measure in the treatment of heroin addiction. The Taiwan version of the World Health Organization Quality of Life assessment (WHOQOL-BREF [TW]) has been developed and studied in various groups, but not specifically in a population of injection drug users. The aim of this study was to analyze the psychometric properties of the WHOQOL-BREF (TW) in a sample of injection drug users undergoing methadone maintenance treatment. Methods A total of 553 participants were interviewed and completed the instrument. Item-response distributions, internal consistency, corrected item-domain correlation, criterion-related validity, and construct validity through confirmatory factor analysis were evaluated. Results The frequency distribution of the 4 domains of the WHOQOL-BREF (TW) showed no floor or ceiling effects. The instrument demonstrated adequate internal consistency (Cronbach’s alpha coefficients were higher than 0.7 across the 4 domains) and all items had acceptable correlation with the corresponding domain scores (r = 0.32-0.73). Correlations (p < 0.01) of the 4 domains with the 2 benchmark items assessing overall QOL and general health were supportive of criterion-related validity. Confirmatory factor analysis yielded marginal goodness-of-fit between the 4-domain model and the sample data. Conclusions The hypothesized WHOQOL-BREF measurement model was appropriate for the injection drug users after some adjustments. Despite different patterns found in the confirmatory factor analysis, the findings overall suggest that the WHOQOL-BREF (TW) is a reliable and valid measure of QOL among injection drug users and can be utilized in future treatment outcome studies. The factor structure provided by the study also helps to understand the QOL characteristics of the injection drug users in Taiwan. However, more research is needed to examine its test-retest reliability and sensitivity to changes due to treatment. PMID:24325611
A Controlled Evaluation of the Distress Criterion for Binge Eating Disorder
ERIC Educational Resources Information Center
Grilo, Carlos M.; White, Marney A.
2011-01-01
Objective: Research has examined various aspects of the validity of the research criteria for binge eating disorder (BED) but has yet to evaluate the utility of Criterion C, "marked distress about binge eating." This study examined the significance of the marked distress criterion for BED using 2 complementary comparison groups. Method:…
Chell, Kathleen; Waller, Daniel; Masser, Barbara
2016-06-01
Research demonstrates that anxiety elevates the risk of blood donors experiencing adverse events, which in turn deters the performance of repeat blood donations. Identifying donors suffering from heightened state anxiety is important to assess the impact of evidence-based interventions. This study analyzed the appropriateness of a shortened version of the state subscale of the State-Trait Anxiety Inventory (STAI) in a blood donation context. STAI-State questionnaire data were collected from two separate samples of Australian blood donors (n = 919 and n = 824 after cleaning). Responses to demographic, donation history, and adverse reaction questions were also obtained. Identification of items and analysis was performed systematically to assess and compare internal reliability and content, construct, convergent, and criterion validity of three potential short-form state anxiety scales. Of the three short-form scales tested, STAI-State six-item scale demonstrated the best metric properties with the least number of items across both sample groups. Cronbach's alpha was acceptable (α = 0.844 and α = 0.820), correlated positively with the original measure (r = 0.927 and r = 0.931) and criterion-related variables, and maintained the two-dimension factorial structure of the original measure. The six-item short version of the STAI-State subscale presented the most reliable and valid scale for use with blood donors. A validated donor anxiety tool provides a standardized assessment and record of donor anxiety to gauge the effectiveness of ongoing efforts to enhance the donation experience. © 2016 AABB.
Refining a health-related quality of life assessment strategy for solid organ transplant patients.
Feurer, Irene D; Moore, Derek E; Speroff, Theodore; Liu, Hongxia; Payne, Jerita; Harrison, Connie; Pinson, C Wright
2004-01-01
The psychometric properties of generic health-related quality of life (HRQOL) assessment instruments were evaluated to identify a reliable, valid, and non-redundant battery to measure longitudinal outcomes in organ transplant patients. Objective functional performance and subjective HRQOL were assessed in 371 solid organ (liver, heart, kidney, lung) transplant patients using the Karnofsky scale, the SF-36 Health Survey (SF-36), and Psychosocial Adjustment to Illness Scale (PAIS). The surveys' internal-consistency reliability, criterion-related validity, and redundancy were tested. The SF-36 mental (MCS) and physical components (PCS), and PAIS summary scales were internally consistent (all alpha > or = 0.83). Four out of seven PAIS scales (vocational, domestic, sexual, social) were collectively associated with the PCS (R = 0.65, P < 0.001), as was functional performance (r = 0.52, P < 0.001). Three PAIS scales (family, social, psychological distress) were associated with the MCS (R = 0.72, P < 0.001). Only the PAIS healthcare orientation (satisfaction) scale was not associated with the SF-36((R)). The relationship between functional performance and the PCS is stronger (r = 0.52, P < 0.001) than with the MCS (r = 0.25, P < 0.001) and the PAIS global score (r = 0.37, P < 0.001). The SF-36 and PAIS are internally consistent and exhibit divergent criterion-related validity but, with the exception of the PAIS healthcare orientation scale, are statistically redundant. The advantages of the SF-36 include wider use, more norms, and a lesser response burden. A transplant-specific patient satisfaction inventory was indicated and was developed.
Dawson, Deborah A; Saha, Tulshi D; Grant, Bridget F
2010-02-01
The relative severity of the 11 DSM-IV alcohol use disorder (AUD) criteria are represented by their severity threshold scores, an item response theory (IRT) model parameter inversely proportional to their prevalence. These scores can be used to create a continuous severity measure comprising the total number of criteria endorsed, each weighted by its relative severity. This paper assesses the validity of the severity ranking of the 11 criteria and the overall severity score with respect to known AUD correlates, including alcohol consumption, psychological functioning, family history, antisociality, and early initiation of drinking, in a representative population sample of U.S. past-year drinkers (n=26,946). The unadjusted mean values for all validating measures increased steadily with the severity threshold score, except that legal problems, the criterion with the highest score, was associated with lower values than expected. After adjusting for the total number of criteria endorsed, this direct relationship was no longer evident. The overall severity score was no more highly correlated with the validating measures than a simple count of criteria endorsed, nor did the two measures yield different risk curves. This reflects both within-criterion variation in severity and the fact that the number of criteria endorsed and their severity are so highly correlated that severity is essentially redundant. Attempts to formulate a scalar measure of AUD will do as well by relying on simple counts of criteria or symptom items as by using scales weighted by IRT measures of severity. Published by Elsevier Ireland Ltd.
The adolescent child health and illness profile. A population-based measure of health.
Starfield, B; Riley, A W; Green, B F; Ensminger, M E; Ryan, S A; Kelleher, K; Kim-Harris, S; Johnston, D; Vogel, K
1995-05-01
This study was designed to test the reliability and validity of an instrument to assess adolescent health status. Reliability and validity were examined by administration to adolescents (ages 11-17 years) in eight schools in two urban areas, one area in Appalachia, and one area in the rural South. Integrity of the domains and subdomains and construct validity were tested in all areas. Test/retest stability, criterion validity, and convergent and discriminant validity were tested in the two urban areas. Iterative testing has resulted in the final form of the CHIP-AE (Child Health and Illness Profile-Adolescent Edition) having 6 domains with 20 subdomains. The domains are Discomfort, Disorders, Satisfaction with Health, Achievement (of age-appropriate social roles), Risks, and Resilience. Tested aspects of reliability and validity have achieved acceptable levels for all retained subdomains. The CHIP-AE in its current form is suitable for assessing the health status of populations and subpopulations of adolescents. Evidence from test-retest stability analyses suggests that the CHIP-AE also can be used to assess changes occurring over time or in response to health services interventions targeted at groups of adolescents.
Park, Yu Kyung; Ju, Hyeon Ok; Na, Hunjoo
2016-02-01
The Perinatal Post-Traumatic Stress Disorder Questionnaire (PPQ) was designed to measure post-traumatic symptoms related to childbirth and symptoms during postnatal period. The purpose of this study was to develop a translated Korean version of the PPQ and to evaluate reliability and validity of the Korean PPQ. Participants were 196 mothers at one to 18 months after giving childbirth and data were collected through e-mails. The PPQ was translated into Korean using translation guideline from World Health Organization. For this study Cronbach's alpha and split-half reliability were used to evaluate the reliability of the PPQ. Exploratory Factor Analysis (EFA), Confirmatory Factor Analysis (CFA), and known-group validity were conducted to examine construct validity. Correlations of the PPQ with Impact of Event Scale (IES), Beck Depression Inventory II (BDI-II), and Beck Anxiety Inventory (BAI) were used to test a criterion validity of the PPQ. Cronbach's alpha and Spearman-Brown split-half correlation coefficient were 0.91 and 0.77, respectively. EFA identified a 3-factor solution including arousal, avoidance, and intrusion factors and CFA revealed the strongest support for the 3-factor model. The correlations of the PPQ with IES, BDI-II, and BAI were .99, .60, and .72, respectively, pointing to criterion validity of a high level. The Korean version PPQ is a useful tool for screening and assessing mothers' experiencing emotional distress related to child birth and during the postnatal period. The PPQ also reflects Post Traumatic Stress Disorder's diagnostic standards well.
Validity of the posttraumatic stress disorders (PTSD) checklist in pregnant women.
Gelaye, Bizu; Zheng, Yinnan; Medina-Mora, Maria Elena; Rondon, Marta B; Sánchez, Sixto E; Williams, Michelle A
2017-05-12
The PTSD Checklist-civilian (PCL-C) is one of the most commonly used self-report measures of PTSD symptoms, however, little is known about its validity when used in pregnancy. This study aims to evaluate the reliability and validity of the PCL-C as a screen for detecting PTSD symptoms among pregnant women. A total of 3372 pregnant women who attended their first prenatal care visit in Lima, Peru participated in the study. We assessed the reliability of the PCL-C items using Cronbach's alpha. Criterion validity and performance characteristics of PCL-C were assessed against an independent, blinded Clinician-Administered PTSD Scale (CAPS) interview using measures of sensitivity, specificity and receiver operating characteristics (ROC) curves. We tested construct validity using exploratory and confirmatory factor analytic approaches. The reliability of the PCL-C was excellent (Cronbach's alpha =0.90). ROC analysis showed that a cut-off score of 26 offered optimal discriminatory power, with a sensitivity of 0.86 (95% CI: 0.78-0.92) and a specificity of 0.63 (95% CI: 0.62-0.65). The area under the ROC curve was 0.75 (95% CI: 0.71-0.78). A three-factor solution was extracted using exploratory factor analysis and was further complemented with three other models using confirmatory factor analysis (CFA). In a CFA, a three-factor model based on DSM-IV symptom structure had reasonable fit statistics with comparative fit index of 0.86 and root mean square error of approximation of 0.09. The Spanish-language version of the PCL-C may be used as a screening tool for pregnant women. The PCL-C has good reliability, criterion validity and factorial validity. The optimal cut-off score obtained by maximizing the sensitivity and specificity should be considered cautiously; women who screened positive may require further investigation to confirm PTSD diagnosis.
Biofeedback in Partial Weight Bearing: Validity of 3 Different Devices.
van Lieshout, Remko; Stukstette, Mirelle J; de Bie, Rob A; Vanwanseele, Benedicte; Pisters, Martijn F
2016-11-01
Study Design Controlled laboratory study to assess criterion-related validity, with a cross-sectional within-subject design. Background Patients with orthopaedic conditions have difficulties complying with partial weight-bearing instructions. Technological advances have resulted in biofeedback devices that offer real-time feedback. However, the accuracy of these devices is mostly unknown. Inaccurate feedback can result in incorrect lower-limb loading and may lead to delayed healing. Objectives To investigate validity of peak force measurements obtained using 3 different biofeedback devices under varying levels of partial weight-bearing categories. Methods Validity of 3 biofeedback devices (OpenGo science, SmartStep, and SensiStep) was assessed. Healthy participants were instructed to walk at a self-selected speed with crutches under 3 different weight-bearing conditions, categorized as a percentage range of body weight: 1% to 20%, greater than 20% to 50%, and greater than 50% to 75%. Peak force data from the biofeedback devices were compared with the peak vertical ground reaction force measured with a force plate. Criterion validity was estimated using simple and regression-based Bland-Altman 95% limits of agreement and weighted kappas. Results Fifty-five healthy adults (58% male) participated. Agreement with the gold standard was substantial for the SmartStep, moderate for OpenGo science, and slight for SensiStep (weighted ± = 0.76, 0.58, and 0.19, respectively). For the 1% to 20% and greater than 20% to 50% weight-bearing categories, both the OpenGo science and SmartStep had acceptable limits of agreement. For the weight-bearing category greater than 50% to 75%, none of the devices had acceptable agreement. Conclusion The OpenGo science and SmartStep provided valid feedback in the lower weight-bearing categories, and the SensiStep showed poor validity of feedback in all weight-bearing categories. J Orthop Sports Phys Ther 2016;46(11):-1. Epub 12 Oct 2016. doi:10.2519/jospt.2016.6625.
Sotardi, Valerie A
2018-05-01
Educational measures of anxiety focus heavily on students' experiences with tests yet overlook other assessment contexts. In this research, two brief multiscale questionnaires were developed and validated to measure trait evaluation anxiety (MTEA-12) and state evaluation anxiety (MSEA-12) for use in various assessment contexts in non-clinical, educational settings. The research included a cross-sectional analysis of self-report data using authentic assessment settings in which evaluation anxiety was measured. Instruments were tested using a validation sample of 241 first-year university students in New Zealand. Scale development included component structures for state and trait scales based on existing theoretical frameworks. Analyses using confirmatory factor analysis and descriptive statistics indicate that the scales are reliable and structurally valid. Multivariate general linear modeling using subscales from the MTEA-12, MSEA-12, and student grades suggest adequate criterion-related validity. Initial predictive validity in which one relevant MTEA-12 factor explained between 21% and 54% of the variance in three MSEA-12 factors. Results document MTEA-12 and MSEA-12 as reliable measures of trait and state dimensions of evaluation anxiety for test and writing contexts. Initial estimates suggest the scales as having promising validity, and recommendations for further validation are outlined.
Methodology Series Module 9: Designing Questionnaires and Clinical Record Forms - Part II.
Setia, Maninder Singh
2017-01-01
This article is a continuation of the previous module on designing questionnaires and clinical record form in which we have discussed some basic points about designing the questionnaire and clinical record forms. In this section, we will discuss the reliability and validity of questionnaires. The different types of validity are face validity, content validity, criterion validity, and construct validity. The different types of reliability are test-retest reliability, inter-rater reliability, and intra-rater reliability. Some of these parameters are assessed by subject area experts. However, statistical tests should be used for evaluation of other parameters. Once the questionnaire has been designed, the researcher should pilot test the questionnaire. The items in the questionnaire should be changed based on the feedback from the pilot study participants and the researcher's experience. After the basic structure of the questionnaire has been finalized, the researcher should assess the validity and reliability of the questionnaire or the scale. If an existing standard questionnaire is translated in the local language, the researcher should assess the reliability and validity of the translated questionnaire, and these values should be presented in the manuscript. The decision to use a self- or interviewer-administered, paper- or computer-based questionnaire depends on the nature of the questions, literacy levels of the target population, and resources.
Methodology Series Module 9: Designing Questionnaires and Clinical Record Forms – Part II
Setia, Maninder Singh
2017-01-01
This article is a continuation of the previous module on designing questionnaires and clinical record form in which we have discussed some basic points about designing the questionnaire and clinical record forms. In this section, we will discuss the reliability and validity of questionnaires. The different types of validity are face validity, content validity, criterion validity, and construct validity. The different types of reliability are test-retest reliability, inter-rater reliability, and intra-rater reliability. Some of these parameters are assessed by subject area experts. However, statistical tests should be used for evaluation of other parameters. Once the questionnaire has been designed, the researcher should pilot test the questionnaire. The items in the questionnaire should be changed based on the feedback from the pilot study participants and the researcher's experience. After the basic structure of the questionnaire has been finalized, the researcher should assess the validity and reliability of the questionnaire or the scale. If an existing standard questionnaire is translated in the local language, the researcher should assess the reliability and validity of the translated questionnaire, and these values should be presented in the manuscript. The decision to use a self- or interviewer-administered, paper- or computer-based questionnaire depends on the nature of the questions, literacy levels of the target population, and resources. PMID:28584367
Criterion Related Validity of Karate Specific Aerobic Test (KSAT)
Chaabene, Helmi; Hachana, Younes; Franchini, Emerson; Tabben, Montassar; Mkaouer, Bessem; Negra, Yassine; Hammami, Mehrez; Chamari, Karim
2015-01-01
Background: Karate is one the most popular combat sports in the world. Physical fitness assessment on a regular manner is important for monitoring the effectiveness of the training program and the readiness of karatekas to compete. Objectives: The aim of this research was to examine the criterion related to validity of the karate specific aerobic test (KSAT) as an indicator of aerobic level of karate practitioners. Patients and Methods: Cardiorespiratory responses, aerobic performance level through both treadmill laboratory test and YoYo intermittent recovery test level 1 (YoYoIRTL1) as well as time to exhaustion in the KSAT test (TE’KSAT) were determined in a total of fifteen healthy international karatekas (i.e. karate practitioners) (means ± SD: age: 22.2 ± 4.3 years; height: 176.4 ± 7.5 cm; body mass: 70.3 ± 9.7 kg and body fat: 13.2 ± 6%). Results: Peak heart rate obtained from KSAT represented ~99% of maximal heart rate registered during the treadmill test showing that KSAT imposes high physiological demands. There was no significant correlation between KSAT’s TE and relative (mL/min kg) treadmill maximal oxygen uptake (r = 0.14; P = 0.69; [small]). On the other hand, there was a significant relationship between KSAT’s TE and the velocity associated with VO2max (vVO2max) (r = 0.67; P = 0.03; [large]) as well as the velocity at VO2 corresponding to the second ventilatory threshold (vVO2 VAT) (r = 0.64; P = 0.04; [large]). Moreover, significant relationship was found between TE’s KSAT and both the total distance covered and parameters of intermittent endurance measured through YoYoIRTL1. Conclusions: The KSAT has not proved to have indirect criterion related validity as no significant correlations have been found between TE’s KSAT and treadmill VO2max. Nevertheless, as correlated to other aerobic fitness variables, KSAT can be considered as an indicator of karate specific endurance. The establishment of the criterion related validity of the KSAT requires further investigation. PMID:26446345
Albores-Gallo, Lilia; Hernández-Guzmán, Laura; Hasfura-Buenaga, Cecilia; Navarro-Luna, Enrique
To investigate the validity and internal consistency of the Mexican version of the CBCL/1.5 -5 that assesses the most common psychopathology in pre-school children in clinical and epidemiological settings. A total of 438 parents from two groups, clinical-psychiatric (N= 62) and community (N= 376) completed the CBCL/1.5-5/Mexican version. The internal consistency was high for total problems α=0.95, and internalized α=0.89 and externalized α=0.91 subscales. The test re-test (one week) using the intraclass correlation coefficient (ICC) was ≥ 0.95 for the internalized, externalized, and total problems subscales. The ROC curve for the criterion status of clinically-referred vs. non-referred using the total problems scale ≥ 24 resulted in an AUC (area under curve) of 0.77, a specificity 0.73, and a sensitivity of 0.70. The CBCL/1.5 -5/Mexican version is a reliable and valid tool. Copyright © 2016 Sociedad Chilena de Pediatría. Publicado por Elsevier España, S.L.U. All rights reserved.
Workaholism in Brazil: measurement and individual differences.
Romeo, Marina; Yepes-Baldó, Montserrat; Berger, Rita; Netto Da Costa, Francisco Franco
2014-01-01
The aim of this research is the measurement and assessment of individual differences of workaholism in Brazil, an important issue which affects the competitiveness of companies. The WART 15-PBV was applied to a sample of 153 managers from companies located in Brazil, 82 (53.6%) women and 71 (46.4%) men. Ages ranged from 20 to 69 years with an average value of 41 (SD=9.06). We analyzed, on one hand, the factor structure of the questionnaire, its internal consistency and convergent (with the Dutch Work Addiction Scale - DUWAS) and criterion validity (with General Health Questionnaire GHQ). On the other hand, we analyzed individual gender differences on workaholism. WART15-PBV has good psychometric properties, and evidence for convergent and criterion validity. Females and males differed on Impaired Communication / Self-Absorption dimension. This dimension has a direct effect only on mens health perception, while Compulsive tendencies dimension has a direct effect for both genders. The findings suggest the WART15-PBV is a valid measure of workaholism that would contribute to the workers health and their professional and personal life, in order to encourage adequate conditions in the workplace taking into account workers individual differences.
ERIC Educational Resources Information Center
Tavassoli, Teresa; Bellesheim, Katherine; Siper, Paige M.; Wang, A. Ting; Halpern, Danielle; Gorenstein, Michelle; Grodberg, David; Kolevzon, Alexander; Buxbaum, Joseph D.
2016-01-01
Sensory reactivity is a new DSM-5 criterion for autism spectrum disorder (ASD). The current study aims to validate a clinician-administered sensory observation in ASD, the Sensory Processing Scale Assessment (SPS). The SPS and the Short Sensory Profile (SSP) parent-report were used to measure sensory reactivity in children with ASD (n = 35) and…
ERIC Educational Resources Information Center
Yarbrough, Nükhet D.
2016-01-01
As part of a project to translate and administer the Torrance Tests of Creative Thinking (TTCT) to Turkish elementary and secondary students, 35 professionals were trained in a full-day workshop to learn to score the verbal TTCT. All trainees scored the same 4 sets of TTCT verbal criterion tests for fluency, flexibility, and originality by filling…
ERIC Educational Resources Information Center
Longenbecker, Sueann; Wood, Peter H.
1984-01-01
Scores from the National Board Dental Hygiene Examination (NBDHE) served as the criterion variable in a comparison of the predictive validity of the Dental Hygiene Aptitude Tests (DHAT) and the ACT Assessment tests. The DHAT-Science and Verbal tests combined to produce the highest multiple correlation with NBDHE scores. (Author/DWH)
Validity of the Digital Inclinometer and iPhone When Measuring Thoracic Spine Rotation.
Bucke, Jonathan; Spencer, Simon; Fawcett, Louise; Sonvico, Lawrence; Rushton, Alison; Heneghan, Nicola R
2017-09-01
Spinal axial rotation is required for many functional and sporting activities. Eighty percent of axial rotation occurs in the thoracic spine. Existing measures of thoracic spine rotation commonly involve laboratory equipment, use a seated position, and include lumbar motion. A simple performance-based outcome measure would allow clinicians to evaluate isolated thoracic spine rotation. Currently, no valid measure exists. To explore the criterion and concurrent validity of a digital inclinometer (DI) and iPhone Clinometer app (iPhone) for measuring thoracic spine rotation using the heel-sit position. Controlled laboratory study. University laboratory. A total of 23 asymptomatic healthy participants (14 men, 9 women; age = 25.82 ± 4.28 years, height = 170.26 ± 8.01 cm, mass = 67.50 ± 9.46 kg, body mass index = 23.26 ± 2.79) were recruited from a student population. We took DI and iPhone measurements of thoracic spine rotation in the heel-sit position concurrently with dual-motion analysis (laboratory measure) and ultrasound imaging of the underlying bony tissue motion (reference standard). To determine the criterion and concurrent validity, we used the Pearson product moment correlation coefficient (r, 2 tailed) and Bland-Altman plots. The DI (r = 0.88, P < .001) and iPhone (r = 0.88, P < .001) demonstrated strong criterion validity. Both also had strong concurrent validity (r = 0.98, P < .001). Bland-Altman plots illustrated mean differences of 5.82° (95% confidence interval [CI] = 20.37°, -8.73°) and 4.94° (95% CI = 19.23°, -9.35°) between the DI and iPhone, respectively, and the reference standard and 0.87° (95% CI = 6.79°, -5.05°) between the DI and iPhone. The DI and iPhone provided valid measures of thoracic spine rotation in the heel-sit position. Both can be used in clinical practice to assess thoracic spine rotation, which may be valuable when evaluating thoracic dysfunction.
Baten, Verena; Busch, Hans-Jörg; Busche, Caroline; Schmid, Bonaventura; Heupel-Reuter, Miriam; Perlov, Evgeniy; Brich, Jochen; Klöppel, Stefan
2018-05-08
Delirium is frequent in elderly patients presenting in the emergency department (ED). Despite the severe prognosis, the majority of delirium cases remain undetected by emergency physicians (EPs). At the time of our study there was no valid delirium screening tool available for EDs in German-speaking regions. We aimed to evaluate the brief Confusion Assessment Method (bCAM) for a German ED during the daily work routine. We implemented the bCAM into practice in a German interdisciplinary high-volume ED and evaluated the bCAM's validity in a convenience sample of medical patients aged ≥ 70 years. The bCAM, which assesses four core features of delirium, was performed by EPs during their daily work routine and compared to a criterion standard based on the criteria for delirium as described in the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition. Compared to the criterion standard, delirium was found to be present in 46 (16.0%) of the 288 nonsurgical patients enrolled. The bCAM showed 93.8% specificity (95% confidence interval [CI] = 90.0%-96.5%) and 65.2% sensitivity (95% CI = 49.8%-78.7%). Positive and negative likelihood ratios were 10.5 and 0.37, respectively, while the odds ratio was 28.4. Delirium was missed in 10 of 16 cases, since the bCAM did not indicate altered levels of consciousness and disorganized thinking. The level of agreement with the criterion standard increased for patients with low cognitive performance. This was the first study evaluating the bCAM for a German ED and when performed by EPs during routine work. The bCAM showed good specificity, but only moderate sensitivity. Nevertheless, application of the bCAM most likely improves the delirium detection rate in German EDs. However, it should only be applied by trained physicians to maximize diagnostic accuracy and hence improve the bCAM's sensitivity. Future studies should refine the bCAM. © 2018 by the Society for Academic Emergency Medicine.
ERIC Educational Resources Information Center
Sánchez-Rosas, Javier; Furlan, Luis Alberto
2017-01-01
Based on the control-value theory of achievement emotions and theory of achievement goals, this research provides evidence of convergent, divergent, and criterion validity of the Spanish Cognitive Test Anxiety Scale (S-CTAS). A sample of Argentinean undergraduates responded to several scales administered at three points. At time 1 and 3, the…
ERIC Educational Resources Information Center
Abdekhodaie, Zahra; Tabatabaei, Seyed Mahmood; Gholizadeh, Mortaza
2012-01-01
In this study, the prevalence of attention-deficit hyperactivity disorder (ADHD) in kindergarten children in northeast Iran was investigated, and the criterion validity of Conners' parent-teacher questionnaire was evaluated through the use of clinical interviews. This study was a cross-sectional descriptive research project with children in…
ERIC Educational Resources Information Center
Maljaars, Jarymke; Noens, Ilse; Scholte, Evert; van Berckelaer-Onnes, Ina
2012-01-01
The Diagnostic Interview for Social and Communication Disorders (DISCO; Wing, 2006) is a standardized, semi-structured and interviewer-based schedule for diagnosis of autism spectrum disorder (ASD). The objective of this study was to evaluate the criterion and convergent validity of the DISCO-11 ICD-10 algorithm in young and low-functioning…
Phillips, Tasha R; Sellbom, Martin; Ben-Porath, Yossef S; Patrick, Christopher J
2014-02-01
Replicating and extending research by Sellbom et al. (M. Sellbom, Y. S. Ben-Porath, C. J. Patrick, D. B. Wygant, D. M. Gartland, & K. P. Stafford, 2012, Development and Construct Validation of the MMPI-2-RF Measures of Global Psychopathy, Fearless-Dominance, and Impulsive-Antisociality, Personality Disorders: Theory, Research, and Treatment, 3, 17-38), the current study examined the criterion-related validity of three self-report indices of psychopathy that were derived from scores on the Minnesota Multiphasic Personality Inventory (MMPI)-2-Restructured Form (MMPI-2-RF; Y. S. Ben-Porath & A. Tellegen, 2008, Minnesota Multiphasic Personality Inventory-2-Restructured Form: Manual for Administration, Scoring, and Interpretation, Minneapolis, MN: University of Minnesota Press). We estimated psychopathy indices by regressing scores from the Psychopathic Personality Inventory (PPI; S. O. Lilienfeld & B. P. Andrews, 1996, Development and Preliminary Validation of a Self-Report Measure of Psychopathic Personality Traits in Noncriminal Populations, Journal of Personality Assessment, 66, 488-524) and its two distinct facets, Fearless-Dominance and Impulsive-Antisociality, onto conceptually selected MMPI-2-RF scales. Data for a newly collected sample of 230 incarcerated women were combined with existing data from Sellbom et al.'s (2012) male correctional and mixed-gender college samples to establish regression equations with optimal generalizability. Correlation and regression analyses were then used to examine associations between the MMPI-2-RF-based estimates of PPI psychopathy and criterion measures (i.e., other well-established measures of psychopathy and conceptually related personality traits), and to evaluate whether gender moderated these associations. The MMPI-2-RF-based psychopathy indices correlated as expected with criterion measures and showed only one significant moderating effect for gender, namely, in the association between psychopathy and narcissism. These results provide further support for the validity of the MMPI-2-RF-based estimates of PPI psychopathy, and encourage their use in research and clinical contexts.
Goode, N; Salmon, P M; Taylor, N Z; Lenné, M G; Finch, C F
2017-10-01
One factor potentially limiting the uptake of Rasmussen's (1997) Accimap method by practitioners is the lack of a contributing factor classification scheme to guide accident analyses. This article evaluates the intra- and inter-rater reliability and criterion-referenced validity of a classification scheme developed to support the use of Accimap by led outdoor activity (LOA) practitioners. The classification scheme has two levels: the system level describes the actors, artefacts and activity context in terms of 14 codes; the descriptor level breaks the system level codes down into 107 specific contributing factors. The study involved 11 LOA practitioners using the scheme on two separate occasions to code a pre-determined list of contributing factors identified from four incident reports. Criterion-referenced validity was assessed by comparing the codes selected by LOA practitioners to those selected by the method creators. Mean intra-rater reliability scores at the system (M = 83.6%) and descriptor (M = 74%) levels were acceptable. Mean inter-rater reliability scores were not consistently acceptable for both coding attempts at the system level (M T1 = 68.8%; M T2 = 73.9%), and were poor at the descriptor level (M T1 = 58.5%; M T2 = 64.1%). Mean criterion referenced validity scores at the system level were acceptable (M T1 = 73.9%; M T2 = 75.3%). However, they were not consistently acceptable at the descriptor level (M T1 = 67.6%; M T2 = 70.8%). Overall, the results indicate that the classification scheme does not currently satisfy reliability and validity requirements, and that further work is required. The implications for the design and development of contributing factors classification schemes are discussed. Copyright © 2017 Elsevier Ltd. All rights reserved.
Zhang, Dengke; Pang, Yanxia; Cai, Weixiong; Fazio, Rachel L; Ge, Jianrong; Su, Qiaorong; Xu, Shuiqin; Pan, Yinan; Chen, Sanmei; Zhang, Hongwei
2016-08-01
Impairment of theory of mind (ToM) is a common phenomenon following traumatic brain injury (TBI) that has clear effects on patients' social functioning. A growing body of research has focused on this area, and several methods have been developed to assess ToM deficiency. Although an informant assessment scale would be useful for examining individuals with TBI, very few studies have adopted this approach. The purpose of the present study was to develop an informant assessment scale of ToM for adults with traumatic brain injury (IASToM-aTBI) and to test its reliability and validity with 196 adults with TBI and 80 normal adults. A 44-item scale was developed following a literature review, interviews with patient informants, consultations with experts, item analysis, and exploratory factor analysis (EFA). The following three common factors were extracted: social interaction, understanding of beliefs, and understanding of emotions. The psychometric analyses indicate that the scale has good internal consistency reliability, split-half reliability, test-retest reliability, inter-rater reliability, structural validity, discriminate validity and criterion validity. These results provide preliminary evidence that supports the reliability and validity of the IASToM-aTBI as a ToM assessment tool for adults with TBI.
Prince, Martin J; de Rodriguez, Juan Llibre; Noriega, L; Lopez, A; Acosta, Daisy; Albanese, Emiliano; Arizaga, Raul; Copeland, John RM; Dewey, Michael; Ferri, Cleusa P; Guerra, Mariella; Huang, Yueqin; Jacob, KS; Krishnamoorthy, ES; McKeigue, Paul; Sousa, Renata; Stewart, Robert J; Salas, Aquiles; Sosa, Ana Luisa; Uwakwa, Richard
2008-01-01
Background The criterion for dementia implicit in DSM-IV is widely used in research but not fully operationalised. The 10/66 Dementia Research Group sought to do this using assessments from their one phase dementia diagnostic research interview, and to validate the resulting algorithm in a population-based study in Cuba. Methods The criterion was operationalised as a computerised algorithm, applying clinical principles, based upon the 10/66 cognitive tests, clinical interview and informant reports; the Community Screening Instrument for Dementia, the CERAD 10 word list learning and animal naming tests, the Geriatric Mental State, and the History and Aetiology Schedule – Dementia Diagnosis and Subtype. This was validated in Cuba against a local clinician DSM-IV diagnosis and the 10/66 dementia diagnosis (originally calibrated probabilistically against clinician DSM-IV diagnoses in the 10/66 pilot study). Results The DSM-IV sub-criteria were plausibly distributed among clinically diagnosed dementia cases and controls. The clinician diagnoses agreed better with 10/66 dementia diagnosis than with the more conservative computerized DSM-IV algorithm. The DSM-IV algorithm was particularly likely to miss less severe dementia cases. Those with a 10/66 dementia diagnosis who did not meet the DSM-IV criterion were less cognitively and functionally impaired compared with the DSMIV confirmed cases, but still grossly impaired compared with those free of dementia. Conclusion The DSM-IV criterion, strictly applied, defines a narrow category of unambiguous dementia characterized by marked impairment. It may be specific but incompletely sensitive to clinically relevant cases. The 10/66 dementia diagnosis defines a broader category that may be more sensitive, identifying genuine cases beyond those defined by our DSM-IV algorithm, with relevance to the estimation of the population burden of this disorder. PMID:18577205
Validity of two alternative systems for measuring vertical jump height.
Leard, John S; Cirillo, Melissa A; Katsnelson, Eugene; Kimiatek, Deena A; Miller, Tim W; Trebincevic, Kenan; Garbalosa, Juan C
2007-11-01
Vertical jump height is frequently used by coaches, health care professionals, and strength and conditioning professionals to objectively measure function. The purpose of this study is to determine the concurrent validity of the jump and reach method (Vertec) and the contact mat method (Just Jump) in assessing vertical jump height when compared with the criterion reference 3-camera motion analysis system. Thirty-nine college students, 25 females and 14 males between the ages of 18 and 25 (mean age 20.65 years), were instructed to perform the countermovement jump. Reflective markers were placed at the base of the individual's sacrum for the 3-camera motion analysis system to measure vertical jump height. The subject was then instructed to stand on the Just Jump mat beneath the Vertec and perform the jump. Measurements were recorded from each of the 3 systems simultaneously for each jump. The Pearson r statistic between the video and the jump and reach (Vertec) was 0.906. The Pearson r between the video and contact mat (Just Jump) was 0.967. Both correlations were significant at the 0.01 level. Analysis of variance showed a significant difference among the 3 means F(2,235) = 5.51, p < 0.05. The post hoc analysis showed a significant difference between the criterion reference (M = 0.4369 m) and the Vertec (M = 0.3937 m, p = 0.005) but not between the criterion reference and the Just Jump system (M = 0.4420 m, p = 0.972). The Just Jump method of measuring vertical jump height is a valid measure when compared with the 3-camera system. The Vertec was found to have a high correlation with the criterion reference, but the mean differed significantly. This study indicates that a higher degree of confidence is warranted when comparing Just Jump results with a 3-camera system study.
Leckman, James F.; Denys, Damiaan; Simpson, H. Blair; Mataix-Cols, David; Hollander, Eric; Saxena, Sanjaya; Miguel, Euripedes C.; Rauch, Scott L.; Goodman, Wayne K.; Phillips, Katharine A.; Stein, Dan J.
2014-01-01
Background Since the publication of the DSM-IV in 1994, research on obsessive–compulsive disorder (OCD) has continued to expand. It is timely to reconsider the nosology of this disorder, assessing whether changes to diagnostic criteria as well as subtypes and specifiers may improve diagnostic validity and clinical utility. Methods The existing criteria were evaluated. Key issues were identified. Electronic databases of PubMed, ScienceDirect, and PsycINFO were searched for relevant studies. Results This review presents a number of options and preliminary recommendations to be considered for DSM-V. These include: (1) clarifying and simplifying the definition of obsessions and compulsions(criterion A); (2) possibly deleting the requirement that people recognize that their obsessions or compulsions are excessive or unreasonable (criterion B); (3) rethinking the clinical significance criterion (criterion C) and, in the interim, possibly adjusting what is considered “time-consuming” for OCD; (4) listing additional disorders to help with the differential diagnosis (criterion D); (5) rethinking the medical exclusion criterion (criterion E) and clarifying what is meant by a “general medical condition”; (6) revising the specifiers (i.e., clarifying that OCD can involve a range of insight, in addition to “poor insight,” and adding “tic-related OCD”); and (7) highlighting in the DSM-V text important clinical features of OCD that are not currently mentioned in the criteria (e.g., the major symptom dimensions). Conclusions A number of changes to the existing diagnostic criteria for OCD are proposed. These proposed criteria may change as the DSM-V process progresses. PMID:20217853
Leckman, James F; Denys, Damiaan; Simpson, H Blair; Mataix-Cols, David; Hollander, Eric; Saxena, Sanjaya; Miguel, Euripedes C; Rauch, Scott L; Goodman, Wayne K; Phillips, Katharine A; Stein, Dan J
2010-06-01
Since the publication of the DSM-IV in 1994, research on obsessive-compulsive disorder (OCD) has continued to expand. It is timely to reconsider the nosology of this disorder, assessing whether changes to diagnostic criteria as well as subtypes and specifiers may improve diagnostic validity and clinical utility. The existing criteria were evaluated. Key issues were identified. Electronic databases of PubMed, ScienceDirect, and PsycINFO were searched for relevant studies. This review presents a number of options and preliminary recommendations to be considered for DSM-V. These include: (1) clarifying and simplifying the definition of obsessions and compulsions (criterion A); (2) possibly deleting the requirement that people recognize that their obsessions or compulsions are excessive or unreasonable (criterion B); (3) rethinking the clinical significance criterion (criterion C) and, in the interim, possibly adjusting what is considered "time-consuming" for OCD; (4) listing additional disorders to help with the differential diagnosis (criterion D); (5) rethinking the medical exclusion criterion (criterion E) and clarifying what is meant by a "general medical condition"; (6) revising the specifiers (i.e., clarifying that OCD can involve a range of insight, in addition to "poor insight," and adding "tic-related OCD"); and (7) highlighting in the DSM-V text important clinical features of OCD that are not currently mentioned in the criteria (e.g., the major symptom dimensions). A number of changes to the existing diagnostic criteria for OCD are proposed. These proposed criteria may change as the DSM-V process progresses. (c) 2010 Wiley-Liss, Inc.
Fosco, Whitney D; Hawk, Larry W
2017-02-01
A child's ability to sustain attention over time (AOT) is critical in attention-deficit/hyperactivity disorder (ADHD), yet no prior work has examined the extent to which a child's decrement in AOT on laboratory tasks relates to clinically-relevant behavior. The goal of this study is to provide initial evidence for the criterion validity of laboratory assessments of AOT. A total of 20 children with ADHD (7-12 years of age) who were enrolled in a summer treatment program completed two lab attention tasks (a continuous performance task and a self-paced choice discrimination task) and math seatwork. Analyses focused on relations between attention task parameters and math productivity. Individual differences in overall attention (OA) measures (averaged across time) accounted for 23% of the variance in math productivity, supporting the criterion validity of lab measures of attention. The criterion validity was enhanced by consideration of changes in AOT. Performance on all laboratory attention measures deteriorated as time-on-task increased, and individual differences in the decrement in AOT accounted for 40% of the variance in math productivity. The only variable to uniquely predict math productivity was from the self-paced choice discrimination task. This study suggests that attention tasks in the lab do predict a clinically-relevant target behavior in children with ADHD, supporting their use as a means to study attention processes in a controlled environment. Furthermore, this prediction is improved when attention is examined as a function of time-on-task and when the attentional demands are consistent between lab and life contexts.
De Croon, Einar M; Blonk, Roland W B; Sluiter, Judith K; Frings-Dresen, Monique H W
2005-02-01
Monitoring psychological job strain may help occupational physicians to take preventive action at the appropriate time. For this purpose, the 10-item trucker strain monitor (TSM) assessing work-related fatigue and sleeping problems in truck drivers was developed. This study examined (1) test-retest reliability, (2) criterion validity of the TSM with respect to future sickness absence due to psychological health complaints and (3) usefulness of the TSM two-scales structure. The TSM and self-administered questionnaires, providing information about stressful working conditions (job control and job demands) and sickness absence, were sent to a random sample of 2000 drivers in 1998. Of the 1123 responders, 820 returned a completed questionnaire 2 years later (response: 72%). The TSM work-related fatigue scale, the TSM sleeping problems scale and the TSM composite scale showed satisfactory 2-year test-retest reliability (coefficient r=0.62, 0.66 and 0.67, respectively). The work-related fatigue, sleeping problems scale and composite scale had sensitivities of 61, 65 and 61%, respectively in identifying drivers with future sickness absence due to psychological health complaints. The specificity and positive predictive value of the TSM composite scale were 77 and 11%, respectively. The work-related fatigue scale and the sleeping problems scale were moderately strong correlated (r=0.62). However, stressful working conditions were differentially associated with the two scales. The results support the test-retest reliability, criterion validity and two-factor structure of the TSM. In general, the results suggest that the use of occupation-specific psychological job strain questionnaires is fruitful.
McMahon, Robert J; Witkiewitz, Katie; Kotler, Julie S
2010-11-01
This study investigated the predictive validity of youth callous-unemotional (CU) traits, as measured in early adolescence (Grade 7) by the Antisocial Process Screening Device (APSD; Frick & Hare, 2001), in a longitudinal sample (N = 754). Antisocial outcomes, assessed in adolescence and early adulthood, included self-reported general delinquency from 7th grade through 2 years post-high school, self-reported serious crimes through 2 years post-high school, juvenile and adult arrest records through 1 year post-high school, and antisocial personality disorder symptoms and diagnosis at 2 years post-high school. CU traits measured in 7th grade were highly predictive of 5 of the 6 antisocial outcomes-general delinquency, juvenile and adult arrests, and early adult antisocial personality disorder criterion count and diagnosis-over and above prior and concurrent conduct problem behavior (i.e., criterion counts of oppositional defiant disorder and conduct disorder) and attention-deficit/hyperactivity disorder (criterion count). Incorporating a CU traits specifier for those with a diagnosis of conduct disorder improved the positive prediction of antisocial outcomes, with a very low false-positive rate. There was minimal evidence of moderation by sex, race, or urban/rural status. Urban/rural status moderated one finding, with being from an urban area associated with stronger relations between CU traits and adult arrests. Findings clearly support the inclusion of CU traits as a specifier for the diagnosis of conduct disorder, at least with respect to predictive validity. PsycINFO Database Record (c) 2010 APA, all rights reserved
Lichtenberg, Peter A; Ficker, Lisa J; Rahman-Filipiak, Annalise
2016-01-01
This study examines preliminary evidence for the Lichtenberg Financial Decision Rating Scale (LFDRS), a new person-centered approach to assessing capacity to make financial decisions, and its relationship to self-reported cases of financial exploitation in 69 older African Americans. More than one third of individuals reporting financial exploitation also had questionable decisional abilities. Overall, decisional ability score and current decision total were significantly associated with cognitive screening test and financial ability scores, demonstrating good criterion validity. Study findings suggest that impaired decisional abilities may render older adults more vulnerable to financial exploitation, and that the LFDRS is a valid tool.
Measurement properties of depression questionnaires in patients with diabetes: a systematic review.
van Dijk, Susan E M; Adriaanse, Marcel C; van der Zwaan, Lennart; Bosmans, Judith E; van Marwijk, Harm W J; van Tulder, Maurits W; Terwee, Caroline B
2018-06-01
To conduct a systematic review on measurement properties of questionnaires measuring depressive symptoms in adult patients with type 1 or type 2 diabetes. A systematic review of the literature in MEDLINE, EMbase and PsycINFO was performed. Full text, original articles, published in any language up to October 2016 were included. Eligibility for inclusion was independently assessed by three reviewers who worked in pairs. Methodological quality of the studies was evaluated by two independent reviewers using the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) checklist. Quality of the questionnaires was rated per measurement property, based on the number and quality of the included studies and the reported results. Of 6286 unique hits, 21 studies met our criteria evaluating nine different questionnaires in multiple settings and languages. The methodological quality of the included studies was variable for the different measurement properties: 9/15 studies scored 'good' or 'excellent' on internal consistency, 2/5 on reliability, 0/1 on content validity, 10/10 on structural validity, 8/11 on hypothesis testing, 1/5 on cross-cultural validity, and 4/9 on criterion validity. For the CES-D, there was strong evidence for good internal consistency, structural validity, and construct validity; moderate evidence for good criterion validity; and limited evidence for good cross-cultural validity. The PHQ-9 and WHO-5 also performed well on several measurement properties. However, the evidence for structural validity of the PHQ-9 was inconclusive. The WHO-5 was less extensively researched and originally not developed to measure depression. Currently, the CES-D is best supported for measuring depressive symptoms in diabetes patients.
Diehl, K; Görig, T; Breitbart, E W; Greinert, R; Hillhouse, J J; Stapleton, J L; Schneider, S
2018-01-01
Evidence suggests that indoor tanning may have addictive properties. However, many instruments for measuring indoor tanning addiction show poor validity and reliability. Recently, a new instrument, the Behavioral Addiction Indoor Tanning Screener (BAITS), has been developed. To test the validity and reliability of the BAITS by using a multimethod approach. We used data from the first wave of the National Cancer Aid Monitoring on Sunbed Use, which included a cognitive pretest (August 2015) and a Germany-wide representative survey (October to December 2015). In the cognitive pretest 10 users of tanning beds were interviewed and 3000 individuals aged 14-45 years were included in the representative survey. Potential symptoms of indoor tanning addiction were measured using the BAITS, a brief screening survey with seven items (answer categories: yes vs. no). Criterion validity was assessed by comparing the results of BAITS with usage parameters. Additionally, we tested internal consistency and construct validity. A total of 19·7% of current and 1·8% of former indoor tanning users were screened positive for symptoms of a potential indoor tanning addiction. We found significant associations between usage parameters and the BAITS (criterion validity). Internal consistency (reliability) was good (Kuder-Richardson-20, 0·854). The BAITS was shown to be a homogeneous construct (construct validity). Compared with other short instruments measuring symptoms of a potential indoor tanning addiction, the BAITS seems to be a valid and reliable tool. With its short length and the binary items the BAITS is easy to use in large surveys. © 2017 British Association of Dermatologists.
Durand, Guillaume
2018-05-03
Although highly debated, the notion of the existence of an adaptive side to psychopathy is supported by some researchers. Currently, 2 instruments assessing psychopathic traits include an adaptive component, which might not cover the full spectrum of adaptive psychopathic traits. The Durand Adaptive Psychopathic Traits Questionnaire (DAPTQ; Durand, 2017 ) is a 41-item self-reported instrument assessing adaptive traits known to correlate with the psychopathic personality. In this study, I investigated in 2 samples (N = 263 and N = 262) the incremental validity of the DAPTQ over the Psychopathic Personality Inventory-Short Form (PPI-SF) and the Triarchic Psychopathy Measure (TriPM) using multiple criterion measures. Results showed that the DAPTQ significantly increased the predictive validity over the PPI-SF on 5 factors of the HEXACO. Additionally, the DAPTQ provided incremental validity over both the PPI-SF and the TriPM on measures of communication adaptability, perceived stress, and trait anxiety. Overall, these results support the validity of the DAPTQ in community samples. Directions for future studies to further validate the DAPTQ are discussed.
Place and direction learning in a spatial T-maze task by neonatal piglets
Elmore, Monica R. P.; Dilger, Ryan N.; Johnson, Rodney W.
2013-01-01
Pigs are a valuable animal model for studying neurodevelopment in humans due to similarities in brain structure and growth. The development and validation of behavioral tests to assess learning and memory in neonatal piglets are needed. The present study evaluated the capability of 2-wk old piglets to acquire a novel place and direction learning spatial T-maze task. Validity of the task was assessed by the administration of scopolamine, an anti-cholinergic drug that acts on the hippocampus and other related structures, to impair spatial memory. During acquisition, piglets were trained to locate a milk reward in a constant place in space, as well as direction (east or west), in a plus-shaped maze using extra-maze visual cues. Following acquisition, reward location was reversed and piglets were re-tested to assess learning and working memory. The performance of control piglets in the maze improved over time (P < 0.0001), reaching performance criterion (80% correct) on day 5 of acquisition. Correct choices decreased in the reversal phase (P < 0.0001), but improved over time. In a separate study, piglets were injected daily with either phosphate buffered saline (PBS; control) or scopolamine prior to testing. Piglets administered scopolamine showed impaired performance in the maze compared to controls (P = 0.03), failing to reach performance criterion after 6 days of acquisition testing. Collectively, these data demonstrate that neonatal piglets can be tested in a spatial T-maze task to assess hippocampal-dependent learning and memory. PMID:22526690
Wilson, G. Terence; Sysko, Robyn
2013-01-01
Objective In DSM-IV, to be diagnosed with Bulimia Nervosa (BN) or the provisional diagnosis of Binge Eating Disorder (BED), an individual must experience episodes of binge eating is “at least twice a week” on average, for three or six months respectively. The purpose of this review was to examine the validity and utility of the frequency criterion for BN and BED. Method Published studies evaluating the frequency criterion were reviewed. Results Our review found little evidence to support the validity or utility of the DSM-IV frequency criterion of twice a week binge eating; however, the number of studies available for our review was limited. Conclusion A number of options are available for the frequency criterion in DSM-V, and the optimal diagnostic threshold for binge eating remains to be determined. PMID:19610014
van der Ploeg, Hidde P; Streppel, Kitty R M; van der Beek, Allard J; van der Woude, Luc H V; Vollenbroek-Hutten, Miriam; van Mechelen, Willem
2007-01-01
The objective was to determine the test-retest reliability and criterion validity of the Physical Activity Scale for Individuals with Physical Disabilities (PASIPD). Forty-five non-wheelchair dependent subjects were recruited from three Dutch rehabilitation centers. Subjects' diagnoses were: stroke, spinal cord injury, whiplash, and neurological-, orthopedic- or back disorders. The PASIPD is a 7-d recall physical activity questionnaire that was completed twice, 1 wk apart. During this week, physical activity was also measured with an Actigraph accelerometer. The test-retest reliability Spearman correlation of the PASIPD was 0.77. The criterion validity Spearman correlation was 0.30 when compared to the accelerometer. The PASIPD had test-retest reliability and criterion validity that is comparable to well established self-report physical activity questionnaires from the general population.
The revised Generalized Expectancy for Success Scale: a validity and reliability study.
Hale, W D; Fiedler, L R; Cochran, C D
1992-07-01
The Generalized Expectancy for Success Scale (GESS; Fibel & Hale, 1978) was revised and assessed for reliability and validity. The revised version was administered to 199 college students along with other conceptually related measures, including the Rosenberg Self-Esteem Scale, the Life Orientation Test, and Rotter's Internal-External Locus of Control Scale. One subsample of students also completed the Eysenck Personality Inventory, while another subsample performed a criterion-related task that involved risk taking. Item analysis yielded 25 items with correlations of .45 or higher with the total score. Results indicated high internal consistency and test-retest reliability.
The cross-validated AUC for MCP-logistic regression with high-dimensional data.
Jiang, Dingfeng; Huang, Jian; Zhang, Ying
2013-10-01
We propose a cross-validated area under the receiving operator characteristic (ROC) curve (CV-AUC) criterion for tuning parameter selection for penalized methods in sparse, high-dimensional logistic regression models. We use this criterion in combination with the minimax concave penalty (MCP) method for variable selection. The CV-AUC criterion is specifically designed for optimizing the classification performance for binary outcome data. To implement the proposed approach, we derive an efficient coordinate descent algorithm to compute the MCP-logistic regression solution surface. Simulation studies are conducted to evaluate the finite sample performance of the proposed method and its comparison with the existing methods including the Akaike information criterion (AIC), Bayesian information criterion (BIC) or Extended BIC (EBIC). The model selected based on the CV-AUC criterion tends to have a larger predictive AUC and smaller classification error than those with tuning parameters selected using the AIC, BIC or EBIC. We illustrate the application of the MCP-logistic regression with the CV-AUC criterion on three microarray datasets from the studies that attempt to identify genes related to cancers. Our simulation studies and data examples demonstrate that the CV-AUC is an attractive method for tuning parameter selection for penalized methods in high-dimensional logistic regression models.
Butler, Leon H; Irons, Jessica G; Bassett, Drew T; Correia, Christopher J
2018-06-01
The multiple choice procedure (MCP) is used to assess the relative reinforcing value of concurrently available stimuli. The MCP was originally developed to assess the reinforcing value of drugs; the current within-subjects study employed the MCP to assess the reinforcing value of gambling behavior. Participants (N = 323) completed six versions of the MCP that presented hypothetical choices between money to be used while gambling ($10 or $25) versus escalating amounts of guaranteed money available immediately or after delays of either 1 week or 1 month. Results suggest that choices on the MCP are correlated with other measures of gambling behavior, thus providing concurrent validity data for using the MCP to quantify the relative reinforcing value of gambling. The MCP for gambling also displayed sensitivity to reinforcer magnitude and delay effects, which provides evidence of criterion validity. The results are consistent with a behavioral economic model of addiction and suggest that the MCP could be a valid tool for future research on gambling behavior.
Smith-Ryan, Abbie E; Blue, Malia N M; Trexler, Eric T; Hirsch, Katie R
2018-03-01
Measurement of body composition to assess health risk and prevention is expanding. Accurate portable techniques are needed to facilitate use in clinical settings. This study evaluated the accuracy and repeatability of a portable ultrasound (US) in comparison with a four-compartment criterion for per cent body fat (%Fat) in overweight/obese adults. Fifty-one participants (mean ± SD; age: 37·2 ± 11·3 years; BMI: 31·6 ± 5·2 kg m -2 ) were measured for %Fat using US (GE Logiq-e) and skinfolds. A subset of 36 participants completed a second day of the same measurements, to determine reliability. US and skinfold %Fat were calculated using the seven-site Jackson-Pollock equation. The Wang 4C model was used as the criterion method for %Fat. Compared to a gold standard criterion, US %Fat (36·4 ± 11·8%; P = 0·001; standard error of estimate [SEE] = 3·5%) was significantly higher than the criterion (33·0 ± 8·0%), but not different than skinfolds (35·3 ± 5·9%; P = 0·836; SEE = 4·5%). US resulted in good reliability, with no significant differences from Day 1 (39·95 ± 15·37%) to Day 2 (40·01 ± 15·42%). Relative consistency was 0·96, and standard error of measure was 0·94%. Although US overpredicted %Fat compared to the criterion, a moderate SEE for US is suggestive of a practical assessment tool in overweight individuals. %Fat differences reported from these field-based techniques are less than reported by other single-measurement laboratory methods and therefore may have utility in a clinical setting. This technique may also accurately track changes. © 2016 Scandinavian Society of Clinical Physiology and Nuclear Medicine. Published by John Wiley & Sons Ltd.
Validation of an Arabic version of an instrument to measure waterpipe smoking behavior.
Abou Arbid, S; Al Mulla, A; Ghandour, B; Ammar, N; Adawi, M; Daher, R; Younes, N; Chami, H A
2017-04-01
Reliable and valid measures of waterpipe smoking are essential to study its health effects. The purpose of this study was to examine the reliability and validity of an Arabic translation of Maziak questionnaire that assesses various aspects of waterpipe smoking in epidemiological studies. A cross-sectional study. This questionnaire was translated, back translated, and culturally adapted to the local Arabic dialect. Construct and convergent validity were assessed in a sample of 119 daily waterpipe smokers (WPS) and 30 occasional WPS, defined as smoking at least one waterpipe per week but less than daily from Beirut and Doha (mean age = 52.4 years, males = 61.7%). Construct validity was assessed by comparing the smoking behavior of daily and occasional WPS. Convergent validity was assessed by correlating daily smoking intensity ('number of waterpipe smoked per day') with 'number of waterpipe smoked yesterday' and by correlating lifetime smoking exposure (waterpipe-year) calculated by multiplying number of waterpipe smoked per day × duration of waterpipe smoking with alternate measures obtained graphically (graphical waterpipe-year) or adjusted (adjusted waterpipe-year). Criterion validity was assessed by correlating daily smoking intensity and lifetime smoking exposure with serum cotinine level. Test-retest reliability was analyzed by re-administering the questionnaire to 30 daily and 30 occasional WPS after 2 weeks. Smoking intensity, patterns of use, and willingness to quit differed significantly between daily and occasional WPS. Daily smoking intensity correlated strongly with the number of waterpipe smoked yesterday (r s = 0.68, P < 0.001), but not in the occasional WPS (r s = 0.13, P = 0.70). Waterpipe-year correlated very strongly with adjusted waterpipe-year and graphical waterpipe-year (r s = 0.98, P < 0.001 and r s = 0.92, P < 0.001, respectively). Waterpipe-year, daily smoking intensity, and number of waterpipe smoked yesterday, correlated weakly but significantly with serum cotinine levels (r s = 0.243, P = 0.01; r s = 0.359, P < 0.01 and r s = 0.387, P < 0.01, respectively). The type and pattern of waterpipe use items showed high test-retest reliability with near perfect agreement (k > 0.9), the sharing and intention to quit waterpipe items had substantial agreement (k > 0.6), and the intent to quit item showed moderate agreement (k > 0.4). The questionnaire showed strong reliability, face validity, construct and convergent validity, and a weak but statistically significant criterion validity. Maziak questionnaire is valid and reliable for assessing waterpipe smoking patterns, intensity, and willingness to quit. Copyright © 2016 The Royal Society for Public Health. Published by Elsevier Ltd. All rights reserved.
Venables, Noah C.; Patrick, Christopher J.
2013-01-01
The Externalizing Spectrum Inventory (ESI; Krueger, Markon, Patrick, Benning, & Kramer, 2007) provides a self-report based method for indexing a range of correlated problem behaviors and traits in the domain of deficient impulse control. The ESI organizes lower-order behaviors and traits of this kind around higher-order factors encompassing general disinhibitory proneness, callous-aggression, and substance abuse. The current study used data from a male prisoner sample (N = 235) to evaluate the validity of ESI total and factor scores in relation to external criterion measures consisting of externalizing disorder symptoms (including child and adult antisocial deviance and substance-related problems) assessed via diagnostic interview, personality traits assessed by self-report, and psychopathic features as assessed by both interview and self-report. Results provide evidence for the validity of the ESI measurement model and point to its potential utility as a referent for research on the neurobiological correlates and etiological bases of externalizing proneness. PMID:21787091
Venables, Noah C; Patrick, Christopher J
2012-03-01
The Externalizing Spectrum Inventory (ESI; Krueger, Markon, Patrick, Benning, & Kramer, 2007) provides a self-report based method for indexing a range of correlated problem behaviors and traits in the domain of deficient impulse control. The ESI organizes lower order behaviors and traits of this kind around higher order factors encompassing general disinhibitory proneness, callous-aggression, and substance abuse. In the current study, we used data from a male prisoner sample (N = 235) to evaluate the validity of ESI total and factor scores in relation to external criterion measures consisting of externalizing disorder symptoms (including child and adult antisocial deviance and substance-related problems) assessed via diagnostic interviews, personality traits assessed with self-reports, and psychopathic features as assessed with both interviews and self-reports. Results provide evidence for the validity of the ESI measurement model and point to its potential usefulness as a referent for research on the neurobiological correlates and etiological bases of externalizing proneness.
Qi, Bing-Bing; Resnick, Barbara
2014-01-01
To assess the psychometric properties of Chinese versions self-efficacy and outcome expectations on osteoporosis medication adherence (SEOMA-C and OEOMA-C) scales. Back-translated tools were assessed by internal consistency and R2 by structured equation modeling, confirmatory factor analyses, hypothesis testing, and criterion-related validity among 110 (81 females, 29 males) Mandarin-speaking immigrants (mean age = 63.44, SD = 9.63). The Cronbach's alpha for SEOMA-C and OEOMA-C is .904 and .937, respectively. There was fair and good fit of the measurement model to the data. Previous bone mineral density (BMD) testing, calcaneus BMD, self-efficacy for exercise, and osteoporosis medication adherence were positively related to SEOMA-C scores. These scales constitute some preliminary validity and reliability. Further refined and cultural sensitive items could be explored and added.
Adaptation to Portuguese of the Depression, Anxiety and Stress Scales (DASS).
Apóstolo, João Luís Alves; Mendes, Aida Cruz; Azeredo, Zaida Aguiar
2006-01-01
To adapt to Portuguese, of Portugal, the Depression, Anxiety and Stress Scales, a 21-item short scale (DASS 21), designed to measure depression, anxiety and stress. After translation and back-translation with the help of experts, the DASS 21 was administered to patients in external psychiatry consults (N=101), and its internal consistency, construct validity and concurrent validity were measured. The DASS 21 properties certify its quality to measure emotional states. The instrument reveals good internal consistency. Factorial analysis shows that the two-factor structure is more adequate. The first factor groups most of the items that theoretically assess anxiety and stress, and the second groups most of the items that assess depression, explaining, on the whole, 58.54% of total variance. The strong positive correlation between the DASS 21 and the Hospital Anxiety and Depression scale (HAD) confirms the hypothesis regarding the criterion validity, however, revealing fragilities as to the divergence between theoretically different constructs.
Zhou, Ting; Yang, Kaixiang; Thapa, Sudip; Fu, Qiang; Jiang, Yongsheng; Yu, Shiying
2017-04-01
The assessment of quality of life (QOL) is an important part of cachexia management for cancer patients. Functional assessment of anorexia-cachexia therapy (FAACT), a specific QOL instrument for cachexia patients, has not been validated in Chinese population. The aim of this study was to validate the FAACT scale in Chinese cancer patients for its future use. Eligible cancer patients were included in our study. Patients' demographic and clinical characteristics were collected from the electronic medical records. Patients were asked to complete the Chinese version of FAACT scale and the MD Anderson symptom inventory (MDASI), and then the reliability and validity were analyzed. A total of 285 patients were enrolled in our study, data of 241 patients were evaluated. Coefficients of Cronbach's alpha, test-retest and split-half analyses were all greater than 0.8, which indicated an excellent reliability for FAACT scale. In item-subscale correlation analysis and factor analysis, good construct validity for FAACT scale was found. The correlation between FAACT and MDASI interference subscale showed reasonable criterion-related validity, and for further clinical validation, the FAACT scale showed excellent discriminative validity for distinguishing patients in different cachexia status and in different performance status. The Chinese version of FAACT scale has good reliability and validity and is suitable for measuring QOL of cachexia patients in Chinese population.
Reliability and validity of a combat exposure index for Vietnam era veterans.
Janes, G R; Goldberg, J; Eisen, S A; True, W R
1991-01-01
The reliability and validity of a self-report measure of combat exposure are examined in a cohort of male-male twin pairs who served in the military during the Vietnam era. Test-retest reliability for a five-level ordinal index of combat exposure is assessed by use of 192 duplicate sets of responses. The chance-corrected proportion in agreement (as measured by the kappa coefficient) is .84. As a measure of criterion-related validity, the combat index is correlated with the award of combat-related military medals ascertained from the military records. The probability of receiving a Purple Heart, Bronze Star, Commendation Medal and Combat Infantry Badge is associated strongly with the combat exposure index. These results show that this simple index is a reliable and valid measure of combat exposure.
Santos, Rafaella Zulianello Dos; Bonin, Christiani Decker Batista; Martins, Eliara Ten Caten; Pereira Junior, Moacir; Ghisi, Gabriela Lima de Melo; Macedo, Kassia Rosangela Paz de; Benetti, Magnus
2018-01-01
The absence of instruments capable of measuring the level of knowledge of hypertensive patients in cardiac rehabilitation programs about their disease reflects the lack of specific recommendations for these patients. To develop and validate a questionnaire to evaluate the knowledge of hypertensive patients in cardiac rehabilitation programs about their disease. A total of 184 hypertensive patients (mean age 60.5 ± 10 years, 66.8% men) were evaluated. Reproducibility was assessed by calculation of the intraclass correlation coefficient using the test-retest method. Internal consistency was assessed by the Cronbach's alpha and the construct validity by the exploratory factorial analysis. The final version of the instrument had 17 questions organized in areas considered important for patient education. The instrument proposed showed a clarity index of 8.7 (0.25). The intraclass correlation coefficient was 0.804 and the Cronbach's correlation coefficient was 0.648. Factor analysis revealed five factors associated with knowledge areas. Regarding the criterion validity, patients with higher education level and higher family income showed greater knowledge about hypertension. The instrument has a satisfactory clarity index and adequate validity, and can be used to evaluate the knowledge of hypertensive participants in cardiac rehabilitation programs.
Survey Development to Assess College Students' Perceptions of the Campus Environment.
Sowers, Morgan F; Colby, Sarah; Greene, Geoffrey W; Pickett, Mackenzie; Franzen-Castle, Lisa; Olfert, Melissa D; Shelnutt, Karla; Brown, Onikia; Horacek, Tanya M; Kidd, Tandalayo; Kattelmann, Kendra K; White, Adrienne A; Zhou, Wenjun; Riggsbee, Kristin; Yan, Wangcheng; Byrd-Bredbenner, Carol
2017-11-01
We developed and tested a College Environmental Perceptions Survey (CEPS) to assess college students' perceptions of the healthfulness of their campus. CEPS was developed in 3 stages: questionnaire development, validity testing, and reliability testing. Questionnaire development was based on an extensive literature review and input from an expert panel to establish content validity. Face validity was established with the target population using cognitive interviews with 100 college students. Concurrent-criterion validity was established with in-depth interviews (N = 30) of college students compared to surveys completed by the same 30 students. Surveys completed by college students from 8 universities (N = 1147) were used to test internal structure (factor analysis) and internal consistency (Cronbach's alpha). After development and testing, 15 items remained from the original 48 items. A 5-factor solution emerged: physical activity (4 items, α = .635), water (3 items, α = .773), vending (2 items, α = .680), healthy food (2 items, α = .631), and policy (2 items, α = .573). The mean total score for all universities was 62.71 (±11.16) on a 100-point scale. CEPS appears to be a valid and reliable tool for assessing college students' perceptions of their health-related campus environment.
Brief International Cognitive Assessment for MS (BICAMS): international standards for validation.
Benedict, Ralph H B; Amato, Maria Pia; Boringa, Jan; Brochet, Bruno; Foley, Fred; Fredrikson, Stan; Hamalainen, Paivi; Hartung, Hans; Krupp, Lauren; Penner, Iris; Reder, Anthony T; Langdon, Dawn
2012-07-16
An international expert consensus committee recently recommended a brief battery of tests for cognitive evaluation in multiple sclerosis. The Brief International Cognitive Assessment for MS (BICAMS) battery includes tests of mental processing speed and memory. Recognizing that resources for validation will vary internationally, the committee identified validation priorities, to facilitate international acceptance of BICAMS. Practical matters pertaining to implementation across different languages and countries were discussed. Five steps to achieve optimal psychometric validation were proposed. In Step 1, test stimuli should be standardized for the target culture or language under consideration. In Step 2, examiner instructions must be standardized and translated, including all information from manuals necessary for administration and interpretation. In Step 3, samples of at least 65 healthy persons should be studied for normalization, matched to patients on demographics such as age, gender and education. The objective of Step 4 is test-retest reliability, which can be investigated in a small sample of MS and/or healthy volunteers over 1-3 weeks. Finally, in Step 5, criterion validity should be established by comparing MS and healthy controls. At this time, preliminary studies are underway in a number of countries as we move forward with this international assessment tool for cognition in MS.
Damschroder, Laura J; Goodrich, David E; Kim, Hyungjin Myra; Holleman, Robert; Gillon, Leah; Kirsh, Susan; Richardson, Caroline R; Lutes, Lesley D
2016-09-01
Practical and valid instruments are needed to assess fidelity of coaching for weight loss. The purpose of this study was to develop and validate the ASPIRE Coaching Fidelity Checklist (ACFC). Classical test theory guided ACFC development. Principal component analyses were used to determine item groupings. Psychometric properties, internal consistency, and inter-rater reliability were evaluated for each subscale. Criterion validity was tested by predicting weight loss as a function of coaching fidelity. The final 19-item ACFC consists of two domains (session process and session structure) and five subscales (sets goals and monitor progress, assess and personalize self-regulatory content, manages the session, creates a supportive and empathetic climate, and stays on track). Four of five subscales showed high internal consistency (Cronbach alphas > 0.70) for group-based coaching; only two of five subscales had high internal reliability for phone-based coaching. All five sub-scales were positively and significantly associated with weight loss for group- but not for phone-based coaching. The ACFC is a reliable and valid instrument that can be used to assess fidelity and guide skill-building for weight management interventionists.
A New Criterion for Prediction of Hot Tearing Susceptibility of Cast Alloys
NASA Astrophysics Data System (ADS)
Nasresfahani, Mohamad Reza; Niroumand, Behzad
2014-08-01
A new criterion for prediction of hot tearing susceptibility of cast alloys is suggested which takes into account the effects of both important mechanical and metallurgical factors and is believed to be less sensitive to the presence of volume defects such as bifilms and inclusions. The criterion was validated by studying the hot tearing tendency of Al-Cu alloy. In conformity with the experimental results, the new criterion predicted reduction of hot tearing tendency with increasing the copper content.
Measuring the emotional climate of an organization.
Yurtsever, Gülçimen; De Rivera, Joseph
2010-04-01
The importance of emotional climate in the organizational climate literature has gained interest. However, few studies have concentrated on adequately measuring the emotional climate of organizations. In this study, a reliable and valid scale was developed to measure the most important aspects of emotional climate in different organizations. This study presents evidence of reliability and validity for 28 items constructed to measure emotional climate in an organization in four separate studies. The data were obtained from working people from four different organizations by self-administered questionnaires. The findings indicate that three factors--Trust, Hope, and Security--were factors of the 28-item scale. Validation data also included correlations with duration of employment. The other method of assessing criterion validity was by comparing mean scores in organizations with differing productivity; results indicated that the organization with more productive members had a significantly higher mean score on emotional climate and its subscales. The generalizability of the results to private businesses also was assessed.
Urpí-Fernández, Ana-María; Zabaleta-Del-Olmo, Edurne; Montes-Hidalgo, Javier; Tomás-Sábado, Joaquín; Roldán-Merino, Juan-Francisco; Lluch-Canut, María-Teresa
2017-12-01
To identify, critically appraise and summarize the measurement properties of instruments to assess self-care in healthy children. Assessing self-care is a proper consideration for nursing practice and nursing research. No systematic review summarizes instruments of measurement validated in healthy children. Psychometric review in accordance with the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) panel. MEDLINE, CINAHL, PsycINFO, Web of Science and Open Grey were searched from their inception to December 2016. Validation studies with a healthy child population were included. Search was not restricted by language. Two reviewers independently assessed the methodological quality of included studies using the COSMIN checklist. Eleven studies were included in the review assessing the measurement properties of ten instruments. There was a maximum of two studies per instrument. None of the studies evaluated the properties of test-retest reliability, measurement error, criterion validity and responsiveness. Internal consistency and structural validity were rated as "excellent" or "good" in four studies. Four studies were rated as "excellent" in content validity. Cross-cultural validity was rated as "poor" in the two studies (three instruments) which cultural adaptation was carried out. The evidence available does not allow firm conclusions about the instruments identified in terms of reliability and validity. Future research should focus on generate evidence about a wider range of measurement properties of these instruments using a rigorous methodology, as well as instrument testing on different countries and child population. © 2017 John Wiley & Sons Ltd.
Imura, Tomoya; Takamura, Masahiro; Okazaki, Yoshihiro; Tokunaga, Satoko
2016-10-01
We developed a scale to measure time management and assessed its reliability and validity. We then used this scale to examine the impact of time management on psychological stress response. In Study 1-1, we developed the scale and assessed its internal consistency and criterion-related validity. Findings from a factor analysis revealed three elements of time management, “time estimation,” “time utilization,” and “taking each moment as it comes.” In Study 1-2, we assessed the scale’s test-retest reliability. In Study 1-3, we assessed the validity of the constructed scale. The results indicate that the time management scale has good reliability and validity. In Study 2, we performed a covariance structural analysis to verify our model that hypothesized that time management influences perceived control of time and psychological stress response, and perceived control of time influences psychological stress response. The results showed that time estimation increases the perceived control of time, which in turn decreases stress response. However, we also found that taking each moment as it comes reduces perceived control of time, which in turn increases stress response.
Development and validation of the Spanish-English Language Proficiency Scale (SELPS).
Smyk, Ekaterina; Restrepo, M Adelaida; Gorin, Joanna S; Gray, Shelley
2013-07-01
This study examined the development and validation of a criterion-referenced Spanish-English Language Proficiency Scale (SELPS) that was designed to assess the oral language skills of sequential bilingual children ages 4-8. This article reports results for the English proficiency portion of the scale. The SELPS assesses syntactic complexity, grammatical accuracy, verbal fluency, and lexical diversity based on 2 story retell tasks. In Study 1, 40 children were given 2 story retell tasks to evaluate the reliability of parallel forms. In Study 2, 76 children participated in the validation of the scale against language sample measures and teacher ratings of language proficiency. Study 1 indicated no significant differences between the SELPS scores on the 2 stories. Study 2 indicated that the SELPS scores correlated significantly with their counterpart language sample measures. Correlations between the SELPS and teacher ratings were moderate. The 2 story retells elicited comparable SELPS scores, providing a valuable tool for test-retest conditions in the assessment of language proficiency. Correlations between the SELPS scores and external variables indicated that these measures assessed the same language skills. Results provided empirical evidence regarding the validity of inferences about language proficiency based on the SELPS score.
Hobbelen, Johannes S M; Koopmans, Raymond T C M; Verhey, Frans R J; Habraken, Kitty M; de Bie, Rob A
2008-08-01
Paratonia is one of the associated movement disorders characteristic of dementia. The aim of this study was to develop an assessment tool (the Paratonia Assessment Instrument, PAI), based on the new consensus definition of paratonia. An additional aim was to investigate the reliability and validity of the PAI. A three-phase cross-sectional survey was conducted. In the first two phases, the PAI was developed and validated. In the third phase, the inter-observer reliability and feasibility of the instrument was tested. The original PAI consisted of five criteria that all needed to be met in order to make the diagnosis. On the basis of a qualitative analysis, one criterion was reformulated and another was removed. Following this, inter-observer reliability between the two assessors resulted in an improvement of Cohen's kappa from 0.532 in the initial phase to 0.677 in the second phase. This improvement was substantiated in the third phase by two independent assessors with Cohen's kappa ranging from 0.625 to 1. The PAI is a reliable and valid assessment tool for diagnosing paratonia in elderly people with dementia that can be applied easily in daily practice.
Tsugawa, Yusuke; Ohbu, Sadayoshi; Cruess, Richard; Cruess, Sylvia; Okubo, Tomoya; Takahashi, Osamu; Tokuda, Yasuharu; Heist, Brian S; Bito, Seiji; Itoh, Toshiyuki; Aoki, Akiko; Chiba, Tsutomu; Fukui, Tsuguya
2011-08-01
Despite the growing importance of and interest in medical professionalism, there is no standardized tool for its measurement. The authors sought to verify the validity, reliability, and generalizability of the Professionalism Mini-Evaluation Exercise (P-MEX), a previously developed and tested tool, in the context of Japanese hospitals. A multicenter, cross-sectional evaluation study was performed to investigate the validity, reliability, and generalizability of the P-MEX in seven Japanese hospitals. In 2009-2010, 378 evaluators (attending physicians, nurses, peers, and junior residents) completed 360-degree assessments of 165 residents and fellows using the P-MEX. The content validity and criterion-related validity were examined, and the construct validity of the P-MEX was investigated by performing confirmatory factor analysis through a structural equation model. The reliability was tested using generalizability analysis. The contents of the P-MEX achieved good acceptance in a preliminary working group, and the poststudy survey revealed that 302 (79.9%) evaluators rated the P-MEX items as appropriate, indicating good content validity. The correlation coefficient between P-MEX scores and external criteria was 0.78 (P < .001), demonstrating good criterion-related validity. Confirmatory factor analysis verified high path coefficient (0.60-0.99) and adequate goodness of fit of the model. The generalizability analysis yielded a high dependability coefficient, suggesting good reliability, except when evaluators were peers or junior residents. Findings show evidence of adequate validity, reliability, and generalizability of the P-MEX in Japanese hospital settings. The P-MEX is the only evaluation tool for medical professionalism verified in both a Western and East Asian cultural context.
Charalambous, Andreas; Kaite, Charis; Constantinou, Marianna; Kouta, Christiana
2016-12-02
To translate and validate the Cancer-Related Fatigue (CRF) Scale in the Greek language. A cross-sectional descriptive design was used in order to translate and validate the CRF Scale in Greek. Factor analyses were performed to understand the psychometric properties of the scale and to establish construct, criterion and convergent validity. Outpatients' oncology clinics of two public hospitals in Cyprus. 148 patients with advanced prostate cancer undergoing chemotherapy. The Cancer Fatigue Scale (CFS) had good stability (test-retest reliability r=0.79, p<0.001) and good internal consistency (Cronbach's α coefficient for all 15 items α=0.916). Furthermore, the Kaiser-Meyer-Olkin Measure of Sampling Adequacy (KMO value) was found to be 0.743 and considered to be satisfactory (>0.5). The correlations between the CFS physical scale (CFS-FS scale) and the European Organization for Research and Treatment of Cancer (EORTC) QLQ-C30 physical subscales were found to be significant (r=-0.715). The same occurred between CFS cognitive and EORTC cognitive subscale (r=-0.579). Overall, the criterion validity was verified. The same occurs for the convergent validity of the CFS since all correlations with the Global Health Status (q29-q30) were found to be significant. This is the first validation study of the CRF Scale in Greek and warrant of its use in the assessment of prostate cancer patient's related fatigue. However, further testing and validation is needed in the early stages of the disease and in patients in later chemotherapy cycles. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
Talip, Whadi-ah; Steyn, Nelia P; Visser, Marianne; Charlton, Karen E; Temple, Norman
2003-09-01
We wanted to develop and validate a test that assesses the knowledge and practices of health professionals (HPs) with regard to the role of nutrition, physical activity, and smoking cessation (lifestyle modification) in chronic diseases of lifestyle. A descriptive cross-sectional validation study was carried out. The validation design consisted of two phases, namely 1) test planning and development and 2) test evaluation. The study sample consisted of five groups of HPs: dietitians, dietetic interns, general practitioners, medical students, and nurses. The overall response rate was 58%, resulting in a sample size of 186 participants. A test was designed to evaluate the knowledge and practices of HPs. The test was first evaluated by an expert group to ensure content, construct, and face validity. Thereafter, the questionnaire was tested on five groups of HPs to test for criterion validity. Internal consistency was evaluated by Cronbach's alpha. An expert panel ensured content, construct, and face validity of the test. Groups with the most training and exposure to nutrition (dietitians and dietetic interns) had the highest group mean score, ranging from 61% to 88%, whereas those with limited nutrition training (general practitioners, medical students, and nurses) had significantly lower scores, ranging from 26% to 80%. This result demonstrated criterion validity. Internal consistency of the overall test demonstrated a Cronbach's alpha of 0.99. Most HPs identified the mass media as their main source of information on lifestyle modification. These HPs also identified lack of time, lack of patient compliance, and lack of knowledge as barriers that prevent them from providing counseling on lifestyle modification. The results of this study showed that this test instrument identifies groups of health professionals with adequate training (knowledge) in lifestyle modification and those who require further training (knowledge).
Kaite, Charis; Constantinou, Marianna; Kouta, Christiana
2016-01-01
Objective To translate and validate the Cancer-Related Fatigue (CRF) Scale in the Greek language. Design A cross-sectional descriptive design was used in order to translate and validate the CRF Scale in Greek. Factor analyses were performed to understand the psychometric properties of the scale and to establish construct, criterion and convergent validity. Setting Outpatients' oncology clinics of two public hospitals in Cyprus. Participants 148 patients with advanced prostate cancer undergoing chemotherapy. Results The Cancer Fatigue Scale (CFS) had good stability (test–retest reliability r=0.79, p<0.001) and good internal consistency (Cronbach's α coefficient for all 15 items α=0.916). Furthermore, the Kaiser-Meyer-Olkin Measure of Sampling Adequacy (KMO value) was found to be 0.743 and considered to be satisfactory (>0.5). The correlations between the CFS physical scale (CFS-FS scale) and the European Organization for Research and Treatment of Cancer (EORTC) QLQ-C30 physical subscales were found to be significant (r=−0.715). The same occurred between CFS cognitive and EORTC cognitive subscale (r=−0.579). Overall, the criterion validity was verified. The same occurs for the convergent validity of the CFS since all correlations with the Global Health Status (q29–q30) were found to be significant. Conclusions This is the first validation study of the CRF Scale in Greek and warrant of its use in the assessment of prostate cancer patient's related fatigue. However, further testing and validation is needed in the early stages of the disease and in patients in later chemotherapy cycles. PMID:27913557
Sitnikova, Kate; Dijkstra-Kersten, Sandra M A; Mokkink, Lidwine B; Terluin, Berend; van Marwijk, Harm W J; Leone, Stephanie S; van der Horst, Henriëtte E; van der Wouden, Johannes C
2017-12-01
The aim of this review is to critically appraise the evidence on measurement properties of self-report questionnaires measuring somatization in adult primary care patients and to provide recommendations about which questionnaires are most useful for this purpose. We assessed the methodological quality of included studies using the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) checklist. To draw overall conclusions about the quality of the questionnaires, we conducted an evidence synthesis using predefined criteria for judging the measurement properties. We found 24 articles on 9 questionnaires. Studies on the Patient Health Questionnaire-15 (PHQ-15) and the Four-Dimensional Symptom Questionnaire (4DSQ) somatization subscale prevailed and covered the broadest range of measurement properties. These questionnaires had the best internal consistency, test-retest reliability, structural validity, and construct validity. The PHQ-15 also had good criterion validity, whereas the 4DSQ somatization subscale was validated in several languages. The Bodily Distress Syndrome (BDS) checklist had good internal consistency and structural validity. Some evidence was found for good construct validity and criterion validity of the Physical Symptom Checklist (PSC-51) and good construct validity of the Symptom Check-List (SCL-90-R) somatization subscale. However, these three questionnaires were only studied in a small number of primary care studies. Based on our findings, we recommend the use of either the PHQ-15 or 4DSQ somatization subscale for somatization in primary care. Other questionnaires, such as the BDS checklist, PSC-51 and the SCL-90-R somatization subscale show promising results but have not been studied extensively in primary care. Copyright © 2017 Elsevier Inc. All rights reserved.
Seves, Mauro; Haidl, Theresa; Eggers, Susanne; Rostamzadeh, Ayda; Genske, Anna; Jünger, Saskia; Woopen, Christiane; Jessen, Frank; Ruhrmann, Stephan
2018-01-01
Abstract Background Numerous studies suggest that health literacy (HL) plays a crucial role in maintaining and improving individual health. Furthermore, empirical findings highlight the relation between levels of a person’s HL and clinical outcomes. So far, there are no reviews, which investigate HL in individuals at-risk for psychosis. The aim of the current review is to assess how individuals at risk of developing a first episode of psychosis gain access to, understand, evaluate and apply risk-related health information. Methods A mixed-methods approach was used to analyze and synthesize a variety of study types including qualitative and quantitative studies. Search strategy, screening and data selection have been carried out according to the PRISMA criteria. The systematic search was applied on peer-reviewed literature in PUBMED, Cochrane Library, PsycINFO and Web of Science. Studies were included if participants met clinical high risk criteria (CHR), including the basic symptom criterion (BS) and the ultra-high risk (UHR) criteria. The UHR criteria comprise the attenuated psychotic symptom criterion (APS), the brief limited psychotic symptom criterion (BLIPS) and the genetic risk and functional decline criterion (GRDP) Furthermore, studies must have used validated HL measures or any operationalization of the HL’s subdimensions (access, understanding, appraisal, decision-making or action) as a primary outcome. A third inclusion criterion comprised that the concept of HL or one of the four dimensions was mentioned in title or abstract. Data extraction and synthesis was implemented according to existing recommendations for appraising evidence from different study types. The quality of the included studies was evaluated and related to the study results. Results The search string returned 10587 papers. After data extraction 15 quantitative as well as 4 qualitative studies and 3 reviews were included. The Quality assessment evaluated 12 publications as “good”, 9 as “fair” and one paper as “poor”. Only one of the studies assessed HL with as primary outcome. In the other studies, the five different subdimensions of HL were investigated as a secondary outcome respectively mentioned in the paper. “Gaining Access” was examined in 18 of the 22 studies. “Understanding” has been assessed in 7 publications. “Appraise” was examined in 9 studies. “Apply decision making” and “Apply health behavior” were investigated in 1 of 8 studies. Since none of the included publications operationalized neither HL nor the subdimensions of HL with a validated measure, no explicit influencing factors could be found. Discussion Quantitative and qualitative evidence indicates that subjects at-risk for psychosis describe a lack of understanding about their state and fear stigmatization that might lead to dysfunctional coping strategies, such as ignoring and hiding symptoms. Affected subjects are eager to be informed about their condition and describe favoured channels for obtaining information. The internet, family members, school personnel and GP’s play a crucial role in gain access to, understand, evaluate and apply risk-related health information. The results clearly highlight that more research should be dedicated to HL in individuals at risk of developing a psychosis. Further studies should explore the relation between HL and clinical outcomes in this target population by assessing the underlining constructs with validated tools.
Vuillerot, Carole; Meilleur, Katherine G.; Jain, Minal; Waite, Melissa; Wu, Tianxia; Linton, Melody; Datsgir, Jahannaz; Donkervoort, Sandra; Leach, Meganne E.; Rutkowski, Anne; Rippert, Pascal; Payan, Christine; Iwaz, Jean; Hamroun, Dalil; Bérard, Carole; Poirot, Isabelle; Bönnemann, Carsten G.
2016-01-01
Objective To develop and validate an English version of the Neuromuscular (NM)-Score, a classification for patients with NM diseases in each of the 3 motor function domains: D1, standing and transfers; D2, axial and proximal motor function; and D3, distal motor function. Design Validation survey. Setting Patients seen at a medical research center between June and September 2013. Participants Consecutive patients (N = 42) aged 5 to 19 years with a confirmed or suspected diagnosis of congenital muscular dystrophy. Interventions Not applicable. Main Outcome Measures An English version of the NM-Score was developed by a 9-person expert panel that assessed its content validity and semantic equivalence. Its concurrent validity was tested against criterion standards (Brooke Scale, Motor Function Measure [MFM], activity limitations for patients with upper and/or lower limb impairments [ACTIVLIM], Jebsen Test, and myometry measurements). Informant agreement between patient/caregiver (P/C)-reported and medical doctor (MD)-reported NM scores was measured by weighted kappa. Results Significant correlation coefficients were found between NM scores and criterion standards. The highest correlations were found between NM-score D1 and MFM score D1 (ρ = −.944, P<.0001), ACTIVLIM (ρ = −.895, P<.0001), and hip abduction strength by myometry (ρ = −.811, P<.0001). Informant agreement between P/C-reported and MD-reported NM scores was high for D1 (κ = .801; 95% confidence interval [CI], .701–.914) but moderate for D2 (κ = .592; 95% CI, .412–.773) and D3 (κ = .485; 95% CI, .290–.680). Correlation coefficients between the NM scores and the criterion standards did not significantly differ between P/C-reported and MD-reported NM scores. Conclusions Patients and physicians completed the English NM-Score easily and accurately. The English version is a reliable and valid instrument that can be used in clinical practice and research to describe the functional abilities of patients with NM diseases. PMID:24862765
How much is enough? Examining frequency criteria for NSSI disorder in adolescent inpatients.
Muehlenkamp, Jennifer J; Brausch, Amy M; Washburn, Jason J
2017-06-01
To empirically evaluate the diagnostic relevance of the proposed Diagnostic and Statistical Manual of Mental Disorders (5th ed.; DSM-5 ; APA, 2013) Criterion-A frequency threshold for nonsuicidal self-injury (NSSI) disorder. Archival, de-identified, self-reported clinical assessment data from 746 adolescent psychiatric patients (Mage = 14.97; 88% female; 76% White) were used. The sample was randomly split into 2 unique samples for data analyses. Measures included assessments of NSSI, proposed DSM-5 NSSI-disorder criteria, psychopathology, dysfunction, distress, functional impairment, and suicidality. Discriminant-function analyses run with Sample A identified a significant differentiation of groups based on a frequency of NSSI at 25 or more days in the past year, Λ = .814, χ2(54) = 72.59, p < .05, canonical R2 = .36. This cutoff was replicated in the second sample. All patients were coded into 1 of 3 empirically derived NSSI-frequency cutoff groups: high (>25 days), moderate (5-24 days), and low (1-4 days) and compared. The high-NSSI group scored higher on most NSSI features, including DSM-5 -proposed Criterion-B and -C symptoms, depression, psychotic symptoms, substance abuse, borderline personality-disorder features, suicidal ideation, and suicide plans, than the moderate- and low-NSSI groups, who did not differ from each other on many of the variables. The currently proposed DSM-5 Criterion-A frequency threshold for NSSI disorder lacks validity and clinical utility. The field needs to consider raising the frequency threshold to ensure that a meaningful and valid set of diagnostic criteria are established, and to avoid overpathologizing individuals who infrequently engage in NSSI. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Spathis, Jemima Grace; Connick, Mark James; Beckman, Emma Maree; Newcombe, Peter Anthony; Tweedy, Sean Michael
2015-01-01
Paralympic throwing events for athletes with physical impairments comprise seated and standing javelin, shot put, discus and seated club throwing. Identification of talented throwers would enable prediction of future success and promote participation; however, a valid and reliable talent identification battery for Paralympic throwing has not been reported. This study evaluates the reliability and validity of a talent identification battery for Paralympic throws. Participants were non-disabled so that impairment would not confound analyses, and results would provide an indication of normative performance. Twenty-eight non-disabled participants (13 M; 15 F) aged 23.6 years (±5.44) performed five kinematically distinct criterion throws (three seated, two standing) and nine talent identification tests (three anthropometric, six motor); 23 were tested a second time to evaluate test-retest reliability. Talent identification test-retest reliability was evaluated using Intra-class Correlation Coefficient (ICC) and Bland-Altman plots (Limits of Agreement). Spearman's correlation assessed strength of association between criterion throws and talent identification tests. Reliability was generally acceptable (mean ICC = 0.89), but two seated talent identification tests require more extensive familiarisation. Correlation strength (mean rs = 0.76) indicated that the talent identification tests can be used to validly identify individuals with competitively advantageous attributes for each of the five kinematically distinct throwing activities. Results facilitate further research in this understudied area.
Criterion Validity of the Child's Challenging Behavior Scale, Version 2 (CCBS-2).
Bourke-Taylor, Helen M; Cordier, Reinie; Pallant, Julie F
The Child's Challenging Behavior Scale, Version 2 (CCBS-2), measures maternal rating of a child's challenging behaviors that compromise maternal mental health. The CCBS-2, the Child Behavior Checklist (CBCL), and the Strengths and Difficulties Questionnaire (SDQ) were compared in a sample of typically developing young Australian children. Criterion validity was investigated by correlating the CCBS-2 with "gold standard" measures (CBCL and SDQ subscales). Data were collected in a cross-sectional survey of mothers (N = 336) of children ages 3-9 yr. Correlations with the CBCL externalizing subscales demonstrated moderate (ρ = .46) to strong (ρ = .66) correlations. Correlations with the SDQ externalizing behaviors subscales were moderate (ρ = .35) to strong (ρ = .60). The criterion validity established in this study strengthens the psychometric properties that support ongoing development of the CCBS-2 as an efficient tool that may identify children in need of further evaluation. Copyright © 2018 by the American Occupational Therapy Association, Inc.
Nelson, Melissa C; Lytle, Leslie A
2009-04-01
Sweetened beverage and fast-food intake have been identified as important targets for obesity prevention. However, there are few brief dietary assessment tools available to evaluate these behaviors among adolescents. The objective of this research was to examine reliability and validity of a 22-item dietary screener assessing adolescent consumption of specific energy-containing and non-energy-containing beverages (nine items) and fast food (13 items). The screener was administered to adolescents (ages 11 to 18 years) recruited from the Minneapolis/St Paul, MN, metro region. One sample of adolescents completed test-retest reliability of the screener (n=33, primarily white adolescents). Another adolescent sample completed the screener along with three 24-hour dietary recalls to assess criterion validity (n=59 white adolescents). Test-retest assessments were completed approximately 7 to 14 days apart, and agreement between the two administrations of the screener was substantial, with most items yielding Spearman correlations and kappa statistics that were >0.60. When compared to the gold standard dietary recall data, findings indicate that the validity of the screener items assessing adolescents' intake of regular soda, sports drinks, milk, and water was fair. However, the differential assessment periods captured by the two methods (ie, 1 month for the screener vs 3 days for the recalls) posed challenges in analysis and made it impossible to assess the validity of some screener items. Overall while these screener items largely represent reliable measures with fair validity, our findings highlight the challenges inherent in the validation of brief dietary assessment tools.
Tokudome, Yuko; Okumura, Keiko; Kumagai, Yoshiko; Hirano, Hirohiko; Kim, Hunkyung; Morishita, Shiho; Watanabe, Yutaka
2017-11-01
Because few Japanese questionnaires assess the elderly's appetite, there is an urgent need to develop an appetite questionnaire with verified reliability, validity, and reproducibility. We translated and back-translated the Council on Nutrition Appetite Questionnaire (CNAQ), which has eight items, into Japanese (CNAQ-J), as well as the Simplified Nutritional Appetite Questionnaire (SNAQ-J), which includes four CNAQ-J-derived items. Using structural equation modeling, we examined the CNAQ-J structure based on data of 649 Japanese elderly people in 2013, including individuals having a certain degree of cognitive impairment, and we developed the SNAQ for the Japanese elderly (SNAQ-JE) according to an exploratory factor analysis. Confirmatory factor analyses on the appetite questionnaires were conducted to probe fitting to the model. We computed Cronbach's α coefficients and criterion-referenced/-related validity figures examining associations of the three appetite battery scores with body mass index (BMI) values and with nutrition-related questionnaire values. Test-retest reproducibility of appetite tools was scrutinized over an approximately 2-week interval. An exploratory factor analysis demonstrated that the CNAQ-J was constructed of one factor (appetite), yielding the SNAQ-JE, which includes four questions derived from the CNAQ-J. The three appetite instruments showed almost equivalent fitting to the model and reproducibility. The CNAQ-J and SNAQ-JE demonstrated satisfactory reliability and significant criterion-referenced/-related validity values, including BMIs, but the SNAQ-J included a low factor-loading item, exhibited less satisfactory reliability and had a non-significant relationship to BMI. The CNAQ-J and SNAQ-JE may be applied to assess the appetite of Japanese elderly, including persons with some cognitive impairment. Copyright © 2017 The Authors. Production and hosting by Elsevier B.V. All rights reserved.
Development and testing of the cancer multidisciplinary team meeting observational tool (MDT-MOT)
Harris, Jenny; Taylor, Cath; Sevdalis, Nick; Jalil, Rozh; Green, James S.A.
2016-01-01
Abstract Objective To develop a tool for independent observational assessment of cancer multidisciplinary team meetings (MDMs), and test criterion validity, inter-rater reliability/agreement and describe performance. Design Clinicians and experts in teamwork used a mixed-methods approach to develop and refine the tool. Study 1 observers rated pre-determined optimal/sub-optimal MDM film excerpts and Study 2 observers independently rated video-recordings of 10 MDMs. Setting Study 2 included 10 cancer MDMs in England. Participants Testing was undertaken by 13 health service staff and a clinical and non-clinical observer. Intervention None. Main Outcome Measures Tool development, validity, reliability/agreement and variability in MDT performance. Results Study 1: Observers were able to discriminate between optimal and sub-optimal MDM performance (P ≤ 0.05). Study 2: Inter-rater reliability was good for 3/10 domains. Percentage of absolute agreement was high (≥80%) for 4/10 domains and percentage agreement within 1 point was high for 9/10 domains. Four MDTs performed well (scored 3+ in at least 8/10 domains), 5 MDTs performed well in 6–7 domains and 1 MDT performed well in only 4 domains. Leadership and chairing of the meeting, the organization and administration of the meeting, and clinical decision-making processes all varied significantly between MDMs (P ≤ 0.01). Conclusions MDT-MOT demonstrated good criterion validity. Agreement between clinical and non-clinical observers (within one point on the scale) was high but this was inconsistent with reliability coefficients and warrants further investigation. If further validated MDT-MOT might provide a useful mechanism for the routine assessment of MDMs by the local workforce to drive improvements in MDT performance. PMID:27084499
Development and testing of the cancer multidisciplinary team meeting observational tool (MDT-MOT).
Harris, Jenny; Taylor, Cath; Sevdalis, Nick; Jalil, Rozh; Green, James S A
2016-06-01
To develop a tool for independent observational assessment of cancer multidisciplinary team meetings (MDMs), and test criterion validity, inter-rater reliability/agreement and describe performance. Clinicians and experts in teamwork used a mixed-methods approach to develop and refine the tool. Study 1 observers rated pre-determined optimal/sub-optimal MDM film excerpts and Study 2 observers independently rated video-recordings of 10 MDMs. Study 2 included 10 cancer MDMs in England. Testing was undertaken by 13 health service staff and a clinical and non-clinical observer. None. Tool development, validity, reliability/agreement and variability in MDT performance. Study 1: Observers were able to discriminate between optimal and sub-optimal MDM performance (P ≤ 0.05). Study 2: Inter-rater reliability was good for 3/10 domains. Percentage of absolute agreement was high (≥80%) for 4/10 domains and percentage agreement within 1 point was high for 9/10 domains. Four MDTs performed well (scored 3+ in at least 8/10 domains), 5 MDTs performed well in 6-7 domains and 1 MDT performed well in only 4 domains. Leadership and chairing of the meeting, the organization and administration of the meeting, and clinical decision-making processes all varied significantly between MDMs (P ≤ 0.01). MDT-MOT demonstrated good criterion validity. Agreement between clinical and non-clinical observers (within one point on the scale) was high but this was inconsistent with reliability coefficients and warrants further investigation. If further validated MDT-MOT might provide a useful mechanism for the routine assessment of MDMs by the local workforce to drive improvements in MDT performance. © The Author 2016. Published by Oxford University Press in association with the International Society for Quality in Health Care; all rights reserved.
Zin, Faridah Mohd; Hillaluddin, Azlin Hilma; Mustaffa, Jamaludin
2017-01-01
Objective: This study aims to develop, validate and determine the reliability of an interactive multimedia strategy to prevent tobacco use among the young (TUPY-S) from an adolescents’ perspective. Methods: A descriptive study design was utilized. A modular instruction guideline by Russel (1974) was followed in the entire process, comprising a feasibility study, a review of existing modules, specification of the objectives, identification of the construct criterion items, learner analysis and entry behavior specification, establishment of the sequence instruction and media selection, a tryout with students and a field test. Result: Feasibility was agreed among the researchers and the school authorities. Culturally suitable rigorously developed tobacco use preventive strategies delivered using information technology (IT) are lacking in the literature. The objective of TUPY-S is to prevent tobacco use among adolescents living in Malaysia. Identified construct criterion items include knowledge, attitude, intention to use, self-efficacy, and refusal skill. The target population was early adolescents belonging to generation-Z. Content was developed from the adolescents’ perspective and delivered using IT in Malay language. Content validity, assessed by six experts in the field and module development, was good at 86%. The students’ tryout showed satisfactory face validity subjectively and objectively (85.5%) and high alpha Cronbach reliability (0.91). Conclusion: TUPY-S was confirmed to suit early adolescents of the current generation living in Malaysia. It demonstrated good content validity among the experts, satisfactory face validity and reliability among the target population. TUPY-S is ready to be evaluated for its effectiveness among early adolescents. PMID:28612599
Toward a Measure of Accountability in Nursing: A Three-Stage Validation Study.
Drach-Zahavy, Anat; Leonenko, Marina; Srulovici, Einav
2018-06-04
To develop and psychometrically evaluate a three-dimensional questionnaire suitable for evaluating personal and organizational accountability in nurses. Accountability is defined as a three-dimensional value, directing professionals to take responsibility for their decisions and actions, to be willing to explain them (transparency) and to be judged according to society's accepted values (answerability). Despite the relatively clear definition, measurement of accountability lags well behind. Existing self-report questionnaires do not fully capture the complexity of the concept; nor do they capture the different sources of accountability (e.g., personal accountability, organizational accountability). A three-stage measure development. Data were collected during 2015-2016. In Phase 1, an initial database of items (N = 74) was developed, based on literature review and qualitative study, establishing face and content validity. In Phase 2, the face, content, construct and criterion-related validity of the initial questionnaires (19 items for personal and organizational accountability questionnaire) was established with a sample of 229 nurses. In Phase 3, the final questionnaires (19 items each) were validated with a new sample of 329 nurses and established construct validity. The final version of the instruments comprised 19 items, suitable for assessing personal and organizational accountability. The questionnaire referred to the dimensions of responsibility, transparency and answerability. The findings established the instrument's content, construct and criterion-related validity, as well as good internal reliability. The questionnaire portrays accountability in nursing, by capturing nurses' subjective perceptions of accountability dimensions (responsibility, transparency, answerability), as demonstrated by personal and organizational values. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
[Measurement properties of self-report questionnaires published in Korean nursing journals].
Lee, Eun-Hyun; Kim, Chun-Ja; Kim, Eun Jung; Chae, Hyun-Ju; Cho, Soo-Yeon
2013-02-01
The purpose of this study was to evaluate measurement properties of self-report questionnaires for studies published in Korean nursing journals. Of 424 Korean nursing articles initially identified, 168 articles met the inclusion criteria. The methodological quality of the measurements used in the studies and interpretability were assessed using the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) checklist. It consists of items on internal consistency, reliability, measurement error, content validity, construct validity including structural validity, hypothesis testing, cross-cultural validity, and criterion validity, and responsiveness. For each item of the COSMIN checklist, measurement properties are rated on a four-point scale: excellent, good, fair, and poor. Each measurement property is scored with worst score counts. All articles used the classical test theory for measurement properties. Internal consistency (72.6%), construct validity (56.5%), and content validity (38.2%) were most frequently reported properties being rated as 'excellent' by COSMIN checklist, whereas other measurement properties were rarely reported. A systematic review of measurement properties including interpretability of most instruments warrants further research and nursing-focused checklists assessing measurement properties should be developed to facilitate intervention outcomes across Korean studies.
Marcatto, Francesco; D'Errico, Giuseppe; Di Blas, Lisa; Ferrante, Donatella
2011-01-01
The aim of this paper is to present a preliminary validation of an Italian adaptation of the HSE Management Standards Work-Related Stress Indicator Tool (IT), an instrument for assessing work-related stress at the organizational level, originally developed in Britain by the Health and Safety Executive. A scale that assesses the physical work environment has been added to the original version of the IT. 190 employees of the University of Trieste have been enrolled in the study. A confirmatory analysis showed a satisfactory fit of the eight-factors structure of the instrument. Further psychometric analysis showed adequate internal consistency of the IT scales and good criterion validity, as evidenced by the correlations with self-perception of stress, work satisfaction and motivation. In conclusion, the Indicator Tool proved to be a valid and reliable instrument for the assessment of work-related stress at the organizational level, and it is also compatible with the instructions provided by the Ministry of Labour and Social Policy (Circular letter 18/11/2010).
[Validity and Reliability of Korean Version of the Spiritual Care Competence Scale].
Chung, Mi Ja; Park, Youngrye; Eun, Young
2016-12-01
The aim of this study was to examine the validity and reliability of the Korean Version of the Spiritual Care Competence Scale (K-SCCS). A cross-sectional study design was used. The K-SCCS consisted of 26 questions to measure spiritual care competence of nurses. Participants, 228 nurses who had more than 3 years'experience as a nurse, completed the survey. Confirmatory factor analysis was used to examine the construct validity and correlations of K-SCCS and spiritual well-being (SWB) were used to examine the criterion validity of K-SCCS. Cronbach's alpha was used to test internal consistency. The construct and the criterion-related validity of K-SCCS were supported as measures of spiritual care competence. Cronbach's alpha was .95. Factor loadings of the 26 questions ranged from .60 to .96. Construct validity of K-SCCS was verified by confirmatory factor analysis (RMSEA=.08, CFI=.90, NFI=.85). Criterion validity compared to the SWB showed significant correlation (r=.44, p<.001). The findings suggest that K-SCCS serves as an appropriate measure of spiritual care competence with validity and reliability. However, further study is needed to retest the verification of the factor analysis related to factor 2 (professionalisation and improving the quality of spiritual care) and factor 3 (personal support and patient counseling). Therefore, we recommend using the total score without distinguishing subscales.
2013-01-01
Background A scale validated in one language is not automatically valid in another language or culture. The purpose of this study was to validate the English version of the UNESP-Botucatu multidimensional composite pain scale (MCPS) to assess postoperative pain in cats. The English version was developed using translation, back-translation, and review by individuals with expertise in feline pain management. In sequence, validity and reliability tests were performed. Results Of the three domains identified by factor analysis, the internal consistency was excellent for ‘pain expression’ and ‘psychomotor change’ (0.86 and 0.87) but not for ‘physiological variables’ (0.28). Relevant changes in pain scores at clinically distinct time points (e.g., post-surgery, post-analgesic therapy), confirmed the construct validity and responsiveness (Wilcoxon test, p < 0.001). Favorable correlation with the IVAS scores (p < 0.001) and moderate to very good agreement between blinded observers and ‘gold standard’ evaluations, supported criterion validity. The cut-off point for rescue analgesia was > 7 (range 0–30 points) with 96.5% sensitivity and 99.5% specificity. Conclusions The English version of the UNESP-Botucatu-MCPS is a valid, reliable and responsive instrument for assessing acute pain in cats undergoing ovariohysterectomy, when used by anesthesiologists or anesthesia technicians. The cut-off point for rescue analgesia provides an additional tool for guiding analgesic therapy. PMID:23867090
Brondani, Juliana T; Mama, Khursheed R; Luna, Stelio P L; Wright, Bonnie D; Niyom, Sirirat; Ambrosio, Jennifer; Vogel, Pamela R; Padovani, Carlos R
2013-07-17
A scale validated in one language is not automatically valid in another language or culture. The purpose of this study was to validate the English version of the UNESP-Botucatu multidimensional composite pain scale (MCPS) to assess postoperative pain in cats. The English version was developed using translation, back-translation, and review by individuals with expertise in feline pain management. In sequence, validity and reliability tests were performed. Of the three domains identified by factor analysis, the internal consistency was excellent for 'pain expression' and 'psychomotor change' (0.86 and 0.87) but not for 'physiological variables' (0.28). Relevant changes in pain scores at clinically distinct time points (e.g., post-surgery, post-analgesic therapy), confirmed the construct validity and responsiveness (Wilcoxon test, p < 0.001). Favorable correlation with the IVAS scores (p < 0.001) and moderate to very good agreement between blinded observers and 'gold standard' evaluations, supported criterion validity. The cut-off point for rescue analgesia was > 7 (range 0-30 points) with 96.5% sensitivity and 99.5% specificity. The English version of the UNESP-Botucatu-MCPS is a valid, reliable and responsive instrument for assessing acute pain in cats undergoing ovariohysterectomy, when used by anesthesiologists or anesthesia technicians. The cut-off point for rescue analgesia provides an additional tool for guiding analgesic therapy.
Robertson, Samuel J; Burnett, Angus F; Cochrane, Jodie
2014-04-01
A high level of participant skill is influential in determining the outcome of many sports. Thus, tests assessing skill outcomes in sport are commonly used by coaches and researchers to estimate an athlete's ability level, to evaluate the effectiveness of interventions or for the purpose of talent identification. The objective of this systematic review was to examine the methodological quality, measurement properties and feasibility characteristics of sporting skill outcome tests reported in the peer-reviewed literature. A search of both SPORTDiscus and MEDLINE databases was undertaken. Studies that examined tests of sporting skill outcomes were reviewed. Only studies that investigated measurement properties of the test (reliability or validity) were included. A total of 22 studies met the inclusion/exclusion criteria. A customised checklist of assessment criteria, based on previous research, was utilised for the purpose of this review. A range of sports were the subject of the 22 studies included in this review, with considerations relating to methodological quality being generally well addressed by authors. A range of methods and statistical procedures were used by researchers to determine the measurement properties of their skill outcome tests. The majority (95%) of the reviewed studies investigated test-retest reliability, and where relevant, inter and intra-rater reliability was also determined. Content validity was examined in 68% of the studies, with most tests investigating multiple skill domains relevant to the sport. Only 18% of studies assessed all three reviewed forms of validity (content, construct and criterion), with just 14% investigating the predictive validity of the test. Test responsiveness was reported in only 9% of studies, whilst feasibility received varying levels of attention. In organised sport, further tests may exist which have not been investigated in this review. This could be due to such tests firstly not being published in the peer-review literature and secondly, not having their measurement properties (i.e., reliability or validity) examined formally. Of the 22 studies included in this review, items relating to test methodological quality were, on the whole, well addressed. Test-retest reliability was determined in all but one of the reviewed studies, whilst most studies investigated at least two aspects of validity (i.e., content, construct or criterion-related validity). Few studies examined predictive validity or responsiveness. While feasibility was addressed in over half of the studies, practicality and test limitations were rarely addressed. Consideration of study quality, measurement properties and feasibility components assessed in this review can assist future researchers when developing or modifying tests of sporting skill outcomes.
ERIC Educational Resources Information Center
Shriver, Edgar L.; Foley, John P., Jr.
A battery of criterion referenced Job Task Performance Tests (JTPT) was developed because paper and pencil tests of job knowledge and electronic theory had very poor criterion-related or empirical validity with respect to the ability of electronic maintenance men to perform their job. Although the original JTPT required the use of actual…
Ten Issues in Criterion-Referenced Testing: A Response to Commonly Heard Criticisms.
ERIC Educational Resources Information Center
Curlette, William L.; Stallings, William M.
1979-01-01
The 10 criticisms of criterion-referenced tests addressed in this paper are: the domains tested; pedagogical influence; difficulty of items; cumbersome reports; reliability; arbitrary criteria; local objectives; labeling; predictive validity; and repeated testing. (SJL)
Procedures for Constructing and Using Criterion-Referenced Performance Tests.
ERIC Educational Resources Information Center
Campbell, Clifton P.; Allender, Bill R.
1988-01-01
Criterion-referenced performance tests (CRPT) provide a realistic method for objectively measuring task proficiency against predetermined attainment standards. This article explains the procedures of constructing, validating, and scoring CRPTs and includes a checklist for a welding test. (JOW)
Stefanatou, Pentagiotissa; Giannouli, Eleni; Konstantakopoulos, George; Vitoratou, Silia; Mavreas, Venetsanos
2014-11-01
Evaluation of mental health services based on patients' needs assessments has never taken place in Greece, although it is a crucial factor for the efficient use of their limited resources. To examine the inter-rater and test-retest reliability and the concurrent/convergent validity of the Greek research version of the Camberwell Assessment of Need-Research (CAN-R). A total of 53 schizophrenic patient-staff pairs were interviewed twice to test the inter-rater and test-retest reliability of the Greek version of the CAN-R. The World Health Organization Quality of Life-Brief Form (WHOQOL-BREF) and World Health Organization Disability Assessment Schedule-2.0 (WHODAS-2.0) were administered to the patients to examine concurrent validity. The inter-rater and test-retest reliability of patient and staff interviews for the 22 individual items and the eight summary scores of the instrument's four sections were good to excellent. Significant correlations emerged between CAN scores and the WHOQOL-BREF and WHODAS-2.0 domains for both patient and staff ratings, indicating good concurrent validity. Our results suggest that the Greek version of the CAN-R is a reliable instrument for assessing mental health patients' needs. Moreover, it is the first CAN-R validity study with satisfactory results using WHOQOL-BREF and WHODAS-2.0 as criterion variables. © The Author(s) 2013.
Nikjooy, Afsaneh; Jafari, Hassan; Saba, Maryam A; Ebrahimi, Naghmeh; Mirzaei, Rezvan
2018-05-01
The Patient Assessment of Constipation Quality of Life (PAC-QOL) questionnaire is the most validated and the most specific tool for measuring the quality of life of patients with constipation. Over 120 million people live in countries whose official language is Persian. There is no reported Persian version of the PAC-QOL questionnaire yet. The aim of this study was to translate and culturally adapt the PAC-QOL questionnaire and to assess its reliability and validity among Persian patients with chronic constipation. Following the translation and cultural adaptation of the PAC-QOL questionnaire to Persian, 100 patients (mean±SD age=40.51±13.67) with constipation were recruited for validity measurement and 20 patients were re-examined for reliability. Content validity was assessed based on the opinions of an expert committee and the floor/ceiling effect. Construct validity was evaluated according to the hypothesis test. The SF-36 questionnaire was used for concurrent criterion validity, intra-class correlation coefficient for reliability, and Cronbach's alpha for internal consistency. The content validity of the PAC-QOL questionnaire was proven, and there was no floor/ceiling effect. Construct validity also was confirmed based on the hypothesis test. The overall Cronbach's alpha of the PAC-QOL questionnaire was 0.92 (range=0.72-0.92), and the overall intra-class correlation coefficient of the questionnaire was 0.88 (range=0.69-0.87). The correlation between the SF-36 and PAC-QOL questionnaires was moderate. The Persian version of the PAC-QOL questionnaire demonstrated good validity and reliability properties in chronic constipation. Accordingly, Persian researchers and clinicians can benefit from this questionnaire in further research and assessment of treatment outcomes.
[Examination of the criterion validity of the MMPI-2 Depression, Anxiety, and Anger Content scales].
Uluç, Sait
2008-01-01
Examination of the psychometric properties and content areas of the revised MMPI's (MMPI-2 [Minnesota Multiphasic Personality Inventory-2]) content scales is required. In this study the criterion-related validity of the MMPI-2 Depression, Anxiety, and Anger Content scales was examined using the following conceptually relevant scales: The Beck Depression Inventory (BDI), Beck Anxiety Inventory (BAI), and State Triad Anger Scale (STAS). MMPI-2 Depression, Anxiety, and Anger Content scales, and BDI, BAI, and STAS were administered to a sample of 196 students at Middle East Technical University (n= 196; 122 female, 74 male). Regression analyses were performed to determine if these conceptually relevant scales contributed significantly beyond the content scales. The MMPI-2 Depression Content Scale was compared to BDI, the MMPI-2 Anxiety Scale was compared to BAI, and the MMPI-2 Anger Content Scale was compared to STAS. The internal consistency of the MMPI-2 Depression Content Scale (alpha = 0.82), the MMPI-2 Anxiety Content Scale (alpha = 0.73), and the MMPI-2 Anger Content Scale (alpha = 0.72) was obtained. Criterion validity of the 3 analyzed content scales was demonstrated for both males and females. The findings indicated that (1) the MMPI-2 Depression Content Scale provides information about the general level of depression, (2) the MMPI-2 Anxiety Content Scale assesses subjective anxiety rather than somatic anxiety, and (3) the MMPI-2 Anger Content Scale may provide information about the potential to act out. The findings also provide further evidence that the 3 conceptually relevant scales aid in the interpretation of MMPI-2 scores by contributing additional information beyond the clinical scales.
Miller, Joshua D; McCain, Jessica; Lynam, Donald R; Few, Lauren R; Gentile, Brittany; MacKillop, James; Campbell, W Keith
2014-09-01
The growing interest in the study of narcissism has resulted in the development of a number of assessment instruments that manifest only modest to moderate convergence. The present studies adjudicate among these measures with regard to criterion validity. In the 1st study, we compared multiple narcissism measures to expert consensus ratings of the personality traits associated with narcissistic personality disorder (NPD; Study 1; N = 98 community participants receiving psychological/psychiatric treatment) according to the Diagnostic and Statistical Manual of Mental Disorders (4th ed., text rev.; DSM-IV-TR; American Psychiatric Association, 2000) using 5-factor model traits as well as the traits associated with the pathological trait model according to the Diagnostic and Statistical Manual of Mental Disorders (5th ed.; American Psychiatric Association, 2013). In Study 2 (N = 274 undergraduates), we tested the criterion validity of an even larger set of narcissism instruments by examining their relations with measures of general and pathological personality, as well as psychopathology, and compared the resultant correlations to the correlations expected by experts for measures of grandiose and vulnerable narcissism. Across studies, the grandiose dimensions from the Five-Factor Narcissism Inventory (FFNI; Glover, Miller, Lynam, Crego, & Widiger, 2012) and the Narcissistic Personality Inventory (Raskin & Terry, 1988) provided the strongest match to expert ratings of DSM-IV-TR NPD and grandiose narcissism, whereas the vulnerable dimensions of the FFNI and the Pathological Narcissism Inventory (Pincus et al., 2009), as well as the Hypersensitive Narcissism Scale (Hendin & Cheek, 1997), provided the best match to expert ratings of vulnerable narcissism. These results should help guide researchers toward the selection of narcissism instruments that are most well suited to capturing different aspects of narcissism. PsycINFO Database Record (c) 2014 APA, all rights reserved.
Crosby, Richard A.; Graham, Cynthia A.; Yarber, William L.; Sanders, Stephanie A.; Milhausen, Robin R.; Mena, Leandro
2015-01-01
Objective To construct and test measures of psychosocial mediators that could be used in intervention studies seeking to promote safer sex behavior among young Black men who have sex with men (YBMSM). Methods YBMSM (N=400), ages 18–29 years, were recruited from an STI clinic, in the Southern U.S. All men had engaged in penile-anal sex with a male as a “top” in the past 6 months. Men completed an audio-computer assisted self-interview and provided specimens used for NAAT testing to detect Chlamydia and gonorrhea. Four measures were constructed and tested for criterion validity (Safer Sex Communication, Condom Turn-Offs, Condom Pleasure Scale, and a single item assessing frequency of condom use discussions before sexual arousal). Results With the exception of Safer Sex Communication, all of the measures showed criterion validity for both unprotected anal insertive, and unprotected anal receptive sex. With the exception of the Condom Turn-Offs, the three other measures were supported by criterion validity for oral sex. Both the Condom Turn-Offs and Condom Pleasure Scale were significantly related to whether or not men reported multiple partners as a “top” but only the Condom Pleasure Scale was associated with reports of multiple partners as a “bottom.” Only the Condom Turn-Offs Scale was positively associated with having been diagnosed with either Chlamydia or gonorrhea. Conclusion Findings provide three brief scales and a single item that can be used in intervention studies targeting YBMSM. Perceptions about condoms a turn off and about condoms enhancing pleasure showed strong association with sexual risk behaviors. PMID:26766525
NASA Astrophysics Data System (ADS)
Wang, Cong; Shang, De-Guang; Wang, Xiao-Wei
2015-02-01
An improved high-cycle multiaxial fatigue criterion based on the critical plane was proposed in this paper. The critical plane was defined as the plane of maximum shear stress (MSS) in the proposed multiaxial fatigue criterion, which is different from the traditional critical plane based on the MSS amplitude. The proposed criterion was extended as a fatigue life prediction model that can be applicable for ductile and brittle materials. The fatigue life prediction model based on the proposed high-cycle multiaxial fatigue criterion was validated with experimental results obtained from the test of 7075-T651 aluminum alloy and some references.
Huber, J; Hüsler, J; Dieppe, P; Günther, K P; Dreinhöfer, K; Judge, A
2016-03-01
To validate a new method to identify responders (relative effect per patient (REPP) >0.2) using the OMERACT-OARSI criteria as gold standard in a large multicentre sample. The REPP ([score before - after treatment]/score before treatment) was calculated for 845 patients of a large multicenter European cohort study for THR. The patients with a REPP >0.2 were defined as responders. The responder rate was compared to the gold standard (OMERACT-OARSI criteria) using receiver operator characteristic (ROC) curve analysis for sensitivity, specificity and percentage of appropriately classified patients. With the criterion REPP>0.2 85.4% of the patients were classified as responders, applying the OARSI-OMERACT criteria 85.7%. The new method had 98.8% sensitivity, 94.2% specificity and 98.1% of the patients were correctly classified compared to the gold standard. The external validation showed a high sensitivity and also specificity of a new criterion to identify a responder compared to the gold standard method. It is simple and has no uncertainties due to a single classification criterion. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
Link, William; Sauer, John R.
2016-01-01
The analysis of ecological data has changed in two important ways over the last 15 years. The development and easy availability of Bayesian computational methods has allowed and encouraged the fitting of complex hierarchical models. At the same time, there has been increasing emphasis on acknowledging and accounting for model uncertainty. Unfortunately, the ability to fit complex models has outstripped the development of tools for model selection and model evaluation: familiar model selection tools such as Akaike's information criterion and the deviance information criterion are widely known to be inadequate for hierarchical models. In addition, little attention has been paid to the evaluation of model adequacy in context of hierarchical modeling, i.e., to the evaluation of fit for a single model. In this paper, we describe Bayesian cross-validation, which provides tools for model selection and evaluation. We describe the Bayesian predictive information criterion and a Bayesian approximation to the BPIC known as the Watanabe-Akaike information criterion. We illustrate the use of these tools for model selection, and the use of Bayesian cross-validation as a tool for model evaluation, using three large data sets from the North American Breeding Bird Survey.
Castillo-Tandazo, Wilson; Flores-Fortty, Adolfo; Feraud, Lourdes; Tettamanti, Daniel
2013-01-01
Purpose To translate, cross-culturally adapt, and validate the Questionnaire for Diabetes-Related Foot Disease (Q-DFD), originally created and validated in Australia, for its use in Spanish-speaking patients with diabetes mellitus. Patients and methods The translation and cross-cultural adaptation were based on international guidelines. The Spanish version of the survey was applied to a community-based (sample A) and a hospital clinic-based sample (samples B and C). Samples A and B were used to determine criterion and construct validity comparing the survey findings with clinical evaluation and medical records, respectively; while sample C was used to determine intra- and inter-rater reliability. Results After completing the rigorous translation process, only four items were considered problematic and required a new translation. In total, 127 patients were included in the validation study: 76 to determine criterion and construct validity and 41 to establish intra- and inter-rater reliability. For an overall diagnosis of diabetes-related foot disease, a substantial level of agreement was obtained when we compared the Q-DFD with the clinical assessment (kappa 0.77, sensitivity 80.4%, specificity 91.5%, positive likelihood ratio [LR+] 9.46, negative likelihood ratio [LR−] 0.21); while an almost perfect level of agreement was obtained when it was compared with medical records (kappa 0.88, sensitivity 87%, specificity 97%, LR+ 29.0, LR− 0.13). Survey reliability showed substantial levels of agreement, with kappa scores of 0.63 and 0.73 for intra- and inter-rater reliability, respectively. Conclusion The translated and cross-culturally adapted Q-DFD showed good psychometric properties (validity, reproducibility, and reliability) that allow its use in Spanish-speaking diabetic populations. PMID:24039434
Saadatpour, Leila; Hemati, Simin; Habibi, Farzaneh; Behzadi, Erfan; Hashemi-Jazi, Marsa Sadat; Kheirabadi, Gholamreza; Mirbagher, Leila; Gholamrezaei, Ali
2015-09-01
Various symptoms frequently affect cancer patients' quality of life. Appropriate assessment of these symptoms provides valuable data for cancer management. This study aimed to validate the Persian version of the M. D. Anderson Symptom Inventory (MDASI-P). This cross-sectional study was conducted at four cancer treatment centers in two cities in Iran. Breast cancer and colorectal cancer patients aged 18 years and older were consecutively included in the study. The standard forward-backward translation method was applied. Patients completed the MDASI-P along with the previously validated Persian version of the European Organization for Research and Treatment of Cancer Quality of Life Questionnaire-Core 30 (EORTC QLQ-C30). Construct validity (factor analysis), criterion validity (against the EORTC QLQ-C30), and reliability (Cronbach's alpha) were analyzed. A total of 146 breast cancer and 94 colorectal cancer patients were studied. Factor analysis for the symptom severity items resulted in a three-factor solution, further reduced to a two-factor solution: general symptoms and gastrointestinal symptoms. Correlation of the MDASI-P symptom severity items with corresponding EORTC QLQ-C30 symptom items (r = 0.48-0.75) and MDASI-P interference items with corresponding EORTC QLQ-C30 functioning domains (r = -0.46 to -0.23) supported the criterion validity. Cronbach's alpha was 0.90, 0.88, and 0.77 for the total questionnaire, symptom severity items, and the interference subscale, respectively. The MDASI-P is a feasible, valid, and reliable instrument for evaluation of symptoms in Persian-speaking cancer patients and can be used to improve symptom management in these patients. Copyright © 2015 American Academy of Hospice and Palliative Medicine. Published by Elsevier Inc. All rights reserved.
Measuring student learning using initial and final concept test in an STEM course
NASA Astrophysics Data System (ADS)
Kaw, Autar; Yalcin, Ali
2012-06-01
Effective assessment is a cornerstone in measuring student learning in higher education. For a course in Numerical Methods, a concept test was used as an assessment tool to measure student learning and its improvement during the course. The concept test comprised 16 multiple choice questions and was given in the beginning and end of the class for three semesters. Hake's gain index, a measure of learning gains from pre- to post-tests, of 0.36 to 0.41 were recorded. The validity and reliability of the concept test was checked via standard measures such as Cronbach's alpha, content and criterion-related validity, item characteristic curves and difficulty and discrimination indices. The performance of various subgroups such as pre-requisite grades, transfer students, gender and age were also studied.
Criterion-Referenced and Norm-Referenced Assessments: Compatibility and Complementarity
ERIC Educational Resources Information Center
Lok, Beatrice; McNaught, Carmel; Young, Kenneth
2016-01-01
The tension between criterion-referenced and norm-referenced assessment is examined in the context of curriculum planning and assessment in outcomes-based approaches to higher education. This paper argues the importance of a criterion-referenced assessment approach once an outcomes-based approach has been adopted. It further discusses the…
Developing and testing the patient-centred innovation questionnaire for hospital nurses.
Huang, Ching-Yuan; Weng, Rhay-Hung; Wu, Tsung-Chin; Lin, Tzu-En; Hsu, Ching-Tai; Hung, Chiu-Hsia; Tsai, Yu-Chen
2018-03-01
Develop the patient-centred innovation questionnaire for hospital nurses and establish its validity and reliability. Patient-centred care has been adopted by health care managers in their efforts to improve health care quality. It is regarded as a core concept for developing innovation. A cross-sectional study was employed to collect data from hospital nurses in Taiwan. This study was divided into two stages: pilot study and main study. In the main study, 596 valid responses were collected. This study adopted reliability analysis, exploratory factor analysis, confirmatory factor analysis and selected nurse innovation scale as a criterion to test criterion-related validity. Five-dimension patient-centred innovation questionnaire was proposed: access and practicability, co-ordination and communication, sharing power and responsibility, care continuity, family and person focus. Each dimension demonstrated a reliability of 0.89-0.98. All dimensions had acceptable convergent and discriminate validity. The patient-centred innovation questionnaire and nurse innovation scale exhibited a significantly positive correlation. Patient-centred innovation questionnaire not only had a good theoretical basis but also had sufficient reliability and construct validity, and criterion-related validity. Patient-centred innovation questionnaire could give a measure for evaluating the implementation of patient-centred care and could be used as a management tool during the process of nurse innovation. © 2017 John Wiley & Sons Ltd.
29 CFR 1607.5 - General standards for validity studies.
Code of Federal Regulations, 2010 CFR
2010-07-01
... 29 Labor 4 2010-07-01 2010-07-01 false General standards for validity studies. 1607.5 Section 1607... studies. A. Acceptable types of validity studies. For the purposes of satisfying these guidelines, users may rely upon criterion-related validity studies, content validity studies or construct validity...
ERIC Educational Resources Information Center
Zwick, Rebecca
2012-01-01
Differential item functioning (DIF) analysis is a key component in the evaluation of the fairness and validity of educational tests. The goal of this project was to review the status of ETS DIF analysis procedures, focusing on three aspects: (a) the nature and stringency of the statistical rules used to flag items, (b) the minimum sample size…
Validating SPICES as a Screening Tool for Frailty Risks among Hospitalized Older Adults
Aronow, Harriet Udin; Borenstein, Jeff; Haus, Flora; Braunstein, Glenn D.; Bolton, Linda Burnes
2014-01-01
Older patients are vulnerable to adverse hospital events related to frailty. SPICES, a common screening protocol to identify risk factors in older patients, alerts nurses to initiate care plans to reduce the probability of patient harm. However, there is little published validating the association between SPICES and measures of frailty and adverse outcomes. This paper used data from a prospective cohort study on frailty among 174 older adult inpatients to validate SPICES. Almost all patients met one or more SPICES criteria. The sum of SPICES was significantly correlated with age and other well-validated assessments for vulnerability, comorbid conditions, and depression. Individuals meeting two or more SPICES criteria had a risk of adverse hospital events three times greater than individuals with either no or one criterion. Results suggest that as a screening tool used within 24 hours of admission, SPICES is both valid and predictive of adverse events. PMID:24876954
Construction and Initial Validation of the Multiracial Experiences Measure (MEM)
Yoo, Hyung Chol; Jackson, Kelly; Guevarra, Rudy P.; Miller, Matthew J.; Harrington, Blair
2015-01-01
This article describes the development and validation of the Multiracial Experiences Measure (MEM): a new measure that assesses uniquely racialized risks and resiliencies experienced by individuals of mixed racial heritage. Across two studies, there was evidence for the validation of the 25-item MEM with 5 subscales including Shifting Expressions, Perceived Racial Ambiguity, Creating Third Space, Multicultural Engagement, and Multiracial Discrimination. The 5-subscale structure of the MEM was supported by a combination of exploratory and confirmatory factor analyses. Evidence of criterion-related validity was partially supported with MEM subscales correlating with measures of racial diversity in one’s social network, color-blind racial attitude, psychological distress, and identity conflict. Evidence of discriminant validity was supported with MEM subscales not correlating with impression management. Implications for future research and suggestions for utilization of the MEM in clinical practice with multiracial adults are discussed. PMID:26460977
Construction and initial validation of the Multiracial Experiences Measure (MEM).
Yoo, Hyung Chol; Jackson, Kelly F; Guevarra, Rudy P; Miller, Matthew J; Harrington, Blair
2016-03-01
This article describes the development and validation of the Multiracial Experiences Measure (MEM): a new measure that assesses uniquely racialized risks and resiliencies experienced by individuals of mixed racial heritage. Across 2 studies, there was evidence for the validation of the 25-item MEM with 5 subscales including Shifting Expressions, Perceived Racial Ambiguity, Creating Third Space, Multicultural Engagement, and Multiracial Discrimination. The 5-subscale structure of the MEM was supported by a combination of exploratory and confirmatory factor analyses. Evidence of criterion-related validity was partially supported with MEM subscales correlating with measures of racial diversity in one's social network, color-blind racial attitude, psychological distress, and identity conflict. Evidence of discriminant validity was supported with MEM subscales not correlating with impression management. Implications for future research and suggestions for utilization of the MEM in clinical practice with multiracial adults are discussed. (c) 2016 APA, all rights reserved).
2011-01-01
Background The lack of culturally adapted and validated instruments for child mental health and psychosocial support in low and middle-income countries is a barrier to assessing prevalence of mental health problems, evaluating interventions, and determining program cost-effectiveness. Alternative procedures are needed to validate instruments in these settings. Methods Six criteria are proposed to evaluate cross-cultural validity of child mental health instruments: (i) purpose of instrument, (ii) construct measured, (iii) contents of construct, (iv) local idioms employed, (v) structure of response sets, and (vi) comparison with other measurable phenomena. These criteria are applied to transcultural translation and alternative validation for the Depression Self-Rating Scale (DSRS) and Child PTSD Symptom Scale (CPSS) in Nepal, which recently suffered a decade of war including conscription of child soldiers and widespread displacement of youth. Transcultural translation was conducted with Nepali mental health professionals and six focus groups with children (n = 64) aged 11-15 years old. Because of the lack of child mental health professionals in Nepal, a psychosocial counselor performed an alternative validation procedure using psychosocial functioning as a criterion for intervention. The validation sample was 162 children (11-14 years old). The Kiddie-Schedule for Affective Disorders and Schizophrenia (K-SADS) and Global Assessment of Psychosocial Disability (GAPD) were used to derive indication for treatment as the external criterion. Results The instruments displayed moderate to good psychometric properties: DSRS (area under the curve (AUC) = 0.82, sensitivity = 0.71, specificity = 0.81, cutoff score ≥ 14); CPSS (AUC = 0.77, sensitivity = 0.68, specificity = 0.73, cutoff score ≥ 20). The DSRS items with significant discriminant validity were "having energy to complete daily activities" (DSRS.7), "feeling that life is not worth living" (DSRS.10), and "feeling lonely" (DSRS.15). The CPSS items with significant discriminant validity were nightmares (CPSS.2), flashbacks (CPSS.3), traumatic amnesia (CPSS.8), feelings of a foreshortened future (CPSS.12), and easily irritated at small matters (CPSS.14). Conclusions Transcultural translation and alternative validation feasibly can be performed in low clinical resource settings through task-shifting the validation process to trained mental health paraprofessionals using structured interviews. This process is helpful to evaluate cost-effectiveness of psychosocial interventions. PMID:21816045
[Validity of four questionnaires to assess physical activity in Spanish adolescents].
Martínez-Gómez, David; Martínez-De-Haro, Vicente; Del-Campo, Juan; Zapatera, Belén; Welk, Gregory J; Villagra, Ariel; Marcos, Ascensión; Veiga, Oscar L
2009-01-01
The physical activity (PA) levels of Spanish adolescents must be determined to assess how the lack of PA may affect the increasing prevalence of obesity. Thus, to assess PA in this age range valid measurement instruments are essential. The aim of this study was to evaluate the validity of four easily applied questionnaires (the enKid and FITNESSGRAM questions, the Patient-Centered Assessment and Counselling [PACE] questionnaire, and an activity rating) to assess PA in Spanish adolescents by using an accelerometer as the criterion instrument. A total of 232 adolescents (113 girls) completed the questionnaires and wore an ActiGraph accelerometer for 7 consecutive days. Spearman's correlation coefficient (rho) was used to compare the questionnaires and total PA, moderate PA, vigorous PA and moderate-to-vigorous PA (MVPA) assessed by the accelerometer. All the questionnaires showed moderate correlations when compared against total PA (rho=0.36-0.43) and MVPA (rho=0.34-0.46) obtained by the accelerometer in the total sample. Higher correlations were found when comparing the questionnaires against vigorous PA (rho=0.42-0.51) than against moderate PA (rho=0.15-0.17). The FITNESSGRAM question and the PACE questionnaire obtained weak correlations in girls and the enKid question and activity rating were moderately correlated for boys and girls. The four questionnaires evaluated showed acceptable validity in the assessment of PA in the Spanish adolescent population.
Validation of Cost-Effectiveness Criterion for Evaluating Noise Abatement Measures
DOT National Transportation Integrated Search
1999-04-01
This project will provide the Texas Department of Transportation (TxDOT)with information about the effects of the current cost-effectiveness criterion. The project has reviewed (1) the cost-effectiveness criteria used by other states, (2) the noise b...
De Cocker, K; Cardon, G; De Bourdeaudhuij, I
2006-01-01
Objectives To evaluate if inexpensive Stepping Meters are valid in counting steps in adults in free living conditions. Methods For six days, 35 healthy volunteers wore a criterion Yamax Digiwalker and five Stepping Meters every day until all 973 pedometers had been tested. Steps were recorded daily, and the differences between counts from the Digiwalker and the Stepping Meter were expressed as a percentage of the valid value of the Digiwalker step counts. The criterion used to determine if a Stepping Meter was valid was a maximum deviation of 10% from the Digiwalker step counts. Results A total of 252 (25.9%) Stepping Meters met the criterion, whereas 74.1% made an overestimation or underestimation of more than 10%. In more than one third (36.6%) of the invalid Stepping Meters, the deviation was greater than 50%. Most (64.8%) of the invalid pedometers overestimated the actual steps taken. Conclusions Inexpensive Stepping Meters cannot be used in community interventions as they will give participants the wrong message. PMID:16790485
Sampling bias in blending validation and a different approach to homogeneity assessment.
Kraemer, J; Svensson, J R; Melgaard, H
1999-02-01
Sampling of batches studied for validation is reported. A thief particularly suited for granules, rather than cohesive powders, was used in the study. It is shown, as has been demonstrated in the past, that traditional 1x to 3x thief sampling of a blend is biased, and that the bias decreases as the sample size increases. It is shown that taking 50 samples of tablets after blending and testing this subpopulation for normality is a discriminating manner of testing for homogeneity. As a criterion, it is better than sampling at mixer or drum stage would be even if an unbiased sampling device were available.
[Publication practices of academics in medical psychology and in psychosomatics and psychotherapy].
Decker, O; Brähler, E
2001-07-01
When qualifying for higher academic positions junior academics face increasing demands for submitting papers for publication. The criteria for assessing these publications are presently under discussion. Contributions to international English language journals are more highly regarded, and it has become indispensable to have papers published in journals listed in SCI and SSCI. The question remains whether these criteria are valid to judge academic qualifications. Whereas one criterion for validity may be the publication practice of the present academic representatives, it appears that to some extent the chairs themselves would not fulfill the requirements for academic qualification today. Results regarding this are presented and discussed.
Accuracy of clinical observations of push-off during gait after stroke.
McGinley, Jennifer L; Morris, Meg E; Greenwood, Ken M; Goldie, Patricia A; Olney, Sandra J
2006-06-01
To determine the accuracy (criterion-related validity) of real-time clinical observations of push-off in gait after stroke. Criterion-related validity study of gait observations. Rehabilitation hospital in Australia. Eleven participants with stroke and 8 treating physical therapists. Not applicable. Pearson product-moment correlation between physical therapists' observations of push-off during gait and criterion measures of peak ankle power generation from a 3-dimensional motion analysis system. A high correlation was obtained between the observational ratings and the measurements of peak ankle power generation (Pearson r =.98). The standard error of estimation of ankle power generation was .32W/kg. Physical therapists can make accurate real-time clinical observations of push-off during gait following stroke.
Validity and extension of the SCS-CN method for computing infiltration and rainfall-excess rates
NASA Astrophysics Data System (ADS)
Mishra, Surendra Kumar; Singh, Vijay P.
2004-12-01
A criterion is developed for determining the validity of the Soil Conservation Service curve number (SCS-CN) method. According to this criterion, the existing SCS-CN method is found to be applicable when the potential maximum retention, S, is less than or equal to twice the total rainfall amount. The criterion is tested using published data of two watersheds. Separating the steady infiltration from capillary infiltration, the method is extended for predicting infiltration and rainfall-excess rates. The extended SCS-CN method is tested using 55 sets of laboratory infiltration data on soils varying from Plainfield sand to Yolo light clay, and the computed and observed infiltration and rainfall-excess rates are found to be in good agreement.
Visual judgements of steadiness in one-legged stance: reliability and validity.
Haupstein, T; Goldie, P
2000-01-01
There is a paucity of information about the validity and reliability of clinicians' visual judgements of steadiness in one-legged stance. Such judgements are used frequently in clinical practice to support decisions about treatment in the fields of neurology, sports medicine, paediatrics and orthopaedics. The aim of the present study was to address the validity and reliability of visual judgements of steadiness in one-legged stance in a group of physiotherapists. A videotape of 20 five-second performances was shown to 14 physiotherapists with median clinical experience of 6.75 years. Validity of visual judgement was established by correlating scores obtained from an 11-point rating scale with criterion scores obtained from a force platform. In addition, partial correlations were used to control for the potential influence of body weight on the relationship between the visual judgements and criterion scores. Inter-observer reliability was quantified between the physiotherapists; intra-observer reliability was quantified between two tests four weeks apart. Mean criterion-related validity was high, regardless of whether body weight was controlled for statistically (Pearson's r = 0.84, 0.83, respectively). The standard error of estimating the criterion score was 3.3 newtons. Inter-observer reliability was high (ICC (2,1) = 0.81 at Test 1 and 0.82 at Test 2). Intra-observer reliability was high (on average ICC (2,1) = 0.88; Pearson's r = 0.90). The standard error of measurement for the 11-point scale was one unit. The finding of higher accuracy of making visual judgements than previously reported may be due to several aspects of design: use of a criterion score derived from the variability of the force signal which is more discriminating than variability of centre of pressure; use of a discriminating visual rating scale; specificity and clear definition of the phenomenon to be rated.
Chang, Jasper O; Levy, Susan S; Seay, Seth W; Goble, Daniel J
2014-05-01
Recent guidelines advocate sports medicine professionals to use balance tests to assess sensorimotor status in the management of concussions. The present study sought to determine whether a low-cost balance board could provide a valid, reliable, and objective means of performing this balance testing. Criterion validity testing relative to a gold standard and 7 day test-retest reliability. University biomechanics laboratory. Thirty healthy young adults. Balance ability was assessed on 2 days separated by 1 week using (1) a gold standard measure (ie, scientific grade force plate), (2) a low-cost Nintendo Wii Balance Board (WBB), and (3) the Balance Error Scoring System (BESS). Validity of the WBB center of pressure path length and BESS scores were determined relative to the force plate data. Test-retest reliability was established based on intraclass correlation coefficients. Composite scores for the WBB had excellent validity (r = 0.99) and test-retest reliability (R = 0.88). Both the validity (r = 0.10-0.52) and test-retest reliability (r = 0.61-0.78) were lower for the BESS. These findings demonstrate that a low-cost balance board can provide improved balance testing accuracy/reliability compared with the BESS. This approach provides a potentially more valid/reliable, yet affordable, means of assessing sports-related concussion compared with current methods.
López-de-Uralde-Villanueva, I; Gil-Martínez, A; Candelas-Fernández, P; de Andrés-Ares, J; Beltrán-Alacreu, H; La Touche, R
2016-12-08
The self-administered Leeds Assessment of Neuropathic Symptoms and Signs (S-LANSS) scale is a tool designed to identify patients with pain with neuropathic features. To assess the validity and reliability of the Spanish-language version of the S-LANSS scale. Our study included a total of 182 patients with chronic pain to assess the convergent and discriminant validity of the S-LANSS; the sample was increased to 321 patients to evaluate construct validity and reliability. The validated Spanish-language version of the ID-Pain questionnaire was used as the criterion variable. All participants completed the ID-Pain, the S-LANSS, and the Numerical Rating Scale for pain. Discriminant validity was evaluated by analysing sensitivity, specificity, and the area under the receiver operating characteristic curve (AUC). Construct validity was assessed with factor analysis and by comparing the odds ratio of each S-LANSS item to the total score. Convergent validity and reliability were evaluated with Pearson's r and Cronbach's alpha, respectively. The optimal cut-off point for S-LANSS was ≥12 points (AUC=.89; sensitivity=88.7; specificity=76.6). Factor analysis yielded one factor; furthermore, all items contributed significantly to the positive total score on the S-LANSS (P<.05). The S-LANSS showed a significant correlation with ID-Pain (r=.734, α=.71). The Spanish-language version of the S-LANSS is valid and reliable for identifying patients with chronic pain with neuropathic features. Copyright © 2016 Sociedad Española de Neurología. Publicado por Elsevier España, S.L.U. All rights reserved.
Ulloa, R E; Narváez, M R; Arroyo, E; del Bosque, J; de la Peña, F
2009-01-01
Teacher's rating scales for the evaluation of attention deficit and superactivity disorder (TDAH) and conduct disorders have been shown to be useful and valid tools. The Child Psychiatric Hospital Teacher Questionnaire (CPHTQ) of the Hospital Psiquiátrico Infantil Dr. Juan N. Navarro was designed for the assessment of ADHD symptoms, externalizing symptoms and school functioning difficulties of children and adolescents. Internal consistency, criterion validity, construct validity and sensitivity of the scale to changes in symptom severity were evaluated in this study. The scale was administered to 282 teachers of children and adolescents aged 5 to 17 years who came to a unit specialized in child psychiatry. The validity analysis of the instrument showed that the internal consistency measured by Cronbach's alpha was 0.94. The factorial analysis yielded 5 factors accounting for 59.1% of the variance: hyperactivity and conduct symptoms, predatory, conduct disorder, inattentive, poor functioning and motor disturbances. The CPHTQ scores on the scale showed positive correlation with the Clinical Global impression (CGI) scale in the patients' response to drug treatment. The CPHTQ shows adequate validity characteristics that demonstrate its utility in the evaluation of patients with ADHD and its comorbidity with other behavior disorders.
Sanchez-Armass, Omar; Raffaelli, Marcela; Andrade, Flavia Cristina Drumond; Wiley, Angela R; Noyola, Aida Nacielli Morales; Arguelles, Alejandra Cepeda; Aradillas-Garcia, Celia
2017-03-01
To evaluate the criterion validity and diagnostic utility of the SCOFF, a brief eating disorder (ED) screening instrument, in a Mexican sample. The study was conducted in two phases in 2012. Phase I involved the administration of self-report measures [the SCOFF and the Eating Disorder Inventory-2, (EDI-2)] to 1057 students aged 17-56 years (M age = 21.0, SD = 3.4; 67 % female) from three colleges at the Universidad Autónoma de San Luis Potosí, Mexico. In Phase II, a random subsample of these students (n = 104) participated in the eating disorder examination, a structured interview that yields ED diagnoses. Analyses were conducted to evaluate the SCOFF's criterion validity by examining (a) correlations between scores on the SCOFF and the EDI-2 and (b) the SCOFF's ability to differentiate diagnosed ED cases and non-cases. EDI-2 subscales showed high correlations with the SCOFF scores proving initial evidence of criterion validity. A score of two points on the SCOFF optimized the sensitivity (78 %) and specificity (84 %). With this cutoff, the SCOFF correctly classified over half the cases (PPV = 58 %) and screened out the majority of non-cases (NPV = 93 %) providing further evidence of criterion validity. Analyses were repeated separately for men and women, yielding gender-specific information on the SCOFF's performance. Taken as a whole, results indicated that the SCOFF can be a useful tool for identifying Mexican university students who are at risk of eating disorders.