Evaluation of Criterion Validity for Scales with Congeneric Measures
ERIC Educational Resources Information Center
Raykov, Tenko
2007-01-01
A method for estimating criterion validity of scales with homogeneous components is outlined. It accomplishes point and interval estimation of interrelationship indices between composite scores and criterion variables and is useful for testing hypotheses about criterion validity of measurement instruments. The method can also be used with missing…
A Model for Estimating the Reliability and Validity of Criterion-Referenced Measures.
ERIC Educational Resources Information Center
Edmonston, Leon P.; Randall, Robert S.
A decision model designed to determine the reliability and validity of criterion referenced measures (CRMs) is presented. General procedures which pertain to the model are discussed as to: Measures of relationship, Reliability, Validity (content, criterion-oriented, and construct validation), and Item Analysis. The decision model is presented in…
Discriminant Validity Assessment: Use of Fornell & Larcker criterion versus HTMT Criterion
NASA Astrophysics Data System (ADS)
Hamid, M. R. Ab; Sami, W.; Mohmad Sidek, M. H.
2017-09-01
Assessment of discriminant validity is a must in any research that involves latent variables for the prevention of multicollinearity issues. Fornell and Larcker criterion is the most widely used method for this purpose. However, a new method has emerged for establishing the discriminant validity assessment through heterotrait-monotrait (HTMT) ratio of correlations method. Therefore, this article presents the results of discriminant validity assessment using these methods. Data from previous study was used that involved 429 respondents for empirical validation of value-based excellence model in higher education institutions (HEI) in Malaysia. From the analysis, the convergent, divergent and discriminant validity were established and admissible using Fornell and Larcker criterion. However, the discriminant validity is an issue when employing the HTMT criterion. This shows that the latent variables under study faced the issue of multicollinearity and should be looked into for further details. This also implied that the HTMT criterion is a stringent measure that could detect the possible indiscriminant among the latent variables. In conclusion, the instrument which consisted of six latent variables was still lacking in terms of discriminant validity and should be explored further.
2012-12-01
Development and validation. ABA, BQ , and criterion data were extracted from AT- SAT concurrent, criterion- related validation database. Overall, 1,232...dependent on responses to the other instrument. 3 A subset of 260 controllers in the AT- SAT dataset had full and complete ABA, BQ , and criterion data (i.e... SAT cases with ABA, BQ , and criterion data (n=260) was very small, making fairness analyses with the validation sample impractical. However, the
Dahlke, Jeffrey A; Kostal, Jack W; Sackett, Paul R; Kuncel, Nathan R
2018-05-03
We explore potential explanations for validity degradation using a unique predictive validation data set containing up to four consecutive years of high school students' cognitive test scores and four complete years of those students' college grades. This data set permits analyses that disentangle the effects of predictor-score age and timing of criterion measurements on validity degradation. We investigate the extent to which validity degradation is explained by criterion dynamism versus the limited shelf-life of ability scores. We also explore whether validity degradation is attributable to fluctuations in criterion variability over time and/or GPA contamination from individual differences in course-taking patterns. Analyses of multiyear predictor data suggest that changes to the determinants of performance over time have much stronger effects on validity degradation than does the shelf-life of cognitive test scores. The age of predictor scores had only a modest relationship with criterion-related validity when the criterion measurement occasion was held constant. Practical implications and recommendations for future research are discussed. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Evidence for the Criterion Validity and Clinical Utility of the Pathological Narcissism Inventory
ERIC Educational Resources Information Center
Thomas, Katherine M.; Wright, Aidan G. C.; Lukowitsky, Mark R.; Donnellan, M. Brent; Hopwood, Christopher J.
2012-01-01
In this study, the authors evaluated aspects of criterion validity and clinical utility of the grandiosity and vulnerability components of the Pathological Narcissism Inventory (PNI) using two undergraduate samples (N = 299 and 500). Criterion validity was assessed by evaluating the correlations of narcissistic grandiosity and narcissistic…
Onwujekwe, Obinna
2004-02-01
Contingent valuation question formats that will be used to elicit willingness to pay for goods and services need to be relevant to the area they will be used in order for responses to be valid. A novel contingent valuation question format called the "structured haggling technique" (SH) that resembles the bargaining system in Nigerian markets was designed and its criterion and content validity compared with those of the bidding game (BG) and binary-with-follow-up (BWFU) technique. This was achieved by determining the willingness to pay (WTP) for insecticide-treated nets (ITNs) in Southeast Nigeria. Content validity was determined through observation of actual trading of untreated nets together with interviews with sellers and consumers. Criterion validity was determined by comparing stated and actual WTP. Stated WTP was determined using a questionnaire administered to 810 household heads and actual WTP was determined by offering the nets for sale to all respondents one month later. The phi (correlation) coefficient was used to compare criterion validity across question formats. The phi coefficients were SH (0.60: 95% C.I. 0.50-0.71), BG (0.42: 95% C.I. 0.29-0.54) and the BWFU (0.32: 95% C.I. 0.20-0.44), implying that the BG and SH had similar levels of criterion-validity while the BWFU was the least criterion-valid. However, the SH was the most content-valid. It is necessary to validate the findings in other areas where haggling is common. Future studies should establish the content validity of question formats in the contexts in which they will be used before administering questionnaires.
Classen, Sherrilene; Wang, Yanning; Winter, Sandra M; Velozo, Craig A; Lanford, Desiree N; Bédard, Michel
2013-01-01
We determined the concurrent criterion validity of the Safe Driving Behavior Measure (SDBM) for on-road outcomes (passing or failing the on-road test as determined by a certified driving rehabilitation specialist) among older drivers and their family members-caregivers. On the basis of ratings from 168 older drivers and 168 family members-caregivers, we calculated receiver operating characteristic curves. The drivers' area under the curve (AUC) was .620 (95% confidence interval [CI] = .514-.725, p = .043). The family members-caregivers' AUC was .726 (95% CI = .622-.829, p ≤ .01). Older drivers' ratings showed statistically significant yet poor concurrent criterion validity, but family members-caregivers' ratings showed good concurrent criterion validity for the criterion on-road driving test. Continuing research with a more representative sample is being pursued to confirm the SDBM's concurrent criterion validity. This screening tool may be useful for generalist practitioners to use in making decisions regarding driving. Copyright © 2013 by the American Occupational Therapy Association, Inc.
Wang, Yanning; Winter, Sandra M.; Velozo, Craig A.; Lanford, Desiree N.; Bédard, Michel
2013-01-01
We determined the concurrent criterion validity of the Safe Driving Behavior Measure (SDBM) for on-road outcomes (passing or failing the on-road test as determined by a certified driving rehabilitation specialist) among older drivers and their family members–caregivers. On the basis of ratings from 168 older drivers and 168 family members–caregivers, we calculated receiver operating characteristic curves. The drivers’ area under the curve (AUC) was .620 (95% confidence interval [CI] = .514–.725, p = .043). The family members–caregivers’ AUC was .726 (95% CI = .622–.829, p ≤ .01). Older drivers’ ratings showed statistically significant yet poor concurrent criterion validity, but family members–caregivers’ ratings showed good concurrent criterion validity for the criterion on-road driving test. Continuing research with a more representative sample is being pursued to confirm the SDBM’s concurrent criterion validity. This screening tool may be useful for generalist practitioners to use in making decisions regarding driving. PMID:23245789
ERIC Educational Resources Information Center
Fidler, James R.
1993-01-01
Criterion-related validities of 2 laboratory practitioner certification examinations for medical technologists (MTs) and medical laboratory technicians (MLTs) were assessed for 81 MT and 70 MLT examinees. Validity coefficients are presented for both measures. Overall, summative ratings yielded stronger validity coefficients than ratings based on…
Ando, Yukako; Kataoka, Tsuyoshi; Okamura, Hitoshi; Tanaka, Katsutoshi; Kobayashi, Toshio
2013-12-01
The purpose of this research is to verify the reliability and validity of a job stressor scale for nurses caring for patients with intractable neurological diseases. A mail survey was conducted using a self-report questionnaire. The subjects were 263 nurses and assistant nurses working in wards specializing in intractable neurological diseases. The response rate was 71.9% (valid response rate, 66.2%). With regard to reliability, internal consistency and stability were assessed. Internal consistency was examined via Cronbach's alpha. For stability, the test-retest method was performed and stability was examined via intraclass correlation coefficients. With regard to validity, factor validity, criterion-related validity, and content validity were assessed. Exploratory factor analysis was used for factor validity. For criterion-related validity, an existing scale was used as an external criterion; concurrent validity was examined via Spearman's rank correlation coefficients. As a result of analysis, there were 26 items in the scale created with an eight factor structure. Cronbach's a for the 26 items was 0.90; with the exception of two factors, alpha for all of the individual sub-factors was high at 0.7 or higher. The intraclass correlation coefficient for the 26 items was 0.89 (p < 0.001). With regard to criterion-related validity, concurrent validity was confirmed and the correlation coefficient with an external criterion was 0.73 (p < 0.001). For content validity, subjects who responded that "The questionnaire represents a stressor well or to a degree" accounted for 81% of the total responses. Reliability and validity were confirmed, so the scale created in the current research is a usable scale.
Validity of the Eating Attitudes Test and the Eating Disorders Inventory in Bulimia Nervosa.
ERIC Educational Resources Information Center
Gross, Janet; And Others
1986-01-01
Assessed criterion and concurrent validity of the Eating Attitudes Test and the Eating Disorder Inventory in 82 women with bulimia nervosa. Both tests demonstrated criterion validity by discriminating bulimia nervosa subjects from normals. Only weak support was found for concurrent validity within bulimia subjects. Recommends combination of…
Schiffman, Eric L.; Truelove, Edmond L.; Ohrbach, Richard; Anderson, Gary C.; John, Mike T.; List, Thomas; Look, John O.
2011-01-01
AIMS The purpose of the Research Diagnostic Criteria for Temporomandibular Disorders (RDC/TMD) Validation Project was to assess the diagnostic validity of this examination protocol. An overview is presented, including Axis I and II methodology and descriptive statistics for the study participant sample. This paper details the development of reliable methods to establish the reference standards for assessing criterion validity of the Axis I RDC/TMD diagnoses. Validity testing for the Axis II biobehavioral instruments was based on previously validated reference standards. METHODS The Axis I reference standards were based on the consensus of 2 criterion examiners independently performing a comprehensive history, clinical examination, and evaluation of imaging. Intersite reliability was assessed annually for criterion examiners and radiologists. Criterion exam reliability was also assessed within study sites. RESULTS Study participant demographics were comparable to those of participants in previous studies using the RDC/TMD. Diagnostic agreement of the criterion examiners with each other and with the consensus-based reference standards was excellent with all kappas ≥ 0.81, except for osteoarthrosis (moderate agreement, k = 0.53). Intrasite criterion exam agreement with reference standards was excellent (k ≥ 0.95). Intersite reliability of the radiologists for detecting computed tomography-disclosed osteoarthrosis and magnetic resonance imaging-disclosed disc displacement was good to excellent (k = 0.71 and 0.84, respectively). CONCLUSION The Validation Project study population was appropriate for assessing the reliability and validity of the RDC/TMD Axis I and II. The reference standards used to assess the validity of Axis I TMD were based on reliable and clinically credible methods. PMID:20213028
ERIC Educational Resources Information Center
Lin, Keh-chung; Chen, Hui-fang; Chen, Chia-ling; Wang, Tien-ni; Wu, Ching-yi; Hsieh, Yu-wei; Wu, Li-ling
2012-01-01
This study examined criterion-related validity and clinimetric properties of the Pediatric Motor Activity Log (PMAL) in children with cerebral palsy. Study participants were 41 children (age range: 28-113 months) and their parents. Criterion-related validity was evaluated by the associations between the PMAL and criterion measures at baseline and…
ERIC Educational Resources Information Center
Swanson, Jennifer R.; Bradley-Johnson, Sharon; Johnson, C. Merle; O'Dell, Anna Rubenaker
2009-01-01
Three studies examine the validity of the Preschool Form of the Cognitive Abilities Scale--Second Edition (CAS-2). Significant high concurrent criterion-related validity correlations, corrected for restricted range, are found between the CAS-2 and the Detroit Test of Learning Ability--Primary: Third Edition for 26 three-year-olds (r[subscript c] =…
The Validation of a Case-Based, Cumulative Assessment and Progressions Examination
Coker, Adeola O.; Copeland, Jeffrey T.; Gottlieb, Helmut B.; Horlen, Cheryl; Smith, Helen E.; Urteaga, Elizabeth M.; Ramsinghani, Sushma; Zertuche, Alejandra; Maize, David
2016-01-01
Objective. To assess content and criterion validity, as well as reliability of an internally developed, case-based, cumulative, high-stakes third-year Annual Student Assessment and Progression Examination (P3 ASAP Exam). Methods. Content validity was assessed through the writing-reviewing process. Criterion validity was assessed by comparing student scores on the P3 ASAP Exam with the nationally validated Pharmacy Curriculum Outcomes Assessment (PCOA). Reliability was assessed with psychometric analysis comparing student performance over four years. Results. The P3 ASAP Exam showed content validity through representation of didactic courses and professional outcomes. Similar scores on the P3 ASAP Exam and PCOA with Pearson correlation coefficient established criterion validity. Consistent student performance using Kuder-Richardson coefficient (KR-20) since 2012 reflected reliability of the examination. Conclusion. Pharmacy schools can implement internally developed, high-stakes, cumulative progression examinations that are valid and reliable using a robust writing-reviewing process and psychometric analyses. PMID:26941435
Steele, Catriona M.; Namasivayam-MacDonald, Ashwini M.; Guida, Brittany T.; Cichero, Julie A.; Duivestein, Janice; MRSc; Hanson, Ben; Lam, Peter; Riquelme, Luis F.
2018-01-01
Objective To assess consensual validity, interrater reliability, and criterion validity of the International Dysphagia Diet Standardisation Initiative Functional Diet Scale, a new functional outcome scale intended to capture the severity of oropharyngeal dysphagia, as represented by the degree of diet texture restriction recommended for the patient. Design Participants assigned International Dysphagia Diet Standardisation Initiative Functional Diet Scale scores to 16 clinical cases. Consensual validity was measured against reference scores determined by an author reference panel. Interrater reliability was measured overall and across quartile subsets of the dataset. Criterion validity was evaluated versus Functional Oral Intake Scale (FOIS) scores assigned by survey respondents to the same case scenarios. Feedback was requested regarding ease and likelihood of use. Setting Web-based survey. Participants Respondents (NZ170) from 29 countries. Interventions Not applicable. Main Outcome Measures Consensual validity (percent agreement and Kendall t), criterion validity (Spearman rank correlation), and interrater reliability (Kendall concordance and intraclass coefficients). Results The International Dysphagia Diet Standardisation Initiative Functional Diet Scale showed strong consensual validity, criterion validity, and interrater reliability. Scenarios involving liquid-only diets, transition from nonoral feeding, or trial diet advances in therapy showed the poorest consensus, indicating a need for clear instructions on how to score these situations. The International Dysphagia Diet Standardisation Initiative Functional Diet Scale showed greater sensitivity than the FOIS to specific changes in diet. Most (>70%) respondents indicated enthusiasm for implementing the International Dysphagia Diet Standardisation Initiative Functional Diet Scale. Conclusions This initial validation study suggests that the International Dysphagia Diet Standardisation Initiative Functional Diet Scale has strong consensual and criterion validity and can be used reliably by clinicians to capture diet texture restriction and progression in people with dysphagia. PMID:29428348
Steele, Catriona M; Namasivayam-MacDonald, Ashwini M; Guida, Brittany T; Cichero, Julie A; Duivestein, Janice; Hanson, Ben; Lam, Peter; Riquelme, Luis F
2018-05-01
To assess consensual validity, interrater reliability, and criterion validity of the International Dysphagia Diet Standardisation Initiative Functional Diet Scale, a new functional outcome scale intended to capture the severity of oropharyngeal dysphagia, as represented by the degree of diet texture restriction recommended for the patient. Participants assigned International Dysphagia Diet Standardisation Initiative Functional Diet Scale scores to 16 clinical cases. Consensual validity was measured against reference scores determined by an author reference panel. Interrater reliability was measured overall and across quartile subsets of the dataset. Criterion validity was evaluated versus Functional Oral Intake Scale (FOIS) scores assigned by survey respondents to the same case scenarios. Feedback was requested regarding ease and likelihood of use. Web-based survey. Respondents (N=170) from 29 countries. Not applicable. Consensual validity (percent agreement and Kendall τ), criterion validity (Spearman rank correlation), and interrater reliability (Kendall concordance and intraclass coefficients). The International Dysphagia Diet Standardisation Initiative Functional Diet Scale showed strong consensual validity, criterion validity, and interrater reliability. Scenarios involving liquid-only diets, transition from nonoral feeding, or trial diet advances in therapy showed the poorest consensus, indicating a need for clear instructions on how to score these situations. The International Dysphagia Diet Standardisation Initiative Functional Diet Scale showed greater sensitivity than the FOIS to specific changes in diet. Most (>70%) respondents indicated enthusiasm for implementing the International Dysphagia Diet Standardisation Initiative Functional Diet Scale. This initial validation study suggests that the International Dysphagia Diet Standardisation Initiative Functional Diet Scale has strong consensual and criterion validity and can be used reliably by clinicians to capture diet texture restriction and progression in people with dysphagia. Copyright © 2018 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Schiffman, Eric L; Truelove, Edmond L; Ohrbach, Richard; Anderson, Gary C; John, Mike T; List, Thomas; Look, John O
2010-01-01
The purpose of the Research Diagnostic Criteria for Temporomandibular Disorders (RDC/TMD) Validation Project was to assess the diagnostic validity of this examination protocol. The aim of this article is to provide an overview of the project's methodology, descriptive statistics, and data for the study participant sample. This article also details the development of reliable methods to establish the reference standards for assessing criterion validity of the Axis I RDC/TMD diagnoses. The Axis I reference standards were based on the consensus of two criterion examiners independently performing a comprehensive history, clinical examination, and evaluation of imaging. Intersite reliability was assessed annually for criterion examiners and radiologists. Criterion examination reliability was also assessed within study sites. Study participant demographics were comparable to those of participants in previous studies using the RDC/TMD. Diagnostic agreement of the criterion examiners with each other and with the consensus-based reference standards was excellent with all kappas > or = 0.81, except for osteoarthrosis (moderate agreement, k = 0.53). Intrasite criterion examiner agreement with reference standards was excellent (k > or = 0.95). Intersite reliability of the radiologists for detecting computed tomography-disclosed osteoarthrosis and magnetic resonance imaging-disclosed disc displacement was good to excellent (k = 0.71 and 0.84, respectively). The Validation Project study population was appropriate for assessing the reliability and validity of the RDC/TMD Axis I and II. The reference standards used to assess the validity of Axis I TMD were based on reliable and clinically credible methods.
Evaluation of Measurement Instrument Criterion Validity in Finite Mixture Settings
ERIC Educational Resources Information Center
Raykov, Tenko; Marcoulides, George A.; Li, Tenglong
2016-01-01
A method for evaluating the validity of multicomponent measurement instruments in heterogeneous populations is discussed. The procedure can be used for point and interval estimation of criterion validity of linear composites in populations representing mixtures of an unknown number of latent classes. The approach permits also the evaluation of…
Evaluation of Validity and Reliability for Hierarchical Scales Using Latent Variable Modeling
ERIC Educational Resources Information Center
Raykov, Tenko; Marcoulides, George A.
2012-01-01
A latent variable modeling method is outlined, which accomplishes estimation of criterion validity and reliability for a multicomponent measuring instrument with hierarchical structure. The approach provides point and interval estimates for the scale criterion validity and reliability coefficients, and can also be used for testing composite or…
Discriminative and Criterion Validity of the Autism Spectrum Identity Scale (ASIS)
ERIC Educational Resources Information Center
McDonald, T. A. M.
2017-01-01
Individuals on the autism spectrum face stigma that can influence identity development. Previous research on the 22-item Autism Spectrum Identity Scale (ASIS) reported a four-factor structure with strong split-sample cross-validation and good internal consistency. This study reports the discriminative and criterion validity of the ASIS with other…
Design and validation of a comprehensive fecal incontinence questionnaire.
Macmillan, Alexandra K; Merrie, Arend E H; Marshall, Roger J; Parry, Bryan R
2008-10-01
Fecal incontinence can have a profound effect on quality of life. Its prevalence remains uncertain because of stigma, lack of consistent definition, and dearth of validated measures. This study was designed to develop a valid clinical and epidemiologic questionnaire, building on current literature and expertise. Patients and experts undertook face validity testing. Construct validity, criterion validity, and test-retest reliability was undertaken. Construct validity comprised factor analysis and internal consistency of the quality of life scale. The validity of known groups was tested against 77 control subjects by using regression models. Questionnaire results were compared with a stool diary for criterion validity. Test-retest reliability was calculated from repeated questionnaire completion. The questionnaire achieved good face validity. It was completed by 104 patients. The quality of life scale had four underlying traits (factor analysis) and high internal consistency (overall Cronbach alpha = 0.97). Patients and control subjects answered the questionnaire significantly differently (P < 0.01) in known-groups validity testing. Criterion validity assessment found mean differences close to zero. Median reliability for the whole questionnaire was 0.79 (range, 0.35-1). This questionnaire compares favorably with other available instruments, although the interpretation of stool consistency requires further research. Its sensitivity to treatment still needs to be investigated.
Shmulewitz, D.; Wall, M.M.; Aharonovich, E.; Spivak, B.; Weizman, A.; Frisch, A.; Grant, B. F.; Hasin, D.
2013-01-01
Background The fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) proposes aligning nicotine use disorder (NUD) criteria with those for other substances, by including the current DSM fourth edition (DSM-IV) nicotine dependence (ND) criteria, three abuse criteria (neglect roles, hazardous use, interpersonal problems) and craving. Although NUD criteria indicate one latent trait, evidence is lacking on: (1) validity of each criterion; (2) validity of the criteria as a set; (3) comparative validity between DSM-5 NUD and DSM-IV ND criterion sets; and (4) NUD prevalence. Method Nicotine criteria (DSM-IV ND, abuse and craving) and external validators (e.g. smoking soon after awakening, number of cigarettes per day) were assessed with a structured interview in 734 lifetime smokers from an Israeli household sample. Regression analysis evaluated the association between validators and each criterion. Receiver operating characteristic analysis assessed the association of the validators with the DSM-5 NUD set (number of criteria endorsed) and tested whether DSM-5 or DSM-IV provided the most discriminating criterion set. Changes in prevalence were examined. Results Each DSM-5 NUD criterion was significantly associated with the validators, with strength of associations similar across the criteria. As a set, DSM-5 criteria were significantly associated with the validators, were significantly more discriminating than DSM-IV ND criteria, and led to increased prevalence of binary NUD (two or more criteria) over ND. Conclusions All findings address previous concerns about the DSM-IV nicotine diagnosis and its criteria and support the proposed changes for DSM-5 NUD, which should result in improved diagnosis of nicotine disorders. PMID:23312475
ERIC Educational Resources Information Center
Livingstone, Holly A.; Day, Arla L.
2005-01-01
Despite the popularity of the concept of emotional intelligence(EI), there is much controversy around its definition, measurement, and validity. Therefore, the authors examined the construct and criterion-related validity of an ability-based EI measure (Mayer Salovey Caruso Emotional Intelligence Test [MSCEIT]) and a mixed-model EI measure…
Criterion-Related Validity: Assessing the Value of Subscores
ERIC Educational Resources Information Center
Davison, Mark L.; Davenport, Ernest C., Jr.; Chang, Yu-Feng; Vue, Kory; Su, Shiyang
2015-01-01
Criterion-related profile analysis (CPA) can be used to assess whether subscores of a test or test battery account for more criterion variance than does a single total score. Application of CPA to subscore evaluation is described, compared to alternative procedures, and illustrated using SAT data. Considerations other than validity and reliability…
Abbas, Ismail; Rovira, Joan; Casanovas, Josep
2006-12-01
To develop and validate a model of a clinical trial that evaluates the changes in cholesterol level as a surrogate marker for lipodystrophy in HIV subjects under alternative antiretroviral regimes, i.e., treatment with Protease Inhibitors vs. a combination of nevirapine and other antiretroviral drugs. Five simulation models were developed based on different assumptions, on treatment variability and pattern of cholesterol reduction over time. The last recorded cholesterol level, the difference from the baseline, the average difference from the baseline and level evolution, are the considered endpoints. Specific validation criteria based on a 10% minus or plus standardized distance in means and variances were used to compare the real and the simulated data. The validity criterion was met by all models for considered endpoints. However, only two models met the validity criterion when all endpoints were considered. The model based on the assumption that within-subjects variability of cholesterol levels changes over time is the one that minimizes the validity criterion, standardized distance equal to or less than 1% minus or plus. Simulation is a useful technique for calibration, estimation, and evaluation of models, which allows us to relax the often overly restrictive assumptions regarding parameters required by analytical approaches. The validity criterion can also be used to select the preferred model for design optimization, until additional data are obtained allowing an external validation of the model.
MacKillop, James; Acker, John D; Bollinger, Jared; Clifton, Allan; Miller, Joshua D; Campbell, W Keith; Goodie, Adam S
2013-09-01
Alcohol misuse is substantially influenced by social factors, but systematic assessments of social network drinking are typically lengthy. The goal of the present study was to provide further validation of a brief measure of social network alcohol use, the Brief Alcohol Social Density Assessment (BASDA), in a sample of emerging adults. Specifically, the study sought to examine the BASDA's convergent, criterion, and incremental validity in relation to well-established measures of drinking motives and problematic drinking. Participants were 354 undergraduates who were assessed using the BASDA, the Alcohol Use Disorders Identification Test (AUDIT), and the Drinking Motives Questionnaire. Significant associations were observed between the BASDA index of alcohol-related social density and alcohol misuse, social motives, and conformity motives, supporting convergent validity. Criterion-related validity was supported by evidence that significantly greater alcohol involvement was present in the social networks of individuals scoring at or above an AUDIT score of 8, a validated criterion for hazardous drinking. Finally, the BASDA index was significantly associated with alcohol misuse above and beyond drinking motives in relation to AUDIT scores, supporting incremental validity. Taken together, these findings provide further support for the BASDA as an efficient measure of drinking in an individual's social network. Methodological considerations as well as recommendations for future investigations in this area are discussed.
ERIC Educational Resources Information Center
Rikli, Roberta E.; Jones, C. Jessie
2013-01-01
Purpose: To develop and validate criterion-referenced fitness standards for older adults that predict the level of capacity needed for maintaining physical independence into later life. The proposed standards were developed for use with a previously validated test battery for older adults--the Senior Fitness Test (Rikli, R. E., & Jones, C. J.…
ERIC Educational Resources Information Center
Daviss, W. Burleson; Birmaher, Boris; Melhem, Nadine A.; Axelson, David A.; Michaels, Shana M.; Brent, David A.
2006-01-01
Background: Previous measures of pediatric depression have shown inconsistent validity in groups with differing demographics, comorbid diagnoses, and clinic or non-clinic origins. The current study re-examines the criterion validity of child- and parent-versions of the Mood and Feelings Questionnaire (MFQ-C, MFQ-P) in a heterogeneous sample of…
Gaudin, Valérie
2017-09-01
Screening methods are used as a first-line approach to detect the presence of antibiotic residues in food of animal origin. The validation process guarantees that the method is fit-for-purpose, suited to regulatory requirements, and provides evidence of its performance. This article is focused on intra-laboratory validation. The first step in validation is characterisation of performance, and the second step is the validation itself with regard to pre-established criteria. The validation approaches can be absolute (a single method) or relative (comparison of methods), overall (combination of several characteristics in one) or criterion-by-criterion. Various approaches to validation, in the form of regulations, guidelines or standards, are presented and discussed to draw conclusions on their potential application for different residue screening methods, and to determine whether or not they reach the same conclusions. The approach by comparison of methods is not suitable for screening methods for antibiotic residues. The overall approaches, such as probability of detection (POD) and accuracy profile, are increasingly used in other fields of application. They may be of interest for screening methods for antibiotic residues. Finally, the criterion-by-criterion approach (Decision 2002/657/EC and of European guideline for the validation of screening methods), usually applied to the screening methods for antibiotic residues, introduced a major characteristic and an improvement in the validation, i.e. the detection capability (CCβ). In conclusion, screening methods are constantly evolving, thanks to the development of new biosensors or liquid chromatography coupled to tandem-mass spectrometry (LC-MS/MS) methods. There have been clear changes in validation approaches these last 20 years. Continued progress is required and perspectives for future development of guidelines, regulations and standards for validation are presented here.
Cha, Young Joo; Lee, Jae Jin; Kim, Do Hyun; You, Joshua Sung H
2017-10-23
Core stabilization plays an important role in the regulation of postural stability. To overcome shortcomings associated with pain and severe core instability during conventional core stabilization tests, we recently developed the dynamic neuromuscular stabilization-based heel sliding (DNS-HS) test. The purpose of this study was to establish the criterion validity and test-retest reliability of the novel DNS-HS test. Twenty young adults with core instability completed both the bilateral straight leg lowering test (BSLLT) and DNS-HS test for the criterion validity study and repeated the DNS-HS test for the test-retest reliability study. Criterion validity was determined by comparing hip joint angle data that were obtained from BSLLT and DNS-HS measures. The test-retest reliability was determined by comparing hip joint angle data. Criterion validity was (ICC2,3) = 0.700 (p< 0.05), suggesting a good relationship between the two core stability measures. Test-retest reliability was (ICC3,3) = 0.953 (p< 0.05), indicating excellent consistency between the repeated DNS-HS measurements. Criterion validity data demonstrated a good relationship between the gold standard BSLLT and DNS-HS core stability measures. Test-retest reliability data suggests that DNS-HS core stability was a reliable test for core stability. Clinically, the DNS-HS test is useful to objectively quantify core instability and allow early detection and evaluation.
ERIC Educational Resources Information Center
Oakland, Thomas
New strategies for evaluation criterion referenced measures (CRM) are discussed. These strategies examine the following issues: (1) the use of normed referenced measures (NRM) as CRM and then estimating the reliability and validity of such measures in terms of variance from an arbitrarily specified criterion score, (2) estimation of the…
Validity of the modified back-saver sit-and-reach test: a comparison with other protocols.
Hui, S S; Yuen, P Y
2000-09-01
Studies have shown that the classical sit-and-reach (CSR) test, the modified sit-and-reach (MSR), and the newly developed back-saver sit-and-reach (BS) test have poor criterion-related validity in estimating low-back flexibility but yielded moderate criterion-related validity in hamstring flexibility. The V sit-and-reach (VSR) test was found to be practical but the validity has not been established. The purpose of this study was to propose a modified back-saver sit-and-reach (MBS) test, which incorporated all advantages of the various protocols, and to compare the criterion-related validity and reliability of all these tests. 158 college students (F = 96, and M = 62; age = 20.77 +/- 2.51) performed CSR, VSR, BS (left and right leg), and MBS (left and right leg) tests in a randomized order. Scores from each test were then correlated with the criterion measures. For all sit-reach tests, intraclass reliability (single trial) was very high (r = 0.89-0.98). MBS yielded significant and highest r with low-back and hamstring criterion for men (r = 0.47-0.67) and women (r = 0.23-0.54). The low-back and right hamstring validity of MBS for men were significantly (P < 0.01) higher than those from BS and CSR, whereas no differences in criterion-related validity were found between the MBS and other protocols in women. The ratings of perceived comfort among the sit-and-reach protocols were significantly different (P < 0.001) from each other. The rating for MBS was observed the most comfortable test as compared with other protocols. The MBS test is not only a reliable test for hamstring and low-back flexibility, it is also a more practical with improved validity for hamstring and low-back flexibility in men than previous protocols.
The Counselor Evaluation Rating Scale: A Valid Criterion of Counselor Effectiveness?
ERIC Educational Resources Information Center
Jones, Lawrence K.
1974-01-01
The validity of recent recommendations regarding the use of certain factors of the 16 Personality Factor Questionnaire (16PF) to select persons for counselor training programs, where the CERS was the criterion measure, is challenged. (Author)
Sheffield, Alexandra; Waller, Glenn; Emanuelli, Francesca; Murray, James
2006-01-01
Recent studies support the reliability and validity of the Young Parenting Inventory-Revised (YPI-R) and its use in investigating the role of parenting in the aetiology and maintenance of eating pathology. However, criterion validity has yet to be fully established. To investigate one aspect of criterion validity, this study examines the association between parenting and comorbid problems in the eating disorders (including general psychopathology and impulsivity). The participants were 124 women with eating disorders. They completed the YPI-R and the Brief Symptom Inventory (BSI; a measure of general psychopathology). They were also interviewed about their use of a number of impulsive behaviours. YPI-R scales were significant predictors of one of the nine BSI scales, and distinguished those patients who did or did not use specific impulsive behaviours. The criterion validity of the YPI-R is partially supported with regards to general psychopathology and impulsivity. The findings highlight the specificity of the parenting styles measured by the YPI-R, and the need for further research using this tool.
Criterion-Related Validity of the TOEFL iBT Listening Section. TOEFL iBT Research Report. RR-09-02
ERIC Educational Resources Information Center
Sawaki, Yasuyo; Nissan, Susan
2009-01-01
The study investigated the criterion-related validity of the "Test of English as a Foreign Language"[TM] Internet-based test (TOEFL[R] iBT) Listening section by examining its relationship to a criterion measure designed to reflect language-use tasks that university students encounter in everyday academic life: listening to academic…
Ethical leadership: meta-analytic evidence of criterion-related and incremental validity.
Ng, Thomas W H; Feldman, Daniel C
2015-05-01
This study examines the criterion-related and incremental validity of ethical leadership (EL) with meta-analytic data. Across 101 samples published over the last 15 years (N = 29,620), we observed that EL demonstrated acceptable criterion-related validity with variables that tap followers' job attitudes, job performance, and evaluations of their leaders. Further, followers' trust in the leader mediated the relationships of EL with job attitudes and performance. In terms of incremental validity, we found that EL significantly, albeit weakly in some cases, predicted task performance, citizenship behavior, and counterproductive work behavior-even after controlling for the effects of such variables as transformational leadership, use of contingent rewards, management by exception, interactional fairness, and destructive leadership. The article concludes with a discussion of ways to strengthen the incremental validity of EL. (PsycINFO Database Record (c) 2015 APA, all rights reserved).
Palm, Peter; Josephson, Malin; Mathiassen, Svend Erik; Kjellberg, Katarina
2016-06-01
We evaluated the intra- and inter-observer reliability and criterion validity of an observation protocol, developed in an iterative process involving practicing ergonomists, for assessment of working technique during cash register work for the purpose of preventing upper extremity symptoms. Two ergonomists independently assessed 17 15-min videos of cash register work on two occasions each, as a basis for examining reliability. Criterion validity was assessed by comparing these assessments with meticulous video-based analyses by researchers. Intra-observer reliability was acceptable (i.e. proportional agreement >0.7 and kappa >0.4) for 10/10 questions. Inter-observer reliability was acceptable for only 3/10 questions. An acceptable inter-observer reliability combined with an acceptable criterion validity was obtained only for one working technique aspect, 'Quality of movements'. Thus, major elements of the cashiers' working technique could not be assessed with an acceptable accuracy from short periods of observations by one observer, such as often desired by practitioners. Practitioner Summary: We examined an observation protocol for assessing working technique in cash register work. It was feasible in use, but inter-observer reliability and criterion validity were generally not acceptable when working technique aspects were assessed from short periods of work. We recommend the protocol to be used for educational purposes only.
Five-level emergency triage systems: variation in assessment of validity.
Kuriyama, Akira; Urushidani, Seigo; Nakayama, Takeo
2017-11-01
Triage systems are scales developed to rate the degree of urgency among patients who arrive at EDs. A number of different scales are in use; however, the way in which they have been validated is inconsistent. Also, it is difficult to define a surrogate that accurately predicts urgency. This systematic review described reference standards and measures used in previous validation studies of five-level triage systems. We searched PubMed, EMBASE and CINAHL to identify studies that had assessed the validity of five-level triage systems and described the reference standards and measures applied in these studies. Studies were divided into those using criterion validity (reference standards developed by expert panels or triage systems already in use) and those using construct validity (prognosis, costs and resource use). A total of 57 studies examined criterion and construct validity of 14 five-level triage systems. Criterion validity was examined by evaluating (1) agreement between the assigned degree of urgency with objective standard criteria (12 studies), (2) overtriage and undertriage (9 studies) and (3) sensitivity and specificity of triage systems (7 studies). Construct validity was examined by looking at (4) the associations between the assigned degree of urgency and measures gauged in EDs (48 studies) and (5) the associations between the assigned degree of urgency and measures gauged after hospitalisation (13 studies). Particularly, among 46 validation studies of the most commonly used triages (Canadian Triage and Acuity Scale, Emergency Severity Index and Manchester Triage System), 13 and 39 studies examined criterion and construct validity, respectively. Previous studies applied various reference standards and measures to validate five-level triage systems. They either created their own reference standard or used a combination of severity/resource measures. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
29 CFR 1607.5 - General standards for validity studies.
Code of Federal Regulations, 2010 CFR
2010-07-01
... 29 Labor 4 2010-07-01 2010-07-01 false General standards for validity studies. 1607.5 Section 1607... studies. A. Acceptable types of validity studies. For the purposes of satisfying these guidelines, users may rely upon criterion-related validity studies, content validity studies or construct validity...
Mayorga-Vega, Daniel; Bocanegra-Parrilla, Raúl; Ornelas, Martha; Viciana, Jesús
2016-01-01
The main purpose of the present meta-analysis was to examine the criterion-related validity of the distance- and time-based walk/run tests for estimating cardiorespiratory fitness among apparently healthy children and adults. Relevant studies were searched from seven electronic bibliographic databases up to August 2015 and through other sources. The Hunter-Schmidt's psychometric meta-analysis approach was conducted to estimate the population criterion-related validity of the following walk/run tests: 5,000 m, 3 miles, 2 miles, 3,000 m, 1.5 miles, 1 mile, 1,000 m, ½ mile, 600 m, 600 yd, ¼ mile, 15 min, 12 min, 9 min, and 6 min. From the 123 included studies, a total of 200 correlation values were analyzed. The overall results showed that the criterion-related validity of the walk/run tests for estimating maximum oxygen uptake ranged from low to moderate (rp = 0.42-0.79), with the 1.5 mile (rp = 0.79, 0.73-0.85) and 12 min walk/run tests (rp = 0.78, 0.72-0.83) having the higher criterion-related validity for distance- and time-based field tests, respectively. The present meta-analysis also showed that sex, age and maximum oxygen uptake level do not seem to affect the criterion-related validity of the walk/run tests. When the evaluation of an individual's maximum oxygen uptake attained during a laboratory test is not feasible, the 1.5 mile and 12 min walk/run tests represent useful alternatives for estimating cardiorespiratory fitness. As in the assessment with any physical fitness field test, evaluators must be aware that the performance score of the walk/run field tests is simply an estimation and not a direct measure of cardiorespiratory fitness.
Saraf, Sanatan; Mathew, Thomas; Roy, Anindya
2015-01-01
For the statistical validation of surrogate endpoints, an alternative formulation is proposed for testing Prentice's fourth criterion, under a bivariate normal model. In such a setup, the criterion involves inference concerning an appropriate regression parameter, and the criterion holds if the regression parameter is zero. Testing such a null hypothesis has been criticized in the literature since it can only be used to reject a poor surrogate, and not to validate a good surrogate. In order to circumvent this, an equivalence hypothesis is formulated for the regression parameter, namely the hypothesis that the parameter is equivalent to zero. Such an equivalence hypothesis is formulated as an alternative hypothesis, so that the surrogate endpoint is statistically validated when the null hypothesis is rejected. Confidence intervals for the regression parameter and tests for the equivalence hypothesis are proposed using bootstrap methods and small sample asymptotics, and their performances are numerically evaluated and recommendations are made. The choice of the equivalence margin is a regulatory issue that needs to be addressed. The proposed equivalence testing formulation is also adopted for other parameters that have been proposed in the literature on surrogate endpoint validation, namely, the relative effect and proportion explained.
Toro, Brigitte; Nester, Christopher J; Farren, Pauline C
2007-03-01
To develop the construct, content, and criterion validity of the Salford Gait Tool (SF-GT) and to evaluate agreement between gait observations using the SF-GT and kinematic gait data. Tool development and comparative evaluation. University in the United Kingdom. For designing construct and content validity, convenience samples of 10 children with hemiplegic, diplegic, and quadriplegic cerebral palsy (CP) and 152 physical therapy students and 4 physical therapists were recruited. For developing criterion validity, kinematic gait data of 13 gait clusters containing 56 children with hemiplegic, diplegic, and quadriplegic CP and 11 neurologically intact children was used. For clinical evaluation, a convenience sample of 23 pediatric physical therapists participated. We developed a sagittal plane observational gait assessment tool through a series of design, test, and redesign iterations. The tool's grading system was calibrated using kinematic gait data of 13 gait clusters and was evaluated by comparing the agreement of gait observations using the SF-GT with kinematic gait data. Criterion standard kinematic gait data. There was 58% mean agreement based on grading categories and 80% mean agreement based on degree estimations evaluated with the least significant difference method. The new SF-GT has good concurrent criterion validity.
The Missing Middle in Validation Research
ERIC Educational Resources Information Center
Taylor, Erwin K.; Griess, Thomas
1976-01-01
In most selection validation research, only the upper and lower tails of the criterion distribution are used, often yielding misleading or incorrect results. Provides formulas and tables which enable the researcher to account more accurately for the distribution of criterion within the middle range of population. (Author/RW)
Mayorga-Vega, Daniel; Merino-Marban, Rafael; Viciana, Jesús
2014-01-01
The main purpose of the present meta-analysis was to examine the scientific literature on the criterion-related validity of sit-and-reach tests for estimating hamstring and lumbar extensibility. For this purpose relevant studies were searched from seven electronic databases dated up through December 2012. Primary outcomes of criterion-related validity were Pearson´s zero-order correlation coefficients (r) between sit-and-reach tests and hamstrings and/or lumbar extensibility criterion measures. Then, from the included studies, the Hunter- Schmidt´s psychometric meta-analysis approach was conducted to estimate population criterion- related validity of sit-and-reach tests. Firstly, the corrected correlation mean (rp), unaffected by statistical artefacts (i.e., sampling error and measurement error), was calculated separately for each sit-and-reach test. Subsequently, the three potential moderator variables (sex of participants, age of participants, and level of hamstring extensibility) were examined by a partially hierarchical analysis. Of the 34 studies included in the present meta-analysis, 99 correlations values across eight sit-and-reach tests and 51 across seven sit-and-reach tests were retrieved for hamstring and lumbar extensibility, respectively. The overall results showed that all sit-and-reach tests had a moderate mean criterion-related validity for estimating hamstring extensibility (rp = 0.46-0.67), but they had a low mean for estimating lumbar extensibility (rp = 0. 16-0.35). Generally, females, adults and participants with high levels of hamstring extensibility tended to have greater mean values of criterion-related validity for estimating hamstring extensibility. When the use of angular tests is limited such as in a school setting or in large scale studies, scientists and practitioners could use the sit-and-reach tests as a useful alternative for hamstring extensibility estimation, but not for estimating lumbar extensibility. Key Points Overall sit-and-reach tests have a moderate mean criterion-related validity for estimating hamstring extensibility, but they have a low mean validity for estimating lumbar extensibility. Among all the sit-and-reach test protocols, the Classic sit-and-reach test seems to be the best option to estimate hamstring extensibility. End scores (e.g., the Classic sit-and-reach test) are a better indicator of hamstring extensibility than the modifications that incorporate fingers-to-box distance (e.g., the Modified sit-and-reach test). When angular tests such as straight leg raise or knee extension tests cannot be used, sit-and-reach tests seem to be a useful field test alternative to estimate hamstring extensibility, but not to estimate lumbar extensibility. PMID:24570599
Turkish Version of Kolcaba's Immobilization Comfort Questionnaire: A Validity and Reliability Study.
Tosun, Betül; Aslan, Özlem; Tunay, Servet; Akyüz, Aygül; Özkan, Hüseyin; Bek, Doğan; Açıksöz, Semra
2015-12-01
The purpose of this study was to determine the validity and reliability of the Turkish version of the Immobilization Comfort Questionnaire (ICQ). The sample used in this methodological study consisted of 121 patients undergoing lower extremity arthroscopy in a training and research hospital. The validity study of the questionnaire assessed language validity, structural validity and criterion validity. Structural validity was evaluated via exploratory factor analysis. Criterion validity was evaluated by assessing the correlation between the visual analog scale (VAS) scores (i.e., the comfort and pain VAS scores) and the ICQ scores using Spearman's correlation test. The Kaiser-Meyer-Olkin coefficient and Bartlett's test of sphericity were used to determine the suitability of the data for factor analysis. Internal consistency was evaluated to determine reliability. The data were analyzed with SPSS version 15.00 for Windows. Descriptive statistics were presented as frequencies, percentages, means and standard deviations. A p value ≤ .05 was considered statistically significant. A moderate positive correlation was found between the ICQ scores and the VAS comfort scores; a moderate negative correlation was found between the ICQ and the VAS pain measures in the criterion validity analysis. Cronbach α values of .75 and .82 were found for the first and second measurements, respectively. The findings of this study reveal that the ICQ is a valid and reliable tool for assessing the comfort of patients in Turkey who are immobilized because of lower extremity orthopedic problems. Copyright © 2015. Published by Elsevier B.V.
A Note on Economic Content and Test Validity.
ERIC Educational Resources Information Center
Soper, John C.; Brenneke, Judith Staley
1987-01-01
Offers practical tips on how teachers can determine whether classroom tests are actually measuring what they are designed to measure. Discusses criterion-related validity, construct validity, and content validity. Demonstrates how to determine the degree of content validity a particular test may have for a particular course or unit. (Author/DH)
Empirical agreement in model validation.
Jebeile, Julie; Barberousse, Anouk
2016-04-01
Empirical agreement is often used as an important criterion when assessing the validity of scientific models. However, it is by no means a sufficient criterion as a model can be so adjusted as to fit available data even though it is based on hypotheses whose plausibility is known to be questionable. Our aim in this paper is to investigate into the uses of empirical agreement within the process of model validation. Copyright © 2015 Elsevier Ltd. All rights reserved.
Validation of the Intrinsic Spirituality Scale (ISS) with Muslims.
Hodge, David R; Zidan, Tarek; Husain, Altaf
2015-12-01
This study validates an existing spirituality measure--the intrinsic spirituality scale (ISS)--for use with Muslims in the United States. A confirmatory factor analysis was conducted with a diverse sample of self-identified Muslims (N = 281). Validity and reliability were assessed along with criterion and concurrent validity. The measurement model fit the data well, normed χ2 = 2.50, CFI = 0.99, RMSEA = 0.07, and SRMR = 0.02. All 6 items that comprise the ISS demonstrated satisfactory levels of validity (λ > .70) and reliability (R2 > .50). The Cronbach's alpha obtained with the present sample was .93. Appropriate correlations with theoretically linked constructs demonstrated criterion and concurrent validity. The results suggest the ISS is a valid measure of spirituality in clinical settings with the rapidly growing Muslim population. The ISS may, for instance, provide an efficient screening tool to identify Muslims that are particularly likely to benefit from spiritually accommodative treatments. (c) 2015 APA, all rights reserved).
Milian, Monika; Kreitschmann-Andermahr, Ilonka; Siegel, Sonja; Kleist, Bernadette; Führer-Sakel, Dagmar; Honegger, Juergen; Buchfelder, Michael; Psaras, Tsambika
2015-01-01
To evaluate the construct and criterion validity of the Tuebingen Cushing's disease quality of life inventory (Tuebingen CD-25) for application in patients treated for Cushing's disease (CD). A total of 176 patients with adrenocorticotropin hormone-dependent CD (144 of them female, overall mean age 46.1 ± 13.7 years) treated at 3 large tertiary referral centers in Germany were studied. Construct validity was assessed by hypothesis testing (self-perceived symptom reduction assessment) and contrasted groups (patients with vs. without hypercorticolism). For this purpose, already existing data from 55 CD patients was used, representing the hypercortisolemic group. Criterion validity (concurrent validity) was assessed in relation to the Cushing's quality of life questionnaire (CushingQoL), the Short Form 36 health survey (SF-36), and the body mass index (BMI). Patients with self-perceived remarkable symptom reduction had significant lower Tuebingen CD-25 scores (i.e. better health-related quality of life) than patients with self-perceived insufficient symptom reduction (p < 0.05). Similarly, the mean scores of the Tuebingen CD-25 scales were lower in patients without hypercortisolism (total score 27.0 ± 17.2) compared to those with hypercortisolism (total score 45.3 ± 22.1; each p < 0.05), providing evidence for construct validity. Criterion validity was confirmed by the correlations between the Tuebingen CD-25 total score and the CushingQoL (Spearman's coefficient -0.733), as well as all scales of the SF-36 (Spearman's coefficient between -0.447 and -0.700). The analyses presented in this large-sample study provide robust evidence for the construct and criterion validity of the Tuebingen CD-25. © 2015 S. Karger AG, Basel.
[Validity and Reliability of Korean Version of the Spiritual Care Competence Scale].
Chung, Mi Ja; Park, Youngrye; Eun, Young
2016-12-01
The aim of this study was to examine the validity and reliability of the Korean Version of the Spiritual Care Competence Scale (K-SCCS). A cross-sectional study design was used. The K-SCCS consisted of 26 questions to measure spiritual care competence of nurses. Participants, 228 nurses who had more than 3 years'experience as a nurse, completed the survey. Confirmatory factor analysis was used to examine the construct validity and correlations of K-SCCS and spiritual well-being (SWB) were used to examine the criterion validity of K-SCCS. Cronbach's alpha was used to test internal consistency. The construct and the criterion-related validity of K-SCCS were supported as measures of spiritual care competence. Cronbach's alpha was .95. Factor loadings of the 26 questions ranged from .60 to .96. Construct validity of K-SCCS was verified by confirmatory factor analysis (RMSEA=.08, CFI=.90, NFI=.85). Criterion validity compared to the SWB showed significant correlation (r=.44, p<.001). The findings suggest that K-SCCS serves as an appropriate measure of spiritual care competence with validity and reliability. However, further study is needed to retest the verification of the factor analysis related to factor 2 (professionalisation and improving the quality of spiritual care) and factor 3 (personal support and patient counseling). Therefore, we recommend using the total score without distinguishing subscales.
ERIC Educational Resources Information Center
Harris, Larry P.; Wolf, Steven R.
1979-01-01
The article focuses on the controversy over norm-referenced v criterion-referenced measures (CRM) in assessment of learning disorders. The authors contend that while the reliability of CRMs is generally indisputable, the validity of measures designed from local curricula is still dependent on the intuitive judgments of teachers. (Author/SBH)
Validation of the Military Entrance Physical Strength Capacity Test. Technical Report 610.
ERIC Educational Resources Information Center
Myers, David C.; And Others
A battery of physical ability tests was validated using a predictive, criterion-related strategy. The battery was given to 1,003 female soldiers and 980 male soldiers before they had begun Army Basic Training. Criterion measures which represented physical competency in Basic Training (physical proficiency tests, sick call, profiles, and separation…
ERIC Educational Resources Information Center
Mooney, Paul; Lastrapes, Renée E.
2016-01-01
The amount of research evaluating the technical merits of general outcome measures of science and social studies achievement is growing. This study targeted criterion validity for critical content monitoring. Questions addressed the concurrent criterion validity of alternate presentation formats of critical content monitoring and the measure's…
Validation of a Criterion Referenced Test for Young Handicapped Children: PIPER.
ERIC Educational Resources Information Center
Strum, Irene; Shapiro, Madelaine
The purpose of this study was to validate the Prescriptive Instructional Program for Educational Readiness (PIPER) for utilization as a criterion referenced test (CRT) among learning disabled children. The program consisted of behavioral objectives and diagnostic and/or mastery tasks and activities for each objective in the area of gross motor…
Evaluation of Weighted Scale Reliability and Criterion Validity: A Latent Variable Modeling Approach
ERIC Educational Resources Information Center
Raykov, Tenko
2007-01-01
A method is outlined for evaluating the reliability and criterion validity of weighted scales based on sets of unidimensional measures. The approach is developed within the framework of latent variable modeling methodology and is useful for point and interval estimation of these measurement quality coefficients in counseling and education…
Meta-Analysis of Criterion Validity for Curriculum-Based Measurement in Written Language
ERIC Educational Resources Information Center
Romig, John Elwood; Therrien, William J.; Lloyd, John W.
2017-01-01
We used meta-analysis to examine the criterion validity of four scoring procedures used in curriculum-based measurement of written language. A total of 22 articles representing 21 studies (N = 21) met the inclusion criteria. Results indicated that two scoring procedures, correct word sequences and correct minus incorrect sequences, have acceptable…
Jung, Sung-Hoon; Kwon, Oh-Yun; Jeon, In-Cheol; Hwang, Ui-Jae; Weon, Jong-Hyuck
2018-01-01
The purposes of this study were to determine the intra-rater test-retest reliability of a smart phone-based measurement tool (SBMT) and a three-dimensional (3D) motion analysis system for measuring the transverse rotation angle of the pelvis during single-leg lifting (SLL) and the criterion validity of the transverse rotation angle of the pelvis measurement using SBMT compared with a 3D motion analysis system (3DMAS). Seventeen healthy volunteers performed SLL with their dominant leg without bending the knee until they reached a target placed 20 cm above the table. This study used a 3DMAS, considered the gold standard, to measure the transverse rotation angle of the pelvis to assess the criterion validity of the SBMT measurement. Intra-rater test-retest reliability was determined using the SBMT and 3DMAS using intra-class correlation coefficient (ICC) [3,1] values. The criterion validity of the SBMT was assessed with ICC [3,1] values. Both the 3DMAS (ICC = 0.77) and SBMT (ICC = 0.83) showed excellent intra-rater test-retest reliability in the measurement of the transverse rotation angle of the pelvis during SLL in a supine position. Moreover, the SBMT showed an excellent correlation with the 3DMAS (ICC = 0.99). Measurement of the transverse rotation angle of the pelvis using the SBMT showed excellent reliability and criterion validity compared with the 3DMAS.
Kim, Dong Hee; Im, Yeo Jin
2013-02-01
To develop and test the validity and reliability of the Korean version of the Family Management Measure (Korean FaMM) to assess applicability for families with children having chronic illnesses. The Korean FaMM was articulated through forward-backward translation methods. Internal consistency reliability, construct and criterion validity were calculated using PASW WIN (19.0) and AMOS (20.0). Survey data were collected from 341 mothers of children suffering from chronic disease enrolled in a university hospital in Seoul, South Korea. The Korean version of FaMM showed reliable internal consistency with Cronbach's alpha for the total scale of .69-.91. Factor loadings of the 53 items on the six sub-scales ranged from 0.28-0.84. The model of six subscales for the Korean FaMM was validated by expiratory and confirmatory factor analysis (χ²<.001, RMR<.05, GFI, AGFI, NFI, NNFI>.08). Criterion validity compared to the Parental Stress Index (PSI) showed significant correlation. The findings of this study demonstrate that the Korean FaMM showed satisfactory construct and criterion validity and reliability. It is useful to measure Korean family's management style with their children who have a chronic illness.
Convergent, discriminant, and criterion validity of DSM-5 traits.
Yalch, Matthew M; Hopwood, Christopher J
2016-10-01
Section III of the Diagnostic and Statistical Manual of Mental Disorders (5th edi.; DSM-5; American Psychiatric Association, 2013) contains a system for diagnosing personality disorder based in part on assessing 25 maladaptive traits. Initial research suggests that this aspect of the system improves the validity and clinical utility of the Section II Model. The Computer Adaptive Test of Personality Disorder (CAT-PD; Simms et al., 2011) contains many similar traits as the DSM-5, as well as several additional traits seemingly not covered in the DSM-5. In this study we evaluate the convergent and discriminant validity between the DSM-5 traits, as assessed by the Personality Inventory for DSM-5 (PID-5; Krueger et al., 2012), and CAT-PD in an undergraduate sample, and test whether traits included in the CAT-PD but not the DSM-5 provide incremental validity in association with clinically relevant criterion variables. Results supported the convergent and discriminant validity of the PID-5 and CAT-PD scales in their assessment of 23 out of 25 DSM-5 traits. DSM-5 traits were consistently associated with 11 criterion variables, despite our having intentionally selected clinically relevant criterion constructs not directly assessed by DSM-5 traits. However, the additional CAT-PD traits provided incremental information above and beyond the DSM-5 traits for all criterion variables examined. These findings support the validity of pathological trait models in general and the DSM-5 and CAT-PD models in particular, while also suggesting that the CAT-PD may include additional traits for consideration in future iterations of the DSM-5 system. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Mayorga-Vega, Daniel; Bocanegra-Parrilla, Raúl; Ornelas, Martha; Viciana, Jesús
2016-01-01
Objectives The main purpose of the present meta-analysis was to examine the criterion-related validity of the distance- and time-based walk/run tests for estimating cardiorespiratory fitness among apparently healthy children and adults. Materials and Methods Relevant studies were searched from seven electronic bibliographic databases up to August 2015 and through other sources. The Hunter-Schmidt’s psychometric meta-analysis approach was conducted to estimate the population criterion-related validity of the following walk/run tests: 5,000 m, 3 miles, 2 miles, 3,000 m, 1.5 miles, 1 mile, 1,000 m, ½ mile, 600 m, 600 yd, ¼ mile, 15 min, 12 min, 9 min, and 6 min. Results From the 123 included studies, a total of 200 correlation values were analyzed. The overall results showed that the criterion-related validity of the walk/run tests for estimating maximum oxygen uptake ranged from low to moderate (rp = 0.42–0.79), with the 1.5 mile (rp = 0.79, 0.73–0.85) and 12 min walk/run tests (rp = 0.78, 0.72–0.83) having the higher criterion-related validity for distance- and time-based field tests, respectively. The present meta-analysis also showed that sex, age and maximum oxygen uptake level do not seem to affect the criterion-related validity of the walk/run tests. Conclusions When the evaluation of an individual’s maximum oxygen uptake attained during a laboratory test is not feasible, the 1.5 mile and 12 min walk/run tests represent useful alternatives for estimating cardiorespiratory fitness. As in the assessment with any physical fitness field test, evaluators must be aware that the performance score of the walk/run field tests is simply an estimation and not a direct measure of cardiorespiratory fitness. PMID:26987118
Psychometric evaluation of the Swedish version of Rosenberg's self-esteem scale.
Eklund, Mona; Bäckström, Martin; Hansson, Lars
2018-04-01
The widely used Rosenberg's self-esteem scale (RSES) has not been evaluated for psychometric properties in Sweden. This study aimed at analyzing its factor structure, internal consistency, criterion, convergent and discriminant validity, sensitivity to change, and whether a four-graded Likert-type response scale increased its reliability and validity compared to a yes/no response scale. People with mental illness participating in intervention studies to (1) promote everyday life balance (N = 223) or (2) remedy self-stigma (N = 103) were included. Both samples completed the RSES and questionnaires addressing quality of life and sociodemographic data. Sample 1 also completed instruments chosen to assess convergent and discriminant validity: self-mastery (convergent validity), level of functioning and occupational engagement (discriminant validity). Confirmatory factor analysis (CFA), structural equation modeling, and conventional inferential statistics were used. Based on both samples, the Swedish RSES formed one factor and exhibited high internal consistency (>0.90). The two response scales were equivalent. Criterion validity in relation to quality of life was demonstrated. RSES could distinguish between women and men (women scoring lower) and between diagnostic groups (people with depression scoring lower). Correlations >0.5 with variables chosen to reflect convergent validity and around 0.2 with variables used to address discriminant validity further highlighted the construct validity of RSES. The instrument also showed sensitivity to change. The Swedish RSES exhibited a one-component factor structure and showed good psychometric properties in terms of good internal consistency, criterion, convergent and discriminant validity, and sensitivity to change. The yes/no and the four-graded Likert-type response scales worked equivalently.
Development and psychometric testing of the Cancer Knowledge Scale for Elders.
Su, Ching-Ching; Chen, Yuh-Min; Kuo, Bo-Jein
2009-03-01
To develop the Cancer Knowledge Scale for Elders and test its validity and reliability. The number of elders suffering from cancer is increasing. To facilitate cancer prevention behaviours among elders, they shall be educated about cancer-related knowledge. Prior to designing a programme that would respond to the special needs of elders, understanding the cancer-related knowledge within this population was necessary. However, extensive review of the literature revealed a lack of appropriate instruments for measuring cancer-related knowledge. A valid and reliable cancer knowledge scale for elders is necessary. A non-experimental methodological design was used to test the psychometric properties of the Cancer Knowledge Scale for Elders. Item analysis was first performed to screen out items that had low corrected item-total correlation coefficients. Construct validity was examined with a principle component method of exploratory factor analysis. Cancer-related health behaviour was used as the criterion variable to evaluate criterion-related validity. Internal consistency reliability was assessed by the KR-20. Stability was determined by two-week test-retest reliability. The factor analysis yielded a four-factor solution accounting for 49.5% of the variance. For criterion-related validity, cancer knowledge was positively correlated with cancer-related health behaviour (r = 0.78, p < 0.001). The KR-20 coefficients of each factor were 0.85, 0.76, 0.79 and 0.67 and 0.87 for the total scale. Test-retest reliability over a two-week period was 0.83 (p < 0.001). This study provides evidence for content validity, construct validity, criterion-related validity, internal consistency and stability of the Cancer Knowledge Scale for Elders. The results show that this scale is an easy-to-use instrument for elders and has adequate validity and reliability. The scale can be used as an assessment instrument when implementing cancer education programmes for elders. It can also be used to evaluate the effects of education programmes.
Serel Arslan, S; Demir, N; Karaduman, A A
2017-02-01
This study aimed to develop a scale called Tongue Thrust Rating Scale (TTRS), which categorised tongue thrust in children in terms of its severity during swallowing, and to investigate its validity and reliability. The study describes the developmental phase of the TTRS and presented its content and criterion-based validity and interobserver and intra-observer reliability. For content validation, seven experts assessed the steps in the scale over two Delphi rounds. Two physical therapists evaluated videos of 50 children with cerebral palsy (mean age, 57·9 ± 16·8 months), using the TTRS to test criterion-based validity, interobserver and intra-observer reliability. The Karaduman Chewing Performance Scale (KCPS) and Drooling Severity and Frequency Scale (DSFS) were used for criterion-based validity. All the TTRS steps were deemed necessary. The content validity index was 0·857. A very strong positive correlation was found between two examinations by one physical therapist, which indicated intra-observer reliability (r = 0·938, P < 0·001). A very strong positive correlation was also found between the TTRS scores of two physical therapists, indicating interobserver reliability (r = 0·892, P < 0·001). There was also a strong positive correlation between the TTRS and KCPS (r = 0·724, P < 0·001) and a very strong positive correlation between the TTRS scores and DSFS (r = 0·822 and r = 0·755; P < 0·001). These results demonstrated the criterion-based validity of the TTRS. The TTRS is a valid, reliable and clinically easy-to-use functional instrument to document the severity of tongue thrust in children. © 2016 John Wiley & Sons Ltd.
Rönspies, Jelena; Schmidt, Alexander F; Melnikova, Anna; Krumova, Rosina; Zolfagari, Asadeh; Banse, Rainer
2015-07-01
The present study was conducted to validate an adaptation of the Implicit Relational Assessment Procedure (IRAP) as an indirect latency-based measure of sexual orientation. Furthermore, reliability and criterion validity of the IRAP were compared to two established indirect measures of sexual orientation: a Choice Reaction Time task (CRT) and a Viewing Time (VT) task. A sample of 87 heterosexual and 35 gay men completed all three indirect measures in an online study. The IRAP and the VT predicted sexual orientation nearly perfectly. Both measures also showed a considerable amount of convergent validity. Reliabilities (internal consistencies) reached satisfactory levels. In contrast, the CRT did not tap into sexual orientation in the present study. In sum, the VT measure performed best, with the IRAP showing only slightly lower reliability and criterion validity, whereas the CRT did not yield any evidence of reliability or criterion validity in the present research. The results were discussed in the light of specific task properties of the indirect latency-based measures (task-relevance vs. task-irrelevance).
ERIC Educational Resources Information Center
Kelly, William E.; Lutz, Daniel
2014-01-01
The concurrent criterion validity of the Ausburg Multidimensional Personality Instrument (AMPI) clinical scales was examined. The AMPI and several scales purportedly measuring the same or similar constructs as those of the AMPI clinical scales were administered to two samples of college students (N = 134 and N = 118). The correlations between the…
The Validity of the Modified Sit-and-Reach Test in College-Age Students.
ERIC Educational Resources Information Center
Minkler, Sharin; Patterson, Patricia
1994-01-01
Reports a study that examined the criterion-related validity of the modified sit-and-reach test against criterion measures of hamstring and low back flexibility in college students. Results indicated the modified sit-and-reach test moderately related to hamstring flexibility, but its relation to low back flexibility was low. (SM)
ERIC Educational Resources Information Center
Roth, Philip L.; Buster, Maury A.; Bobko, Philip
2011-01-01
A number of applied psychologists have suggested that trainability test Black-White ethnic group differences are low or relatively low (e.g., Siegel & Bergman, 1975), though data are scarce. Likewise, there are relatively few estimates of criterion-related validity for trainability tests predicting job performance (cf. Robertson & Downs,…
easyCBM® Reading Criterion Related Validity Evidence: Grades K-1. Technical Report #1309
ERIC Educational Resources Information Center
Lai, Cheng-Fei; Alonzo, Julie; Tindal, Gerald
2013-01-01
In this technical report, we present the results of a study to gather criterion-related evidence for Grade K-1 easyCBM® reading measures. We used correlations to examine the relation between the easyCBM® measures and other published measures with known reliability and validity evidence, including the Dynamic Indicators of Basic Early Literacy…
ERIC Educational Resources Information Center
Hirschi, Andreas
2009-01-01
Interest differentiation and elevation are supposed to provide important information about a person's state of interest development, yet little is known about their development and criterion validity. The present study explored these constructs among a group of Swiss adolescents. Study 1 applied a cross-sectional design with 210 students in 11th…
Adolescent Domain Screening Inventory-Short Form: Development and Initial Validation
ERIC Educational Resources Information Center
Corrigan, Matthew J.
2017-01-01
This study sought to develop a short version of the ADSI, and investigate its psychometric properties. Methods: This is a secondary analysis. Analysis to determine the Cronbach's Alpha, correlations to determine concurrent criterion validity and known instrument validity and a logistic regression to determine predictive validity were conducted.…
Renteria, Laura; Li, Susan Tinsley; Pliskin, Neil H
2008-05-01
The utility of the Spanish WAIS-III was investigated by examining its reliability and validity among 100 Spanish-speaking participants. Results indicated that the internal consistency of the subtests was satisfactory, but inadequate for Letter Number Sequencing. Criterion validity was adequate. Convergent and discriminant validity results were generally similar to the North American normative sample. Paired sample t-tests suggested that the WAIS-III may underestimate ability when compared to the criterion measures that were utilized to assess validity. This study provides support for the use of the Spanish WAIS-III in urban Hispanic populations, but also suggests that caution be used when administering specific subtests, due to the nature of the Latin America alphabet and potential test bias.
The Dula dangerous driving index in China: an investigation of reliability and validity.
Qu, Weina; Ge, Yan; Jiang, Caihong; Du, Feng; Zhang, Kan
2014-03-01
The aim of this study was to translate the Dula Dangerous Driving Index (DDDI) into Chinese and to verify its reliability and validity. A total of 246 drivers completed the Chinese version of the DDDI and the Driver Behavior Questionnaire (DBQ). Specific sociodemographic variables and traffic violations were also measured. A confirmatory factor analysis confirmed the internal structure of the DDDI, and the four-factor model was supported in China. Measures of convergent and criterion validity demonstrated that the Chinese DDDI was valid. Its convergent validity was supported by its positive relationship with the DBQ, and its criterion validity was tested using its relationship with self-reported accident involvement and traffic violations. Finally, score comparisons between different demographic groups revealed significant differences, thereby linking age and driving years to dangerous driving. Copyright © 2013 Elsevier Ltd. All rights reserved.
Chen, Poyu; Lin, Keh-Chung; Liing, Rong-Jiuan; Wu, Ching-Yi; Chen, Chia-Ling; Chang, Ku-Chou
2016-06-01
To examine the criterion validity, responsiveness, and minimal clinically important difference (MCID) of the EuroQoL 5-Dimensions Questionnaire (EQ-5D-5L) and visual analog scale (EQ-VAS) in people receiving rehabilitation after stroke. The EQ-5D-5L, along with four criterion measures-the Medical Research Council scales for muscle strength, the Fugl-Meyer assessment, the functional independence measure, and the Stroke Impact Scale-was administered to 65 patients with stroke before and after 3- to 4-week therapy. Criterion validity was estimated using the Spearman correlation coefficient. Responsiveness was analyzed by the effect size, standardized response mean (SRM), and criterion responsiveness. The MCID was determined by anchor-based and distribution-based approaches. The percentage of patients exceeding the MCID was also reported. Concurrent validity of the EQ-Index was better compared with the EQ-VAS. The EQ-Index has better power for predicting the rehabilitation outcome in the activities of daily living than other motor-related outcome measures. The EQ-Index was moderately responsive to change (SRM = 0.63), whereas the EQ-VAS was only mildly responsive to change. The MCID estimation of the EQ-Index (the percentage of patients exceeding the MCID) was 0.10 (33.8 %) and 0.10 (33.8 %) based on the anchor-based and distribution-based approaches, respectively, and the estimation of EQ-VAS was 8.61 (41.5 %) and 10.82 (32.3 %). The EQ-Index has shown reasonable concurrent validity, limited predictive validity, and acceptable responsiveness for detecting the health-related quality of life in stroke patients undergoing rehabilitation, but not for EQ-VAS. Future research considering different recovery stages after stroke is warranted to validate these estimations.
The brief multidimensional students' life satisfaction scale-college version.
Zullig, Keith J; Huebner, E Scott; Patton, Jon M; Murray, Karen A
2009-01-01
To investigate the psychometric properties of the BMSLSS-College among 723 college students. Internal consistency estimates explored scale reliability, factor analysis explored construct validity, and known-groups validity was assessed using the National College Youth Risk Behavior Survey and Harvard School of Public Health College Alcohol Study. Criterion-related validity was explored through analyses with the CDC's health-related quality of life scale and a social isolation scale. Acceptable internal consistency reliability, construct, known-groups, and criterion-related validity were established. Findings offer preliminary support for the BMSLSS-C; it could be useful in large-scale research studies, applied screening contexts, and for program evaluation purposes toward achieving Healthy People 2010 objectives.
Yılmaz, Emel; Eser, Erhan; Şekuri, Cevad; Kültürsay, Hakan
2011-08-01
The purpose of this study was to describe the psychometric properties of the Myocardial Infarction Dimensional Assessment Scale (MIDAS). This is a methodological cultural adaptation study. The MIDAS consists of 35-items covering seven domains: physical activity, insecurity, emotional reaction, dependency, diet, concerns over medication, and side effects which are rated on a five-point Likert scale from 1: never to 5:always. The highest score of MIDAS is 100.Quality of life (QOL) decreases as the score of scale increases. Overall 185 myocardial infarction (MI) patients were enrolled in this study. Cronbach alpha was used for the reliability analysis. The criterion validity, structural validity, and sensitivity analysis approach was used for validity analysis. New York Heart Association (NYHA) and the Canadian Cardiovascular Society Functional Classifications (CCSFC) for testing the criterion validity; SF-36 for construct validity testing of the Turkish version of the MIDAS were used. The range of Cronbach alpha values is 0.79-0.90 for seven domains of the scale. No problematic items were observed for the entire scale. Medication related domains of the MIDAS showed considerable floor effects (35.7%-22.7%). Confirmatory Factor analysis indicators [Comparative Fit Index (CFI) =0.95 and Root Mean Square Error of Approximation (RMSEA) =0.075] supported the construct validity of MIDAS. Convergent validity of the MIDAS was confirmed with correlation of SF-36 scale where appropriate. Criterion validity results was also satisfactory by comparing different stages of the NYHA and the CCSFC (p<0.05). Overall results revealed that Turkish version of the MIDAS is a reliable and valid instrument.
Angers, Magalie; Svotelis, Amy; Balg, Frederic; Allard, Jean-Pascal
2016-04-01
The Ankle Osteoarthritis Scale (AOS) is a self-administered score specific for ankle osteoarthritis (OA) with excellent reliability and strong construct and criterion validity. Many recent randomized multicentre trials have used the AOS, and the involvement of the French-speaking population is limited by the absence of a French version. Our goal was to develop a French version and validate the psychometric properties to assure equivalence to the original English version. Translation was performed according to American Association of Orthopaedic Surgeons (AAOS) 2000 guidelines for cross-cultural adaptation. Similar to the validation process of the English AOS, we evaluated the psychometric properties of the French version (AOS-Fr): criterion validity (AOS-Fr v. Western Ontario and McMaster Universities Arthritis Index [WOMAC] and SF-36 scores), construct validity (AOS-Fr correlation to single heel-lift test), and reliability (AOS-Fr test-retest). Sixty healthy individuals tested a prefinal version of the AOS-Fr for comprehension, leading to modifications and a final version that was approved by C. Saltzman, author of the AOS. We then recruited patients with ankle OA for evaluation of the AOS-Fr psychometric properties. Twenty-eight patients with ankle OA participated in the evaluation. The AOS-Fr showed strong criterion validity (AOS:WOMAC r = 0.709 and AOS:SF-36 r = -0.654) and construct validity (r = 0.664) and proved to be reliable (test-retest intraclass correlation coefficient = 0.922). The AOS-Fr is a reliable and valid score equivalent to the English version in terms of psychometric properties, thus is available for use in multicentre trials.
ERIC Educational Resources Information Center
Bödeker, Malte; Bucksch, Jens; Wallmann-Sperlich, Birgit
2018-01-01
The Neighborhood Physical Activity Questionnaire allows to assess physical activity within and outside the neighborhood. Study objectives were to examine the criterion-related validity and health/functioning associations of Neighborhood Physical Activity Questionnaire-derived physical activity in German older adults. A total of 107 adults aged…
ERIC Educational Resources Information Center
Naji Qasem, Mamun Ali; Ahmad Gul, Showkeen Bilal
2014-01-01
The study was conducted to know the effect of items direction (positive or negative) on the factorial construction and criterion related validity in Likert scale. The descriptive survey research method was used for the study and the sample consisted of 510 undergraduate students selected by used random sampling technique. A scale developed by…
ERIC Educational Resources Information Center
Kettler, Ryan J.; Elliott, Stephen N.; Davies, Michael; Griffin, Patrick
2012-01-01
This study addresses the predictive validity of results from a screening system of academic enablers, with a sample of Australian elementary school students, when the criterion variable is end-of-year achievement. The investigation included (a) comparing the predictive validity of a brief criterion-referenced nomination system with more…
easyCBM® Reading Criterion Related Validity Evidence: Grades 2-5. Technical Report #1310
ERIC Educational Resources Information Center
Lai, Cheng-Fei; Alonzo, Julie; Tindal, Gerald
2013-01-01
In this technical report, we present the results of a study to gather criterion-related evidence for Grade 2-5 easyCBM® reading measures. We used correlations to examine the relation between the easyCBM® measures and other published measures with known reliability and validity evidence, including the Gates-MacGinitie Reading Tests and the Dynamic…
A Case for Transforming the Criterion of a Predictive Validity Study
ERIC Educational Resources Information Center
Patterson, Brian F.; Kobrin, Jennifer L.
2011-01-01
This study presents a case for applying a transformation (Box and Cox, 1964) of the criterion used in predictive validity studies. The goals of the transformation were to better meet the assumptions of the linear regression model and to reduce the residual variance of fitted (i.e., predicted) values. Using data for the 2008 cohort of first-time,…
Tousignant, Michel; Smeesters, Cécil; Breton, Anne-Marie; Breton, Emilie; Corriveau, Hélène
2006-04-01
This study compared range of motion (ROM) measurements using a cervical range of motion device (CROM) and an optoelectronic system (OPTOTRAK). To examine the criterion validity of the CROM for the measurement of cervical ROM on healthy adults. Whereas measurements of cervical ROM are recognized as part of the assessment of patients with neck pain, few devices are available in clinical settings. Two papers published previously showed excellent criterion validity for measurements of cervical flexion/extension and lateral flexion using the CROM. Subjects performed neck rotation, flexion/extension, and lateral flexion while sitting on a wooden chair. The ROM values were measured by the CROM as well as the OPTOTRAK. The cervical rotational ROM values using the CROM demonstrated a good to excellent linear relationship with those using the OPTOTRAK: right rotation, r = 0.89 (95% confidence interval, 0.81-0.94), and left rotation, r = 0.94 (95% confidence interval, 0.90-0.97). Similar results were also obtained for flexion/extension and lateral flexion ROM values. The CROM showed excellent criterion validity for measurements of cervical rotation. We propose using ROM values measured by the CROM as outcome measures for patients with neck pain.
Developing and testing the patient-centred innovation questionnaire for hospital nurses.
Huang, Ching-Yuan; Weng, Rhay-Hung; Wu, Tsung-Chin; Lin, Tzu-En; Hsu, Ching-Tai; Hung, Chiu-Hsia; Tsai, Yu-Chen
2018-03-01
Develop the patient-centred innovation questionnaire for hospital nurses and establish its validity and reliability. Patient-centred care has been adopted by health care managers in their efforts to improve health care quality. It is regarded as a core concept for developing innovation. A cross-sectional study was employed to collect data from hospital nurses in Taiwan. This study was divided into two stages: pilot study and main study. In the main study, 596 valid responses were collected. This study adopted reliability analysis, exploratory factor analysis, confirmatory factor analysis and selected nurse innovation scale as a criterion to test criterion-related validity. Five-dimension patient-centred innovation questionnaire was proposed: access and practicability, co-ordination and communication, sharing power and responsibility, care continuity, family and person focus. Each dimension demonstrated a reliability of 0.89-0.98. All dimensions had acceptable convergent and discriminate validity. The patient-centred innovation questionnaire and nurse innovation scale exhibited a significantly positive correlation. Patient-centred innovation questionnaire not only had a good theoretical basis but also had sufficient reliability and construct validity, and criterion-related validity. Patient-centred innovation questionnaire could give a measure for evaluating the implementation of patient-centred care and could be used as a management tool during the process of nurse innovation. © 2017 John Wiley & Sons Ltd.
Considerations Underlying the Use of Mixed Group Validation
ERIC Educational Resources Information Center
Jewsbury, Paul A.; Bowden, Stephen C.
2013-01-01
Mixed Group Validation (MGV) is an approach for estimating the diagnostic accuracy of tests. MGV is a promising alternative to the more commonly used Known Groups Validation (KGV) approach for estimating diagnostic accuracy. The advantage of MGV lies in the fact that the approach does not require a perfect external validity criterion or gold…
Comparison of two methods of measuring physical activity in South African older adults.
Kolbe-Alexander, Tracy L; Lambert, Estelle V; Harkins, Judith Biletnikoff; Ekelund, Ulf
2006-01-01
The aim of this study was to assess the validity and reliability of the Yale Physical Activity Survey (YPAS) and the short version of the International Physical Activity Questionnaire (IPAQ) in older South African adults. The YPAS includes measures of weekly energy expenditure (EE) for housework, yard work, caregiving, exercise, and recreation. The IPAQ measures total time and EE during vigorous and moderate activity, walking, and sitting. The instruments were administered twice for test-retest reliability (men, n = 52, 68 +/- 5.4 years, and women, n = 70, 66 +/- 5.8 years). Data for criterion validity were obtained from accelerometers. YPAS reliability ranged from r = .44 to.80 for men and r = .59 to .99 for women (p < .0001). IPAQ reliability was lower for men (r = .29 to .76) than for women (r = .46 to .77). Criterion validity of the YPAS was .31 to .54 for men and .26 to .29 for women. The YPAS and short IPAQ had comparable results for reliability and criterion validity.
2013-01-01
Summary of background data Recent smartphones, such as the iPhone, are often equipped with an accelerometer and magnetometer, which, through software applications, can perform various inclinometric functions. Although these applications are intended for recreational use, they have the potential to measure and quantify range of motion. The purpose of this study was to estimate the intra and inter-rater reliability as well as the criterion validity of the clinometer and compass applications of the iPhone in the assessment cervical range of motion in healthy participants. Methods The sample consisted of 28 healthy participants. Two examiners measured cervical range of motion of each participant twice using the iPhone (for the estimation of intra and inter-reliability) and once with the CROM (for the estimation of criterion validity). Estimates of reliability and validity were then established using the intraclass correlation coefficient (ICC). Results We observed a moderate intra-rater reliability for each movement (ICC = 0.65-0.85) but a poor inter-rater reliability (ICC < 0.60). For the criterion validity, the ICCs are moderate (>0.50) to good (>0.65) for movements of flexion, extension, lateral flexions and right rotation, but poor (<0.50) for the movement left rotation. Conclusion We found good intra-rater reliability and lower inter-rater reliability. When compared to the gold standard, these applications showed moderate to good validity. However, before using the iPhone as an outcome measure in clinical settings, studies should be done on patients presenting with cervical problems. PMID:23829201
The Perceived Leadership Communication Questionnaire (PLCQ): Development and Validation.
Schneider, Frank M; Maier, Michaela; Lovrekovic, Sara; Retzbach, Andrea
2015-01-01
The Perceived Leadership Communication Questionnaire (PLCQ) is a short, reliable, and valid instrument for measuring leadership communication from both perspectives of the leader and the follower. Drawing on a communication-based approach to leadership and following a theoretical framework of interpersonal communication processes in organizations, this article describes the development and validation of a one-dimensional 6-item scale in four studies (total N = 604). Results from Study 1 and 2 provide evidence for the internal consistency and factorial validity of the PLCQ's self-rating version (PLCQ-SR)-a version for measuring how leaders perceive their own communication with their followers. Results from Study 3 and 4 show internal consistency, construct validity, and criterion validity of the PLCQ's other-rating version (PLCQ-OR)-a version for measuring how followers perceive the communication of their leaders. Cronbach's α had an average of.80 over the four studies. All confirmatory factor analyses yielded good to excellent model fit indices. Convergent validity was established by average positive correlations of.69 with subdimensions of transformational leadership and leader-member exchange scales. Furthermore, nonsignificant correlations with socially desirable responding indicated discriminant validity. Last, criterion validity was supported by a moderately positive correlation with job satisfaction (r =.31).
[Evaluation of Suicide Risk Levels in Hospitals: Validity and Reliability Tests].
Macagnino, Sandro; Steinert, Tilman; Uhlmann, Carmen
2018-05-01
Examination of in-hospital suicide risk levels concerning their validity and their reliability. The internal suicide risk levels were evaluated in a cross sectional study of in 163 inpatients. A reliability check was performed via determining interrater-reliability of senior physician, therapist and the responsible nurse. Within the scope of the validity check, we conducted analyses of criterion validity and construct validity. For the total sample an "acceptable" to "good" interrater-reliability (Kendalls W = .77) of suicide risk levels were obtained. Schizophrenic disorders showed the lowest values, for personality disorders we found the highest level of interrater-reliability. When examining the criterion validity, Item-9 of the BDI-II is substantial correlated to our suicide risk levels (ρ m = .54, p < .01). Within the scope of construct validity check, affective disorders showed the highest correlation (ρ = .77), compatible also with "convergent validity". They differed with schizophrenic disorders which showed the least concordance (ρ = .43). In-hospital suicide risk levels may represent an important contribution to the assessment of suicidal behavior of inpatients experiencing psychiatric treatment due to their overall good validity and reliability. © Georg Thieme Verlag KG Stuttgart · New York.
ERIC Educational Resources Information Center
Sánchez-Rosas, Javier; Furlan, Luis Alberto
2017-01-01
Based on the control-value theory of achievement emotions and theory of achievement goals, this research provides evidence of convergent, divergent, and criterion validity of the Spanish Cognitive Test Anxiety Scale (S-CTAS). A sample of Argentinean undergraduates responded to several scales administered at three points. At time 1 and 3, the…
ERIC Educational Resources Information Center
Willoughby, Michael T.; Blair, Clancy B.; Wirth, R. J.; Greenberg, Mark
2010-01-01
In this study, the authors examined the psychometric properties and criterion validity of a newly developed battery of tasks that were designed to assess executive function (EF) abilities in early childhood. The battery was included in the 36-month assessment of the Family Life Project (FLP), a prospective longitudinal study of 1,292 children…
ERIC Educational Resources Information Center
Abdekhodaie, Zahra; Tabatabaei, Seyed Mahmood; Gholizadeh, Mortaza
2012-01-01
In this study, the prevalence of attention-deficit hyperactivity disorder (ADHD) in kindergarten children in northeast Iran was investigated, and the criterion validity of Conners' parent-teacher questionnaire was evaluated through the use of clinical interviews. This study was a cross-sectional descriptive research project with children in…
ERIC Educational Resources Information Center
Maljaars, Jarymke; Noens, Ilse; Scholte, Evert; van Berckelaer-Onnes, Ina
2012-01-01
The Diagnostic Interview for Social and Communication Disorders (DISCO; Wing, 2006) is a standardized, semi-structured and interviewer-based schedule for diagnosis of autism spectrum disorder (ASD). The objective of this study was to evaluate the criterion and convergent validity of the DISCO-11 ICD-10 algorithm in young and low-functioning…
ERIC Educational Resources Information Center
Tolin, David F.; Steenkamp, Maria M.; Marx, Brian P.; Litz, Brett T.
2010-01-01
Although validity scales of the Minnesota Multiphasic Personality Inventory-2 (MMPI-2; J. N. Butcher, W. G. Dahlstrom, J. R. Graham, A. Tellegen, & B. Kaemmer, 1989) have proven useful in the detection of symptom exaggeration in criterion-group validation (CGV) studies, usually comparing instructed feigners with known patient groups, the…
ERIC Educational Resources Information Center
Watson, David; O'Hara, Michael W.; Chmielewski, Michael; McDade-Montez, Elizabeth A.; Koffel, Erin; Naragon, Kristin; Stuart, Scott
2008-01-01
The authors explicated the validity of the Inventory of Depression and Anxiety Symptoms (IDAS; D. Watson et al., 2007) in 2 samples (306 college students and 605 psychiatric patients). The IDAS scales showed strong convergent validity in relation to parallel interview-based scores on the Clinician Rating version of the IDAS; the mean convergent…
The Arthroscopic Surgical Skill Evaluation Tool (ASSET).
Koehler, Ryan J; Amsdell, Simon; Arendt, Elizabeth A; Bisson, Leslie J; Braman, Jonathan P; Bramen, Jonathan P; Butler, Aaron; Cosgarea, Andrew J; Harner, Christopher D; Garrett, William E; Olson, Tyson; Warme, Winston J; Nicandri, Gregg T
2013-06-01
Surgeries employing arthroscopic techniques are among the most commonly performed in orthopaedic clinical practice; however, valid and reliable methods of assessing the arthroscopic skill of orthopaedic surgeons are lacking. The Arthroscopic Surgery Skill Evaluation Tool (ASSET) will demonstrate content validity, concurrent criterion-oriented validity, and reliability when used to assess the technical ability of surgeons performing diagnostic knee arthroscopic surgery on cadaveric specimens. Cross-sectional study; Level of evidence, 3. Content validity was determined by a group of 7 experts using the Delphi method. Intra-articular performance of a right and left diagnostic knee arthroscopic procedure was recorded for 28 residents and 2 sports medicine fellowship-trained attending surgeons. Surgeon performance was assessed by 2 blinded raters using the ASSET. Concurrent criterion-oriented validity, interrater reliability, and test-retest reliability were evaluated. Content validity: The content development group identified 8 arthroscopic skill domains to evaluate using the ASSET. Concurrent criterion-oriented validity: Significant differences in the total ASSET score (P < .05) between novice, intermediate, and advanced experience groups were identified. Interrater reliability: The ASSET scores assigned by each rater were strongly correlated (r = 0.91, P < .01), and the intraclass correlation coefficient between raters for the total ASSET score was 0.90. Test-retest reliability: There was a significant correlation between ASSET scores for both procedures attempted by each surgeon (r = 0.79, P < .01). The ASSET appears to be a useful, valid, and reliable method for assessing surgeon performance of diagnostic knee arthroscopic surgery in cadaveric specimens. Studies are ongoing to determine its generalizability to other procedures as well as to the live operating room and other simulated environments.
Helmerhorst, Hendrik J F; Brage, Søren; Warren, Janet; Besson, Herve; Ekelund, Ulf
2012-08-31
Physical inactivity is one of the four leading risk factors for global mortality. Accurate measurement of physical activity (PA) and in particular by physical activity questionnaires (PAQs) remains a challenge. The aim of this paper is to provide an updated systematic review of the reliability and validity characteristics of existing and more recently developed PAQs and to quantitatively compare the performance between existing and newly developed PAQs.A literature search of electronic databases was performed for studies assessing reliability and validity data of PAQs using an objective criterion measurement of PA between January 1997 and December 2011. Articles meeting the inclusion criteria were screened and data were extracted to provide a systematic overview of measurement properties. Due to differences in reported outcomes and criterion methods a quantitative meta-analysis was not possible.In total, 31 studies testing 34 newly developed PAQs, and 65 studies examining 96 existing PAQs were included. Very few PAQs showed good results on both reliability and validity. Median reliability correlation coefficients were 0.62-0.71 for existing, and 0.74-0.76 for new PAQs. Median validity coefficients ranged from 0.30-0.39 for existing, and from 0.25-0.41 for new PAQs.Although the majority of PAQs appear to have acceptable reliability, the validity is moderate at best. Newly developed PAQs do not appear to perform substantially better than existing PAQs in terms of reliability and validity. Future PAQ studies should include measures of absolute validity and the error structure of the instrument.
Validity and Reliability of the Upper Extremity Work Demands Scale.
Jacobs, Nora W; Berduszek, Redmar J; Dijkstra, Pieter U; van der Sluis, Corry K
2017-12-01
Purpose To evaluate validity and reliability of the upper extremity work demands (UEWD) scale. Methods Participants from different levels of physical work demands, based on the Dictionary of Occupational Titles categories, were included. A historical database of 74 workers was added for factor analysis. Criterion validity was evaluated by comparing observed and self-reported UEWD scores. To assess structural validity, a factor analysis was executed. For reliability, the difference between two self-reported UEWD scores, the smallest detectable change (SDC), test-retest reliability and internal consistency were determined. Results Fifty-four participants were observed at work and 51 of them filled in the UEWD twice with a mean interval of 16.6 days (SD 3.3, range = 10-25 days). Criterion validity of the UEWD scale was moderate (r = .44, p = .001). Factor analysis revealed that 'force and posture' and 'repetition' subscales could be distinguished with Cronbach's alpha of .79 and .84, respectively. Reliability was good; there was no significant difference between repeated measurements. An SDC of 5.0 was found. Test-retest reliability was good (intraclass correlation coefficient for agreement = .84) and all item-total correlations were >.30. There were two pairs of highly related items. Conclusion Reliability of the UEWD scale was good, but criterion validity was moderate. Based on current results, a modified UEWD scale (2 items removed, 1 item reworded, divided into 2 subscales) was proposed. Since observation appeared to be an inappropriate gold standard, we advise to investigate other types of validity, such as construct validity, in further research.
2012-01-01
Physical inactivity is one of the four leading risk factors for global mortality. Accurate measurement of physical activity (PA) and in particular by physical activity questionnaires (PAQs) remains a challenge. The aim of this paper is to provide an updated systematic review of the reliability and validity characteristics of existing and more recently developed PAQs and to quantitatively compare the performance between existing and newly developed PAQs. A literature search of electronic databases was performed for studies assessing reliability and validity data of PAQs using an objective criterion measurement of PA between January 1997 and December 2011. Articles meeting the inclusion criteria were screened and data were extracted to provide a systematic overview of measurement properties. Due to differences in reported outcomes and criterion methods a quantitative meta-analysis was not possible. In total, 31 studies testing 34 newly developed PAQs, and 65 studies examining 96 existing PAQs were included. Very few PAQs showed good results on both reliability and validity. Median reliability correlation coefficients were 0.62–0.71 for existing, and 0.74–0.76 for new PAQs. Median validity coefficients ranged from 0.30–0.39 for existing, and from 0.25–0.41 for new PAQs. Although the majority of PAQs appear to have acceptable reliability, the validity is moderate at best. Newly developed PAQs do not appear to perform substantially better than existing PAQs in terms of reliability and validity. Future PAQ studies should include measures of absolute validity and the error structure of the instrument. PMID:22938557
Yee, Chee-Seng; Farewell, Vernon; Isenberg, David A; Rahman, Anisur; Teh, Lee-Suan; Griffiths, Bridget; Bruce, Ian N; Ahmad, Yasmeen; Prabu, Athiveeraramapandian; Akil, Mohammed; McHugh, Neil; D'Cruz, David; Khamashta, Munther A; Maddison, Peter; Gordon, Caroline
2007-01-01
Objective To determine the construct and criterion validity of the British Isles Lupus Assessment Group 2004 (BILAG-2004) index for assessing disease activity in systemic lupus erythematosus (SLE). Methods Patients with SLE were recruited into a multicenter cross-sectional study. Data on SLE disease activity (scores on the BILAG-2004 index, Classic BILAG index, and Systemic Lupus Erythematosus Disease Activity Index 2000 [SLEDAI-2K]), investigations, and therapy were collected. Overall BILAG-2004 and overall Classic BILAG scores were determined by the highest score achieved in any of the individual systems in the respective index. Erythrocyte sedimentation rates (ESRs), C3 levels, C4 levels, anti–double-stranded DNA (anti-dsDNA) levels, and SLEDAI-2K scores were used in the analysis of construct validity, and increase in therapy was used as the criterion for active disease in the analysis of criterion validity. Statistical analyses were performed using ordinal logistic regression for construct validity and logistic regression for criterion validity. Sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were calculated. Results Of the 369 patients with SLE, 92.7% were women, 59.9% were white, 18.4% were Afro-Caribbean and 18.4% were South Asian. Their mean ± SD age was 41.6 ± 13.2 years and mean disease duration was 8.8 ± 7.7 years. More than 1 assessment was obtained on 88.6% of the patients, and a total of 1,510 assessments were obtained. Increasing overall scores on the BILAG-2004 index were associated with increasing ESRs, decreasing C3 levels, decreasing C4 levels, elevated anti-dsDNA levels, and increasing SLEDAI-2K scores (all P < 0.01). Increase in therapy was observed more frequently in patients with overall BILAG-2004 scores reflecting higher disease activity. Scores indicating active disease (overall BILAG-2004 scores of A and B) were significantly associated with increase in therapy (odds ratio [OR] 19.3, P < 0.01). The BILAG-2004 and Classic BILAG indices had comparable sensitivity, specificity, PPV, and NPV. Conclusion These findings show that the BILAG-2004 index has construct and criterion validity. PMID:18050213
Matsuzaki, Mika; Sullivan, Ruth; Ekelund, Ulf; Krishna, K V Radha; Kulkarni, Bharati; Collier, Tim; Ben-Shlomo, Yoav; Kinra, Sanjay; Kuper, Hannah
2016-01-19
There is limited availability of context-specific physical activity questionnaires in low and middle income countries. The aim of this study was to develop and examine the validity of a new Indian physical activity questionnaire, the Andhra Pradesh Children and Parent Study Physical Activity Questionnaire (APCAPS-PAQ). The current study was conducted with the cohort from the Hyderabad DXA Study (n = 2321), recruited in 2009-2010. Criterion validity (n = 245) was examined by comparing the APCAPS-PAQ to a combined heart rate and motion sensor worn for 8 days. Construct validity (n = 2321) was assessed with linear regression, comparing APCAPS-PAQ against BMI, percent body fat, and pulse rate. The APCAPS-PAQ criterion validity was variable depending on the PA intensity groups (ρ = 0.26, 0.07, 0.39; к = 0.14, 0.04, 0.16 for sedentary, light, moderate/vigorous physical activity (MVPA) respectively). Sedentary and light intensity activities from the questionnaire were underestimated when compared to the criterion data while MVPA in APCAPS-PAQ was overestimated. Higher time spent in sedentary activity in APCAPS-PAQ was associated with higher BMI and percent body fat, suggesting construct validity. The APCAPS-PAQ validity is comparable to other physical activity questionnaires. This tool is able to assess sedentary behavior, moderate/vigorous activity and physical activity energy expenditure on a group level with reasonable validity. This new questionnaire may be used for ranking individuals according to their sedentary time and physical activity in southern India.
[Development and validity of workplace bullying in nursing-type inventory (WPBN-TI)].
Lee, Younju; Lee, Mihyoung
2014-04-01
The purpose of this study was to develop an instrument to assess bullying of nurses, and test the validity and reliability of the instrument. The initial thirty items of WPBN-TI were identified through a review of the literature on types bullying related to nursing and in-depth interviews with 14 nurses who experienced bullying at work. Sixteen items were developed through 2 content validity tests by 9 experts and 10 nurses. The final WPBN-TI instrument was evaluated by 458 nurses from five general hospitals in the Incheon metropolitan area. SPSS 18.0 program was used to assess the instrument based on internal consistency reliability, construct validity, and criterion validity. WPBN-TI consisted of 16 items with three distinct factors (verbal and nonverbal bullying, work-related bullying, and external threats), which explained 60.3% of the total variance. The convergent validity and determinant validity for WPBN-TI were 100.0%, 89.7%, respectively. Known-groups validity of WPBN-TI was proven through the mean difference between subjective perception of bullying. The satisfied criterion validity for WPBN-TI was more than .70. The reliability of WPBN-TI was Cronbach's α of .91. WPBN-TI with high validity and reliability is suitable to determine types of bullying in nursing workplace.
Construction and Validation of the Perceived Opportunity to Craft Scale.
van Wingerden, Jessica; Niks, Irene M W
2017-01-01
We developed and validated a scale to measure employees' perceived opportunity to craft (POC) in two separate studies conducted in the Netherlands (total N = 2329). POC is defined as employees' perception of their opportunity to craft their job. In Study 1, the perceived opportunity to craft scale (POCS) was developed and tested for its factor structure and reliability in an explorative way. Study 2 consisted of confirmatory analyses of the factor structure and reliability of the scale as well as examination of the discriminant and criterion-related validity of the POCS. The results indicated that the scale consists of one dimension and could be reliably measured with five items. Evidence was found for the discriminant validity of the POCS. The scale also showed criterion-related validity when correlated with job crafting (+), job resources (autonomy +; opportunities for professional development +), work engagement (+), and the inactive construct cynicism (-). We discuss the implications of these findings for theory and practice.
Numerical and Experimental Validation of a New Damage Initiation Criterion
NASA Astrophysics Data System (ADS)
Sadhinoch, M.; Atzema, E. H.; Perdahcioglu, E. S.; van den Boogaard, A. H.
2017-09-01
Most commercial finite element software packages, like Abaqus, have a built-in coupled damage model where a damage evolution needs to be defined in terms of a single fracture energy value for all stress states. The Johnson-Cook criterion has been modified to be Lode parameter dependent and this Modified Johnson-Cook (MJC) criterion is used as a Damage Initiation Surface (DIS) in combination with the built-in Abaqus ductile damage model. An exponential damage evolution law has been used with a single fracture energy value. Ultimately, the simulated force-displacement curves are compared with experiments to validate the MJC criterion. 7 out of 9 fracture experiments were predicted accurately. The limitations and accuracy of the failure predictions of the newly developed damage initiation criterion will be discussed shortly.
van der Ploeg, Hidde P; Streppel, Kitty R M; van der Beek, Allard J; van der Woude, Luc H V; Vollenbroek-Hutten, Miriam; van Mechelen, Willem
2007-01-01
The objective was to determine the test-retest reliability and criterion validity of the Physical Activity Scale for Individuals with Physical Disabilities (PASIPD). Forty-five non-wheelchair dependent subjects were recruited from three Dutch rehabilitation centers. Subjects' diagnoses were: stroke, spinal cord injury, whiplash, and neurological-, orthopedic- or back disorders. The PASIPD is a 7-d recall physical activity questionnaire that was completed twice, 1 wk apart. During this week, physical activity was also measured with an Actigraph accelerometer. The test-retest reliability Spearman correlation of the PASIPD was 0.77. The criterion validity Spearman correlation was 0.30 when compared to the accelerometer. The PASIPD had test-retest reliability and criterion validity that is comparable to well established self-report physical activity questionnaires from the general population.
Vanwolleghem, Griet; Van Dyck, Delfien; Ducheyne, Fabian; De Bourdeaudhuij, Ilse; Cardon, Greet
2014-06-10
Google Street View provides a valuable and efficient alternative to observe the physical environment compared to on-site fieldwork. However, studies on the use, reliability and validity of Google Street View in a cycling-to-school context are lacking. We aimed to study the intra-, inter-rater reliability and criterion validity of EGA-Cycling (Environmental Google Street View Based Audit - Cycling to school), a newly developed audit using Google Street View to assess the physical environment along cycling routes to school. Parents (n = 52) of 11-to-12-year old Flemish children, who mostly cycled to school, completed a questionnaire and identified their child's cycling route to school on a street map. Fifty cycling routes of 11-to-12-year olds were identified and physical environmental characteristics along the identified routes were rated with EGA-Cycling (5 subscales; 37 items), based on Google Street View. To assess reliability, two researchers performed the audit. Criterion validity of the audit was examined by comparing the ratings based on Google Street View with ratings through on-site assessments. Intra-rater reliability was high (kappa range 0.47-1.00). Large variations in the inter-rater reliability (kappa range -0.03-1.00) and criterion validity scores (kappa range -0.06-1.00) were reported, with acceptable inter-rater reliability values for 43% of all items and acceptable criterion validity for 54% of all items. EGA-Cycling can be used to assess physical environmental characteristics along cycling routes to school. However, to assess the micro-environment specifically related to cycling, on-site assessments have to be added.
An evidence-based decision assistance model for predicting training outcome in juvenile guide dogs.
Harvey, Naomi D; Craigon, Peter J; Blythe, Simon A; England, Gary C W; Asher, Lucy
2017-01-01
Working dog organisations, such as Guide Dogs, need to regularly assess the behaviour of the dogs they train. In this study we developed a questionnaire-style behaviour assessment completed by training supervisors of juvenile guide dogs aged 5, 8 and 12 months old (n = 1,401), and evaluated aspects of its reliability and validity. Specifically, internal reliability, temporal consistency, construct validity, predictive criterion validity (comparing against later training outcome) and concurrent criterion validity (comparing against a standardised behaviour test) were evaluated. Thirty-nine questions were sourced either from previously published literature or created to meet requirements identified via Guide Dogs staff surveys and staff feedback. Internal reliability analyses revealed seven reliable and interpretable trait scales named according to the questions within them as: Adaptability; Body Sensitivity; Distractibility; Excitability; General Anxiety; Trainability and Stair Anxiety. Intra-individual temporal consistency of the scale scores between 5-8, 8-12 and 5-12 months was high. All scales excepting Body Sensitivity showed some degree of concurrent criterion validity. Predictive criterion validity was supported for all seven scales, since associations were found with training outcome, at at-least one age. Thresholds of z-scores on the scales were identified that were able to distinguish later training outcome by identifying 8.4% of all dogs withdrawn for behaviour and 8.5% of all qualified dogs, with 84% and 85% specificity. The questionnaire assessment was reliable and could detect traits that are consistent within individuals over time, despite juvenile dogs undergoing development during the study period. By applying thresholds to scores produced from the questionnaire this assessment could prove to be a highly valuable decision-making tool for Guide Dogs. This is the first questionnaire-style assessment of juvenile dogs that has shown value in predicting the training outcome of individual working dogs.
ERIC Educational Resources Information Center
Deng, Weiling; Monfils, Lora
2017-01-01
Using simulated data, this study examined the impact of different levels of stringency of the valid case inclusion criterion on item response theory (IRT)-based true score equating over 5 years in the context of K-12 assessment when growth in student achievement is expected. Findings indicate that the use of the most stringent inclusion criterion…
ERIC Educational Resources Information Center
Wray, Kraig; Lai, Cheng-Fei; Sáez, Leilani; Alonzo, Julie; Tindal, Gerald
2013-01-01
We report the results of an alternate form reliability and criterion validity study of kindergarten and grade 1 (N = 84-199) reading measures from the easyCBM© assessment system and Stanford Early School Achievement Test/Stanford Achievement Test, 10th edition (SESAT/SAT-10) across 5 time points. The alternate form reliabilities ranged from…
Empirical Validation of Reading Proficiency Guidelines
ERIC Educational Resources Information Center
Clifford, Ray; Cox, Troy L.
2013-01-01
The validation of ability scales describing multidimensional skills is always challenging, but not impossible. This study applies a multistage, criterion-referenced approach that uses a framework of aligned texts and reading tasks to explore the validity of the ACTFL and related reading proficiency guidelines. Rasch measurement and statistical…
NASA Astrophysics Data System (ADS)
Ji, Bing; Tsai, Chin-Chun; Stwalley, William C.
1995-04-01
A modified internuclear distance criterion, RLR- m, as the lower bound for the region of validity of the inverse-power expansion of the diatomic long-range potential is proposed. This new criterion takes into account the spatial orientation of the atomic orbitals while retaining the simplicity of the traditional Le Roy radius, RLR for the interaction of S state atoms. Recent experimental and theoretical results for various excited states in Na 2 suggest that this proposed RLR- m is an appropriate generalization of RLR.
Developing a short measure of organizational justice: a multisample health professionals study.
Elovainio, Marko; Heponiemi, Tarja; Kuusio, Hannamaria; Sinervo, Timo; Hintsa, Taina; Aalto, Anna-Mari
2010-11-01
To develop and test the validity of a short version of the original questionnaire measuring organizational justice. The study samples comprised working physicians (N = 2792) and registered nurses (n = 2137) from the Finnish Health Professionals study. Structural equation modelling was applied to test structural validity, using the justice scales. Furthermore, criterion validity was explored with well-being (sleeping problems) and health indicators (psychological distress/self-rated health). The short version of the organizational justice questionnaire (eight items) provides satisfactory psychometric properties (internal consistency, a good model fit of the data). All scales were associated with an increased risk of sleeping problems and psychological distress, indicating satisfactory criterion validity. This short version of the organizational justice questionnaire provides a useful tool for epidemiological studies focused on health-adverse effects of work environment.
De Cocker, K; Cardon, G; De Bourdeaudhuij, I
2006-01-01
Objectives To evaluate if inexpensive Stepping Meters are valid in counting steps in adults in free living conditions. Methods For six days, 35 healthy volunteers wore a criterion Yamax Digiwalker and five Stepping Meters every day until all 973 pedometers had been tested. Steps were recorded daily, and the differences between counts from the Digiwalker and the Stepping Meter were expressed as a percentage of the valid value of the Digiwalker step counts. The criterion used to determine if a Stepping Meter was valid was a maximum deviation of 10% from the Digiwalker step counts. Results A total of 252 (25.9%) Stepping Meters met the criterion, whereas 74.1% made an overestimation or underestimation of more than 10%. In more than one third (36.6%) of the invalid Stepping Meters, the deviation was greater than 50%. Most (64.8%) of the invalid pedometers overestimated the actual steps taken. Conclusions Inexpensive Stepping Meters cannot be used in community interventions as they will give participants the wrong message. PMID:16790485
ERIC Educational Resources Information Center
Brown, James M.; Chang, Gerald
1982-01-01
The predictive validity of the Minnesota Reading Assessment (MRA) when used to project potential performance of postsecondary vocational-technical education students was examined. Findings confirmed the MRA to be a valid predictor, although the error in prediction varied between the criterion variables. (Author/GK)
Standards Performance Continuum: Development and Validation of a Measure of Effective Pedagogy.
ERIC Educational Resources Information Center
Doherty, R. William; Hilberg, R. Soleste; Epaloose, Georgia; Tharp, Roland G.
2002-01-01
Describes the development and validation of the Standards Performance Continuum (SPC) for assessing teacher performance of the Standards for Effective Pedagogy. Three studies involving Florida, California, and New Mexico public school teachers provided evidence of inter-rater reliability, concurrent validity, and criterion-related validity…
The Reliability and Validity of the Coopersmith Self-Esteem Inventory-Form B.
ERIC Educational Resources Information Center
Chiu, Lian-Hwang
1985-01-01
The purpose of this study was to determine the test-retest reliability and concurrent validity of the short form (Form B) of the Coopersmith Self-Esteem Inventory. Criterion measures for validity included: (1) sociometric measures; (2) teacher's popularity ranking; and, (3) self-esteem rating. (Author/LMO)
Current Concerns in Validity Theory.
ERIC Educational Resources Information Center
Kane, Michael
Validity is concerned with the clarification and justification of the intended interpretations and uses of observed scores. It has not been easy to formulate a general methodology set of principles for validation, but progress has been made, especially as the field has moved from relatively limited criterion-related models to sophisticated…
ERIC Educational Resources Information Center
Michael, William B.; Colson, Kenneth R.
1979-01-01
The construction and validation of the Life Experience Inventory (LEI) for the identification of creative electrical engineers are described. Using the number of patents held or pending as a criterion measure, the LEI was found to have high concurrent validity. (JKS)
Validation of the Lollipop Test: A Diagnostic Screening Test of School Readiness.
ERIC Educational Resources Information Center
Chew, Alex L.; Morris, John D.
1984-01-01
The validity of the Lollipop Test: A Diagnostic Screening Test of School Readiness was examined using the Metropolitan Readiness Test (MRT), Level I, Form Q, as the criterion. Appreciable concurrent validity was found across test batteries. Implications for school readiness screening are discussed. (Author/BS)
Concurrent Validity of the TONI-3
ERIC Educational Resources Information Center
Banks, Sandra H.; Franzen, Michael D.
2010-01-01
The literature pertaining to intelligence assessment reveals an ongoing discussion about the areas of intelligence captured by nonverbal tests. To date, few studies have investigated the criterion validity of the Test of Nonverbal Intelligence, Third Edition (TONI-3). The present study investigates the concurrent validity of the TONI-3 in a sample…
The Arthroscopic Surgical Skill Evaluation Tool (ASSET)
Koehler, Ryan J.; Amsdell, Simon; Arendt, Elizabeth A; Bisson, Leslie J; Braman, Jonathan P; Butler, Aaron; Cosgarea, Andrew J; Harner, Christopher D; Garrett, William E; Olson, Tyson; Warme, Winston J.; Nicandri, Gregg T.
2014-01-01
Background Surgeries employing arthroscopic techniques are among the most commonly performed in orthopaedic clinical practice however, valid and reliable methods of assessing the arthroscopic skill of orthopaedic surgeons are lacking. Hypothesis The Arthroscopic Surgery Skill Evaluation Tool (ASSET) will demonstrate content validity, concurrent criterion-oriented validity, and reliability, when used to assess the technical ability of surgeons performing diagnostic knee arthroscopy on cadaveric specimens. Study Design Cross-sectional study; Level of evidence, 3 Methods Content validity was determined by a group of seven experts using a Delphi process. Intra-articular performance of a right and left diagnostic knee arthroscopy was recorded for twenty-eight residents and two sports medicine fellowship trained attending surgeons. Subject performance was assessed by two blinded raters using the ASSET. Concurrent criterion-oriented validity, inter-rater reliability, and test-retest reliability were evaluated. Results Content validity: The content development group identified 8 arthroscopic skill domains to evaluate using the ASSET. Concurrent criterion-oriented validity: Significant differences in total ASSET score (p<0.05) between novice, intermediate, and advanced experience groups were identified. Inter-rater reliability: The ASSET scores assigned by each rater were strongly correlated (r=0.91, p <0.01) and the intra-class correlation coefficient between raters for the total ASSET score was 0.90. Test-retest reliability: there was a significant correlation between ASSET scores for both procedures attempted by each individual (r = 0.79, p<0.01). Conclusion The ASSET appears to be a useful, valid, and reliable method for assessing surgeon performance of diagnostic knee arthroscopy in cadaveric specimens. Studies are ongoing to determine its generalizability to other procedures as well as to the live OR and other simulated environments. PMID:23548808
Criterion-Referenced Testing in Foreign Language Teaching.
ERIC Educational Resources Information Center
Takala, Sauli
A review of literature serves as the basis for a discussion of various aspects of criterion-referenced tests. The aspects discussed are: teaching and evaluation objectives, criterion- and norm-referenced measurement, stages in construction of criterion-referenced tests, construction and selection of items, test validity, and test reliability.…
Criterion Validity of the Child's Challenging Behavior Scale, Version 2 (CCBS-2).
Bourke-Taylor, Helen M; Cordier, Reinie; Pallant, Julie F
The Child's Challenging Behavior Scale, Version 2 (CCBS-2), measures maternal rating of a child's challenging behaviors that compromise maternal mental health. The CCBS-2, the Child Behavior Checklist (CBCL), and the Strengths and Difficulties Questionnaire (SDQ) were compared in a sample of typically developing young Australian children. Criterion validity was investigated by correlating the CCBS-2 with "gold standard" measures (CBCL and SDQ subscales). Data were collected in a cross-sectional survey of mothers (N = 336) of children ages 3-9 yr. Correlations with the CBCL externalizing subscales demonstrated moderate (ρ = .46) to strong (ρ = .66) correlations. Correlations with the SDQ externalizing behaviors subscales were moderate (ρ = .35) to strong (ρ = .60). The criterion validity established in this study strengthens the psychometric properties that support ongoing development of the CCBS-2 as an efficient tool that may identify children in need of further evaluation. Copyright © 2018 by the American Occupational Therapy Association, Inc.
Correlates of the MMPI-2-RF in a college setting.
Forbey, Johnathan D; Lee, Tayla T C; Handel, Richard W
2010-12-01
The current study examined empirical correlates of scores on Minnesota Multiphasic Personality Inventory-2-Restructured Form (MMPI-2-RF; A. Tellegen & Y. S. Ben-Porath, 2008; Y. S. Ben-Porath & A. Tellegen, 2008) scales in a college setting. The MMPI-2-RF and six criterion measures (assessing anger, assertiveness, sex roles, cognitive failures, social avoidance, and social fear) were administered to 846 college students (nmen = 264, nwomen = 582) to examine the convergent and discriminant validity of scores on the MMPI-2-RF Specific Problems and Interest scales. Results demonstrated evidence of generally good convergent score validity for the selected MMPI-2-RF scales, reflected in large effect size correlations with criterion measure scores. Further, MMPI-2-RF scale scores demonstrated adequate discriminant validity, reflected in relatively low comparative median correlations between scores on MMPI-2-RF substantive scale sets and criterion measures. Limitations and future directions are discussed.
Sainz de Baranda, Pilar; Rodríguez-Iniesta, María; Ayala, Francisco; Santonja, Fernando; Cejudo, Antonio
2014-07-01
To examine the criterion-related validity of the horizontal hip joint angle (H-HJA) test and vertical hip joint angle (V-HJA) test for estimating hamstring flexibility measured through the passive straight-leg raise (PSLR) test using contemporary statistical measures. Validity study. Controlled laboratory environment. One hundred thirty-eight professional trampoline gymnasts (61 women and 77 men). Hamstring flexibility. Each participant performed 2 trials of H-HJA, V-HJA, and PSLR tests in a randomized order. The criterion-related validity of H-HJA and V-HJA tests was measured through the estimation equation, typical error of the estimate (TEEST), validity correlation (β), and their respective confidence limits. The findings from this study suggest that although H-HJA and V-HJA tests showed moderate to high validity scores for estimating hamstring flexibility (standardized TEEST = 0.63; β = 0.80), the TEEST statistic reported for both tests was not narrow enough for clinical purposes (H-HJA = 10.3 degrees; V-HJA = 9.5 degrees). Subsequently, the predicted likely thresholds for the true values that were generated were too wide (H-HJA = predicted value ± 13.2 degrees; V-HJA = predicted value ± 12.2 degrees). The results suggest that although the HJA test showed moderate to high validity scores for estimating hamstring flexibility, the prediction intervals between the HJA and PSLR tests are not strong enough to suggest that clinicians and sport medicine practitioners should use the HJA and PSLR tests interchangeably as gold standard measurement tools to evaluate and detect short hamstring muscle flexibility.
Estimating activity energy expenditure: how valid are physical activity questionnaires?
Neilson, Heather K; Robson, Paula J; Friedenreich, Christine M; Csizmadi, Ilona
2008-02-01
Activity energy expenditure (AEE) is the modifiable component of total energy expenditure (TEE) derived from all activities, both volitional and nonvolitional. Because AEE may affect health, there is interest in its estimation in free-living people. Physical activity questionnaires (PAQs) could be a feasible approach to AEE estimation in large populations, but it is unclear whether or not any PAQ is valid for this purpose. Our aim was to explore the validity of existing PAQs for estimating usual AEE in adults, using doubly labeled water (DLW) as a criterion measure. We reviewed 20 publications that described PAQ-to-DLW comparisons, summarized study design factors, and appraised criterion validity using mean differences (AEE(PAQ) - AEE(DLW), or TEE(PAQ) - TEE(DLW)), 95% limits of agreement, and correlation coefficients (AEE(PAQ) versus AEE(DLW) or TEE(PAQ) versus TEE(DLW)). Only 2 of 23 PAQs assessed most types of activity over the past year and indicated acceptable criterion validity, with mean differences (TEE(PAQ) - TEE(DLW)) of 10% and 2% and correlation coefficients of 0.62 and 0.63, respectively. At the group level, neither overreporting nor underreporting was more prevalent across studies. We speculate that, aside from reporting error, discrepancies between PAQ and DLW estimates may be partly attributable to 1) PAQs not including key activities related to AEE, 2) PAQs and DLW ascertaining different time periods, or 3) inaccurate assignment of metabolic equivalents to self-reported activities. Small sample sizes, use of correlation coefficients, and limited information on individual validity were problematic. Future research should address these issues to clarify the true validity of PAQs for estimating AEE.
Sanchez-Armass, Omar; Raffaelli, Marcela; Andrade, Flavia Cristina Drumond; Wiley, Angela R; Noyola, Aida Nacielli Morales; Arguelles, Alejandra Cepeda; Aradillas-Garcia, Celia
2017-03-01
To evaluate the criterion validity and diagnostic utility of the SCOFF, a brief eating disorder (ED) screening instrument, in a Mexican sample. The study was conducted in two phases in 2012. Phase I involved the administration of self-report measures [the SCOFF and the Eating Disorder Inventory-2, (EDI-2)] to 1057 students aged 17-56 years (M age = 21.0, SD = 3.4; 67 % female) from three colleges at the Universidad Autónoma de San Luis Potosí, Mexico. In Phase II, a random subsample of these students (n = 104) participated in the eating disorder examination, a structured interview that yields ED diagnoses. Analyses were conducted to evaluate the SCOFF's criterion validity by examining (a) correlations between scores on the SCOFF and the EDI-2 and (b) the SCOFF's ability to differentiate diagnosed ED cases and non-cases. EDI-2 subscales showed high correlations with the SCOFF scores proving initial evidence of criterion validity. A score of two points on the SCOFF optimized the sensitivity (78 %) and specificity (84 %). With this cutoff, the SCOFF correctly classified over half the cases (PPV = 58 %) and screened out the majority of non-cases (NPV = 93 %) providing further evidence of criterion validity. Analyses were repeated separately for men and women, yielding gender-specific information on the SCOFF's performance. Taken as a whole, results indicated that the SCOFF can be a useful tool for identifying Mexican university students who are at risk of eating disorders.
Tierney, M; Fraser, A; Kennedy, N
2015-06-01
The International Physical Activity Questionnaire Short Form (IPAQ-SF) is a self-report questionnaire commonly used in patients with rheumatoid arthritis (RA) to measure physical activity. However, despite its frequent use in patients with RA, its validity has not been ascertained in this population. The aim of this study was to examine the criterion validity of energy expenditure from physical activity recorded with the IPAQ-SF in patients with RA compared with the objective criterion measure, the SenseWear Armband (SWA) which has been validated previously in this population. Cross-sectional criterion validation study. Regional hospital outpatient setting. Twenty-two patients with RA attending outpatient rheumatology clinics. Subjects wore an SWA for 7 full consecutive days and completed the IPAQ-SF. Energy expenditure from physical activity recorded by the SWA and the IPAQ-SF. Energy expenditure from physical activity recorded by the IPAQ-SF and the SWA showed a small, non-significant correlation (r=0.407, P=0.60). The IPAQ-SF underestimated energy expenditure from physical activity by 41% compared with the SWA. This was corroborated using Bland and Altman plots, as the IPAQ-SF was found to overestimate energy expenditure from physical activity in nine of the 22 individuals, and underestimate energy expenditure from physical activity in the remaining 13 individuals. The IPAQ-SF has limited use as an accurate and absolute measure for estimating energy expenditure from physical activity in patients with RA. Copyright © 2014 Chartered Society of Physiotherapy. Published by Elsevier Ltd. All rights reserved.
Teachers' Grade Assignment and the Predictive Validity of Criterion-Referenced Grades
ERIC Educational Resources Information Center
Thorsen, Cecilia; Cliffordson, Christina
2012-01-01
Research has found that grades are the most valid instruments for predicting educational success. Why grades have better predictive validity than, for example, standardized tests is not yet fully understood. One possible explanation is that grades reflect not only subject-specific knowledge and skills but also individual differences in other…
ERIC Educational Resources Information Center
Bornstein, Robert F.
2011-01-01
Although definitions of validity have evolved considerably since L. J. Cronbach and P. E. Meehl's classic (1955) review, contemporary validity research continues to emphasize correlational analyses assessing predictor-criterion relationships, with most outcome criteria being self-reports. The present article describes an alternative way of…
Mobile Phone Use in a Developing Country: A Malaysian Empirical Study
ERIC Educational Resources Information Center
Yeow, Paul H. P.; Yen Yuen, Yee; Connolly, Regina
2008-01-01
This study examined the factors that influence consumer satisfaction with mobile telephone use in Malaysia. The validity of the study's constructs, criterion, and content was confirmed. Construct validity was verified through the factor analysis with a total variance of 73.72 percent explained by all six independent factors. Content validity was…
ERIC Educational Resources Information Center
Andrei, Federica; Smith, Martin M.; Surcinelli, Paola; Baldaro, Bruno; Saklofske, Donald H.
2016-01-01
This study investigated the structure and validity of the Italian translation of the Trait Emotional Intelligence Questionnaire. Data were self-reported from 227 participants. Confirmatory factor analysis supported the four-factor structure of the scale. Hierarchical regressions also demonstrated its incremental validity beyond demographics, the…
ERIC Educational Resources Information Center
Fairclough, Stuart J.; Hilland, Toni A.; Vinson, Don; Stratton, Gareth
2012-01-01
The study purpose was to assess preliminary validity and reliability of the Physical Education and School Sport Environment Inventory (PESSEI), which was designed to audit physical education (PE) and school sport spaces and resources. PE teachers from eight English secondary schools completed the PESSEI. Criterion validity was assessed by…
Eating Disorder Diagnostic Scale: Additional Evidence of Reliability and Validity
ERIC Educational Resources Information Center
Stice, Eric; Fisher, Melissa; Martinez, Erin
2004-01-01
The authors conducted 4 studies investigating the reliability and validity of the Eating Disorder Diagnostic Scale (HDDS; E. Stice, C. F. Telch, & S. L. Rizvi, 2000), a brief self-report measure for diagnosing anorexia nervosa, bulimia nervosa, and binge eating disorder. Study 1 found that the HDDS showed criterion validity with interview-based…
Guise, Brian J; Thompson, Matthew D; Greve, Kevin W; Bianchini, Kevin J; West, Laura
2014-03-01
The current study assessed performance validity on the Stroop Color and Word Test (Stroop) in mild traumatic brain injury (TBI) using criterion-groups validation. The sample consisted of 77 patients with a reported history of mild TBI. Data from 42 moderate-severe TBI and 75 non-head-injured patients with other clinical diagnoses were also examined. TBI patients were categorized on the basis of Slick, Sherman, and Iverson (1999) criteria for malingered neurocognitive dysfunction (MND). Classification accuracy is reported for three indicators (Word, Color, and Color-Word residual raw scores) from the Stroop across a range of injury severities. With false-positive rates set at approximately 5%, sensitivity was as high as 29%. The clinical implications of these findings are discussed. © 2012 The British Psychological Society.
Nascimento-Ferreira, Marcus V; Collese, Tatiana S; de Moraes, Augusto César F; Rendo-Urteaga, Tara; Moreno, Luis A; Carvalho, Heráclito B
2016-12-01
Sleep duration has been associated with several health outcomes in children and adolescents. As an extensive number of questionnaires are currently used to investigate sleep schedule or sleep time, we performed a systematic review of criterion validation of sleep time questionnaires for children and adolescents, considering accelerometers as the reference method. We found a strong correlation between questionnaires and accelerometers for weeknights and a moderate correlation for weekend nights. When considering only studies performing a reliability assessment of the used questionnaires, a significant increase in the correlations for both weeknights and weekend nights was observed. In conclusion, moderate to strong criterion validity of sleep time questionnaires was observed; however, the reliability assessment of the questionnaires showed strong validation performance. Copyright © 2015 Elsevier Ltd. All rights reserved.
Kong, Feng; You, Xuqun; Zhao, Jingjing
2017-01-01
The Gratitude Questionnaire (GQ; McCullough et al., 2002) is one of the most widely used instruments to assess dispositional gratitude. The purpose of this study was to validate a Chinese version of the GQ by examining internal consistency, factor structure, convergent validity, and measurement invariance across sex. A total of 1151 Chinese adults were recruited to complete the GQ, Positive Affect and Negative Affect Scales, and Satisfaction with Life Scale. Confirmatory factor analysis indicated that the original unidimensional model fitted well, which is in accordance with the findings in Western populations. Furthermore, the GQ had satisfactory composite reliability and criterion-related validity with measures of life satisfaction and affective well-being. Evidence of configural, metric and scalar invariance across sex was obtained. Tests of the latent mean differences found females had higher latent mean scores than males. These findings suggest that the Chinese version of GQ is a reliable and valid tool for measuring dispositional gratitude and can generally be utilized across sex in the Chinese context. PMID:28919873
Kong, Feng; You, Xuqun; Zhao, Jingjing
2017-01-01
The Gratitude Questionnaire (GQ; McCullough et al., 2002) is one of the most widely used instruments to assess dispositional gratitude. The purpose of this study was to validate a Chinese version of the GQ by examining internal consistency, factor structure, convergent validity, and measurement invariance across sex. A total of 1151 Chinese adults were recruited to complete the GQ, Positive Affect and Negative Affect Scales, and Satisfaction with Life Scale. Confirmatory factor analysis indicated that the original unidimensional model fitted well, which is in accordance with the findings in Western populations. Furthermore, the GQ had satisfactory composite reliability and criterion-related validity with measures of life satisfaction and affective well-being. Evidence of configural, metric and scalar invariance across sex was obtained. Tests of the latent mean differences found females had higher latent mean scores than males. These findings suggest that the Chinese version of GQ is a reliable and valid tool for measuring dispositional gratitude and can generally be utilized across sex in the Chinese context.
Lin, Keh-chung; Chen, Hui-fang; Chen, Chia-ling; Wang, Tien-ni; Wu, Ching-yi; Hsieh, Yu-wei; Wu, Li-ling
2012-01-01
This study examined criterion-related validity and clinimetric properties of the Pediatric Motor Activity Log (PMAL) in children with cerebral palsy. Study participants were 41 children (age range: 28-113 months) and their parents. Criterion-related validity was evaluated by the associations between the PMAL and criterion measures at baseline and posttreatment, including the self-care, mobility, and cognition subscale, the total performance of the Functional Independence Measure in children (WeeFIM), and the grasping and visual-motor integration of the Peabody Developmental Motor Scales. Pearson correlation coefficients were calculated. Responsiveness was examined using the paired t test and the standardized response mean, the minimal detectable change was captured at the 90% confidence level, and the minimal clinically important change was estimated using anchor-based and distribution-based approaches. The PMAL-QOM showed fair concurrent validity at pretreatment and posttreatment and predictive validity, whereas the PMAL-AOU had fair concurrent validity at posttreatment only. The PMAL-AOU and PMAL-QOM were both markedly responsive to change after treatment. Improvement of at least 0.67 points on the PMAL-AOU and 0.66 points on the PMAL-QOM can be considered as a true change, not measurement error. A mean change has to exceed the range of 0.39-0.94 on the PMAL-AOU and the range of 0.38-0.74 on the PMAL-QOM to be regarded as clinically important change. Copyright © 2011 Elsevier Ltd. All rights reserved.
Amaya-Arias, Ana Carolina; Alzate, Juan Pablo; Eslava-Schmalbach, Javier H
2017-01-01
This study aimed at determining the validity of the Pediatric Quality of Life Inventory 4.0 (PedsQL™ 4.0) for the measurement of health-related quality of life (HRQOL) in Colombian children. Validation study of measurement instruments. The PedsQL™ 4.0 was applied by convenience sampling to 375 pairs of children and adolescents between the ages of 5 and 17 and to their parents-caregivers, as well as to 125 parents-caregivers of children between the ages of 2 and 4 in five cities of Colombia (Bogota, Medellin, Cali, Barranquilla and Bucaramanga). Construct validity was assessed through the use of exploratory and confirmatory factor analysis, and criterion validity was assessed by correlations between the PedsQL™ 4.0 and the KIDSCREEN-27. The instrument was applied to 375 children (ages 5-18) and 125 parents of children between the ages of 2 and 4. Factor analysis revealed four factors considered suitable for the sample in both the child and parent reports, whereas Bartlett's test of sphericity showed inter-correlation between variables. Scale and subscales showed proper indicators of internal consistency. It is recommended not to include or review some of the items in the Colombian version of the scale. The Spanish version for Colombia of the PedsQL™ 4.0 displays suitable indicators of criterion and construct validity, therefore becoming a valuable tool for measuring HRQOL in children in our country. Some modifications are recommended for the Colombian version of the scale.
Akram, A J; Ireland, A J; Postlethwaite, K C; Sandy, J R; Jerreat, A S
2013-11-01
This article describes the process of validity and reliability testing of a condition-specific quality-of-life measure for patients with hypodontia presenting for orthodontic treatment. The development of the instrument is described in a previous article. Royal Devon and Exeter NHS Foundation Trust & Musgrove Park Hospital, Taunton. The child perception questionnaire was used as a standard against which to test criterion validity. The Bland and Altman method was used to check agreement between the two questionnaires. Construct validity was tested using principal component analysis on the four sections of the questionnaire. Test-retest reliability was tested using intraclass correlation coefficient and Bland and Altman method. Cronbach's alpha was used to test internal consistency reliability. Overall the questionnaire showed good reliability, criterion and construct validity. This together with previous evidence of good face and content validity suggests that the instrument may prove useful in clinical practice and further research. This study has demonstrated that the newly developed condition-specific quality-of-life questionnaire is both valid and reliable for use in young patients with hypodontia. © 2013 John Wiley & Sons A/S. Published by Blackwell Publishing Ltd.
2014-01-01
Background Foot disease complications, such as foot ulcers and infection, contribute to considerable morbidity and mortality. These complications are typically precipitated by “high-risk factors”, such as peripheral neuropathy and peripheral arterial disease. High-risk factors are more prevalent in specific “at risk” populations such as diabetes, kidney disease and cardiovascular disease. To the best of the authors’ knowledge a tool capturing multiple high-risk factors and foot disease complications in multiple at risk populations has yet to be tested. This study aimed to develop and test the validity and reliability of a Queensland High Risk Foot Form (QHRFF) tool. Methods The study was conducted in two phases. Phase one developed a QHRFF using an existing diabetes foot disease tool, literature searches, stakeholder groups and expert panel. Phase two tested the QHRFF for validity and reliability. Four clinicians, representing different levels of expertise, were recruited to test validity and reliability. Three cohorts of patients were recruited; one tested criterion measure reliability (n = 32), another tested criterion validity and inter-rater reliability (n = 43), and another tested intra-rater reliability (n = 19). Validity was determined using sensitivity, specificity and positive predictive values (PPV). Reliability was determined using Kappa, weighted Kappa and intra-class correlation (ICC) statistics. Results A QHRFF tool containing 46 items across seven domains was developed. Criterion measure reliability of at least moderate categories of agreement (Kappa > 0.4; ICC > 0.75) was seen in 91% (29 of 32) tested items. Criterion validity of at least moderate categories (PPV > 0.7) was seen in 83% (60 of 72) tested items. Inter- and intra-rater reliability of at least moderate categories (Kappa > 0.4; ICC > 0.75) was seen in 88% (84 of 96) and 87% (20 of 23) tested items respectively. Conclusions The QHRFF had acceptable validity and reliability across the majority of items; particularly items identifying relevant co-morbidities, high-risk factors and foot disease complications. Recommendations have been made to improve or remove identified weaker items for future QHRFF versions. Overall, the QHRFF possesses suitable practicality, validity and reliability to assess and capture relevant foot disease items across multiple at risk populations. PMID:24468080
Machado-Vieira, Rodrigo; Luckenbaugh, David A; Ballard, Elizabeth D; Henter, Ioline D; Tohen, Mauricio; Suppes, Trisha; Zarate, Carlos A
2017-01-01
DSM-5 describes "a distinct period of abnormally and persistently elevated, expansive, or irritable mood and abnormally and persistently increased activity or energy" as a primary criterion for mania. Thus, increased energy or activity is now considered a core symptom of manic and hypomanic episodes. Using data from the Systematic Treatment Enhancement Program for Bipolar Disorder study, the authors analyzed point prevalence data obtained at the initial visit to assess the diagnostic validity of this new DSM-5 criterion. The study hypothesis was that the DSM-5 criterion would alter the prevalence of mania and/or hypomania. The authors compared prevalence, clinical characteristics, validators, and outcome in patients meeting the DSM-5 criteria (i.e., DSM-IV criteria plus the DSM-5 criterion of increased activity or energy) and those who did not meet the new DSM-5 criterion (i.e., who only met DSM-IV criteria). All 4,360 participants met DSM-IV criteria for bipolar disorder, and 310 met DSM-IV criteria for a manic or hypomanic episode. When the new DSM-5 criterion of increased activity or energy was added as a coprimary symptom, the prevalence of mania and hypomania was reduced. Although minor differences were noted in clinical and concurrent validators, no changes were observed in longitudinal outcomes. The findings confirm that including increased activity or energy as part of DSM-5 criterion A decreases the prevalence of manic and hypomanic episodes but does not affect longitudinal clinical outcomes.
Validation of the Spanish Addiction Severity Index Multimedia Version (S-ASI-MV).
Butler, Stephen F; Redondo, José Pedro; Fernandez, Kathrine C; Villapiano, Albert
2009-01-01
This study aimed to develop and test the reliability and validity of a Spanish adaptation of the ASI-MV, a computer administered version of the Addiction Severity Index, called the S-ASI-MV. Participants were 185 native Spanish-speaking adult clients from substance abuse treatment facilities serving Spanish-speaking clients in Florida, New Mexico, California, and Puerto Rico. Participants were administered the S-ASI-MV as well as Spanish versions of the general health subscale of the SF-36, the work and family unit subscales of the Social Adjustment Scale Self-Report, the Michigan Alcohol Screening Test, the alcohol and drug subscales of the Personality Assessment Inventory, and the Hopkins Symptom Checklist-90. Three-to-five-day test-retest reliability was examined along with criterion validity, convergent/discriminant validity, and factorial validity. Measurement invariance between the English and Spanish versions of the ASI-MV was also examined. The S-ASI-MV demonstrated good test-retest reliability (ICCs for composite scores between .59 and .93), criterion validity (rs for composite scores between .66 and .87), and convergent/discriminant validity. Factorial validity and measurement invariance were demonstrated. These results compared favorably with those reported for the original interviewer version of the ASI and the English version of the ASI-MV.
Development and Validation of a Measure of Quality of Life for the Young Elderly in Sri Lanka.
de Silva, Sudirikku Hennadige Padmal; Jayasuriya, Anura Rohan; Rajapaksa, Lalini Chandika; de Silva, Ambepitiyawaduge Pubudu; Barraclough, Simon
2016-01-01
Sri Lanka has one of the fastest aging populations in the world. Measurement of quality of life (QoL) in the elderly needs instruments developed that encompass the sociocultural settings. An instrument was developed to measure QoL in the young elderly in Sri Lanka (QLI-YES), using accepted methods to generate and reduce items. The measure was validated using a community sample. Construct, criterion and predictive validity and reliability were tested. A first-order model of 24 items with 6 domains was found to have good fit indices (CMIN/df = 1.567, RMR = 0.05, CFI = 0.95, and RMSEA = 0.053). Both criterion and predictive validity were demonstrated. Good internal consistency reliability (Cronbach's α = 0.93) was shown. The development of the QLI-YES using a societal perspective relevant to the social and cultural beliefs has resulted in a robust and valid instrument to measure QoL for the young elderly in Sri Lanka. © 2015 APJPH.
Beehler, Sarah; Ahern, Jennifer; Balmer, Brandi; Kuhlman, Jennifer
2017-01-01
This pilot study evaluated the validity and reliability of an Experience of Neighborhood (EON) measure developed to assess neighborhood characteristics that shape reintegration opportunities for returning service members and their families. A total of 91 post-9/11 veterans and spouses completed a survey administered at the Minnesota State Fair. Participants self-reported on their reintegration status (veterans), social functioning (spouses), social support, and mental health. EON factor structure, internal consistency reliability, and validity (discriminant, content, criterion) were analyzed. The EON measure showed adequate reliability, discriminant validity, and content validity. More work is needed to assess criterion validity because EON scores were not correlated with scores on a Census-based index used to measure quality of military neighborhoods. The EON may be useful in assessing broad local factors influencing health among returning veterans and spouses. More research is needed to understand geographic variation in neighborhood conditions and how those affect reintegration and mental health for military families.
Ghisi, Gabriela Lima de Melo; Dos Santos, Rafaella Zulianello; Bonin, Christiani Batista Decker; Roussenq, Suellen; Grace, Sherry L; Oh, Paul; Benetti, Magnus
2014-01-01
To translate, culturally adapt and psychometrically validate the Information Needs in Cardiac Rehabilitation (INCR) tool to Portuguese. The identification of information needs is considered the first step to improve knowledge that ultimately could improve health outcomes. The Portuguese version generated was tested in 300 cardiac rehabilitation patients (CR) (34% women; mean age = 61.3 ± 2.1 years old). Test-retest reliability was assessed using intraclass correlation coefficient (ICC), the internal consistency using Cronbach's alpha, and the criterion validity was assessed with regard to patients' education and duration in CR. All 9 subscales were considered internally consistent (á > 0.7). Significant differences between mean total needs and educational level (p < 0.05) and duration in CR (p = 0.03) supported criterion validity. The overall mean (4.6 ± 0.4), as well as the means of the 9 subscales were high (emergency/safety was the greatest need). The Portuguese INCR was demonstrated to have sufficient reliability, consistency and validity. Copyright © 2014 Elsevier Inc. All rights reserved.
Development and Validation of Triarchic Construct Scales from the Psychopathic Personality Inventory
Hall, Jason R.; Drislane, Laura E.; Patrick, Christopher J.; Morano, Mario; Lilienfeld, Scott O.; Poythress, Norman G.
2014-01-01
The Triarchic model of psychopathy describes this complex condition in terms of distinct phenotypic components of boldness, meanness, and disinhibition. Brief self-report scales designed specifically to index these psychopathy facets have thus far demonstrated promising construct validity. The present study sought to develop and validate scales for assessing facets of the Triarchic model using items from a well-validated existing measure of psychopathy—the Psychopathic Personality Inventory (PPI). A consensus rating approach was used to identify PPI items relevant to each Triarchic facet, and the convergent and discriminant validity of the resulting PPI-based Triarchic scales were evaluated in relation to multiple criterion variables (i.e., other psychopathy inventories, antisocial personality disorder features, personality traits, psychosocial functioning) in offender and non-offender samples. The PPI-based Triarchic scales showed good internal consistency and related to criterion variables in ways consistent with predictions based on the Triarchic model. Findings are discussed in terms of implications for conceptualization and assessment of psychopathy. PMID:24447280
Hall, Jason R; Drislane, Laura E; Patrick, Christopher J; Morano, Mario; Lilienfeld, Scott O; Poythress, Norman G
2014-06-01
The Triarchic model of psychopathy describes this complex condition in terms of distinct phenotypic components of boldness, meanness, and disinhibition. Brief self-report scales designed specifically to index these psychopathy facets have thus far demonstrated promising construct validity. The present study sought to develop and validate scales for assessing facets of the Triarchic model using items from a well-validated existing measure of psychopathy-the Psychopathic Personality Inventory (PPI). A consensus-rating approach was used to identify PPI items relevant to each Triarchic facet, and the convergent and discriminant validity of the resulting PPI-based Triarchic scales were evaluated in relation to multiple criterion variables (i.e., other psychopathy inventories, antisocial personality disorder features, personality traits, psychosocial functioning) in offender and nonoffender samples. The PPI-based Triarchic scales showed good internal consistency and related to criterion variables in ways consistent with predictions based on the Triarchic model. Findings are discussed in terms of implications for conceptualization and assessment of psychopathy.
Beehler, Sarah; Ahern, Jennifer; Balmer, Brandi; Kuhlman, Jennifer
2017-01-01
This pilot study evaluated the validity and reliability of an Experience of Neighborhood (EON) measure developed to assess neighborhood characteristics that shape reintegration opportunities for returning service members and their families. A total of 91 post-9/11 veterans and spouses completed a survey administered at the Minnesota State Fair. Participants self-reported on their reintegration status (veterans), social functioning (spouses), social support, and mental health. EON factor structure, internal consistency reliability, and validity (discriminant, content, criterion) were analyzed. The EON measure showed adequate reliability, discriminant validity, and content validity. More work is needed to assess criterion validity because EON scores were not correlated with scores on a Census-based index used to measure quality of military neighborhoods. The EON may be useful in assessing broad local factors influencing health among returning veterans and spouses. More research is needed to understand geographic variation in neighborhood conditions and how those affect reintegration and mental health for military families. PMID:28936370
Development and Validation of a Measure of Quality of Life for the Young Elderly in Sri Lanka
de Silva, Sudirikku Hennadige Padmal; Jayasuriya, Anura Rohan; Rajapaksa, Lalini Chandika; de Silva, Ambepitiyawaduge Pubudu; Barraclough, Simon
2016-01-01
Sri Lanka has one of the fastest aging populations in the world. Measurement of quality of life (QoL) in the elderly needs instruments developed that encompass the sociocultural settings. An instrument was developed to measure QoL in the young elderly in Sri Lanka (QLI-YES), using accepted methods to generate and reduce items. The measure was validated using a community sample. Construct, criterion and predictive validity and reliability were tested. A first-order model of 24 items with 6 domains was found to have good fit indices (CMIN/df = 1.567, RMR = 0.05, CFI = 0.95, and RMSEA = 0.053). Both criterion and predictive validity were demonstrated. Good internal consistency reliability (Cronbach’s α = 0.93) was shown. The development of the QLI-YES using a societal perspective relevant to the social and cultural beliefs has resulted in a robust and valid instrument to measure QoL for the young elderly in Sri Lanka. PMID:26712893
Rossi, Gina; Debast, Inge; van Alphen, S P J
2017-07-01
The dimensional personality disorders model in the Diagnostic and Statistical Manual (DSM)-5 section III conceptually differentiates impaired personality functioning (criterion A) from the presence of pathological traits (criterion B). This study is the first to specifically address the measurement of criterion A in older adults. Moreover, the convergent/divergent validity of criterion A and criterion B will be compared in younger and older age groups. The Severity Indices of Personality Functioning - Short Form (SIPP-SF) was administered in older (N = 171) and younger adults (N = 210). The factorial structure was analyzed with exploratory structural equation modeling. Differences in convergent/divergent validity between personality functioning (SIPP-SF) and pathological traits (Personality Inventory for DSM-5; Dimensional Assessment of Personality Pathology-Basic Questionnaire) were examined across age groups. Identity Integration, Relational Capacities, Responsibility, Self-Control, and Social Concordance were corroborated as higher order domains. Although the SIPP-SF domains measured unique variation, some high correlations with pathological traits referred to overlapping constructs. Moreover, in older adults, personality functioning was more strongly related to Psychoticism, Disinhibition, Antagonism and Dissocial Behavior compared to younger adults. The SIPP-SF construct validity was demonstrated in terms of a structure of five higher order domains of personality functioning. The instrument is promising as a possible measure of impaired personality functioning in older adults. As such, it is a useful clinical tool to follow up effects of therapy on levels of personality functioning. Moreover, traits were associated with different degrees of personality functioning across age groups.
What Is True Halving in the Payoff Matrix of Game Theory?
Hasegawa, Eisuke; Yoshimura, Jin
2016-01-01
In game theory, there are two social interpretations of rewards (payoffs) for decision-making strategies: (1) the interpretation based on the utility criterion derived from expected utility theory and (2) the interpretation based on the quantitative criterion (amount of gain) derived from validity in the empirical context. A dynamic decision theory has recently been developed in which dynamic utility is a conditional (state) variable that is a function of the current wealth of a decision maker. We applied dynamic utility to the equal division in dove-dove contests in the hawk-dove game. Our results indicate that under the utility criterion, the half-share of utility becomes proportional to a player’s current wealth. Our results are consistent with studies of the sense of fairness in animals, which indicate that the quantitative criterion has greater validity than the utility criterion. We also find that traditional analyses of repeated games must be reevaluated. PMID:27487194
What Is True Halving in the Payoff Matrix of Game Theory?
Ito, Hiromu; Katsumata, Yuki; Hasegawa, Eisuke; Yoshimura, Jin
2016-01-01
In game theory, there are two social interpretations of rewards (payoffs) for decision-making strategies: (1) the interpretation based on the utility criterion derived from expected utility theory and (2) the interpretation based on the quantitative criterion (amount of gain) derived from validity in the empirical context. A dynamic decision theory has recently been developed in which dynamic utility is a conditional (state) variable that is a function of the current wealth of a decision maker. We applied dynamic utility to the equal division in dove-dove contests in the hawk-dove game. Our results indicate that under the utility criterion, the half-share of utility becomes proportional to a player's current wealth. Our results are consistent with studies of the sense of fairness in animals, which indicate that the quantitative criterion has greater validity than the utility criterion. We also find that traditional analyses of repeated games must be reevaluated.
Vanderploeg, Rodney D; Cooper, Douglas B; Belanger, Heather G; Donnell, Alison J; Kennedy, Jan E; Hopewell, Clifford A; Scott, Steven G
2014-01-01
To develop and cross-validate internal validity scales for the Neurobehavioral Symptom Inventory (NSI). Four existing data sets were used: (1) outpatient clinical traumatic brain injury (TBI)/neurorehabilitation database from a military site (n = 403), (2) National Department of Veterans Affairs TBI evaluation database (n = 48 175), (3) Florida National Guard nonclinical TBI survey database (n = 3098), and (4) a cross-validation outpatient clinical TBI/neurorehabilitation database combined across 2 military medical centers (n = 206). Secondary analysis of existing cohort data to develop (study 1) and cross-validate (study 2) internal validity scales for the NSI. The NSI, Mild Brain Injury Atypical Symptoms, and Personality Assessment Inventory scores. Study 1: Three NSI validity scales were developed, composed of 5 unusual items (Negative Impression Management [NIM5]), 6 low-frequency items (LOW6), and the combination of 10 nonoverlapping items (Validity-10). Cut scores maximizing sensitivity and specificity on these measures were determined, using a Mild Brain Injury Atypical Symptoms score of 8 or more as the criterion for invalidity. Study 2: The same validity scale cut scores again resulted in the highest classification accuracy and optimal balance between sensitivity and specificity in the cross-validation sample, using a Personality Assessment Inventory Negative Impression Management scale with a T score of 75 or higher as the criterion for invalidity. The NSI is widely used in the Department of Defense and Veterans Affairs as a symptom-severity assessment following TBI, but is subject to symptom overreporting or exaggeration. This study developed embedded NSI validity scales to facilitate the detection of invalid response styles. The NSI Validity-10 scale appears to hold considerable promise for validity assessment when the NSI is used as a population-screening tool.
Morizot, Julien
2014-10-01
While there are a number of short personality trait measures that have been validated for use with adults, few are specifically validated for use with adolescents. To trust such measures, it must be demonstrated that they have adequate construct validity. According to the view of construct validity as a unifying form of validity requiring the integration of different complementary sources of information, this article reports the evaluation of content, factor, convergent, and criterion validities as well as reliability of adolescents' self-reported personality traits. Moreover, this study sought to address an inherent potential limitation of short personality trait measures, namely their limited conceptual breadth. In this study, starting with items from a known measure, after the language-level was adjusted for use with adolescents, items tapping fundamental primary traits were added to determine the impact of added conceptual breadth on the psychometric properties of the scales. The resulting new measure was named the Big Five Personality Trait Short Questionnaire (BFPTSQ). A group of expert judges considered the items to have adequate content validity. Using data from a community sample of early adolescents, the results confirmed the factor validity of the Big Five structure in adolescence as well as its measurement invariance across genders. More important, the added items did improve the convergent and criterion validities of the scales, but did not negatively affect their reliability. This study supports the construct validity of adolescents' self-reported personality traits and points to the importance of conceptual breadth in short personality measures. © The Author(s) 2014.
Visual judgements of steadiness in one-legged stance: reliability and validity.
Haupstein, T; Goldie, P
2000-01-01
There is a paucity of information about the validity and reliability of clinicians' visual judgements of steadiness in one-legged stance. Such judgements are used frequently in clinical practice to support decisions about treatment in the fields of neurology, sports medicine, paediatrics and orthopaedics. The aim of the present study was to address the validity and reliability of visual judgements of steadiness in one-legged stance in a group of physiotherapists. A videotape of 20 five-second performances was shown to 14 physiotherapists with median clinical experience of 6.75 years. Validity of visual judgement was established by correlating scores obtained from an 11-point rating scale with criterion scores obtained from a force platform. In addition, partial correlations were used to control for the potential influence of body weight on the relationship between the visual judgements and criterion scores. Inter-observer reliability was quantified between the physiotherapists; intra-observer reliability was quantified between two tests four weeks apart. Mean criterion-related validity was high, regardless of whether body weight was controlled for statistically (Pearson's r = 0.84, 0.83, respectively). The standard error of estimating the criterion score was 3.3 newtons. Inter-observer reliability was high (ICC (2,1) = 0.81 at Test 1 and 0.82 at Test 2). Intra-observer reliability was high (on average ICC (2,1) = 0.88; Pearson's r = 0.90). The standard error of measurement for the 11-point scale was one unit. The finding of higher accuracy of making visual judgements than previously reported may be due to several aspects of design: use of a criterion score derived from the variability of the force signal which is more discriminating than variability of centre of pressure; use of a discriminating visual rating scale; specificity and clear definition of the phenomenon to be rated.
2014-01-01
Background Health impairments can result in disability and changed work productivity imposing considerable costs for the employee, employer and society as a whole. A large number of instruments exist to measure health-related productivity changes; however their methodological quality remains unclear. This systematic review critically appraised the measurement properties in generic self-reported instruments that measure health-related productivity changes to recommend appropriate instruments for use in occupational and economic health practice. Methods PubMed, PsycINFO, Econlit and Embase were systematically searched for studies whereof: (i) instruments measured health-related productivity changes; (ii) the aim was to evaluate instrument measurement properties; (iii) instruments were generic; (iv) ratings were self-reported; (v) full-texts were available. Next, methodological quality appraisal was based on COSMIN elements: (i) internal consistency; (ii) reliability; (iii) measurement error; (iv) content validity; (v) structural validity; (vi) hypotheses testing; (vii) cross-cultural validity; (viii) criterion validity; and (ix) responsiveness. Recommendations are based on evidence syntheses. Results This review included 25 articles assessing the reliability, validity and responsiveness of 15 different generic self-reported instruments measuring health-related productivity changes. Most studies evaluated criterion validity, none evaluated cross-cultural validity and information on measurement error is lacking. The Work Limitation Questionnaire (WLQ) was most frequently evaluated with moderate respectively strong positive evidence for content and structural validity and negative evidence for reliability, hypothesis testing and responsiveness. Less frequently evaluated, the Stanford Presenteeism Scale (SPS) showed strong positive evidence for internal consistency and structural validity, and moderate positive evidence for hypotheses testing and criterion validity. The Productivity and Disease Questionnaire (PRODISQ) yielded strong positive evidence for content validity, evidence for other properties is lacking. The other instruments resulted in mostly fair-to-poor quality ratings with limited evidence. Conclusions Decisions based on the content of the instrument, usage purpose, target country and population, and available evidence are recommended. Until high-quality studies are in place to accurately assess the measurement properties of the currently available instruments, the WLQ and, in a Dutch context, the PRODISQ are cautiously preferred based on its strong positive evidence for content validity. Based on its strong positive evidence for internal consistency and structural validity, the SPS is cautiously recommended. PMID:24495301
Validity of Various Methods for Determining Velocity, Force, and Power in the Back Squat.
Banyard, Harry G; Nosaka, Ken; Sato, Kimitake; Haff, G Gregory
2017-10-01
To examine the validity of 2 kinematic systems for assessing mean velocity (MV), peak velocity (PV), mean force (MF), peak force (PF), mean power (MP), and peak power (PP) during the full-depth free-weight back squat performed with maximal concentric effort. Ten strength-trained men (26.1 ± 3.0 y, 1.81 ± 0.07 m, 82.0 ± 10.6 kg) performed three 1-repetition-maximum (1RM) trials on 3 separate days, encompassing lifts performed at 6 relative intensities including 20%, 40%, 60%, 80%, 90%, and 100% of 1RM. Each repetition was simultaneously recorded by a PUSH band and commercial linear position transducer (LPT) (GymAware [GYM]) and compared with measurements collected by a laboratory-based testing device consisting of 4 LPTs and a force plate. Trials 2 and 3 were used for validity analyses. Combining all 120 repetitions indicated that the GYM was highly valid for assessing all criterion variables while the PUSH was only highly valid for estimations of PF (r = .94, CV = 5.4%, ES = 0.28, SEE = 135.5 N). At each relative intensity, the GYM was highly valid for assessing all criterion variables except for PP at 20% (ES = 0.81) and 40% (ES = 0.67) of 1RM. Moreover, the PUSH was only able to accurately estimate PF across all relative intensities (r = .92-.98, CV = 4.0-8.3%, ES = 0.04-0.26, SEE = 79.8-213.1 N). PUSH accuracy for determining MV, PV, MF, MP, and PP across all 6 relative intensities was questionable for the back squat, yet the GYM was highly valid at assessing all criterion variables, with some caution given to estimations of MP and PP performed at lighter loads.
Development of a new instrument for determining the level of chewing function in children.
Serel Arslan, S; Demir, N; Barak Dolgun, A; Karaduman, A A
2016-07-01
This study aimed to develop a chewing performance scale that classifies chewing from normal to severely impaired and to investigate its validity and reliability. The study included the developmental phase and reported the content, structural, criterion validity, interobserver and intra-observer reliability of the chewing performance scale, which was called the Karaduman Chewing Performance Scale (KCPS). A dysphagia literature review, other questionnaires and clinical experiences were used in the developmental phase. Seven experts assessed the steps for content validity over two Delphi rounds. To test structural, criterion validity, interobserver and intra-observer reliability, two swallowing therapists evaluated chewing videos of 144 children (Group I: 61 healthy children without chewing disorders, mean age of 42·38 ± 9·36 months; Group II: 83 children with cerebral palsy who have chewing disorders, mean age of 39·09 ± 22·95 months) using KCPS. The Behavioral Pediatrics Feeding Assessment Scale (BPFAS) was used for criterion validity. The KCPS steps arranged between 0-4 were found to be necessary. The content validity index was 0·885. The KCPS levels were found to be different between groups I and II (χ(2) = 123·286, P < 0·001). A moderately strong positive correlation was found between the KCPS and the subscales of the BPFAS (r = 0·444-0·773, P < 0·001). An excellent positive correlation was detected between two swallowing therapists and between two examinations of one swallowing therapist (r = 0·962, P < 0·001; r = 0·990, P < 0·001, respectively). The KCPS is a valid, reliable, quick and clinically easy-to-use functional instrument for determining the level of chewing function in children. © 2016 John Wiley & Sons Ltd.
Dueñas, María; Mendonça, Liliane; Sampaio, Rute; Gouvinhas, Cláudia; Oliveira, Daniela; Castro-Lopes, José Manuel; Azevedo, Luís Filipe
2017-03-01
The Bowel Function Index (BFI) is a simple and sound bowel function and opioid-induced constipation (OIC) screening tool. We aimed to develop the translation and cultural adaptation of this measure (BFI-P) and to assess its reliability and validity for the Portuguese language and a chronic pain population. The BFI-P was created after a process including translation, back translation and cultural adaptation. Participants (n = 226) were recruited in a chronic pain clinic and were assessed at baseline and after one week. Internal consistency, test-retest reliability, responsiveness, construct (convergent and known groups) and factorial validity were assessed. Test-retest reliability had an intra-class correlation of 0.605 for BFI mean score. Internal consistency of BFI had Cronbach's alpha of 0.865. The construct validity of BFI-P was shown to be excellent and the exploratory factor analysis confirmed its unidimensional structure. The responsiveness of BFI-P was excellent, with a suggested 17-19 point and 8-12 point change in score constituting a clinically relevant change in constipation for patients with and without previous constipation, respectively. This study had some limitations, namely, the criterion validity of BFI-P was not directly assessed; and the absence of a direct criterion for OIC precluded the assessment of the criterion based responsiveness of BFI-P. Nevertheless, BFI may importantly contribute to better OIC screening and its Portuguese version (BFI-P) has been shown to have excellent reliability, internal consistency, validity and responsiveness. Further suggestions regarding statistically and clinically important change cut-offs for this instrument are presented.
Pagliarin, Karina Carlesso; Ortiz, Karin Zazo; Barreto, Simone dos Santos; Pimenta Parente, Maria Alice de Mattos; Nespoulous, Jean-Luc; Joanette, Yves; Fonseca, Rochele Paz
2015-10-15
The Montreal-Toulouse Language Assessment Battery - Brazilian version (MTL-BR) provides a general description of language processing and related components in adults with brain injury. The present study aimed at verifying the criterion-related validity of the Montreal-Toulouse Language Assessment Battery - Brazilian version (MTL-BR) by assessing its ability to discriminate between individuals with unilateral brain damage with and without aphasia. The investigation was carried out in a Brazilian community-based sample of 104 adults, divided into four groups: 26 participants with left hemisphere damage (LHD) with aphasia, 25 participants with right hemisphere damage (RHD), 28 with LHD non-aphasic, and 25 healthy adults. There were significant differences between patients with aphasia and the other groups on most total and subtotal scores on MTL-BR tasks. The results showed strong criterion-related validity evidence for the MTL-BR Battery, and provided important information regarding hemispheric specialization and interhemispheric cooperation. Future research is required to search for additional evidence of sensitivity, specificity and validity of the MTL-BR in samples with different types of aphasia and degrees of language impairment. Copyright © 2015 Elsevier B.V. All rights reserved.
Measuring Sexual Motives: A Test of the Psychometric Properties of the Sexual Motivations Scale.
Jardin, Charles; Garey, Lorra; Zvolensky, Michael J
2017-01-01
Sexual motives refer to functions served by sexual behavior. The Sex Motivations Scale (SMS) has frequently been used to assess sexual motives. At its development, the SMS demonstrated good internal consistency; convergent, divergent, and criterion validity; and configural invariance across sex, age, and Caucasians and African Americans. Yet the metric and scalar invariance of the SMS has not been examined, nor has the measurement invariance of the SMS across Hispanic and Asian Americans, sexual minority status, and relationship status been tested. The criterion validity of the SMS also has yet to be examined for nonintercourse sexual behaviors, such as sexting. The present study aimed to address these gaps in a diverse sample of 2,201 college students (77.60% female; M age = 22.06; 27.84% Caucasian). Results further affirmed the configural, metric, and scalar invariance of the SMS. The convergent and divergent validity of the SMS was supported in relation to positive and negative affect and attachment patterns; and specific SMS subscales demonstrated associations with sexual intercourse behaviors and sexting, supporting the criterion validity of the SMS. These findings suggest the relevance of the SMS in assessing sexual motives across diverse populations and behaviors.
Comparison of the Incremental Validity of the Old and New MCAT.
ERIC Educational Resources Information Center
Wolf, Fredric M.; And Others
The predictive and incremental validity of both the Old and New Medical College Admission Test (MCAT) was examined and compared with a sample of over 300 medical students. Results of zero order and incremental validity coefficients, as well as prediction models resulting from all possible subsets regression analyses using Mallow's Cp criterion,…
Validity of the Mayer-Salovey-Caruso Emotional Intelligence Test: Youth Version-Research Edition
ERIC Educational Resources Information Center
Peters, Christine; Kranzler, John H.; Rossen, Eric
2009-01-01
This study examines the criterion-related validity evidence of scores on the Mayer-Salovey-Caruso Emotional Intelligence Test: Youth Version-Research Version. The authors also investigate the relationship between scores on the MSCEIT-YV and chronological age. Results provide initial support for the construct validity of the MSCEIT-YV but also…
ERIC Educational Resources Information Center
Daniel, Mark; And Others
A study examined the relationship of aptitudes to the performance of skilled technical jobs in engine manufacturing. During the study, several approaches were utilized, including criterion-referenced validation, taxonomic validation, construct validation, and detailed anlaysis of the behaviors involved in performing the jobs. The study sample…
Persoskie, Alexander; Nguyen, Anh B.; Kaufman, Annette R.; Tworek, Cindy
2017-01-01
Beliefs about the relative harmfulness of one product compared to another (perceived relative harm) are central to research and regulation concerning tobacco and nicotine-containing products, but techniques for measuring such beliefs vary widely. We compared the validity of direct and indirect measures of perceived harm of e-cigarettes and smokeless tobacco (SLT) compared to cigarettes. On direct measures, participants explicitly compare the harmfulness of each product. On indirect measures, participants rate the harmfulness of each product separately, and ratings are compared. The U.S. Health Information National Trends Survey (HINTS-FDA-2015; N=3738) included direct measures of perceived harm of e-cigarettes and SLT compared to cigarettes. Indirect measures were created by comparing ratings of harm from e-cigarettes, SLT, and cigarettes on 3-point scales. Logistic regressions tested validity by assessing whether direct and indirect measures were associated with criterion variables including: ever-trying e-cigarettes, ever-trying snus, and SLT use status. Compared to the indirect measures, the direct measures of harm were more consistently associated with criterion variables. On direct measures, 26% of adults rated e-cigarettes as less harmful than cigarettes, and 11% rated SLT as less harmful than cigarettes. Direct measures appear to provide valid information about individuals’ harm beliefs, which may be used to inform research and tobacco control policy. Further validation research is encouraged. PMID:28073035
[Design and validation of a questionnaire for psychosocial nursing diagnosis in Primary Care].
Brito-Brito, Pedro Ruymán; Rodríguez-Álvarez, Cristobalina; Sierra-López, Antonio; Rodríguez-Gómez, José Ángel; Aguirre-Jaime, Armando
2012-01-01
To develop a valid, reliable and easy-to-use questionnaire for a psychosocial nursing diagnosis. The study was performed in two phases: first phase, questionnaire design and construction; second phase, validity and reliability tests. A bank of items was constructed using the NANDA classification as a theoretical framework. Each item was assigned a Likert scale or dichotomous response. The combination of responses to the items constituted the diagnostic rules to assign up to 28 labels. A group of experts carried out the validity test for content. Other validated scales were used as reference standards for the criterion validity tests. Forty-five nurses provided the questionnaire to the patients on three separate occasions over a period of three weeks, and the other validated scales only once to 188 randomly selected patients in Primary Care centres in Tenerife (Spain). Validity tests for construct confirmed the six dimensions of the questionnaire with 91% of total variance explained. Validity tests for criterion showed a specificity of 66%-100%, and showed high correlations with the reference scales when the questionnaire was assigning nursing diagnoses. Reliability tests showed agreement of 56%-91% (P<.001), and a 93% internal consistency. The Questionnaire for Psychosocial Nursing Diagnosis was called CdePS, and included 61 items. The CdePS is a valid, reliable and easy-to-use tool in Primary Care centres to improve the assigning of a psychosocial nursing diagnosis. Copyright © 2011 Elsevier España, S.L. All rights reserved.
ERIC Educational Resources Information Center
Armstrong, William B.
As part of an effort to statistically validate the placement tests used in California's San Diego Community College District (SDCCD) a study was undertaken to review the criteria- and content-related validity of the Assessment and Placement Services (APS) reading and writing tests. Evidence of criteria and content validity was gathered from…
Huang, X N; Zhang, Y; Feng, W W; Wang, H S; Cao, B; Zhang, B; Yang, Y F; Wang, H M; Zheng, Y; Jin, X M; Jia, M X; Zou, X B; Zhao, C X; Robert, J; Jing, Jin
2017-06-02
Objective: To evaluate the reliability and validity of warning signs checklist developed by the National Health and Family Planning Commission of the People's Republic of China (NHFPC), so as to determine the screening effectiveness of warning signs on developmental problems of early childhood. Method: Stratified random sampling method was used to assess the reliability and validity of checklist of warning sign and 2 110 children 0 to 6 years of age(1 513 low-risk subjects and 597 high-risk subjects) were recruited from 11 provinces of China. The reliability evaluation for the warning signs included the test-retest reliability and interrater reliability. With the use of Age and Stage Questionnaire (ASQ) and Gesell Development Diagnosis Scale (GESELL) as the criterion scales, criterion validity was assessed by determining the correlation and consistency between the screening results of warning signs and the criterion scales. Result: In terms of the warning signs, the screening positive rates at different ages ranged from 10.8%(21/141) to 26.2%(51/137). The median (interquartile) testing time for each subject was 1(0.6) minute. Both the test-retest reliability and interrater reliability of warning signs reached 0.7 or above, indicating that the stability was good. In terms of validity assessment, there was remarkable consistency between ASQ and warning signs, with the Kappa value of 0.63. With the use of GESELL as criterion, it was determined that the sensitivity of warning signs in children with suspected developmental delay was 82.2%, and the specificity was 77.7%. The overall Youden index was 0.6. Conclusion: The reliability and validity of warning signs checklist for screening early childhood developmental problems have met the basic requirements of psychological screening scales, with the characteristics of short testing time and easy operation. Thus, this warning signs checklist can be used for screening psychological and behavioral problems of early childhood, especially in community settings.
An evidence-based decision assistance model for predicting training outcome in juvenile guide dogs
Craigon, Peter J.; Blythe, Simon A.; England, Gary C. W.; Asher, Lucy
2017-01-01
Working dog organisations, such as Guide Dogs, need to regularly assess the behaviour of the dogs they train. In this study we developed a questionnaire-style behaviour assessment completed by training supervisors of juvenile guide dogs aged 5, 8 and 12 months old (n = 1,401), and evaluated aspects of its reliability and validity. Specifically, internal reliability, temporal consistency, construct validity, predictive criterion validity (comparing against later training outcome) and concurrent criterion validity (comparing against a standardised behaviour test) were evaluated. Thirty-nine questions were sourced either from previously published literature or created to meet requirements identified via Guide Dogs staff surveys and staff feedback. Internal reliability analyses revealed seven reliable and interpretable trait scales named according to the questions within them as: Adaptability; Body Sensitivity; Distractibility; Excitability; General Anxiety; Trainability and Stair Anxiety. Intra-individual temporal consistency of the scale scores between 5–8, 8–12 and 5–12 months was high. All scales excepting Body Sensitivity showed some degree of concurrent criterion validity. Predictive criterion validity was supported for all seven scales, since associations were found with training outcome, at at-least one age. Thresholds of z-scores on the scales were identified that were able to distinguish later training outcome by identifying 8.4% of all dogs withdrawn for behaviour and 8.5% of all qualified dogs, with 84% and 85% specificity. The questionnaire assessment was reliable and could detect traits that are consistent within individuals over time, despite juvenile dogs undergoing development during the study period. By applying thresholds to scores produced from the questionnaire this assessment could prove to be a highly valuable decision-making tool for Guide Dogs. This is the first questionnaire-style assessment of juvenile dogs that has shown value in predicting the training outcome of individual working dogs. PMID:28614347
Miller, Joshua D; Lynam, Donald R
2012-07-01
Since its publication, the Psychopathic Personality Inventory and its revision (Lilienfeld & Andrews, 1996; Lilienfeld & Widows, 2005) have become increasingly popular such that it is now among the most frequently used self-report inventories for the assessment of psychopathy. The current meta-analysis examined the relations between the two PPI factors (factor 1: Fearless Dominance; factor 2: Self-Centered Impulsivity), as well as their relations with other validated measures of psychopathy, internalizing and externalizing forms of psychopathology, general personality traits, and antisocial personality disorder symptoms. Across 61 samples reported in 49 publications, we found support for the convergent and criterion validity of both PPI factor 2 and the PPI total score. Much weaker validation was found for PPI factor 1, which manifested limited convergent validity and a pattern of correlations with central criterion variables that was inconsistent with many conceptualizations of psychopathy. PsycINFO Database Record (c) 2012 APA, all rights reserved.
Measuring violence risk and outcomes among Mexican American adolescent females.
Cervantes, Richard C; Duenas, Norma; Valdez, Avelardo; Kaplan, Charles
2006-01-01
Central to the development of culturally competent violence prevention programs for Hispanic youth is the development of psychometrically sound violence risk and outcome measures for this population. A study was conducted to determine the psychometric properties of two commonly used violence measures, in this case for Mexican American adolescent females. The Conflict Tactics Scales (CTS2) and the Past Feelings and Acts of Violence Scale (PFAV) were analyzed to examine their interitem reliability, criterion validity, and discriminant validity. A sample of 150 low-risk and 150 high-risk adolescent females was studied. Discriminant validity was indicated by the perpetrator negotiation scale and by the victim psychological aggression and sexual coercion scales of the CTS2 and the PFAV. Analysis indicates that the CTS2 scales and the PFAV demonstrate adequate reliability, whereas strong criterion validity was evidenced by eight of the CTS2 scales and the PFAV.
Stinchfield, Randy; McCready, John; Turner, Nigel E; Jimenez-Murcia, Susana; Petry, Nancy M; Grant, Jon; Welte, John; Chapman, Heather; Winters, Ken C
2016-09-01
The DSM-5 was published in 2013 and it included two substantive revisions for gambling disorder (GD). These changes are the reduction in the threshold from five to four criteria and elimination of the illegal activities criterion. The purpose of this study was to twofold. First, to assess the reliability, validity and classification accuracy of the DSM-5 diagnostic criteria for GD. Second, to compare the DSM-5-DSM-IV on reliability, validity, and classification accuracy, including an examination of the effect of the elimination of the illegal acts criterion on diagnostic accuracy. To compare DSM-5 and DSM-IV, eight datasets from three different countries (Canada, USA, and Spain; total N = 3247) were used. All datasets were based on similar research methods. Participants were recruited from outpatient gambling treatment services to represent the group with a GD and from the community to represent the group without a GD. All participants were administered a standardized measure of diagnostic criteria. The DSM-5 yielded satisfactory reliability, validity and classification accuracy. In comparing the DSM-5 to the DSM-IV, most comparisons of reliability, validity and classification accuracy showed more similarities than differences. There was evidence of modest improvements in classification accuracy for DSM-5 over DSM-IV, particularly in reduction of false negative errors. This reduction in false negative errors was largely a function of lowering the cut score from five to four and this revision is an improvement over DSM-IV. From a statistical standpoint, eliminating the illegal acts criterion did not make a significant impact on diagnostic accuracy. From a clinical standpoint, illegal acts can still be addressed in the context of the DSM-5 criterion of lying to others.
Sindall, Paul; Lenton, John P.; Whytock, Katie; Tolfrey, Keith; Oyster, Michelle L.; Cooper, Rory A.; Goosey-Tolfrey, Victoria L.
2013-01-01
Purpose To compare the criterion validity and accuracy of a 1 Hz non-differential global positioning system (GPS) and data logger device (DL) for the measurement of wheelchair tennis court movement variables. Methods Initial validation of the DL device was performed. GPS and DL were fitted to the wheelchair and used to record distance (m) and speed (m/second) during (a) tennis field (b) linear track, and (c) match-play test scenarios. Fifteen participants were monitored at the Wheelchair British Tennis Open. Results Data logging validation showed underestimations for distance in right (DLR) and left (DLL) logging devices at speeds >2.5 m/second. In tennis-field tests, GPS underestimated distance in five drills. DLL was lower than both (a) criterion and (b) DLR in drills moving forward. Reversing drill direction showed that DLR was lower than (a) criterion and (b) DLL. GPS values for distance and average speed for match play were significantly lower than equivalent values obtained by DL (distance: 2816 (844) vs. 3952 (1109) m, P = 0.0001; average speed: 0.7 (0.2) vs. 1.0 (0.2) m/second, P = 0.0001). Higher peak speeds were observed in DL (3.4 (0.4) vs. 3.1 (0.5) m/second, P = 0.004) during tennis match play. Conclusions Sampling frequencies of 1 Hz are too low to accurately measure distance and speed during wheelchair tennis. GPS units with a higher sampling rate should be advocated in further studies. Modifications to existing DL devices may be required to increase measurement precision. Further research into the validity of movement devices during match play will further inform the demands and movement patterns associated with wheelchair tennis. PMID:23820154
38 CFR 18.442 - Admissions and recruitment.
Code of Federal Regulations, 2011 CFR
2011-07-01
... conduct periodic validity studies against the criterion of overall success in the education program or... use any test or criterion for admission that has a disproportionate, adverse effect on handicapped persons or any class of handicapped persons unless: (i) The test or criterion, as used by the recipient...
Uehara, Kosuke; Ogura, Koichi; Akiyama, Toru; Shinoda, Yusuke; Iwata, Shintaro; Kobayashi, Eisuke; Tanzawa, Yoshikazu; Yonemoto, Tsukasa; Kawano, Hirotaka; Kawai, Akira
2017-09-01
The Musculoskeletal Tumor Society (MSTS) scoring system developed in 1993 is a widely used disease-specific evaluation tool for assessment of physical function in patients with musculoskeletal tumors; however, only a few studies have confirmed its reliability and validity. The aim of this study was to validate the MSTS scoring system for the upper extremity (MSTS-UE) in Japanese patients with musculoskeletal tumors for use by others in research. Does the MSTS-UE have: (1) sufficient reliability and internal consistency; (2) adequate construct validity; and (3) reasonable criterion validity in comparison to the Toronto Extremity Salvage Score (TESS) or SF-36? Reliability was performed using test-retest analysis, and internal consistency was evaluated with Cronbach's alpha coefficient. Construct validity was evaluated using a scree plot to confirm the construct number and the Akaike information criterion network. Criterion validity was evaluated by comparing the MSTS-UE with the TESS and SF-36. The test-retest reliability with intraclass correlation coefficient (0.95; 95% CI, 0.91-0.97) was excellent, and internal consistency with Cronbach's α (0.7; 95% CI, 0.53-0.81) was acceptable. There were no ceiling and floor effects. The Akaike Information Criterion network showed that lifting ability, pain, and dexterity played central roles among the components. The MSTS-UE showed substantial correlation with the TESS scoring scale (r = 0.75; p < 0.001) and fair correlation with the SF-36 physical component summary (r = 0.37; p = 0.007). Although the MSTS-UE showed slight correlation with the SF-36 mental component summary, the emotional acceptance component of the MSTS-UE showed fair correlation (r = 0.29; p = 0.039). We can conclude that the MSTS is not an adequate measure of general health-related quality of life; however, this system was designed mainly to be a simple measure of function in a single extremity. To evaluate the mental state of patients with musculoskeletal tumors in the upper extremity, further study is needed.
Amaya-Arias, Ana Carolina; Alzate, Juan Pablo; Eslava-Schmalbach, Javier H
2017-01-01
Background: This study aimed at determining the validity of the Pediatric Quality of Life Inventory 4.0 (PedsQL™ 4.0) for the measurement of health-related quality of life (HRQOL) in Colombian children. Methods: Validation study of measurement instruments. The PedsQL™ 4.0 was applied by convenience sampling to 375 pairs of children and adolescents between the ages of 5 and 17 and to their parents-caregivers, as well as to 125 parents-caregivers of children between the ages of 2 and 4 in five cities of Colombia (Bogota, Medellin, Cali, Barranquilla and Bucaramanga). Construct validity was assessed through the use of exploratory and confirmatory factor analysis, and criterion validity was assessed by correlations between the PedsQL™ 4.0 and the KIDSCREEN-27. Results: The instrument was applied to 375 children (ages 5–18) and 125 parents of children between the ages of 2 and 4. Factor analysis revealed four factors considered suitable for the sample in both the child and parent reports, whereas Bartlett's test of sphericity showed inter-correlation between variables. Scale and subscales showed proper indicators of internal consistency. It is recommended not to include or review some of the items in the Colombian version of the scale. Conclusions: The Spanish version for Colombia of the PedsQL™ 4.0 displays suitable indicators of criterion and construct validity, therefore becoming a valuable tool for measuring HRQOL in children in our country. Some modifications are recommended for the Colombian version of the scale. PMID:28900536
Rikli, Roberta E; Jones, C Jessie
2013-04-01
To develop and validate criterion-referenced fitness standards for older adults that predict the level of capacity needed for maintaining physical independence into later life. The proposed standards were developed for use with a previously validated test battery for older adults-the Senior Fitness Test (Rikli, R. E., & Jones, C. J. (2001). Development and validation of a functional fitness test for community--residing older adults. Journal of Aging and Physical Activity, 6, 127-159; Rikli, R. E., & Jones, C. J. (1999a). Senior fitness test manual. Champaign, IL: Human Kinetics.). A criterion measure to assess physical independence was identified. Next, scores from a subset of 2,140 "moderate-functioning" older adults from a larger cross-sectional database, together with findings from longitudinal research on physical capacity and aging, were used as the basis for proposing fitness standards (performance cut points) associated with having the ability to function independently. Validity and reliability analyses were conducted to test the standards for their accuracy and consistency as predictors of physical independence. Performance standards are presented for men and women ages 60-94 indicating the level of fitness associated with remaining physically independent until late in life. Reliability and validity indicators for the standards ranged between .79 and .97. The proposed standards provide easy-to-use, previously unavailable methods for evaluating physical capacity in older adults relative to that associated with physical independence. Most importantly, the standards can be used in planning interventions that target specific areas of weakness, thus reducing risk for premature loss of mobility and independence.
Community validation of the IDEA study cognitive screen in rural Tanzania.
Gray, William K; Paddick, Stella Maria; Collingwood, Cecilia; Kisoli, Aloyce; Mbowe, Godfrey; Mkenda, Sarah; Lissu, Carolyn; Rogathi, Jane; Kissima, John; Walker, Richard W; Mushi, Declare; Chaote, Paul; Ogunniyi, Adesola; Dotchin, Catherine L
2016-11-01
The dementia diagnosis gap in sub-Saharan Africa (SSA) is large, partly because of difficulties in screening for cognitive impairment in the community. As part of the Identification and Intervention for Dementia in Elderly Africans (IDEA) study, we aimed to validate the IDEA cognitive screen in a community-based sample in rural Tanzania METHODS: Study participants were recruited from people who attended screening days held in villages within the rural Hai district of Tanzania. Criterion validity was assessed against the gold standard clinical dementia diagnosis using DSM-IV criteria. Construct validity was assessed against, age, education, sex and grip strength and instrumental activities of daily living (IADLs). Internal consistency and floor and ceiling effects were also examined. During community screening, the IDEA cognitive screen had high criterion validity, with an area under the receiver operating characteristic curve of 0.855 (95% CI 0.794 to 0.915). Higher scores on the screen were significantly correlated with lower age, male sex, having attended school, better grip strength and improved performance in activities of daily living. Factor analysis revealed a single factor with an eigenvalue greater than one, although internal consistency was only moderate (Cronbach's alpha = 0.534). The IDEA cognitive screen had high criterion and construct validity and is suitable for use as a cognitive screening instrument in a community setting in SSA. Only moderate internal consistency may partly reflect the multi-domain nature of dementia as diagnosed clinically. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Huber, J; Hüsler, J; Dieppe, P; Günther, K P; Dreinhöfer, K; Judge, A
2016-03-01
To validate a new method to identify responders (relative effect per patient (REPP) >0.2) using the OMERACT-OARSI criteria as gold standard in a large multicentre sample. The REPP ([score before - after treatment]/score before treatment) was calculated for 845 patients of a large multicenter European cohort study for THR. The patients with a REPP >0.2 were defined as responders. The responder rate was compared to the gold standard (OMERACT-OARSI criteria) using receiver operator characteristic (ROC) curve analysis for sensitivity, specificity and percentage of appropriately classified patients. With the criterion REPP>0.2 85.4% of the patients were classified as responders, applying the OARSI-OMERACT criteria 85.7%. The new method had 98.8% sensitivity, 94.2% specificity and 98.1% of the patients were correctly classified compared to the gold standard. The external validation showed a high sensitivity and also specificity of a new criterion to identify a responder compared to the gold standard method. It is simple and has no uncertainties due to a single classification criterion. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
Link, William; Sauer, John R.
2016-01-01
The analysis of ecological data has changed in two important ways over the last 15 years. The development and easy availability of Bayesian computational methods has allowed and encouraged the fitting of complex hierarchical models. At the same time, there has been increasing emphasis on acknowledging and accounting for model uncertainty. Unfortunately, the ability to fit complex models has outstripped the development of tools for model selection and model evaluation: familiar model selection tools such as Akaike's information criterion and the deviance information criterion are widely known to be inadequate for hierarchical models. In addition, little attention has been paid to the evaluation of model adequacy in context of hierarchical modeling, i.e., to the evaluation of fit for a single model. In this paper, we describe Bayesian cross-validation, which provides tools for model selection and evaluation. We describe the Bayesian predictive information criterion and a Bayesian approximation to the BPIC known as the Watanabe-Akaike information criterion. We illustrate the use of these tools for model selection, and the use of Bayesian cross-validation as a tool for model evaluation, using three large data sets from the North American Breeding Bird Survey.
Eckner, James T.; Richardson, James K.; Kim, Hogene; Joshi, Monica S.; Oh, Youkeun K.; Ashton-Miller, James A.
2015-01-01
Summary Slowed reaction time (RT) represents both a risk factor for and a consequence of sport concussion. The purpose of this study was to determine the reliability and criterion validity of a novel clinical test of simple and complex RT, called RTclin, in contact sport athletes. Both tasks were adapted from the well-known ruler drop test of RT and involve manually grasping a falling vertical shaft upon its release, with the complex task employing a go/no-go paradigm based on a slight cue. In 46 healthy contact sport athletes (24 males; M = 16.3 yr., SD = 5.0; 22 women: M age= 15.0 yr., SD = 4.0) whose sports included soccer, ice hockey, American football, martial arts, wrestling, and lacrosse, the latency and accuracy of simple and complex RTclin had acceptable test-retest and inter-rater reliabilities and correlated with a computerized criterion standard, the Axon Computerized Cognitive Assessment Tool. Medium to large effect sizes were found. The novel RTclin tests have acceptable reliability and criterion validity for clinical use and hold promise as concussion assessment tools. PMID:26106803
Is Echinococcus intermedius a valid species?
USDA-ARS?s Scientific Manuscript database
Medical and veterinary sciences require scientific names to discriminate pathogenic organisms in our living environment. Various species concepts have been proposed for metazoan animals. There are, however, constant controversies over their validity because of lack of a common criterion to define ...
Sierpińska, Lidia
2013-09-01
The Authentic Leadership Questionnaire (ALQ) is a standardized research instrument for the evaluation of individual elements of leader's conduct which contribute to the authentic leadership. The application of this questionnaire in Polish conditions required to carry out the validation process. The aim of the study was to evaluate of validity and reliability of the Polish version of the American research instrument for the needs of evaluation of authenticity of leadership of the nursing management in Polish hospitals. The study covered 286 nurses (143 head nurses and 143 of their subordinates) employed in 45 hospitals in Poland. Theoretical validity of the instrument was evaluated using Fisher's transformation (r-Person correlation coefficient), while the criterion validity of the ALQ was evaluated using rho-Spearman correlation coefficient and the BOHIPSZO questionnaire. The reliability of the ALQ was assessed by means of the Cronbach-alpha coefficient. The ALQ questionnaire applied for the evaluation of authenticity of leadership of the nursing management in Polish hospital wards shows an acceptable theoretical and criterion validity and reliability (Cronbach-alpha coefficient 0.80). The Polish version of the ALQ is valid and reliable, and may be applied in studies concerning the evaluation of authenticity of leadership of the nursing management in Polish hospital wards.
7 CFR 15b.30 - Admissions and recruitment.
Code of Federal Regulations, 2011 CFR
2011-01-01
... first year grades, but shall conduct periodic validity studies against the criterion of overall success... admitted; (2) May not make use of any test or criterion for admission that has a disproportionate, adverse effect on handicapped persons or any class of handicapped persons unless (i) the test or criterion, as...
Wilson, G. Terence; Sysko, Robyn
2013-01-01
Objective In DSM-IV, to be diagnosed with Bulimia Nervosa (BN) or the provisional diagnosis of Binge Eating Disorder (BED), an individual must experience episodes of binge eating is “at least twice a week” on average, for three or six months respectively. The purpose of this review was to examine the validity and utility of the frequency criterion for BN and BED. Method Published studies evaluating the frequency criterion were reviewed. Results Our review found little evidence to support the validity or utility of the DSM-IV frequency criterion of twice a week binge eating; however, the number of studies available for our review was limited. Conclusion A number of options are available for the frequency criterion in DSM-V, and the optimal diagnostic threshold for binge eating remains to be determined. PMID:19610014
Fernández-Domínguez, Juan Carlos; de Pedro-Gómez, Joan Ernest; Morales-Asencio, José Miguel; Sastre-Fullana, Pedro; Sesé-Abad, Albert
2017-01-01
Introduction Most of the EBP measuring instruments available to date present limitations both in the operationalisation of the construct and also in the rigour of their psychometric development, as revealed in the literature review performed. The aim of this paper is to provide rigorous and adequate reliability and validity evidence of the scores of a new transdisciplinary psychometric tool, the Health Sciences Evidence-Based Practice (HS-EBP), for measuring the construct EBP in Health Sciences professionals. Methods A pilot study and a subsequent two-stage validation test sample were conducted to progressively refine the instrument until a reduced 60-item version with a five-factor latent structure. Reliability was analysed through both Cronbach’s alpha coefficient and intraclass correlations (ICC). Latent structure was contrasted using confirmatory factor analysis (CFA) following a model comparison aproach. Evidence of criterion validity of the scores obtained was achieved by considering attitudinal resistance to change, burnout, and quality of professional life as criterion variables; while convergent validity was assessed using the Spanish version of the Evidence-Based Practice Questionnaire (EBPQ-19). Results Adequate evidence of both reliability and ICC was obtained for the five dimensions of the questionnaire. According to the CFA model comparison, the best fit corresponded to the five-factor model (RMSEA = 0.049; CI 90% RMSEA = [0.047; 0.050]; CFI = 0.99). Adequate criterion and convergent validity evidence was also provided. Finally, the HS-EBP showed the capability to find differences between EBP training levels as an important evidence of decision validity. Conclusions Reliability and validity evidence obtained regarding the HS-EBP confirm the adequate operationalisation of the EBP construct as a process put into practice to respond to every clinical situation arising in the daily practice of professionals in health sciences (transprofessional). The tool could be useful for EBP individual assessment and for evaluating the impact of specific interventions to improve EBP. PMID:28486533
Fernández-Domínguez, Juan Carlos; de Pedro-Gómez, Joan Ernest; Morales-Asencio, José Miguel; Bennasar-Veny, Miquel; Sastre-Fullana, Pedro; Sesé-Abad, Albert
2017-01-01
Most of the EBP measuring instruments available to date present limitations both in the operationalisation of the construct and also in the rigour of their psychometric development, as revealed in the literature review performed. The aim of this paper is to provide rigorous and adequate reliability and validity evidence of the scores of a new transdisciplinary psychometric tool, the Health Sciences Evidence-Based Practice (HS-EBP), for measuring the construct EBP in Health Sciences professionals. A pilot study and a subsequent two-stage validation test sample were conducted to progressively refine the instrument until a reduced 60-item version with a five-factor latent structure. Reliability was analysed through both Cronbach's alpha coefficient and intraclass correlations (ICC). Latent structure was contrasted using confirmatory factor analysis (CFA) following a model comparison aproach. Evidence of criterion validity of the scores obtained was achieved by considering attitudinal resistance to change, burnout, and quality of professional life as criterion variables; while convergent validity was assessed using the Spanish version of the Evidence-Based Practice Questionnaire (EBPQ-19). Adequate evidence of both reliability and ICC was obtained for the five dimensions of the questionnaire. According to the CFA model comparison, the best fit corresponded to the five-factor model (RMSEA = 0.049; CI 90% RMSEA = [0.047; 0.050]; CFI = 0.99). Adequate criterion and convergent validity evidence was also provided. Finally, the HS-EBP showed the capability to find differences between EBP training levels as an important evidence of decision validity. Reliability and validity evidence obtained regarding the HS-EBP confirm the adequate operationalisation of the EBP construct as a process put into practice to respond to every clinical situation arising in the daily practice of professionals in health sciences (transprofessional). The tool could be useful for EBP individual assessment and for evaluating the impact of specific interventions to improve EBP.
Sullivan, Ruth; Kinra, Sanjay; Ekelund, Ulf; Bharathi, A V; Vaz, Mario; Kurpad, Anura; Collier, Tim; Reddy, K Srinath; Prabhakaran, Dorairaj; Ebrahim, Shah; Kuper, Hannah
2012-02-09
Socio-cultural differences for country-specific activities are rarely addressed in physical activity questionnaires. We examined the reliability and validity of the Indian Migration Study Physical Activity Questionnaire (IMS-PAQ) in urban and rural groups in India. A sub-sample of IMS participants (n = 479) was used to examine short term (≤ 1 month [n = 158]) and long term (> 1 month [n = 321]) IMS-PAQ reliability for levels of total, sedentary, light and moderate/vigorous activity (MVPA) intensity using intraclass correlation (ICC) and kappa coefficients (k). Criterion validity (n = 157) was examined by comparing the IMS-PAQ to a uniaxial accelerometer (ACC) worn ≥ 4 days, via Spearman's rank correlations (ρ) and k, using Bland-Altman plots to check for systematic bias. Construct validity (n = 7,000) was established using linear regression, comparing IMS-PAQ against theoretical constructs associated with physical activity (PA): BMI [kg/m2], percent body fat and pulse rate. IMS-PAQ reliability ranged from ICC 0.42-0.88 and k = 0.37-0.61 (≤ 1 month) and ICC 0.26 to 0.62; kappa 0.17 to 0.45 (> 1 month). Criterion validity was ρ = 0.18-0.48; k = 0.08-0.34. Light activity was underestimated and MVPA consistently and substantially overestimated for the IMS-PAQ vs. the accelerometer. Criterion validity was moderate for total activity and MVPA. Reliability and validity were comparable for urban and rural participants but lower in women than men. Increasing time spent in total activity or MVPA, and decreasing time in sedentary activity were associated with decreasing BMI, percent body fat and pulse rate, thereby demonstrating construct validity. IMS-PAQ reliability and validity is similar to comparable self-reported instruments. It is an appropriate tool for ranking PA of individuals in India. Some refinements may be required for sedentary populations and women in India.
2012-01-01
Background Socio-cultural differences for country-specific activities are rarely addressed in physical activity questionnaires. We examined the reliability and validity of the Indian Migration Study Physical Activity Questionnaire (IMS-PAQ) in urban and rural groups in India. Methods A sub-sample of IMS participants (n = 479) was used to examine short term (≤1 month [n = 158]) and long term (> 1 month [n = 321]) IMS-PAQ reliability for levels of total, sedentary, light and moderate/vigorous activity (MVPA) intensity using intraclass correlation (ICC) and kappa coefficients (k). Criterion validity (n = 157) was examined by comparing the IMS-PAQ to a uniaxial accelerometer (ACC) worn ≥4 days, via Spearman's rank correlations (ρ) and k, using Bland-Altman plots to check for systematic bias. Construct validity (n = 7,000) was established using linear regression, comparing IMS-PAQ against theoretical constructs associated with physical activity (PA): BMI [kg/m2], percent body fat and pulse rate. Results IMS-PAQ reliability ranged from ICC 0.42-0.88 and k = 0.37-0.61 (≤1 month) and ICC 0.26 to 0.62; kappa 0.17 to 0.45 (> 1 month). Criterion validity was ρ = 0.18-0.48; k = 0.08-0.34. Light activity was underestimated and MVPA consistently and substantially overestimated for the IMS-PAQ vs. the accelerometer. Criterion validity was moderate for total activity and MVPA. Reliability and validity were comparable for urban and rural participants but lower in women than men. Increasing time spent in total activity or MVPA, and decreasing time in sedentary activity were associated with decreasing BMI, percent body fat and pulse rate, thereby demonstrating construct validity. Conclusion IMS-PAQ reliability and validity is similar to comparable self-reported instruments. It is an appropriate tool for ranking PA of individuals in India. Some refinements may be required for sedentary populations and women in India. PMID:22321669
[Development and Validation of the Academic Resilience Inventory for Nursing Students in Taiwan].
Li, Cheng-Chieh; Wei, Chi-Fang; Tung, Yuk-Ying
2017-10-01
Failure to cope with learning pressures has been shown to influence the learning achievement and professional performance of nursing students. In order to enable nursing students to adapt successfully to their academic stress, it is essential to explore their academic resilience in the process of learning. To develop the Academic Resilience Inventory for Nursing Students (ARINS) and to test its reliability and validity. A total of 611 nursing students in central and southern Taiwan were recruited as participants. We divided the sample into two subsamples randomly using R software. The first sample was used to conduct item analysis and exploratory factor analysis. The other sample was used to conduct confirmatory factor analysis, cross validation, and criterion-related validity. There are 15 items in the ARINS, with cognitive maturity, emotional regulation, and help-seeking behavior used as the measurement indicators of academic resilience in nursing students. The assessed goodness-of-fit index indicates that the model fit the data well based upon the CFA and has good convergent validity and discriminant validity. Criterion-related validity was supported by the correlation among ARINS, learning performance and attitude, hope and optimistic, and depression. The ARINS has good reliability and validation and is a suitable measure of academic resilience in nursing students. It is helpful for nursing students to examine their academic stress and coping efficacy in the learning process.
Lifesource XL-18 pedometer for measuring steps under controlled and free-living conditions.
Liu, Sam; Brooks, Dina; Thomas, Scott; Eysenbach, Gunther; Nolan, Robert Peter
2015-01-01
The primary aim was to examine the criterion and construct validity and test-retest reliability of the Lifesource XL-18 pedometer (A&D Medical, Toronto, ON, Canada) for measuring steps under controlled and free-living activities. The influence of body mass index, waist size and walking speed on the criterion validity of XL-18 was also explored. Forty adults (35-74 years) performed a 6-min walk test in the controlled condition, and the criterion validity of XL-18 was assessed by comparing it to steps counted manually. Thirty-five adults participated in the free-living condition and the construct validity of XL-18 was assessed by comparing it to Yamax SW-200 (YAMAX Health & Sports, Inc., San Antonio, TX, USA). During the controlled condition, XL-18 did not significantly differ from criterion (P > 0.05) and no systematic error was found using Bland-Altman analysis. The accuracy of XL-18 decreased with slower walking speed (P = 0.001). During the free-living condition, Bland-Altman analysis revealed that XL-18 overestimated daily steps by 327 ± 118 than Yamax (P = 0.004). However, the absolute percent error (APE) (6.5 ± 0.58%) was still within an acceptable range. XL-18 did not differ statistically between pant pockets. XL-18 is suitable for measuring steps in controlled and free-living conditions. However, caution may be required when interpreting the steps recorded under slower speeds and free-living conditions.
Validation and cross cultural adaptation of the Italian version of the Harris Hip Score.
Dettoni, Federico; Pellegrino, Pietro; La Russa, Massimo R; Bonasia, Davide E; Blonna, Davide; Bruzzone, Matteo; Castoldi, Filippo; Rossi, Roberto
2015-01-01
The Harris Hip Score (HHS) is one of the most widely used health related quality of life (HRQOL) measures for the assessment of hip pathology: in spite of this, a validation study, and an official Italian version have not been provided yet. The aim of this study was to create an Italian valid and reliable version of the HHS. The score was translated and modified in Italian; then 103 patients with different hip pathologies were evaluated using this HHS version and also with the WOMAC and the SF-12 questionnaires. Content, construct and criterion validities were tested, such as interobserver reliability, test-retest reliability and internal consistency. Cross-cultural adaptation was easy, and only minor adaptation was required in the translation process. Construct and criterion validity of the HHS Italian Version were confirmed by satisfactory values of Spearman's Rho for correlation between specific domains of HHS and Womac and SF12 scores. Interobserver and test-retest reliabilities obtained values of 0.996 and 0.975 respectively; Cronbach's alpha for internal consistency was 0.816. Statistical and clinical analysis showed that HHS is highly valid and reliable in this new Italian version.
Assessment scale of risk for surgical positioning injuries 1
Lopes, Camila Mendonça de Moraes; Haas, Vanderlei José; Dantas, Rosana Aparecida Spadoti; de Oliveira, Cheila Gonçalves; Galvão, Cristina Maria
2016-01-01
ABSTRACT Objective: to build and validate a scale to assess the risk of surgical positioning injuries in adult patients. Method: methodological research, conducted in two phases: construction and face and content validation of the scale and field research, involving 115 patients. Results: the Risk Assessment Scale for the Development of Injuries due to Surgical Positioning contains seven items, each of which presents five subitems. The scale score ranges between seven and 35 points in which, the higher the score, the higher the patient's risk. The Content Validity Index of the scale corresponded to 0.88. The application of Student's t-test for equality of means revealed the concurrent criterion validity between the scores on the Braden scale and the constructed scale. To assess the predictive criterion validity, the association was tested between the presence of pain deriving from surgical positioning and the development of pressure ulcer, using the score on the Risk Assessment Scale for the Development of Injuries due to Surgical Positioning (p<0.001). The interrater reliability was verified using the intraclass correlation coefficient, equal to 0.99 (p<0.001). Conclusion: the scale is a valid and reliable tool, but further research is needed to assess its use in clinical practice. PMID:27579925
Soble, Jason R; Bain, Kathleen M; Bailey, K Chase; Kirton, Joshua W; Marceaux, Janice C; Critchfield, Edan A; McCoy, Karin J M; O'Rourke, Justin J F
2018-01-08
Embedded performance validity tests (PVTs) allow for continuous assessment of invalid performance throughout neuropsychological test batteries. This study evaluated the utility of the Wechsler Memory Scale-Fourth Edition (WMS-IV) Logical Memory (LM) Recognition score as an embedded PVT using the Advanced Clinical Solutions (ACS) for WAIS-IV/WMS-IV Effort System. This mixed clinical sample was comprised of 97 total participants, 71 of whom were classified as valid and 26 as invalid based on three well-validated, freestanding criterion PVTs. Overall, the LM embedded PVT demonstrated poor concordance with the criterion PVTs and unacceptable psychometric properties using ACS validity base rates (42% sensitivity/79% specificity). Moreover, 15-39% of participants obtained an invalid ACS base rate despite having a normatively-intact age-corrected LM Recognition total score. Receiving operating characteristic curve analysis revealed a Recognition total score cutoff of < 61% correct improved specificity (92%) while sensitivity remained weak (31%). Thus, results indicated the LM Recognition embedded PVT is not appropriate for use from an evidence-based perspective, and that clinicians may be faced with reconciling how a normatively intact cognitive performance on the Recognition subtest could simultaneously reflect invalid performance validity.
Alyusuf, Raja H; Prasad, Kameshwar; Abdel Satir, Ali M; Abalkhail, Ali A; Arora, Roopa K
2013-01-01
The exponential use of the internet as a learning resource coupled with varied quality of many websites, lead to a need to identify suitable websites for teaching purposes. The aim of this study is to develop and to validate a tool, which evaluates the quality of undergraduate medical educational websites; and apply it to the field of pathology. A tool was devised through several steps of item generation, reduction, weightage, pilot testing, post-pilot modification of the tool and validating the tool. Tool validation included measurement of inter-observer reliability; and generation of criterion related, construct related and content related validity. The validated tool was subsequently tested by applying it to a population of pathology websites. Reliability testing showed a high internal consistency reliability (Cronbach's alpha = 0.92), high inter-observer reliability (Pearson's correlation r = 0.88), intraclass correlation coefficient = 0.85 and κ =0.75. It showed high criterion related, construct related and content related validity. The tool showed moderately high concordance with the gold standard (κ =0.61); 92.2% sensitivity, 67.8% specificity, 75.6% positive predictive value and 88.9% negative predictive value. The validated tool was applied to 278 websites; 29.9% were rated as recommended, 41.0% as recommended with caution and 29.1% as not recommended. A systematic tool was devised to evaluate the quality of websites for medical educational purposes. The tool was shown to yield reliable and valid inferences through its application to pathology websites.
Examining the validity of self-reports on scales measuring students' strategic processing.
Samuelstuen, Marit S; Bråten, Ivar
2007-06-01
Self-report inventories trying to measure strategic processing at a global level have been much used in both basic and applied research. However, the validity of global strategy scores is open to question because such inventories assess strategy perceptions outside the context of specific task performance. The primary aim was to examine the criterion-related and construct validity of the global strategy data obtained with the Cross-Curricular Competencies (CCC) scale. Additionally, we wanted to compare the validity of these data with the validity of data obtained with a task-specific self-report inventory focusing on the same types of strategies. The sample included 269 10th-grade students from 12 different junior high schools. Global strategy use as assessed with the CCC was compared with task-specific strategy use reported in three different reading situations. Moreover, relationships between scores on the CCC and scores on measures of text comprehension were examined and compared with relationships between scores on the task-specific strategy measure and the same comprehension measures. The comparison between the CCC strategy scores and the task-specific strategy scores suggested only modest criterion-related validity for the data obtained with the global strategy inventory. The CCC strategy scores were also not related to the text comprehension measures, indicating poor construct validity. In contrast, the task-specific strategy scores were positively related to the comprehension measures, indicating good construct validity. Attempts to measure strategic processing at a global level seem to have limited validity and utility.
Mungovan, Sean F; Peralta, Paula J; Gass, Gregory C; Scanlan, Aaron T
2018-04-12
To examine the test-retest reliability and criterion validity of a high-intensity, netball-specific fitness test. Repeated measures, within-subject design. Eighteen female netball players competing in an international competition completed a trial of the Net-Test, which consists of 14 timed netball-specific movements. Players also completed a series of netball-relevant criterion fitness tests. Ten players completed an additional Net-Test trial one week later to assess test-retest reliability using intraclass correlation coefficient (ICC), typical error of measurement (TEM), and coefficient of variation (CV). The typical error of estimate expressed as CV and Pearson correlations were calculated between each criterion test and Net-Test performance to assess criterion validity. Five movements during the Net-Test displayed moderate ICC (0.84-0.90) and two movements displayed high ICC (0.91-0.93). Seven movements and heart rate taken during the Net-Test held low CV (<5%) with values ranging from 1.7 to 9.5% across measures. Total time (41.63±2.05s) during the Net-Test possessed low CV and significant (p<0.05) correlations with 10m sprint time (1.98±0.12s; CV=4.4%, r=0.72), 20m sprint time (3.38±0.19s; CV=3.9%, r=0.79), 505 Change-of-Direction time (2.47±0.08s; CV=2.0%, r=0.80); and maximum oxygen uptake (46.59±2.58 mLkg -1 min -1 ; CV=4.5%, r=-0.66). The Net-Test possesses acceptable reliability for the assessment of netball fitness. Further, the high criterion validity for the Net-Test suggests a range of important netball-specific fitness elements are assessed in combination. Copyright © 2018 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.
Validation of the organizational culture assessment instrument.
Heritage, Brody; Pollock, Clare; Roberts, Lynne
2014-01-01
Organizational culture is a commonly studied area in industrial/organizational psychology due to its important role in workplace behaviour, cognitions, and outcomes. Jung et al.'s [1] review of the psychometric properties of organizational culture measurement instruments noted many instruments have limited validation data despite frequent use in both theoretical and applied situations. The Organizational Culture Assessment Instrument (OCAI) has had conflicting data regarding its psychometric properties, particularly regarding its factor structure. Our study examined the factor structure and criterion validity of the OCAI using robust analysis methods on data gathered from 328 (females = 226, males = 102) Australian employees. Confirmatory factor analysis supported a four factor structure of the OCAI for both ideal and current organizational culture perspectives. Current organizational culture data demonstrated expected reciprocally-opposed relationships between three of the four OCAI factors and the outcome variable of job satisfaction but ideal culture data did not, thus indicating possible weak criterion validity when the OCAI is used to assess ideal culture. Based on the mixed evidence regarding the measure's properties, further examination of the factor structure and broad validity of the measure is encouraged.
Williams, Stacey L.; Polaha, Jodi
2014-01-01
The purpose of this paper was to examine the validity of score interpretations of an instrument developed to measure parents’ perceptions of stigma about seeking mental health services for their children. The validity of the score interpretations of the instrument was tested in two studies. Study 1 examined confirmatory factor analysis (CFA) employing a split half approach, and construct and criterion validity using the entire sample of parents in rural Appalachia whose children were experiencing psychosocial concerns (N=347), while Study 2 further examined CFA, construct and criterion validity, as well as predictive validity of the scores on the new scale using a general sample of parents in rural Appalachia (N=184). Results of exploratory and confirmatory factor analyses revealed support for a two factor model of parents’ perceived stigma, which represented both self and public forms of stigma associated with seeking mental health services for their children, and correlated with existing measures of stigma and other psychosocial variables. Further, the new self and public stigma scale significantly predicted parents’ willingness to seek services for children. PMID:24749752
Frederick, R I
2000-01-01
Mixed group validation (MGV) is offered as an alternative to criterion group validation (CGV) to estimate the true positive and false positive rates of tests and other diagnostic signs. CGV requires perfect confidence about each research participant's status with respect to the presence or absence of pathology. MGV determines diagnostic efficiencies based on group data; knowing an individual's status with respect to pathology is not required. MGV can use relatively weak indicators to validate better diagnostic signs, whereas CGV requires perfect diagnostic signs to avoid error in computing true positive and false positive rates. The process of MGV is explained, and a computer simulation demonstrates the soundness of the procedure. MGV of the Rey 15-Item Memory Test (Rey, 1958) for 723 pre-trial criminal defendants resulted in higher estimates of true positive rates and lower estimates of false positive rates as compared with prior research conducted with CGV. The author demonstrates how MGV addresses all the criticisms Rogers (1997b) outlined for differential prevalence designs in malingering detection research. Copyright 2000 John Wiley & Sons, Ltd.
Validation of the Organizational Culture Assessment Instrument
Heritage, Brody; Pollock, Clare; Roberts, Lynne
2014-01-01
Organizational culture is a commonly studied area in industrial/organizational psychology due to its important role in workplace behaviour, cognitions, and outcomes. Jung et al.'s [1] review of the psychometric properties of organizational culture measurement instruments noted many instruments have limited validation data despite frequent use in both theoretical and applied situations. The Organizational Culture Assessment Instrument (OCAI) has had conflicting data regarding its psychometric properties, particularly regarding its factor structure. Our study examined the factor structure and criterion validity of the OCAI using robust analysis methods on data gathered from 328 (females = 226, males = 102) Australian employees. Confirmatory factor analysis supported a four factor structure of the OCAI for both ideal and current organizational culture perspectives. Current organizational culture data demonstrated expected reciprocally-opposed relationships between three of the four OCAI factors and the outcome variable of job satisfaction but ideal culture data did not, thus indicating possible weak criterion validity when the OCAI is used to assess ideal culture. Based on the mixed evidence regarding the measure's properties, further examination of the factor structure and broad validity of the measure is encouraged. PMID:24667839
INCLEN Diagnostic Tool for Autism Spectrum Disorder (INDT-ASD): development and validation.
Juneja, Monica; Mishra, Devendra; Russell, Paul S S; Gulati, Sheffali; Deshmukh, Vaishali; Tudu, Poma; Sagar, Rajesh; Silberberg, Donald; Bhutani, Vinod K; Pinto, Jennifer M; Durkin, Maureen; Pandey, Ravindra M; Nair, M K C; Arora, Narendra K
2014-05-01
To develop and validate INCLEN Diagnostic Tool for Autism Spectrum Disorder (INDT-ASD). Diagnostic test evaluation by cross sectional design. Four tertiary pediatric neurology centers in Delhi and Thiruvanthapuram, India. Children aged 2-9 years were enrolled in the study. INDT-ASD and Childhood Autism Rating Scale (CARS) were administered in a randomly decided sequence by trained psychologist, followed by an expert evaluation by DSM-IV TR diagnostic criteria (gold standard). Psychometric parameters of diagnostic accuracy, validity (construct, criterion and convergent) and internal consistency. 154 children (110 boys, mean age 64.2 mo) were enrolled. The overall diagnostic accuracy (AUC=0.97, 95% CI 0.93, 0.99; P<0.001) and validity (sensitivity 98%, specificity 95%, positive predictive value 91%, negative predictive value 99%) of INDT-ASD for Autism spectrum disorder were high, taking expert diagnosis using DSM-IV-TR as gold standard. The concordance rate between the INDT-ASD and expert diagnosis for 'ASD group' was 82.52% [Cohen's k=0.89; 95% CI (0.82, 0.97); P=0.001]. The internal consistency of INDT-ASD was 0.96. The convergent validity with CARS (r = 0.73, P= 0.001) and divergent validity with Binet-Kamat Test of intelligence (r = -0.37; P=0.004) were significantly high. INDT-ASD has a 4-factor structure explaining 85.3% of the variance. INDT-ASD has high diagnostic accuracy, adequate content validity, good internal consistency high criterion validity and high to moderate convergent validity and 4-factor construct validity for diagnosis of Autistm spectrum disorder.
Romero-García, Marta; de la Cueva-Ariza, Laura; Benito-Aracil, Llucia; Lluch-Canut, Teresa; Trujols-Albet, Joan; Martínez-Momblan, Maria Antonia; Juvé-Udina, Maria-Eulàlia; Delgado-Hito, Pilar
2018-06-01
The aim of this study was to develop and validate the Nursing Intensive-Care Satisfaction Scale to measures satisfaction with nursing care from the critical care patient's perspective. Instruments that measure satisfaction with nursing cares have been designed and validated without taking the patient's perspective into consideration. Despite the benefits and advances in measuring satisfaction with nursing care, none instrument is specifically designed to assess satisfaction in intensive care units. Instrument development. The population were all discharged patients (January 2013 - January 2015) from three Intensive Care Units of a third level hospital (N = 200). All assessment instruments were given to discharged patients and 48 hours later, to analyse the temporal stability, only the questionnaire was given again. The validation process of the scale included the analysis of internal consistency, temporal stability; validity of construct through a confirmatory factor analysis; and criterion validity. Reliability was 0.95. The intraclass correlation coefficient for the total scale was 0.83 indicating a good temporal stability. Construct validity showed an acceptable fit and factorial structure with four factors, in accordance with the theoretical model, being Consequences factor the best correlated with other factors. Criterion validity, presented a correlation between low and high (range: 0.42-0.68). The scale has been designed and validated incorporating the perspective of critical care patients. Thanks to its reliability and validity, this questionnaire can be used both in research and in clinical practice. The scale offers a possibility to assess and develop interventions to improve patient satisfaction with nursing care. © 2018 John Wiley & Sons Ltd.
Larrabee, Glenn J
2014-01-01
Bilder, Sugar, and Hellemann (2014 this issue) contend that empirical support is lacking for use of multiple performance validity tests (PVTs) in evaluation of the individual case, differing from the conclusions of Davis and Millis (2014), and Larrabee (2014), who found no substantial increase in false positive rates using a criterion of failure of ≥ 2 PVTs and/or Symptom Validity Tests (SVTs) out of multiple tests administered. Reconsideration of data presented in Larrabee (2014) supports a criterion of ≥ 2 out of up to 7 PVTs/SVTs, as keeping false positive rates close to and in most cases below 10% in cases with bona fide neurologic, psychiatric, and developmental disorders. Strategies to minimize risk of false positive error are discussed, including (1) adjusting individual PVT cutoffs or criterion for number of PVTs failed, for examinees who have clinical histories placing them at risk for false positive identification (e.g., severe TBI, schizophrenia), (2) using the history of the individual case to rule out conditions known to result in false positive errors, (3) using normal performance in domains mimicked by PVTs to show that sufficient native ability exists for valid performance on the PVT(s) that have been failed, and (4) recognizing that as the number of PVTs/SVTs failed increases, the likelihood of valid clinical presentation decreases, with a corresponding increase in the likelihood of invalid test performance and symptom report.
ERIC Educational Resources Information Center
LaBelle, Sara; Johnson, Zac D.
2018-01-01
Three studies were conducted to generate a valid and reliable instrument to measure student-to-student confirmation. Study One (N = 396) sought to establish a factor structure based on previous research. Study Two (N = 396) sought to confirm this factor structure and assess criterion-related validity. Study Three (N = 283) sought to assess…
ERIC Educational Resources Information Center
Thomas, Michael L.; Lanyon, Richard I.; Millsap, Roger E.
2009-01-01
The use of criterion group validation is hindered by the difficulty of classifying individuals on latent constructs. Latent class analysis (LCA) is a method that can be used for determining the validity of scales meant to assess latent constructs without such a priori classifications. The authors used this method to examine the ability of the L…
A Controlled Evaluation of the Distress Criterion for Binge Eating Disorder
ERIC Educational Resources Information Center
Grilo, Carlos M.; White, Marney A.
2011-01-01
Objective: Research has examined various aspects of the validity of the research criteria for binge eating disorder (BED) but has yet to evaluate the utility of Criterion C, "marked distress about binge eating." This study examined the significance of the marked distress criterion for BED using 2 complementary comparison groups. Method:…
Reliability and Validity of the Professional Counseling Performance Evaluation
ERIC Educational Resources Information Center
Shepherd, J. Brad; Britton, Paula J.; Kress, Victoria E.
2008-01-01
The definition and measurement of counsellor trainee competency is an issue that has received increased attention yet lacks quantitative study. This research evaluates item responses, scale reliability and intercorrelations, interrater agreement, and criterion-related validity of the Professional Performance Fitness Evaluation/Professional…
Accuracy of clinical observations of push-off during gait after stroke.
McGinley, Jennifer L; Morris, Meg E; Greenwood, Ken M; Goldie, Patricia A; Olney, Sandra J
2006-06-01
To determine the accuracy (criterion-related validity) of real-time clinical observations of push-off in gait after stroke. Criterion-related validity study of gait observations. Rehabilitation hospital in Australia. Eleven participants with stroke and 8 treating physical therapists. Not applicable. Pearson product-moment correlation between physical therapists' observations of push-off during gait and criterion measures of peak ankle power generation from a 3-dimensional motion analysis system. A high correlation was obtained between the observational ratings and the measurements of peak ankle power generation (Pearson r =.98). The standard error of estimation of ankle power generation was .32W/kg. Physical therapists can make accurate real-time clinical observations of push-off during gait following stroke.
Validity and extension of the SCS-CN method for computing infiltration and rainfall-excess rates
NASA Astrophysics Data System (ADS)
Mishra, Surendra Kumar; Singh, Vijay P.
2004-12-01
A criterion is developed for determining the validity of the Soil Conservation Service curve number (SCS-CN) method. According to this criterion, the existing SCS-CN method is found to be applicable when the potential maximum retention, S, is less than or equal to twice the total rainfall amount. The criterion is tested using published data of two watersheds. Separating the steady infiltration from capillary infiltration, the method is extended for predicting infiltration and rainfall-excess rates. The extended SCS-CN method is tested using 55 sets of laboratory infiltration data on soils varying from Plainfield sand to Yolo light clay, and the computed and observed infiltration and rainfall-excess rates are found to be in good agreement.
Criterion-Referenced Testing for College-Level General Education: Some Problems and Recommendations.
ERIC Educational Resources Information Center
Benoist, Howard
1979-01-01
The adoption of a criterion-referenced assessment system and the resulting disadvantages of this form of evaluation for the college general education program are discussed, including problems in identifying assessment validation procedures. (RAO)
Ruch, Willibald; Heintz, Sonja
2017-01-01
How strongly does humor (i.e., the construct-relevant content) in the Humor Styles Questionnaire (HSQ; Martin et al., 2003) determine the responses to this measure (i.e., construct validity)? Also, how much does humor influence the relationships of the four HSQ scales, namely affiliative, self-enhancing, aggressive, and self-defeating, with personality traits and subjective well-being (i.e., criterion validity)? The present paper answers these two questions by experimentally manipulating the 32 items of the HSQ to only (or mostly) contain humor (i.e., construct-relevant content) or to substitute the humor content with non-humorous alternatives (i.e., only assessing construct-irrelevant context). Study 1 (N = 187) showed that the HSQ affiliative scale was mainly determined by humor, self-enhancing and aggressive were determined by both humor and non-humorous context, and self-defeating was primarily determined by the context. This suggests that humor is not the primary source of the variance in three of the HQS scales, thereby limiting their construct validity. Study 2 (N = 261) showed that the relationships of the HSQ scales to the Big Five personality traits and subjective well-being (positive affect, negative affect, and life satisfaction) were consistently reduced (personality) or vanished (subjective well-being) when the non-humorous contexts in the HSQ items were controlled for. For the HSQ self-defeating scale, the pattern of relationships to personality was also altered, supporting an positive rather than a negative view of the humor in this humor style. The present findings thus call for a reevaluation of the role that humor plays in the HSQ (construct validity) and in the relationships to personality and well-being (criterion validity). PMID:28473794
Validity of field expedient devices to assess core temperature during exercise in the cold.
Bagley, James R; Judelson, Daniel A; Spiering, Barry A; Beam, William C; Bartolini, J Albert; Washburn, Brian V; Carney, Keven R; Muñoz, Colleen X; Yeargin, Susan W; Casa, Douglas J
2011-12-01
Exposure to cold environments affects human performance and physiological function. Major medical organizations recommend rectal temperature (TREC) to evaluate core body temperature (TcORE) during exercise in the cold; however, other field expedient devices claim to measure TCORE. The purpose of this study was to determine if field expedient devices provide valid measures of TcRE during rest and exercise in the cold. Participants included 13 men and 12 women (age = 24 +/- 3 yr, height = 170.7 +/- 10.6 cm, mass = 73.4 +/- 16.7 kg, body fat = 18 +/- 7%) who reported being healthy and at least recreationally active. During 150 min of cold exposure, subjects sequentially rested for 30 min, cycled for 90 min (heart rate = 120-140 bpm), and rested for an additional 30 min. Investigators compared aural (T(AUR)), expensive axillary (T(AXLe)), inexpensive axillary (T(AXLi)), forehead (T(FOR)), gastrointestinal (T(GI)), expensive oral (T(ORLe)), inexpensive oral (T(ORLi)), and temporal (T(TEM)) temperatures to T(REc) every 15 min. Researchers used mean difference between each device and T(REC) (i.e., mean bias) as the primary criterion for validity. T(AUR), T(AXLe), T(AXLi), T(FOR), TORLe, T(ORLi), and TTEM provided significantly lower measures compared to T(REC) and fell below our validity criterion. T(GI) significantly exceeded T(REC) at three of eleven time points, but no significant difference existed between mean T(REC) and T(GI) across time. Only T(GI) achieved our validity criterion and compared favorably to T(REC). T(GI) offers a valid measurement with which to assess T(CORE) during rest and exercise in the cold; athletic trainers, mountain rescuers, and military medical personnel should avoid other field expedient devices in similar conditions.
[Elaboration and validation of a tool to measure psychological well-being: WBMMS].
Massé, R; Poulin, C; Dassa, C; Lambert, J; Bélair, S; Battaglini, M A
1998-01-01
Psychological well-being scales used in epidemiologic surveys usually show high construct validity. The content validation, however, is less convincing since these scales rest on lists of items that reflect the theoretical model of the authors. In this study we present results of the construct and criterion validation of a new Well-Being Manifestations Measure Scale (WBMMS) founded on an initial list of manifestations derived from an original content validation in a general population. It is concluded that national and public health epidemiologic surveys should include both measures of positive and negative mental health.
Continual Response Measurement: Design and Validation.
ERIC Educational Resources Information Center
Baggaley, Jon
1987-01-01
Discusses reliability and validity of continual response measurement (CRM), a computer-based measurement technique, and its use in social science research. Highlights include the importance of criterion-referencing the data, guidelines for designing studies using CRM, examples typifying their deductive and inductive functions, and a discussion of…
Heritage, Brody; Gilbert, Jessica M.; Roberts, Lynne D.
2016-01-01
Job embeddedness is a construct that describes the manner in which employees can be enmeshed in their jobs, reducing their turnover intentions. Recent questions regarding the properties of quantitative job embeddedness measures, and their predictive utility, have been raised. Our study compared two competing reflective measures of job embeddedness, examining their convergent, criterion, and incremental validity, as a means of addressing these questions. Cross-sectional quantitative data from 246 Australian university employees (146 academic; 100 professional) was gathered. Our findings indicated that the two compared measures of job embeddedness were convergent when total scale scores were examined. Additionally, job embeddedness was capable of demonstrating criterion and incremental validity, predicting unique variance in turnover intention. However, this finding was not readily apparent with one of the compared job embeddedness measures, which demonstrated comparatively weaker evidence of validity. We discuss the theoretical and applied implications of these findings, noting that job embeddedness has a complementary place among established determinants of turnover intention. PMID:27199817
Criterion and incremental validity of the emotion regulation questionnaire
Ioannidis, Christos A.; Siegling, A. B.
2015-01-01
Although research on emotion regulation (ER) is developing, little attention has been paid to the predictive power of ER strategies beyond established constructs. The present study examined the incremental validity of the Emotion Regulation Questionnaire (ERQ; Gross and John, 2003), which measures cognitive reappraisal and expressive suppression, over and above the Big Five personality factors. It also extended the evidence for the measure's criterion validity to yet unexamined criteria. A university student sample (N = 203) completed the ERQ, a measure of the Big Five, and relevant cognitive and emotion-laden criteria. Cognitive reappraisal predicted positive affect beyond personality, as well as experiential flexibility and constructive self-assertion beyond personality and affect. Expressive suppression explained incremental variance in negative affect beyond personality and in experiential flexibility beyond personality and general affect. No incremental effects were found for worry, social anxiety, rumination, reflection, and preventing negative emotions. Implications for the construct validity and utility of the ERQ are discussed. PMID:25814967
Wagenlehner, Florian Martin Erich; Fröhlich, Oliver; Bschleipfer, Thomas; Weidner, Wolfgang; Perletti, Gianpaolo
2014-06-01
Anatomical damage to pelvic floor structures may cause multiple symptoms. The Integral Theory System Questionnaire (ITSQ) is a holistic questionnaire that uses symptoms to help locate damage in specific connective tissue structures as a guide to reconstructive surgery. It is based on the integral theory, which states that pelvic floor symptoms and prolapse are both caused by lax suspensory ligaments. The aim of the present study was to psychometrically validate the ITSQ. Established psychometric properties including validity, reliability, and responsiveness were considered for evaluation. Criterion validity was assessed in a cohort of 110 women with pelvic floor dysfunctions by analyzing the correlation of questionnaire responses with objective clinical data. Test-retest was performed with questionnaires from 47 patients. Cronbach's alpha and "split-half" reliability coefficients were calculated for inner consistency analysis. Psychometric properties of ITSQ were comparable to the ones of previously validated Pelvic Floor Questionnaires. Face validity and content validity were approved by an expert group of the International Collaboration of Pelvic Floor surgeons. Convergent validity assessed using Bayesian method was at least as accurate as the expert assessment of anatomical defects. Objective data measurement in patients demonstrated significant correlations with ITSQ domains fulfilling criterion validity. Internal consistency values ranked from 0.85 to 0.89 in different scenarios. The ITSQ proofed accurate and is able to serve as a holistic Pelvic Floor Questionnaire directing symptoms to site-specific pelvic floor reconstructive surgery.
The French-Canadian validation of a disease-specific, patient-reported outcome measure for lupus.
Bourré-Tessier, J; Clarke, A E; Kosinski, M; Mikolaitis-Preuss, R A; Bernatsky, S; Block, J A; Jolly, M
2014-12-01
The objective of this paper is to perform the cross-cultural validation of the French version of the LupusPRO, a disease-targeted patient-reported outcome measure, among systemic lupus erythematosus (SLE) patients in Canada. The French version of the LupusPRO and the MOS SF-36 were administered; demographic, clinical and serological characteristics were obtained. Disease activity (SELENA-SLEDAI and the Lupus Foundation of America definition of flare) and damage (SLICC/ACR SDI) were assessed. Physician disease activity and damage assessments were ascertained using visual analog scales. Internal consistency reliability (ICR), test-retest reliability (TRT), convergent and discriminant validity (against corresponding domains of the SF-36), criterion validity (against disease activity, damage or health status) and known group validity were tested. A total of 99 French-Canadian SLE patients participated (97% women, mean (SD) age 45.2 (14.5) years). The median (IQR) SELENA-SLEDAI and SDI were 3.5 (6.0) and 1.0 (2.0), respectively. The ICR of the LupusPRO domains ranged from 0.81 to 0.93 (except for lupus symptoms, procreation and coping), while TRT ranged from 0.72 to 0.95. Convergent and discriminant validity, criterion validity and known group validity against disease activity, damage and health status measures were observed. Confirmatory factor analysis showed a good fit. The LupusPRO has fair psychometric properties among French-Canadian patients with SLE. © The Author(s) 2014 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav.
Wang, Yi-Wen; Tsai, Yun-Fang; Lee, Shwu-Hua; Chen, Ying-Jen; Chen, Hsiu-Fang
2016-07-01
To develop and psychometrically test the Protective Reasons against Suicide Inventory among older Chinese-speaking outpatients. Tools currently exist to test reasons for living among individuals of all ages in western countries, but few are available to assess older adults' protective reasons against suicide in Asia. A cross-sectional survey to investigate protective reasons against suicide among older Chinese-speaking outpatients. The Protective Reasons against Suicide Inventory was developed based on individual interviews with 83 older outpatients in Taiwan, the literature and the authors' clinical experiences. The resulting Inventory was examined in 2013 for content validity, face validity, construct validity, criterion-related validity, internal consistency reliability and test-retest reliability. The Inventory had excellent content validity and face validity. Factor analysis yielded a seven-factor solution, accounting for 87·7% of the variance. Scores on the global Inventory and its subscales tended to be higher in outpatients diagnosed without suicidal ideation than in outpatients diagnosed with suicidal ideation, indicating good criterion validity. Inventory reliability and the intraclass correlation coefficient were satisfactory. The Protective Reasons against Suicide Inventory can be completed in 5 minutes and is perceived as easy to complete. Moreover, the Inventory yielded highly acceptable parameters for validity and reliability. The Protective Reasons against Suicide Inventory can be used to assess older Chinese-speaking outpatients for factors that protect them from attempting suicide. © 2016 John Wiley & Sons Ltd.
Sepehry, Amir A; Lee, Philip E; Hsiung, Ging-Yuek R; Beattie, B Lynn; Feldman, Howard H; Jacova, Claudia
2017-01-01
Presented herein is evidence for criterion, content, and convergent/discriminant validity of the NIMH-Provisional Diagnostic Criteria for depression of Alzheimer's Disease (PDC-dAD) that were formulated to address depression in Alzheimer's disease (AD). Using meta-analytic and systematic review methods, we examined criterion validity evidence in epidemiological and clinical studies comparing the PDC-dAD to Diagnostic and Statistical Manual of Mental Disorders fourth edition (DSM-IV), and International Classification of Disease (ICD 9) depression diagnostic criteria. We estimated prevalence of depression by PDC, DSM, and ICD with an omnibus event rate effect-size. We also examined diagnostic agreement between PDC and DSM. To gauge content validity, we reviewed rates of symptom endorsement for each diagnostic approach. Finally, we examined the PDC's relationship with assessment scales (global cognition, neuropsychiatric, and depression definition) for convergent validity evidence. The aggregate evidence supports the validity of the PDC-dAD. Our findings suggest that depression in AD differs from other depressive disorders including Major Depressive Disorder (MDD) in that dAD is more prevalent, with generally a milder presentation and with unique features not captured by the DSM. Although the PDC are the current standard for diagnosis of depression in AD, we identified the need for their further optimization based on predictive validity evidence.
Brazilian validation of the Alberta Infant Motor Scale.
Valentini, Nadia Cristina; Saccani, Raquel
2012-03-01
The Alberta Infant Motor Scale (AIMS) is a well-known motor assessment tool used to identify potential delays in infants' motor development. Although Brazilian researchers and practitioners have used the AIMS in laboratories and clinical settings, its translation to Portuguese and validation for the Brazilian population is yet to be investigated. This study aimed to translate and validate all AIMS items with respect to internal consistency and content, criterion, and construct validity. A cross-sectional and longitudinal design was used. A cross-cultural translation was used to generate a Brazilian-Portuguese version of the AIMS. In addition, a validation process was conducted involving 22 professionals and 766 Brazilian infants (aged 0-18 months). The results demonstrated language clarity and internal consistency for the motor criteria (motor development score, α=.90; prone, α=.85; supine, α=.92; sitting, α=.84; and standing, α=.86). The analysis also revealed high discriminative power to identify typical and atypical development (motor development score, P<.001; percentile, P=.04; classification criterion, χ(2)=6.03; P=.05). Temporal stability (P=.07) (rho=.85, P<.001) was observed, and predictive power (P<.001) was limited to the group of infants aged from 3 months to 9 months. Limited predictive validity was observed, which may have been due to the restricted time that the groups were followed longitudinally. In sum, the translated version of AIMS presented adequate validity and reliability.
Saffari, Mohsen; Naderi, Maryam K; Piper, Crystal N; Koenig, Harold G
There is no valid and well-established tool to measure fatigue in people with chronic hepatitis B. The aim of this study was to translate the Multidimensional Fatigue Inventory (MFI) into Persian and examine its reliability and validity in Iranian people with chronic hepatitis B. The demographic questionnaire and MFI, as well as Chronic Liver Disease Questionnaire and EuroQol-5D (to assess criterion validity), were administered in face-to-face interviews with 297 participants. A forward-backward translation method was used to develop a culturally adapted Persian version of the questionnaire. Cronbach's α was used to assess the internal reliability of the scale. Pearson correlation was used to assess criterion validity, and known-group method was used along with factor analysis to establish construct validity. Cronbach's α for the total scale was 0.89. Convergent and discriminant validities were also established. Correlations between the MFI and the health-related quality of life scales were significant (p < .01). The scale differentiated between subgroups of persons with the hepatitis B infection in terms of age, gender, employment, education, disease duration, and stage of disease. Factor analysis indicated a four-factor solution for the scale that explained 60% of the variance. The MFI is a valid and reliable instrument to identify fatigue in Iranians with hepatitis B.
A criterion for maximum resin flow in composite materials curing process
NASA Astrophysics Data System (ADS)
Lee, Woo I.; Um, Moon-Kwang
1993-06-01
On the basis of Springer's resin flow model, a criterion for maximum resin flow in autoclave curing is proposed. Validity of the criterion was proved for two resin systems (Fiberite 976 and Hercules 3501-6 epoxy resin). The parameter required for the criterion can be easily estimated from the measured resin viscosity data. The proposed criterion can be used in establishing the proper cure cycle to ensure maximum resin flow and, thus, the maximum compaction.
Bania, Theofani
2014-09-01
We determined the criterion validity and the retest reliability of the ΑctivPAL™ monitor in young people with diplegic cerebral palsy (CP). Activity monitor data were compared with the criterion of video recording for 10 participants. For the retest reliability, activity monitor data were collected from 24 participants on two occasions. Participants had to have diplegic CP and be between 14 and 22 years of age. They also had to be of Gross Motor Function Classification System level II or III. Outcomes were time spent in standing, number of steps (physical activity) and time spent in sitting (sedentary behaviour). For criterion validity, coefficients of determination were all high (r(2) ≥ 0.96), and limits of group agreement were relatively narrow, but limits of agreement for individuals were narrow only for number of steps (≥5.5%). Relative reliability was high for number of steps (intraclass correlation coefficient = 0.87) and moderate for time spent in sitting and lying, and time spent in standing (intraclass correlation coefficients = 0.60-0.66). For groups, changes of up to 7% could be due to measurement error with 95% confidence, but for individuals, changes as high as 68% could be due to measurement error. The results support the criterion validity and the retest reliability of the ActivPAL™ to measure physical activity and sedentary behaviour in groups of young people with diplegic CP but not in individuals. Copyright © 2014 John Wiley & Sons, Ltd.
Ten Issues in Criterion-Referenced Testing: A Response to Commonly Heard Criticisms.
ERIC Educational Resources Information Center
Curlette, William L.; Stallings, William M.
1979-01-01
The 10 criticisms of criterion-referenced tests addressed in this paper are: the domains tested; pedagogical influence; difficulty of items; cumbersome reports; reliability; arbitrary criteria; local objectives; labeling; predictive validity; and repeated testing. (SJL)
Procedures for Constructing and Using Criterion-Referenced Performance Tests.
ERIC Educational Resources Information Center
Campbell, Clifton P.; Allender, Bill R.
1988-01-01
Criterion-referenced performance tests (CRPT) provide a realistic method for objectively measuring task proficiency against predetermined attainment standards. This article explains the procedures of constructing, validating, and scoring CRPTs and includes a checklist for a welding test. (JOW)
De Smedt, Delphine; Clays, Els; Doyle, Frank; Kotseva, Kornelia; Prugger, Christof; Pająk, Andrzej; Jennings, Catriona; Wood, David; De Bacquer, Dirk
2013-09-01
To investigate the validity and reliability of the EuroQol-5D (EQ-5D), the 12-item Short-Form Health Survey (SF-12v2), and the Hospital Anxiety and Depression Scale (HADS) in a stable coronary population. Cross-sectional study EUROASPIRE III. Quality of life data (QoL) were available on 8745 patients hospitalized for coronary artery bypass graft (CABG), percutaneous coronary intervention (PCI), acute myocardial infarction (AMI), or myocardial ischemia. They were interviewed and examined at least 6 months after their hospital admission. Reliability and validity of the 3 instruments were tested. Internal consistency, and discriminative, convergent, criterion and construct validity were assessed. Cronbach's alpha indicated good internal consistency for all measures (0.73 to 0.87). Discriminative validity analyses confirmed significant QoL differences between known groups: age, gender, educational level. In addition, all hypothesized correlations between QoL constructs (convergent validity) and items (criterion validity) were confirmed with significant correlations. Confirmatory factor analyses indicated good construct validity for HADS and SF-12v2. On country-specific level, results were roughly similar. The EQ-5D as well as the SF-12v2 and the HADS are reliable and valid instruments for use in a stable coronary population, both on aggregate European level and on country-specific level. However, our results must be generalized with caution, because EUROASPIRE III patients might not be representative for all patients with stable coronary heart disease. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Stone, Lisanne L; Janssens, Jan M A M; Vermulst, Ad A; Van Der Maten, Marloes; Engels, Rutger C M E; Otten, Roy
2015-01-01
The Strengths and Difficulties Questionnaire is one of the most employed screening instruments. Although there is a large research body investigating its psychometric properties, reliability and validity are not yet fully tested using modern techniques. Therefore, we investigate reliability, construct validity, measurement invariance, and predictive validity of the parent and teacher version in children aged 4-7. Besides, we intend to replicate previous studies by investigating test-retest reliability and criterion validity. In a Dutch community sample 2,238 teachers and 1,513 parents filled out questionnaires regarding problem behaviors and parenting, while 1,831 children reported on sociometric measures at T1. These children were followed-up during three consecutive years. Reliability was examined using Cronbach's alpha and McDonald's omega, construct validity was examined by Confirmatory Factor Analysis, and predictive validity was examined by calculating developmental profiles and linking these to measures of inadequate parenting, parenting stress and social preference. Further, mean scores and percentiles were examined in order to establish norms. Omega was consistently higher than alpha regarding reliability. The original five-factor structure was replicated, and measurement invariance was established on a configural level. Further, higher SDQ scores were associated with future indices of higher inadequate parenting, higher parenting stress and lower social preference. Finally, previous results on test-retest reliability and criterion validity were replicated. This study is the first to show SDQ scores are predictively valid, attesting to the feasibility of the SDQ as a screening instrument. Future research into predictive validity of the SDQ is warranted.
Chabrera, Carolina; Areal, Joan; Font, Albert; Caro, Mónica; Bonet, Marta; Zabalegui, Adelaida
2015-01-01
The aim of this study is to develop a Spanish version of the Satisfaction With Decision scale (SWDs) and analyse the psychometric properties of validity and reliability. An observational, descriptive study and validation of a tool to measure satisfaction with the decision. Urology, Radiation oncology, and Medical oncology Departments of the Hospital Universitari Germans Trias i Pujol, Institut Català d'Oncologia and the Institut Oncològic del Vallès - Hospital General de Catalunya. A total of 170 participants diagnosed with prostate cancer, and who could read and write in Spanish and gave their informed consent. A translation, back-translation and cross-cultural adaptation to Spanish was performed on the SWDs. The content validity, criterion validity, construct validity and reliability (internal consistency and stability) of the Spanish version were evaluated. The SWDs contains 6 items with 5-item Likert scales. A Spanish version (ESD) was obtained that was linguistically and conceptually equivalent to the original version. Criterion validity, the ESD correlated with "satisfaction with the decision" using a linear analogue scale, was significant (r=0.63, P<.01) for all items. The factorial analysis showed a unique dimension to explain 82.08% of the variance. The ESD showed excellent results in terms of internal consistency (Cronbach alpha=0.95) and good test-retest reliability with intraclass correlation coefficient of 0.711. The ESD is a validated Spanish scale to measure the satisfaction with the decisions taken in health, and demonstrates a correct validity and reliability. Copyright © 2015 Elsevier España, S.L.U. All rights reserved.
Anxiety measures validated in perinatal populations: a systematic review.
Meades, Rose; Ayers, Susan
2011-09-01
Research and screening of anxiety in the perinatal period is hampered by a lack of psychometric data on self-report anxiety measures used in perinatal populations. This paper aimed to review self-report measures that have been validated with perinatal women. A systematic search was carried out of four electronic databases. Additional papers were obtained through searching identified articles. Thirty studies were identified that reported validation of an anxiety measure with perinatal women. Most commonly validated self-report measures were the General Health Questionnaire (GHQ), State-Trait Anxiety Inventory (STAI), and Hospital Anxiety and Depression Scales (HADS). Of the 30 studies included, 11 used a clinical interview to provide criterion validity. Remaining studies reported one or more other forms of validity (factorial, discriminant, concurrent and predictive) or reliability. The STAI shows criterion, discriminant and predictive validity and may be most useful for research purposes as a specific measure of anxiety. The Kessler 10 (K-10) may be the best short screening measure due to its ability to differentiate anxiety disorders. The Depression Anxiety Stress Scales 21 (DASS-21) measures multiple types of distress, shows appropriate content, and remains to be validated against clinical interview in perinatal populations. Nineteen studies did not report sensitivity or specificity data. The early stages of research into perinatal anxiety, the multitude of measures in use, and methodological differences restrict comparison of measures across studies. There is a need for further validation of self-report measures of anxiety in the perinatal period to enable accurate screening and detection of anxiety symptoms and disorders. Copyright © 2010 Elsevier B.V. All rights reserved.
Alyusuf, Raja H.; Prasad, Kameshwar; Abdel Satir, Ali M.; Abalkhail, Ali A.; Arora, Roopa K.
2013-01-01
Background: The exponential use of the internet as a learning resource coupled with varied quality of many websites, lead to a need to identify suitable websites for teaching purposes. Aim: The aim of this study is to develop and to validate a tool, which evaluates the quality of undergraduate medical educational websites; and apply it to the field of pathology. Methods: A tool was devised through several steps of item generation, reduction, weightage, pilot testing, post-pilot modification of the tool and validating the tool. Tool validation included measurement of inter-observer reliability; and generation of criterion related, construct related and content related validity. The validated tool was subsequently tested by applying it to a population of pathology websites. Results and Discussion: Reliability testing showed a high internal consistency reliability (Cronbach's alpha = 0.92), high inter-observer reliability (Pearson's correlation r = 0.88), intraclass correlation coefficient = 0.85 and κ =0.75. It showed high criterion related, construct related and content related validity. The tool showed moderately high concordance with the gold standard (κ =0.61); 92.2% sensitivity, 67.8% specificity, 75.6% positive predictive value and 88.9% negative predictive value. The validated tool was applied to 278 websites; 29.9% were rated as recommended, 41.0% as recommended with caution and 29.1% as not recommended. Conclusion: A systematic tool was devised to evaluate the quality of websites for medical educational purposes. The tool was shown to yield reliable and valid inferences through its application to pathology websites. PMID:24392243
Davies, Kylie; Bulsara, Max K; Ramelet, Anne-Sylvie; Monterosso, Leanne
2018-05-01
To establish criterion-related construct validity and test-retest reliability for the Endotracheal Suction Assessment Tool© (ESAT©). Endotracheal tube suction performed in children can significantly affect clinical stability. Previously identified clinical indicators for endotracheal tube suction were used as criteria when designing the ESAT©. Content validity was reported previously. The final stages of psychometric testing are presented. Observational testing was used to measure construct validity and determine whether the ESAT© could guide "inexperienced" paediatric intensive care nurses' decision-making regarding endotracheal tube suction. Test-retest reliability of the ESAT© was performed at two time points. The researchers and paediatric intensive care nurse "experts" developed 10 hypothetical clinical scenarios with predetermined endotracheal tube suction outcomes. "Experienced" (n = 12) and "inexperienced" (n = 14) paediatric intensive care nurses were presented with the scenarios and the ESAT© guiding decision-making about whether to perform endotracheal tube suction for each scenario. Outcomes were compared with those predetermined by the "experts" (n = 9). Test-retest reliability of the ESAT© was measured at two consecutive time points (4 weeks apart) with "experienced" and "inexperienced" paediatric intensive care nurses using the same scenarios and tool to guide decision-making. No differences were observed between endotracheal tube suction decisions made by "experts" (n = 9), "inexperienced" (n = 14) and "experienced" (n = 12) nurses confirming the tool's construct validity. No differences were observed between groups for endotracheal tube suction decisions at T1 and T2. Criterion-related construct validity and test-retest reliability of the ESAT© were demonstrated. Further testing is recommended to confirm reliability in the clinical setting with the "inexperienced" nurse to guide decision-making related to endotracheal tube suction. The ESAT© is the first validated tool to systematically guide endotracheal nursing practice for the "inexperienced" nurse. © 2018 John Wiley & Sons Ltd.
ERIC Educational Resources Information Center
Wicherts, Jelte M.; Scholten, Annemarie Zand
2010-01-01
The validity of cognitive ability tests is often interpreted solely as a function of the cognitive abilities that these tests are supposed to measure, but other factors may be at play. The effects of test anxiety on the criterion related validity (CRV) of tests was the topic of a recent study by Reeve, Heggestad, and Lievens (2009) (Reeve, C. L.,…
ERIC Educational Resources Information Center
Armstrong, William B.
In Fall 1994, the San Diego Community College District (SDCCD), in California, conducted a study to determine the validity of the Mathematics Diagnostic Testing Project (MDTP) placement test. The MDTP provides tests at four levels (i.e., algebra readiness, elementary algebra, intermediate algebra, and pre-calculus) and is used in the District for…
McKown, Clark
2007-03-01
In this study, the validity of 5 tests of children's social-emotional cognition, defined as their encoding, memory, and interpretation of social information, was tested. Participants were 126 clinic-referred children between the ages of 5 and 17. All 5 tests were evaluated in terms of their (a) concurrent validity, (b) incremental validity, and (c) clinical usefulness in predicting social functioning. Tests included measures of nonverbal sensitivity, social language, and social problem solving. Criterion measures included parent and teacher report of social functioning. Analyses support the concurrent validity of all measures, and the incremental validity and clinical usefulness of tests of pragmatic language and problem solving.
The Role of Testing in Affirmative Action.
ERIC Educational Resources Information Center
Manning, Winton H.
Graphs and charts pertaining to testing in affirmative action are presented. Data concern the following: the predictive validity of College Board admissions tests using freshman grade point average as the criterion; validity coefficients of undergraduate grade point average (UGPA) alone, Law School Admission Test (LSAT) scores, and undergraduate…
45 CFR 1170.42 - Admissions and recruitment.
Code of Federal Regulations, 2010 CFR
2010-10-01
... recipient, has been validated as a predictor of success in the education program or activity in question and... inquiry exception. When a recipient is taking remedial action to correct the effects of past... first year grades, but shall conduct periodic validity studies against the criterion of overall success...
A Model of Physical Performance for Occupational Tasks.
ERIC Educational Resources Information Center
Hogan, Joyce
This report acknowledges the problems faced by industrial/organizational psychologists who must make personnel decisions involving physically demanding jobs. The scarcity of criterion-related validation studies and the difficulty of generalizing validity are considered, and a model of physical performance that builds on Fleishman's (1984)…
Validity Arguments for Diagnostic Assessment Using Automated Writing Evaluation
ERIC Educational Resources Information Center
Chapelle, Carol A.; Cotos, Elena; Lee, Jooyoung
2015-01-01
Two examples demonstrate an argument-based approach to validation of diagnostic assessment using automated writing evaluation (AWE). "Criterion"®, was developed by Educational Testing Service to analyze students' papers grammatically, providing sentence-level error feedback. An interpretive argument was developed for its use as part of…
Iwata, Shintaro; Uehara, Kosuke; Ogura, Koichi; Akiyama, Toru; Shinoda, Yusuke; Yonemoto, Tsukasa; Kawai, Akira
2016-09-01
The Musculoskeletal Tumor Society (MSTS) scoring system is a widely used functional evaluation tool for patients treated for musculoskeletal tumors. Although the MSTS scoring system has been validated in English and Brazilian Portuguese, a Japanese version of the MSTS scoring system has not yet been validated. We sought to determine whether a Japanese-language translation of the MSTS scoring system for the lower extremity had (1) sufficient reliability and internal consistency, (2) adequate construct validity, and (3) reasonable criterion validity compared with the Toronto Extremity Salvage Score (TESS) and SF-36 using psychometric analysis. The Japanese version of the MSTS scoring system was developed using accepted guidelines, which included translation of the English version of the MSTS into Japanese by five native Japanese bilingual musculoskeletal oncology surgeons and integrated into one document. One hundred patients with a diagnosis of intermediate or malignant bone or soft tissue tumors located in the lower extremity and who had undergone tumor resection with or without reconstruction or amputation participated in this study. Reliability was evaluated by test-retest analysis, and internal consistency was established by Cronbach's alpha coefficient. Construct validity was evaluated using the principal factor analysis and Akaike information criterion network. Criterion validity was evaluated by comparing the MSTS scoring system with the TESS and SF-36. Test-retest analysis showed a high intraclass correlation coefficient (0.92; 95% CI, 0.88-0.95), indicating high reliability of the Japanese version of the MSTS scoring system, although a considerable ceiling effect was observed, with 23 patients (23%) given the maximum score. Cronbach's alpha coefficient was 0.87 (95% CI, 0.82-0.90), suggesting a high level of internal consistency. Factor analysis revealed that all items had high loading values and communalities; we identified a central role for the items "walking" and "gait" according to the Akaike information criterion network. The total MSTS score was correlated with that of the TESS (r = 0.81; 95% CI, 0.73-0.87; p < 0.001) and the physical component summary and physical functioning of the SF-36. The Japanese-language translation of the MSTS scoring system for the lower extremity has sufficient reliability and reasonable validity. Nevertheless, the observation of a ceiling effect suggests poor ability of this system to discriminate from among patients who have a high level of function.
Validation of Cost-Effectiveness Criterion for Evaluating Noise Abatement Measures
DOT National Transportation Integrated Search
1999-04-01
This project will provide the Texas Department of Transportation (TxDOT)with information about the effects of the current cost-effectiveness criterion. The project has reviewed (1) the cost-effectiveness criteria used by other states, (2) the noise b...
An evaluation of the Psychache Scale on an offender population.
Mills, Jeremy F; Green, Kate; Reddon, John R
2005-10-01
This study examined the generalizability of a self-report measure of psychache to an offender population. The factor structure, construct validity, and criterion validity of the Psychache Scale was assessed on 136 male prison inmates. The results showed the Psychache Scale has a single underlying factor structure and to be strongly associated with measures of depression and hopelessness and moderately associated with psychiatric symptoms and the criterion variable of a history of prior suicide attempts. The variables of depression, hopelessness, and psychiatric symptoms all contributed unique variance to psychache. Discussion centers on psychache's theoretical application to the prediction of suicide.
Williams, Stacey L; Polaha, Jodi
2014-09-01
The purpose of our research was to examine the validity of score interpretations of an instrument developed to measure parents' perceptions of stigma about seeking mental health services for their children. The validity of the score interpretations of the instrument was tested in 2 studies. Study 1 employed confirmatory factor analysis (CFA), using a split half approach, and construct and criterion validity on data from the entire sample of parents in rural Appalachia whose children were experiencing psychosocial concerns (N = 347), while Study 2 employed CFA, construct and criterion validity, and predictive validity of the scores on data from a general sample of parents in rural Appalachia (N = 184). Results of exploratory and confirmatory factor analyses revealed support for a 2-factor model of parents' perceived stigma, which represented both self and public forms of stigma associated with seeking mental health services for their children, and correlated with existing measures of stigma and other psychosocial variables. Further, the new self and public stigma scale significantly predicted parents' willingness to seek services for children. PsycINFO Database Record (c) 2014 APA, all rights reserved.
Severity of illness index for surgical departments in a Cuban hospital: a revalidation study.
Armas-Bencomo, Amadys; Tamargo-Barbeito, Teddy Osmin; Fuentes-Valdés, Edelberto; Jiménez-Paneque, Rosa Eugenia
2017-03-08
In the context of the evaluation of hospital services, the incorporation of severity indices allows an essential control variable for performance comparisons in time and space through risk adjustment. The severity index for surgical services was developed in 1999 and validated as a general index for surgical services. Sixteen years later the hospital context is different in many ways and a revalidation was considered necessary to guarantee its current usefulness. To evaluate the validity and reliability of the surgical services severity index to warrant its reasonable use under current conditions. A descriptive study was carried out in the General Surgery service of the "Hermanos Ameijeiras" Clinical Surgical Hospital of Havana, Cuba during the second half of 2010. We reviewed the medical records of 511 patients discharged from this service. Items were the same as the original index as were their weighted values. Conceptual or construct validity, criterion validity and inter-rater reliability as well as internal consistency of the proposed index were evaluated. Construct validity was expressed as a significant association between the value of the severity index for surgical services and discharge status. A significant association was also found, although weak, with length of hospital stay. Criterion validity was demonstrated through the correlations between the severity index for surgical services and other similar indices. Regarding criterion validity, the Horn index showed a correlation of 0.722 (95% CI: 0.677-0.761) with our index. With the POSSUM score, correlation was 0.454 (95% CI: 0.388-0.514) with mortality risk and 0.539 (95% CI: 0.462-0.607) with morbidity risk. Internal consistency yielded a standardized Cronbach's alpha of 0.8; inter-rater reliability resulted in a reliability coefficient of 0.98 for the quantitative index and a weighted global Kappa coefficient of 0.87 for the ordinal surgical index of severity for surgical services (IGQ). The validity and reliability of the proposed index was satisfactory in all aspects evaluated. The surgical services severity index may be used in the original context and is easily adaptable to other contexts as well.
Klußmann, André; Gebhardt, Hansjürgen; Rieger, Monika; Liebers, Falk; Steinberg, Ulf
2012-01-01
Upper extremity musculoskeletal symptoms and disorders are common in the working population. The economic and social impact of such disorders is considerable. Long-time, dynamic repetitive exposure of the hand-arm system during manual handling operations (MHO) alone or in combination with static and postural effort are recognised as causes of musculoskeletal symptoms and disorders. The assessment of these manual work tasks is crucial to estimate health risks of exposed employees. For these work tasks, a new method for the assessment of the working conditions was developed and a validation study was performed. The results suggest satisfying criterion validity and moderate objectivity of the KIM-MHO draft 2007. The method was modified and evaluated again. It is planned to release a new version of KIM-MHO in spring 2012.
The Measurement of Negative Creativity: Metrics and Relationships
ERIC Educational Resources Information Center
Kapoor, Hansika; Khan, Azizuddin
2016-01-01
Although the dark side of creativity and negative creativity are shaping into legitimate subconstructs, measures to assess the same remain to be validated. To meet this goal, two studies assessed the convergent, predictive, and criterion-related validities of two valence-inclusive creativity measures. One measure assessed the self-report…
The Validity of the Musical Aptitude Profile for Predicting Grades in Freshman Music Theory.
ERIC Educational Resources Information Center
Harrison, Carole S.
1987-01-01
This study investigated the criterion-related validity of the Musical Aptitude Profile in relation to achievement in freshman music theory as determined by semester grades in the courses and by grades in three course components (paperwork, sight-singing and ear-training). (Author/BS)
ERIC Educational Resources Information Center
Roehling, Patricia Vincent; Robin, Arthur L.
1986-01-01
Evaluated the criterion-related validity of the Family Beliefs Inventory, a new self-report measure of unreasonable beliefs regarding parent-adolescent relationships. Distressed fathers displayed more unreasonable beliefs concerning ruination, obedience, perfectionism, and malicious intent than nondistressed fathers. Distressed adolescents…
The Marital Disaffection Scale: An Inventory for Assessing Emotional Estrangement in Marriage.
ERIC Educational Resources Information Center
Kayser, Karen
1996-01-01
Describes a self-report scale measuring levels of disaffection toward one's spouse. A questionnaire containing the Marital Disaffection Scale (MDS) and other disaffection measures of marital happiness was administered to 76 spouses. Results indicated good criterion-related validity, discriminant validity, and interitem reliability. Findings…
Validation of the Proficiency Examination for Diagnostic Radiologic Technology. Final Report.
ERIC Educational Resources Information Center
Educational Testing Service, Princeton, NJ.
The validity of the Proficiency Examination for Diagnostic Radiologic Technology was investigated, using 140 radiologic technologists who took both the written Proficiency Examination and a performance test. As an additional criterion measure of job proficiency, supervisors' assessments were obtained for 128 of the technologists. The resulting…
29 CFR 1607.14 - Technical standards for validity studies.
Code of Federal Regulations, 2011 CFR
2011-07-01
... in the design of the study and their effects identified. (5) Statistical relationships. The degree of...; or such factors should be included in the design of the study and their effects identified. (f... arduous effort involving a series of research studies, which include criterion related validity studies...
29 CFR 1607.14 - Technical standards for validity studies.
Code of Federal Regulations, 2013 CFR
2013-07-01
... in the design of the study and their effects identified. (5) Statistical relationships. The degree of...; or such factors should be included in the design of the study and their effects identified. (f... arduous effort involving a series of research studies, which include criterion related validity studies...
29 CFR 1607.14 - Technical standards for validity studies.
Code of Federal Regulations, 2014 CFR
2014-07-01
... in the design of the study and their effects identified. (5) Statistical relationships. The degree of...; or such factors should be included in the design of the study and their effects identified. (f... arduous effort involving a series of research studies, which include criterion related validity studies...
29 CFR 1607.14 - Technical standards for validity studies.
Code of Federal Regulations, 2012 CFR
2012-07-01
... in the design of the study and their effects identified. (5) Statistical relationships. The degree of...; or such factors should be included in the design of the study and their effects identified. (f... arduous effort involving a series of research studies, which include criterion related validity studies...
29 CFR 1607.14 - Technical standards for validity studies.
Code of Federal Regulations, 2010 CFR
2010-07-01
... in the design of the study and their effects identified. (5) Statistical relationships. The degree of...; or such factors should be included in the design of the study and their effects identified. (f... arduous effort involving a series of research studies, which include criterion related validity studies...
Convergent and Divergent Validity of the Learning Transfer System Inventory
ERIC Educational Resources Information Center
Holton, Elwood F., III; Bates, Reid A.; Bookter, Annette I.; Yamkovenko, V. Bogdan
2007-01-01
The Learning Transfer System Inventory (LTSI) was developed to identify a select set of factors with the potential to substantially enhance or inhibit transfer of learning to the work environment. It has undergone a variety of validation studies, including construct, criterion, and crosscultural studies. However, the convergent and divergent…
Donaldson, Catherine; Tallis, Raymond C; Pomeroy, Valerie M
2009-06-01
Inadequate description of treatment hampers progress in stroke rehabilitation. To develop a valid, reliable, standardised treatment schedule of conventional physical therapy provided for the paretic upper limb after stroke. Eleven neurophysiotherapists participated in the established methodology: semi-structured interviews, focus groups and piloting a draft treatment schedule in clinical practice. Different physiotherapists (n=13) used the treatment schedule to record treatment given to stroke patients with mild, moderate and severe upper limb paresis. Rating of adequacy of the treatment schedule was made using a visual analogue scale (0 to 100mm). Mean (95% confidence interval) visual analogue scores were calculated (expert criterion validity). For intra-rater reliability, each physiotherapist observed a video tape of their treatment and immediately completed a treatment schedule recording form on two separate occasions, 4 to 6 weeks apart. The Kappa statistic was calculated for intra-rater reliability. The treatment schedule consists of a one-page A4 recording form and a user booklet, detailing 50 treatment activities. Expert criterion validity was 79 (95% confidence interval 74 to 84). Intra-rater Kappa was 0.81 (P<0.001). This treatment schedule can be used to document conventional physical therapy in subsequent clinical trials in the geographical area of its development. Further work is needed to investigate generalisability beyond this geographical area.
Reliability and criterion-related validity of a new repeated agility test
Makni, E; Jemni, M; Elloumi, M; Chamari, K; Nabli, MA; Padulo, J; Moalla, W
2016-01-01
The study aimed to assess the reliability and the criterion-related validity of a new repeated sprint T-test (RSTT) that includes intense multidirectional intermittent efforts. The RSTT consisted of 7 maximal repeated executions of the agility T-test with 25 s of passive recovery rest in between. Forty-five team sports players performed two RSTTs separated by 3 days to assess the reliability of best time (BT) and total time (TT) of the RSTT. The intra-class correlation coefficient analysis revealed a high relative reliability between test and retest for BT and TT (>0.90). The standard error of measurement (<0.50) showed that the RSTT has a good absolute reliability. The minimal detectable change values for BT and TT related to the RSTT were 0.09 s and 0.58 s, respectively. To check the criterion-related validity of the RSTT, players performed a repeated linear sprint (RLS) and a repeated sprint with changes of direction (RSCD). Significant correlations between the BT and TT of the RLS, RSCD and RSTT were observed (p<0.001). The RSTT is, therefore, a reliable and valid measure of the intermittent repeated sprint agility performance. As this ability is required in all team sports, it is suggested that team sports coaches, fitness coaches and sports scientists consider this test in their training follow-up. PMID:27274109
Educational testing validity and reliability in pharmacy and medical education literature.
Hoover, Matthew J; Jung, Rose; Jacobs, David M; Peeters, Michael J
2013-12-16
To evaluate and compare the reliability and validity of educational testing reported in pharmacy education journals to medical education literature. Descriptions of validity evidence sources (content, construct, criterion, and reliability) were extracted from articles that reported educational testing of learners' knowledge, skills, and/or abilities. Using educational testing, the findings of 108 pharmacy education articles were compared to the findings of 198 medical education articles. For pharmacy educational testing, 14 articles (13%) reported more than 1 validity evidence source while 83 articles (77%) reported 1 validity evidence source and 11 articles (10%) did not have evidence. Among validity evidence sources, content validity was reported most frequently. Compared with pharmacy education literature, more medical education articles reported both validity and reliability (59%; p<0.001). While there were more scholarship of teaching and learning (SoTL) articles in pharmacy education compared to medical education, validity, and reliability reporting were limited in the pharmacy education literature.
Gutiérrez Sánchez, Daniel; Cuesta-Vargas, Antonio I
2018-04-01
Many measurements have been developed to assess the quality of death (QoD). Among these, the Quality of Dying and Death Questionnaire (QODD) is the most widely studied and best validated. Informal carers and health professionals who care for the patient during their last days of life can complete this assessment tool. The aim of the study is to carry out a cross-cultural adaptation and a psychometric analysis of the QODD for the Spanish population. The translation was performed using a double forward and backward method. An expert panel evaluated the content validity. The questionnaire was tested in a sample of 72 Spanish-speaking adult carers of deceased cancer patients. A psychometric analysis was performed to evaluate internal consistency, divergent criterion-related validity with the Mini-Suffering State Examination (MSSE) and concurrent criterion-related validity with the Palliative Outcome Scale (POS). Some items were deleted and modified to create the Spanish version of the QODD (QODD-ESP-26). The instrument was readable and acceptable. The content validity index was 0.96, suggesting that all items are relevant for the measure of the QoD. This questionnaire showed high internal consistency (Cronbach's α coefficient = 0.88). Divergent validity with MSSE (r = -0.64) and convergent validity with POS (r = -0.61) were also demonstrated. The QODD-ESP-26 is a valid and reliable instrument for the assessment of the QoD of deceased cancer patients that can be used in a clinical and research setting. Copyright © 2018 Elsevier Ltd. All rights reserved.
2013-01-01
Background Transplant recipients are expected to adhere to a lifelong immunosuppressant therapeutic regimen. However, nonadherence to treatment is an underestimated problem for which no properly validated measurement tool is available for Portuguese-speaking patients. We aimed to initially validate the Basel Assessment of Adherence to Immunosuppressive Medications Scale (BAASIS®) to accurately estimate immunosuppressant nonadherence in Brazilian transplant patients. Methods The BAASIS® (English version) was transculturally adapted and its psychometric properties were assessed. The transcultural adaptation was performed using the Guillemin protocol. Psychometric testing included reliability (intraobserver and interobserver reproducibility, agreement, Kappa coefficient, and the Cronbach’s alpha) and validity (content, criterion, and construct validities). Results The final version of the transculturally adapted BAASIS® was pretested, and no difficulties in understanding its content were found. The intraobserver and interobserver reproducibility variances (0.007 and 0.003, respectively), the Cronbach’s alpha (0.7), Kappa coefficient (0.88) and the agreement (95.2%) suggest accuracy, preciseness and reliability. For construct validity, exploratory factorial analysis demonstrated unidimensionality of the first three questions (r = 0.76, r = 0.80, and r = 0.68). For criterion validity, the adapted BAASIS® was correlated with another self-report instrument, the Measure of Adherence to Treatment, and showed good congruence (r = 0.65). Conclusions The BAASIS® has adequate psychometric properties and may be employed in advance to measure adherence to posttransplant immunosuppressant treatments. This instrument will be the first one validated to use in this specific transplant population and in the Portuguese language. PMID:23692889
El-Housseiny, Azza A; Alsadat, Farah A; Alamoudi, Najlaa M; El Derwi, Douaa A; Farsi, Najat M; Attar, Moaz H; Andijani, Basil M
2016-04-14
Early recognition of dental fear is essential for the effective delivery of dental care. This study aimed to test the reliability and validity of the Arabic version of the Children's Fear Survey Schedule-Dental Subscale (CFSS-DS). A school-based sample of 1546 children was randomly recruited. The Arabic version of the CFSS-DS was completed by children during class time. The scale was tested for internal consistency and test-retest reliability. To test criterion validity, children's behavior was assessed using the Frankl scale during dental examination, and results were compared with children's CFSS-DS scores. To test the scale's construct validity, scores on "fear of going to the dentist soon" were correlated with CFSS-DS scores. Factor analysis was also used. The Arabic version of the CFSS-DS showed high reliability regarding both test-retest reliability (intraclass correlation = 0.83, p < 0.001) and internal consistency (Cronbach's α = 0.88). It showed good criterion validity: children with negative behavior had significantly higher fear scores (t = 13.67, p < 0.001). It also showed moderate construct validity (Spearman's rho correlation, r = 0.53, p < 0.001). Factor analysis identified the following factors: "fear of invasive dental procedures," "fear of less invasive dental procedures" and "fear of strangers." The Arabic version of the CFSS-DS is a reliable and valid measure of dental fear in Arabic-speaking children. Pediatric dentists and researchers may use this validated version of the CFSS-DS to measure dental fear in Arabic-speaking children.
Validation of a home food inventory among low-income Spanish- and Somali-speaking families.
Hearst, Mary O; Fulkerson, Jayne A; Parke, Michelle; Martin, Lauren
2013-07-01
To refine and validate an existing home food inventory (HFI) for low-income Somali- and Spanish-speaking families. Formative assessment was conducted using two focus groups, followed by revisions of the HFI, translation of written materials and instrument validation in participants’ homes. Twin Cities Metropolitan Area, Minnesota, USA. Thirty low-income families with children of pre-school age (fifteen Spanish-speaking; fifteen Somali-speaking) completed the HFI simultaneously with, but independently of, a trained staff member. Analysis consisted of calculation of both item-specific and average food group kappa coefficients, specificity, sensitivity and Spearman’s correlation between participants’ and staff scores as a means of assessing criterion validity of individual items, food categories and the obesogenic score. The formative assessment revealed the need for few changes/additions for food items typically found in Spanish-speaking households. Somali-speaking participants requested few additions, but many deletions, including frozen processed food items, non-perishable produce and many sweets as they were not typical food items kept in the home. Generally, all validity indices were within an acceptable range, with the exception of values associated with items such as ‘whole wheat bread’ (k = 0.16). The obesogenic score (presence of high-fat, high-energy foods) had high criterion validity with k = 0.57, sensitivity = 91.8%, specificity = 70.6% and Spearman correlation = 0.78. The revised HFI is a valid assessment tool for use among Spanish and Somali households. This instrument refinement and validation process can be replicated with other population groups.
Validation of a Spanish version of the Spine Functional Index.
Cuesta-Vargas, Antonio I; Gabel, Charles P
2014-06-27
The Spine Functional Index (SFI) is a recently published, robust and clinimetrically valid patient reported outcome measure. The purpose of this study was the adaptation and validation of a Spanish-version (SFI-Sp) with cultural and linguistic equivalence. A two stage observational study was conducted. The SFI was cross-culturally adapted to Spanish through double forward and backward translation then validated for its psychometric characteristics. Participants (n = 226) with various spine conditions of >12 weeks duration completed the SFI-Sp and a region specific measure: for the back, the Roland Morris Questionnaire (RMQ) and Backache Index (BADIX); for the neck, the Neck Disability Index (NDI); for general health the EQ-5D and SF-12. The full sample was employed to determine internal consistency, concurrent criterion validity by region and health, construct validity and factor structure. A subgroup (n = 51) was used to determine reliability at seven days. The SFI-Sp demonstrated high internal consistency (α = 0.85) and reliability (r = 0.96). The factor structure was one-dimensional and supported construct validity. Criterion specific validity for function was high with the RMQ (r = 0.79), moderate with the BADIX (r = 0.59) and low with the NDI (r = 0.46). For general health it was low with the EQ-5D and inversely correlated (r = -0.42) and fair with the Physical and Mental Components of the SF-12 and inversely correlated (r = -0.56 and r = -0.48), respectively. The study limitations included the lack of longitudinal data regarding other psychometric properties, specifically responsiveness. The SFI-Sp was demonstrated as a valid and reliable spine-regional outcome measure. The psychometric properties were comparable to and supported those of the English-version, however further longitudinal investigations are required.
Measurement properties of depression questionnaires in patients with diabetes: a systematic review.
van Dijk, Susan E M; Adriaanse, Marcel C; van der Zwaan, Lennart; Bosmans, Judith E; van Marwijk, Harm W J; van Tulder, Maurits W; Terwee, Caroline B
2018-06-01
To conduct a systematic review on measurement properties of questionnaires measuring depressive symptoms in adult patients with type 1 or type 2 diabetes. A systematic review of the literature in MEDLINE, EMbase and PsycINFO was performed. Full text, original articles, published in any language up to October 2016 were included. Eligibility for inclusion was independently assessed by three reviewers who worked in pairs. Methodological quality of the studies was evaluated by two independent reviewers using the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) checklist. Quality of the questionnaires was rated per measurement property, based on the number and quality of the included studies and the reported results. Of 6286 unique hits, 21 studies met our criteria evaluating nine different questionnaires in multiple settings and languages. The methodological quality of the included studies was variable for the different measurement properties: 9/15 studies scored 'good' or 'excellent' on internal consistency, 2/5 on reliability, 0/1 on content validity, 10/10 on structural validity, 8/11 on hypothesis testing, 1/5 on cross-cultural validity, and 4/9 on criterion validity. For the CES-D, there was strong evidence for good internal consistency, structural validity, and construct validity; moderate evidence for good criterion validity; and limited evidence for good cross-cultural validity. The PHQ-9 and WHO-5 also performed well on several measurement properties. However, the evidence for structural validity of the PHQ-9 was inconclusive. The WHO-5 was less extensively researched and originally not developed to measure depression. Currently, the CES-D is best supported for measuring depressive symptoms in diabetes patients.
The cross-validated AUC for MCP-logistic regression with high-dimensional data.
Jiang, Dingfeng; Huang, Jian; Zhang, Ying
2013-10-01
We propose a cross-validated area under the receiving operator characteristic (ROC) curve (CV-AUC) criterion for tuning parameter selection for penalized methods in sparse, high-dimensional logistic regression models. We use this criterion in combination with the minimax concave penalty (MCP) method for variable selection. The CV-AUC criterion is specifically designed for optimizing the classification performance for binary outcome data. To implement the proposed approach, we derive an efficient coordinate descent algorithm to compute the MCP-logistic regression solution surface. Simulation studies are conducted to evaluate the finite sample performance of the proposed method and its comparison with the existing methods including the Akaike information criterion (AIC), Bayesian information criterion (BIC) or Extended BIC (EBIC). The model selected based on the CV-AUC criterion tends to have a larger predictive AUC and smaller classification error than those with tuning parameters selected using the AIC, BIC or EBIC. We illustrate the application of the MCP-logistic regression with the CV-AUC criterion on three microarray datasets from the studies that attempt to identify genes related to cancers. Our simulation studies and data examples demonstrate that the CV-AUC is an attractive method for tuning parameter selection for penalized methods in high-dimensional logistic regression models.
Psychometric Validation of the Academic Motivation Scale in a Dental Student Sample.
Orsini, Cesar; Binnie, Vivian; Evans, Phillip; Ledezma, Priscilla; Fuentes, Fernando; Villegas, Maria J
2015-08-01
The Academic Motivation Scale is one of the most frequently used instruments to assess academic motivation. It relies on the self-determination theory of human motivation. However, motivation has been understudied in dental education. Therefore, to address the lack of valid instruments to assess academic motivation in dental education and contribute to future research in the field, the aim of this study was to analyze the psychometric properties of this instrument in a sample of dental students. Participants were 989 Chilean undergraduate dental students (86% response rate) who completed a survey containing a Chilean face-valid version of the Spanish Academic Motivation Scale and three other motivation-related instruments to assess the survey's construct and criterion validity. Later, 76 of the students (out of 100 invited) took the survey again to assess its test-retest stability. The instrument's construct validity was supported by the superior goodness of fit of the seven-subscale Academic Motivation Scale over competing models through confirmatory factor analysis and by the expected correlations among its subscales. The concurrent criterion validity was supported by the confirmation of correlations between its subscales and external criteria. Adequate internal consistency and test-retest correlations were also found. The evidence from this study suggests that the Academic Motivation Scale is a preliminarily valid and reliable instrument to assess motivation in the predoctoral dental context. Future research in this area is needed to confirm or refute these results.
Hodge, Megan; Gotzke, Carrie Lynne
2014-08-01
To evaluate the criterion-related validity of the TOCS+ sentence measure (TOCS+, Hodge, Daniels & Gotzke, 2009 ) for children with dysarthria and CP by comparing intelligibility and rate scores obtained concurrently from the TOCS+ and from a conversational sample. Twenty children (3 to 10 years old) diagnosed with spastic cerebral palsy (CP) participated. Nineteen children also had a confirmed diagnosis of dysarthria. Children's intelligibility and speaking rate scores obtained from the TOCS+, which uses imitation of sets of randomly selected items ranging from 2-7 words (80 words in total) and from a contiguous 100-word conversational speech were compared. Mean intelligibility scores were 46.5% (SD = 26.4%) and 50.9% (SD = 19.1%) and mean rates in words per minute (WPM) were 90.2 (SD = 22.3) and 94.1 (SD = 25.6), respectively, for the TOCS+ and conversational samples. No significant differences were found between the two conditions for intelligibility or rate scores. Strong correlations were found between the TOCS+ and conversational samples for intelligibility (r = 0.86; p < 0.001) and WPM (r = 0.77; p < 0.001), supporting the criterion validity of the TOCS+ sentence task as a time efficient procedure for measuring intelligibility and rate in children with CP, with and without confirmed dysarthria. The results support the criterion validity of the TOCS+ sentence task as a time efficient procedure for measuring intelligibility and rate in children with CP, with and without confirmed dysarthria. Children varied in their relative performance on the two speaking tasks, reflecting the complexity of factors that influence intelligibility and rate scores.
2011-01-01
Background Since stress is hypothesized to play a role in the etiology of obesity during adolescence, research on associations between adolescent stress and obesity-related parameters and behaviours is essential. Due to lack of a well-established recent stress checklist for use in European adolescents, the study investigated the reliability and validity of the Adolescent Stress Questionnaire (ASQ) for assessing perceived stress in European adolescents. Methods The ASQ was translated into the languages of the participating cities (Ghent, Stockholm, Vienna, Zaragoza, Pecs and Athens) and was implemented within the HELENA cross-sectional study. A total of 1140 European adolescents provided a valid ASQ, comprising 10 component scales, used for internal reliability (Cronbach α) and construct validity (confirmatory factor analysis or CFA). Contributions of socio-demographic (gender, age, pubertal stage, socio-economic status) characteristics to the ASQ score variances were investigated. Two-hundred adolescents also provided valid saliva samples for cortisol analysis to compare with the ASQ scores (criterion validity). Test-retest reliability was investigated using two ASQ assessments from 37 adolescents. Results Cronbach α-values of the ASQ scales (0.57 to 0.88) demonstrated a moderate internal reliability of the ASQ, and intraclass correlation coefficients (0.45 to 0.84) established an insufficient test-retest reliability of the ASQ. The adolescents' gender (girls had higher stress scores than boys) and pubertal stage (those in a post-pubertal development had higher stress scores than others) significantly contributed to the variance in ASQ scores, while their age and socio-economic status did not. CFA results showed that the original scale construct fitted moderately with the data in our European adolescent population. Only in boys, four out of 10 ASQ scale scores were a significant positive predictor for baseline wake-up salivary cortisol, suggesting a rather poor criterion validity of the ASQ, especially in girls. Conclusions In our European adolescent sample, the ASQ had an acceptable internal reliability and construct validity and the adolescents' gender and pubertal stage systematically contributed to the ASQ variance, but its test-retest reliability and criterion validity were rather poor. Overall, the utility of the ASQ for assessing perceived stress in adolescents across Europe is uncertain and some aspects require further examination. PMID:21943341
Standards for Evaluating Criterion-Referenced Tests.
ERIC Educational Resources Information Center
Walker, Clinton B.
Standards for evaluating criterion-referenced tests are presented. Twenty-one standards, grouped in three categories, are discussed. Category one is defined as measurement properties and is comprised of conceptual validity, including description of the domain, test item agreement with objectives, and item representativeness of the objectives; and…
A new self-report inventory of dyslexia for students: criterion and construct validity.
Tamboer, Peter; Vorst, Harrie C M
2015-02-01
The validity of a Dutch self-report inventory of dyslexia was ascertained in two samples of students. Six biographical questions, 20 general language statements and 56 specific language statements were based on dyslexia as a multi-dimensional deficit. Dyslexia and non-dyslexia were assessed with two criteria: identification with test results (Sample 1) and classification using biographical information (both samples). Using discriminant analyses, these criteria were predicted with various groups of statements. All together, 11 discriminant functions were used to estimate classification accuracy of the inventory. In Sample 1, 15 statements predicted the test criterion with classification accuracy of 98%, and 18 statements predicted the biographical criterion with classification accuracy of 97%. In Sample 2, 16 statements predicted the biographical criterion with classification accuracy of 94%. Estimations of positive and negative predictive value were 89% and 99%. Items of various discriminant functions were factor analysed to find characteristic difficulties of students with dyslexia, resulting in a five-factor structure in Sample 1 and a four-factor structure in Sample 2. Answer bias was investigated with measures of internal consistency reliability. Less than 20 self-report items are sufficient to accurately classify students with and without dyslexia. This supports the usefulness of self-assessment of dyslexia as a valid alternative to diagnostic test batteries. Copyright © 2015 John Wiley & Sons, Ltd.
Ghisi, Gabriela Lima de Melo; Sandison, Nicole; Oh, Paul
2016-03-01
To develop, pilot test and psychometrically validate a shorter version of the coronary artery disease education questionnaire (CADE-Q), called CADE-Q SV. Based on previous versions of the CADE-Q, cardiac rehabilitation (CR) experts developed 20 items divided into 5 knowledge domains to comprise the first version of the CADE-Q SV. To establish content validity, they were reviewed by an expert panel (N=12). Refined items were pilot-tested in 20 patients, in which clarity was provided. A final version was generated and psychometrically-tested in 132CR patients. Test-retest reliability was assessed via the intraclass correlation coefficient (ICC), the internal consistency using Cronbach's alpha, and criterion validity with regard to patients' education and duration in CR. All ICC coefficients meet the minimum recommended standard. All domains were considered internally consistent (α>0.7). Criterion validity was supported by significant differences in mean scores by educational level (p<0.01) and duration in CR (p<0.05). Knowledge about exercise and nutrition was higher than knowledge about medical condition. The CADE-Q SV was demonstrated to have good reliability and validity. This is a short, quick and appropriate tool for application in clinical and research settings, assessing patients' knowledge during CR and as part of education programming. Copyright © 2015. Published by Elsevier Ireland Ltd.
Ghisi, Gabriela Lima de Melo; Grace, Sherry L; Thomas, Scott; Evans, Michael F; Oh, Paul
2013-06-01
To develop and psychometrically validate a tool to assess information needs in cardiac rehabilitation (CR) patients. After a literature search, 60 information items divided into 11 areas of needs were identified. To establish content validity, they were reviewed by an expert panel (N=10). Refined items were pilot-tested in 34 patients on a 5-point Likert-scale from 1 "really not helpful" to 5 "very important". A final version was generated and psychometrically tested in 203 CR patients. Test-retest reliability was assessed via the intraclass correlation coefficient (ICC), the internal consistency using Cronbach's alpha, and criterion validity was assessed with regard to patient's education and duration in CR. Five items were excluded after ICC analysis as well as one area of needs. All 10 areas were considered internally consistent (Cronbach's alpha>0.7). Criterion validity was supported by significant differences in mean scores by educational level (p<0.05) and duration in CR (p<0.001). The mean total score was 4.08 ± 0.53. Patients rated safety as their greatest information need. The INCR Tool was demonstrated to have good reliability and validity. This is an appropriate tool for application in clinical and research settings, assessing patients' needs during CR and as part of education programming. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Guo, Xinying; Wu, Xinjuan; Guo, Aimin; Zhao, Yanwei
2018-01-01
Abstract Condyloma acuminata (CA) is a sexually transmitted disease that affects quality of life (QOL). CECA10 is an English-language questionnaire for assessing QOL in patients with CA, but there is no equivalent in China. This study aimed to develop a validated and reliable Chinese version of CECA10. The Chinese CECA10 was developed from the English version by forward translation, back translation, comparison with the original, cultural adjustments, and a pre-test (5 patients). The Chinese CECA10 and EuroQol Five Dimensions Three Level Questionnaire (EQ-5D-3L) was administered to patients with CA. Content validity (item/scale content validity indexes, I-CVI/S-CVI), test–retest reliability (intraclass coefficient, ICC), internal consistency (Cronbach α), criterion validity (comparison with the Dermatology Life Quality Index, DLQL, using Spearman correlation analysis), construct validity (exploratory factor analysis), and discriminant validity (between subgroups based on number of warts, number of recurrences, or number of sites involved) were assessed. The Chinese CECA10 had good test–retest reliability (ICC = 0.98, P < .001), internal consistency (Cronbach α values of 0.88, 0.84, and 0.83 for the total questionnaire, psychological dimension, and sexual dimension, respectively), content validity (I-CVI = 1 for all items), and criterion validity (r = -0.50, P < .001). Exploratory factor analysis extracted 2 factors with a cumulative contribution of 61.75%; the factor loading with each item was >0.4. Discriminant validity was not high. The mean CECA10 and EQ-VAS scores of 211 patients with CA (28.19 ± 7.16 years; 139 males) were 34.56 ± 19.01 and 64.64 ± 19.28, respectively. The Chinese CECA10 has good reliability and validity for evaluating the QOL of Chinese patients with CA. PMID:29489693
ERIC Educational Resources Information Center
Gao, Zan; Lee, Amelia M.; Solmon, Melinda A.; Kosma, Maria; Carson, Russell L.; Zhang, Tao; Domangue, Elizabeth; Moore, Delilah
2010-01-01
The purpose of this study was to validate physical activity time in middle school physical education as measured by pedometers in relation to a criterion measure, namely, students' accelerometer determined moderate to vigorous physical activity (MVPA). Participants were 155 sixth to eighth graders participating in regularly scheduled physical…
ERIC Educational Resources Information Center
Pike, Gary R.
1989-01-01
A study investigated the appropriateness of the American College Testing Program's College Outcome Measures Program, conducted at the University of Tennessee, Knoxville, by applying the criterion of construct validity. Results indicated that while the test primarily measures individual differences, it is also sensitive to the effects of higher…
ERIC Educational Resources Information Center
Kohn, Paul M.; Milrose, Jill A.
1993-01-01
A decontaminated measure of exposures to hassles for adolescents, the Inventory of High-School Students' Recent Life Experiences (IHSSRLE), was developed and validated with 94 male and 82 female Canadian high school students. The IHSSRLE shows adequate internal consistency reliability and validity against the criterion of subjectively appraised…
A Model for Investigating Predictive Validity at Highly Selective Institutions.
ERIC Educational Resources Information Center
Gross, Alan L.; And Others
A statistical model for investigating predictive validity at highly selective institutions is described. When the selection ratio is small, one must typically deal with a data set containing relatively large amounts of missing data on both criterion and predictor variables. Standard statistical approaches are based on the strong assumption that…
ERIC Educational Resources Information Center
Kane, Michael T.; Mroch, Andrew A.
2010-01-01
In evaluating the relationship between two measures across different groups (i.e., in evaluating "differential validity") it is necessary to examine differences in correlation coefficients and in regression lines. Ordinary least squares (OLS) regression is the standard method for fitting lines to data, but its criterion for optimal fit…
Development and Validation of the Spanish-English Language Proficiency Scale (SELPS)
ERIC Educational Resources Information Center
Smyk, Ekaterina; Restrepo, M. Adelaida; Gorin, Joanna S.; Gray, Shelley
2013-01-01
Purpose: This study examined the development and validation of a criterion-referenced Spanish-English Language Proficiency Scale (SELPS) that was designed to assess the oral language skills of sequential bilingual children ages 4-8. This article reports results for the English proficiency portion of the scale. Method: The SELPS assesses syntactic…
A Note on the Incremental Validity of Aggregate Predictors.
ERIC Educational Resources Information Center
Day, H. D.; Marshall, David
Three computer simulations were conducted to show that very high aggregate predictive validity coefficients can occur when the across-case variability in absolute score stability occurring in both the predictor and criterion matrices is quite small. In light of the increase in internal consistency reliability achieved by the method of aggregation…
ERIC Educational Resources Information Center
Furey, William M.; Marcotte, Amanda M.; Hintze, John M.; Shackett, Caroline M.
2016-01-01
The study presents a critical analysis of written expression curriculum-based measurement (WE-CBM) metrics derived from 3- and 10-min test lengths. Criterion validity and classification accuracy were examined for Total Words Written (TWW), Correct Writing Sequences (CWS), Percent Correct Writing Sequences (%CWS), and Correct Minus Incorrect…
ERIC Educational Resources Information Center
O'Hare, Thomas; Shen, Ce; Sherrer, Margaret
2007-01-01
Objective: Interview data collected from 275 clients with severe mental illnesses are used to test the construct and criterion validity of the Posttraumatic Stress Disorder Symptom Scale (PSS). Method: First, exploratory and confirmatory factor analyses are used to test whether the scale reflects the posttraumatic stress disorder (PTSD) symptom…
An Evaluation of the Psychache Scale on an Offender Population
ERIC Educational Resources Information Center
Mills, Jeremy F.; Green, Kate; Reddon, John R.
2005-01-01
This study examined the generalizability of a self-report measure of psychache to an offender population. The factor structure, construct validity, and criterion validity of the Psychache Scale was assessed on 136 male prison inmates. The results showed the Psychache Scale has a single underlying factor structure and to be strongly associated with…
ERIC Educational Resources Information Center
Owens, Julie Sarno; Storer, Jennifer; Holdaway, Alex S.; Serrano, Verenea J.; Watabe, Yuko; Himawan, Lina K.; Krelko, Rebecca E.; Vause, Katherine J.; Girio-Herrera, Erin; Andrews, Nina
2015-01-01
The current study examined the utility and incremental validity of parent ratings on the Strengths and Difficulties Questionnaire and Disruptive Behavior Disorders rating scale completed at kindergarten registration in identifying risk status as defined by important criterion variables (teacher ratings, daily behavioral performance, and quarterly…
Targeting Low Career Confidence Using the Career Planning Confidence Scale
ERIC Educational Resources Information Center
McAuliffe, Garrett; Jurgens, Jill C.; Pickering, Worth; Calliotte, James; Macera, Anthony; Zerwas, Steven
2006-01-01
The authors describe the development and validation of a test of career planning confidence that makes possible the targeting of specific problem issues in employment counseling. The scale, developed using a rational process and the authors' experience with clients, was tested for criterion-related validity against 2 other measures. The scale…
Investigation of the Lollipop Test as a Pre-Kindergarten Screening Instrument.
ERIC Educational Resources Information Center
Chew, Alex L.; Morris, John D.
1987-01-01
The validity of the Lollipop Test: A Diagnostic Screening Test of School Readiness was examined for 129 pre-kindergarten subjects using the Developmental Indicator for the Assessment of Learning as the criterion. Concurrent validity was demonstrated across the test batteries. The Lollipop Test appears to be an attractive alternative…
Shin, Marlena H; Sullivan, Jennifer L; Rosen, Amy K; Solomon, Jeffrey L; Dunn, Edward J; Shimada, Stephanie L; Hayes, Jennifer; Rivard, Peter E
2014-12-01
Increasing use of Agency for Healthcare Research and Quality's Patient Safety Indicators (PSIs) for hospital performance measurement intensifies the need to critically assess their validity. Our study examined the extent to which variation in PSI composite score is related to differences in hospital organizational structures or processes (i.e., criterion validity). In site visits to three Veterans Health Administration hospitals with high and three with low PSI composite scores ("low performers" and "high performers," respectively), we interviewed a cross-section of hospital staff. We then coded interview transcripts for evidence in 13 safety-related domains and assessed variation across high and low performers. Evidence of leadership and coordination of work/communication (organizational process domains) was predominantly favorable for high performers only. Evidence in the other domains was either mixed, or there were insufficient data to rate the domains. While we found some evidence of criterion validity, the extent to which variation in PSI rates is related to differences in hospitals' organizational structures/processes needs further study. © The Author(s) 2014.
Dillon, Frank R.; Félix-Ortiz, Maria; Rice, Christopher; De La Rosa, Mario; Rojas, Patria; Duan, Rui
2009-01-01
The psychometric properties of the Multidimensional Measure of Cultural Identity Scales for Latinos (MMCISL; Félix-Ortiz, Newcomb, & Myers, 1994) have never been examined in an adult Latina sample representing various levels of nativity and nationality. The rationale for the study was to confirm the factor structure and psychometric properties of the MMCISL with a predominantly immigrant sample of Latina mothers and daughters (n = 316). Adequate reliability estimates were found for 6 of the original 10 scales. Confirmatory factor analyses provided evidence of construct validity for the reliable scales. The Preferred Latino Affiliation scale was the only scale to meet strict measurement invariance criteria across mothers and daughters. Criterion validity was evidenced by relations between the Familiarity with Latino Culture scale and all criterion variables. Implications for acculturation and cultural identity research involving the MMCISL are discussed. PMID:19364206
The psychometric properties of the WHOQOL-BREF in Japanese couples
Sun, Yi; Sugawara, Masumi; Matsumoto, Satoko; Sakai, Atsushi; Takaoka, Junko; Goto, Noriko
2015-01-01
This study investigated the psychometric properties of the Japanese version of the WHOQOL-BREF among 10,693 community-based married Japanese men and women (4376 couples) who were either expecting or raising a child. Analyses of item-response distributions, internal consistency, criterion validity, and discriminant validity indicated that the scale had acceptable reliability and performed well in preliminary tests of validity. Furthermore, dyadic confirmatory factor analysis revealed that the theoretical factor structure was valid and similar across partners, suggesting that men and women define and value quality of life in a similar way. PMID:28070365
ERIC Educational Resources Information Center
Shriver, Edgar L.; Foley, John P., Jr.
A battery of criterion referenced Job Task Performance Tests (JTPT) was developed because paper and pencil tests of job knowledge and electronic theory had very poor criterion-related or empirical validity with respect to the ability of electronic maintenance men to perform their job. Although the original JTPT required the use of actual…
Moschella, Melissa
2016-01-01
This article explains the problems with Alan Shewmon’s critique of brain death as a valid sign of human death, beginning with a critical examination of his analogy between brain death and severe spinal cord injury. The article then goes on to assess his broader argument against the necessity of the brain for adult human organismal integration, arguing that he fails to translate correctly from biological to metaphysical claims. Finally, on the basis of a deeper metaphysical analysis, I offer a revised rationale for the validity of the neurological criterion of human death. PMID:27095749
Guirao-Goris, Silamani J; Ferrer Ferrandis, Esperanza; Montejano Lozoya, Raimunda
2016-02-18
The aim of the study is to identify the construct and criterion validity of the nursing diagnosis label Sedentary Lifestyle. A cross-sectional study in a nursing consultation in primary health care was conducted. Participants were all people that was attended for one year over 50 who voluntarily wish to participate (n=85) in the study. Objective weekly physical activity was measured in METs with an Accelerometer, objective measure of performance was measured by gait speed EPESE Battery (both measures that were used as the gold standard), and physical activity questionnaires (RAPA), the COOP-WONCA physical fitness chart. Spearman correlation coefficients, mean comparison tests and analysis of sensitivity and specificity were used as statistical analysis. The diagnosis "Sedentary Lifestyle" showed a positive correlation between its manifestations and physical activity measured in METs (r=0.39) and EPESE gait speed (r=0.35). The diagnosis showed a sensitivity of 85.1% and a specificity of 65.2% and showed ability to discriminate active people from those that are not using METs as a measure of physical activity (t=-4.4). The diagnosis "Sedentary Lifestyle" shows criterion and construct validity.
[Criterion Validity of the German Version of the CES-D in the General Population].
Jahn, Rebecca; Baumgartner, Josef S; van den Nest, Miriam; Friedrich, Fabian; Alexandrowicz, Rainer W; Wancata, Johannes
2018-04-17
The "Center of Epidemiologic Studies - Depression scale" (CES-D) is a well-known screening tool for depression. Until now the criterion validity of the German version of the CES-D was not investigated in a sample of the adult general population. 508 study participants of the Austrian general population completed the CES-D. ICD-10 diagnoses were established by using the Schedules for Clinical Assessment in Neuropsychiatry (SCAN). Receiver Operating Characteristics (ROC) analysis was conducted. Possible gender differences were explored. Overall discriminating performance of the CES-D was sufficient (ROC-AUC 0,836). Using the traditional cut-off values of 15/16 and 21/22 respectively the sensitivity was 43.2 % and 32.4 %, respectively. The cut-off value developed on the basis of our sample was 9/10 with a sensitivity of 81.1 % und a specificity of 74.3 %. There were no significant gender differences. This is the first study investigating the criterion validity of the German version of the CES-D in the general population. The optimal cut-off values yielded sufficient sensitivity and specificity, comparable to the values of other screening tools. © Georg Thieme Verlag KG Stuttgart · New York.
Forney, K Jean; Bodell, Lindsay P; Haedt-Matt, Alissa A; Keel, Pamela K
2016-07-01
Of the two primary features of binge eating, loss of control (LOC) eating is well validated while the role of eating episode size is less clear. Given the ICD-11 proposal to eliminate episode size from the binge-eating definition, the present study examined the incremental validity of the size criterion, controlling for LOC. Interview and questionnaire data come from four studies of 243 women with bulimia nervosa (n = 141) or purging disorder (n = 102). Hierarchical linear regression tested if the largest reported episode size, coded in kilocalories, explained additional variance in eating disorder features, psychopathology, personality traits, and impairment, holding constant LOC eating frequency, age, and body mass index (BMI). Analyses also tested if episode size moderated the association between LOC eating and these variables. Holding LOC constant, episode size explained significant variance in disinhibition, trait anxiety, and eating disorder-related impairment. Episode size moderated the association of LOC eating with purging frequency and depressive symptoms, such that in the presence of larger eating episodes, LOC eating was more closely associated with these features. Neither episode size nor its interaction with LOC explained additional variance in BMI, hunger, restraint, shape concerns, state anxiety, negative urgency, or global functioning. Taken together, results support the incremental validity of the size criterion, in addition to and in combination with LOC eating, for defining binge-eating episodes in purging syndromes. Future research should examine the predictive validity of episode size in both purging and nonpurging eating disorders (e.g., binge eating disorder) to inform nosological schemes. © 2016 Wiley Periodicals, Inc. (Int J Eat Disord 2016; 49:651-662). © 2016 Wiley Periodicals, Inc.
Scholes, Shaun; Coombs, Ngaire; Pedisic, Zeljko; Mindell, Jennifer S; Bauman, Adrian; Rowlands, Alex V; Stamatakis, Emmanuel
2014-06-15
The criterion validity of the 2008 Physical Activity and Sedentary Behavior Assessment Questionnaire (PASBAQ) was examined in a nationally representative sample of 2,175 persons aged ≥16 years in England using accelerometry. Using accelerometer minutes/day greater than or equal to 200 counts as a criterion, Spearman's correlation coefficient (ρ) for PASBAQ-assessed total activity was 0.30 (95% confidence interval (CI): 0.25, 0.35) in women and 0.20 (95% CI: 0.15, 0.26) in men. Correlations between accelerometer counts/minute of wear time and questionnaire-assessed relative energy expenditure (metabolic equivalent-minutes/day) were higher in women (ρ = 0.41, 95% CI: 0.36, 0.46) than in men (ρ = 0.32, 95% CI: 0.26, 0.38). Similar correlations were observed for minutes/day spent in vigorous activity (women: ρ = 0.39, 95% CI: 0.33, 0.46; men: ρ = 0.31, 95% CI: 0.26, 0.36) and moderate-to-vigorous activity (women: ρ = 0.42, 95% CI: 0.36, 0.48; men: ρ = 0.38, 95% CI: 0.32, 0.45). Correlations for time spent being sedentary (<100 counts/minute) were 0.30 (95% CI: 0.24, 0.35) and 0.25 (95% CI: 0.19, 0.30) in women and men, respectively. Sedentary behavior correlations showed no sex difference. The validity of sedentary behavior and total physical activity was higher in older age groups, but validity was higher in younger persons for vigorous-intensity activity. The PASBAQ is a useful and valid instrument for ranking individuals according to levels of physical activity and sedentary behavior. © The Author 2014. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health.
van Bokhorst-de van der Schueren, Marian A E; Guaitoli, Patrícia Realino; Jansma, Elise P; de Vet, Henrica C W
2014-02-01
Numerous nutrition screening tools for the hospital setting have been developed. The aim of this systematic review is to study construct or criterion validity and predictive validity of nutrition screening tools for the general hospital setting. A systematic review of English, French, German, Spanish, Portuguese and Dutch articles identified via MEDLINE, Cinahl and EMBASE (from inception to the 2nd of February 2012). Additional studies were identified by checking reference lists of identified manuscripts. Search terms included key words for malnutrition, screening or assessment instruments, and terms for hospital setting and adults. Data were extracted independently by 2 authors. Only studies expressing the (construct, criterion or predictive) validity of a tool were included. 83 studies (32 screening tools) were identified: 42 studies on construct or criterion validity versus a reference method and 51 studies on predictive validity on outcome (i.e. length of stay, mortality or complications). None of the tools performed consistently well to establish the patients' nutritional status. For the elderly, MNA performed fair to good, for the adults MUST performed fair to good. SGA, NRS-2002 and MUST performed well in predicting outcome in approximately half of the studies reviewed in adults, but not in older patients. Not one single screening or assessment tool is capable of adequate nutrition screening as well as predicting poor nutrition related outcome. Development of new tools seems redundant and will most probably not lead to new insights. New studies comparing different tools within one patient population are required. Copyright © 2013 Elsevier Ltd and European Society for Clinical Nutrition and Metabolism. All rights reserved.
Scholes, Shaun; Coombs, Ngaire; Pedisic, Zeljko; Mindell, Jennifer S.; Bauman, Adrian; Rowlands, Alex V.; Stamatakis, Emmanuel
2014-01-01
The criterion validity of the 2008 Physical Activity and Sedentary Behavior Assessment Questionnaire (PASBAQ) was examined in a nationally representative sample of 2,175 persons aged ≥16 years in England using accelerometry. Using accelerometer minutes/day greater than or equal to 200 counts as a criterion, Spearman's correlation coefficient (ρ) for PASBAQ-assessed total activity was 0.30 (95% confidence interval (CI): 0.25, 0.35) in women and 0.20 (95% CI: 0.15, 0.26) in men. Correlations between accelerometer counts/minute of wear time and questionnaire-assessed relative energy expenditure (metabolic equivalent-minutes/day) were higher in women (ρ = 0.41, 95% CI: 0.36, 0.46) than in men (ρ = 0.32, 95% CI: 0.26, 0.38). Similar correlations were observed for minutes/day spent in vigorous activity (women: ρ = 0.39, 95% CI: 0.33, 0.46; men: ρ = 0.31, 95% CI: 0.26, 0.36) and moderate-to-vigorous activity (women: ρ = 0.42, 95% CI: 0.36, 0.48; men: ρ = 0.38, 95% CI: 0.32, 0.45). Correlations for time spent being sedentary (<100 counts/minute) were 0.30 (95% CI: 0.24, 0.35) and 0.25 (95% CI: 0.19, 0.30) in women and men, respectively. Sedentary behavior correlations showed no sex difference. The validity of sedentary behavior and total physical activity was higher in older age groups, but validity was higher in younger persons for vigorous-intensity activity. The PASBAQ is a useful and valid instrument for ranking individuals according to levels of physical activity and sedentary behavior. PMID:24863551
Reliability and validity in a nutshell.
Bannigan, Katrina; Watson, Roger
2009-12-01
To explore and explain the different concepts of reliability and validity as they are related to measurement instruments in social science and health care. There are different concepts contained in the terms reliability and validity and these are often explained poorly and there is often confusion between them. To develop some clarity about reliability and validity a conceptual framework was built based on the existing literature. The concepts of reliability, validity and utility are explored and explained. Reliability contains the concepts of internal consistency and stability and equivalence. Validity contains the concepts of content, face, criterion, concurrent, predictive, construct, convergent (and divergent), factorial and discriminant. In addition, for clinical practice and research, it is essential to establish the utility of a measurement instrument. To use measurement instruments appropriately in clinical practice, the extent to which they are reliable, valid and usable must be established.
Grzybowska, Magdalena Emilia; Piaskowska-Cala, Justyna; Wydra, Dariusz Grzegorz
2017-12-29
The aim of the study was to translate into Polish the Pelvic Organ Prolapse/Incontinence Sexual Questionnaire, IUGA-Revised (PISQ-IR), which evaluates sexual function in sexually active (SA) and not SA (NSA) women with pelvic floor disorders (PFD), and to validate the Polish version. After translation, back-translation and cognitive interviews, the final version of PISQ-IR was established. The study group included 252 women with PFD (124 NSA and 128 SA). All women underwent clinical evaluation and completed the PISQ-IR. For test-retest reliability, the questionnaire was administered to 99 patients twice at an interval of 2 weeks. The analysis of criterion validity required the subjects to complete self-reported measures. Internal consistency and criterion validity were assessed separately for NSA and SA women for the PISQ-IR subscales. The mean age of the women was 60.9 ± 10.6 years and their mean BMI was 27.9 ± 4.9 kg/m 2 . Postmenopausal women constituted 82.5% of the study group. Urinary incontinence (UI) was diagnosed in 60 women (23.8%), pelvic organ prolapse (POP) in 90 (35.7%), and UI and POP in 102 (40.5%). Fecal incontinence was reported by 45 women (17.9%). The PISQ-IR Polish version proved to have good internal consistency in NSA women (α 0.651 to 0.857) and SA women (α 0.605 to 0.887), and strong reliability in all subscales (Pearson's coefficient 0.759-0.899; p < 0.001). Criterion validity confirmed moderate to strong correlations between PISQ-IR scores and self-reported measures in SA subscales, as well the SA summary score, and weak to moderate correlations in NSA women. The PISQ-IR Polish version is a valid tool for evaluating sexual function in women with PFD.
Dunleavy, Kim; Neil, Joseph; Tallon, Allison; Adamo, Diane E
2015-09-01
The cervical range of motion device (CROM) has been shown to provide reliable forward head position (FHP) measurement when the upper cervical angle (UCA) is controlled. However, measurement without UCA standardization is reflective of habitual patterns. Criterion validity has not been reported. The purposes of this study were to establish: (1) criterion validity of CROM FHP and UCA compared to Optotrak data, (2) relative reliability and minimal detectable change (MDC95) in patients with and without cervical pain, and (3) to compare UCA and FHP in patients with and without pain in habitual postures. (1) Within-subjects single session concurrent criterion validity design. Simultaneous CROM and OP measurement was conducted in habitual sitting posture in 16 healthy young adults. (2) Reliability and MDC95 of UCA and FHP were calculated from three trials. (3) Values for adults over 35 years with cervical pain and age-matched healthy controls were compared. (1) Forward head position distances were moderately correlated and UCA angles were highly correlated. The mean (standard deviation) differences can be expected to vary between 1·48 cm (1·74) for FHP and -1·7 (2·46)° for UCA. (2) Reliability for CROM FHP measurements were good to excellent (no pain) and moderate (pain). Cervical range of motion FHP MDC95 was moderately low (no pain), and moderate (pain). Reliability for CROM UCA measurements was excellent and MDC95 low for both groups. There was no difference in FHP distances between the pain and no pain groups, UCA was significantly more extended in the pain group (P<0·05). Cervical range of motion FHP measurements were only moderately correlated with Optotrak data, and limits of agreement (LOA) and MDC95 were relatively large. There was also no difference in CROM FHP distance between older symptomatic and asymptomatic individuals. Cervical range of motion FHP measurement is therefore not recommended as a clinical outcome measure. Cervical range of motion UCA measurements showed good criterion validity, excellent test-retest reliability, and achievable MDC95 in asymptomatic and symptomatic participants. Differences of more than 6° are required to exceed error. Cervical range of motion UCA shows promise as a useful reliable and valid measurement, particularly as patients with cervical pain exhibited significantly more extended angles.
Neil, Joseph; Tallon, Allison; Adamo, Diane E.
2015-01-01
Objectives The cervical range of motion device (CROM) has been shown to provide reliable forward head position (FHP) measurement when the upper cervical angle (UCA) is controlled. However, measurement without UCA standardization is reflective of habitual patterns. Criterion validity has not been reported. The purposes of this study were to establish: (1) criterion validity of CROM FHP and UCA compared to Optotrak data, (2) relative reliability and minimal detectable change (MDC95) in patients with and without cervical pain, and (3) to compare UCA and FHP in patients with and without pain in habitual postures. Methods (1) Within-subjects single session concurrent criterion validity design. Simultaneous CROM and OP measurement was conducted in habitual sitting posture in 16 healthy young adults. (2) Reliability and MDC95 of UCA and FHP were calculated from three trials. (3) Values for adults over 35 years with cervical pain and age-matched healthy controls were compared. Results (1) Forward head position distances were moderately correlated and UCA angles were highly correlated. The mean (standard deviation) differences can be expected to vary between 1·48 cm (1·74) for FHP and −1·7 (2·46)° for UCA. (2) Reliability for CROM FHP measurements were good to excellent (no pain) and moderate (pain). Cervical range of motion FHP MDC95 was moderately low (no pain), and moderate (pain). Reliability for CROM UCA measurements was excellent and MDC95 low for both groups. There was no difference in FHP distances between the pain and no pain groups, UCA was significantly more extended in the pain group (P<0·05). Discussion Cervical range of motion FHP measurements were only moderately correlated with Optotrak data, and limits of agreement (LOA) and MDC95 were relatively large. There was also no difference in CROM FHP distance between older symptomatic and asymptomatic individuals. Cervical range of motion FHP measurement is therefore not recommended as a clinical outcome measure. Cervical range of motion UCA measurements showed good criterion validity, excellent test–retest reliability, and achievable MDC95 in asymptomatic and symptomatic participants. Differences of more than 6° are required to exceed error. Cervical range of motion UCA shows promise as a useful reliable and valid measurement, particularly as patients with cervical pain exhibited significantly more extended angles. PMID:26917936
A New Criterion for Prediction of Hot Tearing Susceptibility of Cast Alloys
NASA Astrophysics Data System (ADS)
Nasresfahani, Mohamad Reza; Niroumand, Behzad
2014-08-01
A new criterion for prediction of hot tearing susceptibility of cast alloys is suggested which takes into account the effects of both important mechanical and metallurgical factors and is believed to be less sensitive to the presence of volume defects such as bifilms and inclusions. The criterion was validated by studying the hot tearing tendency of Al-Cu alloy. In conformity with the experimental results, the new criterion predicted reduction of hot tearing tendency with increasing the copper content.
Assessing Sleep Disturbance in Low Back Pain: The Validity of Portable Instruments
Alsaadi, Saad M.; McAuley, James H.; Hush, Julia M.; Bartlett, Delwyn J.; McKeough, Zoe M.; Grunstein, Ronald R.; Dungan, George C.; Maher, Chris G.
2014-01-01
Although portable instruments have been used in the assessment of sleep disturbance for patients with low back pain (LBP), the accuracy of the instruments in detecting sleep/wake episodes for this population is unknown. This study investigated the criterion validity of two portable instruments (Armband and Actiwatch) for assessing sleep disturbance in patients with LBP. 50 patients with LBP performed simultaneous overnight sleep recordings in a university sleep laboratory. All 50 participants were assessed by Polysomnography (PSG) and the Armband and a subgroup of 33 participants wore an Actiwatch. Criterion validity was determined by calculating epoch-by-epoch agreement, sensitivity, specificity and prevalence and bias- adjusted kappa (PABAK) for sleep versus wake between each instrument and PSG. The relationship between PSG and the two instruments was assessed using intraclass correlation coefficients (ICC 2, 1). The study participants showed symptoms of sub-threshold insomnia (mean ISI = 13.2, 95% CI = 6.36) and poor sleep quality (mean PSQI = 9.20, 95% CI = 4.27). Observed agreement with PSG was 85% and 88% for the Armband and Actiwatch. Sensitivity was 0.90 for both instruments and specificity was 0.54 and 0.67 and PABAK of 0.69 and 0.77 for the Armband and Actiwatch respectively. The ICC (95%CI) was 0.76 (0.61 to 0.86) and 0.80 (0.46 to 0.92) for total sleep time, 0.52 (0.29 to 0.70) and 0.55 (0.14 to 0.77) for sleep efficiency, 0.64 (0.45 to 0.78) and 0.52 (0.23 to 0.73) for wake after sleep onset and 0.13 (−0.15 to 0.39) and 0.33 (−0.05 to 0.63) for sleep onset latency, for the Armband and Actiwatch, respectively. The findings showed that both instruments have varied criterion validity across the sleep parameters from excellent validity for measures of total sleep time, good validity for measures of sleep efficiency and wake after onset to poor validity for sleep onset latency. PMID:24763506
Assessing sleep disturbance in low back pain: the validity of portable instruments.
Alsaadi, Saad M; McAuley, James H; Hush, Julia M; Bartlett, Delwyn J; McKeough, Zoe M; Grunstein, Ronald R; Dungan, George C; Maher, Chris G
2014-01-01
Although portable instruments have been used in the assessment of sleep disturbance for patients with low back pain (LBP), the accuracy of the instruments in detecting sleep/wake episodes for this population is unknown. This study investigated the criterion validity of two portable instruments (Armband and Actiwatch) for assessing sleep disturbance in patients with LBP. 50 patients with LBP performed simultaneous overnight sleep recordings in a university sleep laboratory. All 50 participants were assessed by Polysomnography (PSG) and the Armband and a subgroup of 33 participants wore an Actiwatch. Criterion validity was determined by calculating epoch-by-epoch agreement, sensitivity, specificity and prevalence and bias- adjusted kappa (PABAK) for sleep versus wake between each instrument and PSG. The relationship between PSG and the two instruments was assessed using intraclass correlation coefficients (ICC 2, 1). The study participants showed symptoms of sub-threshold insomnia (mean ISI = 13.2, 95% CI = 6.36) and poor sleep quality (mean PSQI = 9.20, 95% CI = 4.27). Observed agreement with PSG was 85% and 88% for the Armband and Actiwatch. Sensitivity was 0.90 for both instruments and specificity was 0.54 and 0.67 and PABAK of 0.69 and 0.77 for the Armband and Actiwatch respectively. The ICC (95%CI) was 0.76 (0.61 to 0.86) and 0.80 (0.46 to 0.92) for total sleep time, 0.52 (0.29 to 0.70) and 0.55 (0.14 to 0.77) for sleep efficiency, 0.64 (0.45 to 0.78) and 0.52 (0.23 to 0.73) for wake after sleep onset and 0.13 (-0.15 to 0.39) and 0.33 (-0.05 to 0.63) for sleep onset latency, for the Armband and Actiwatch, respectively. The findings showed that both instruments have varied criterion validity across the sleep parameters from excellent validity for measures of total sleep time, good validity for measures of sleep efficiency and wake after onset to poor validity for sleep onset latency.
Validity of the Digital Inclinometer and iPhone When Measuring Thoracic Spine Rotation.
Bucke, Jonathan; Spencer, Simon; Fawcett, Louise; Sonvico, Lawrence; Rushton, Alison; Heneghan, Nicola R
2017-09-01
Spinal axial rotation is required for many functional and sporting activities. Eighty percent of axial rotation occurs in the thoracic spine. Existing measures of thoracic spine rotation commonly involve laboratory equipment, use a seated position, and include lumbar motion. A simple performance-based outcome measure would allow clinicians to evaluate isolated thoracic spine rotation. Currently, no valid measure exists. To explore the criterion and concurrent validity of a digital inclinometer (DI) and iPhone Clinometer app (iPhone) for measuring thoracic spine rotation using the heel-sit position. Controlled laboratory study. University laboratory. A total of 23 asymptomatic healthy participants (14 men, 9 women; age = 25.82 ± 4.28 years, height = 170.26 ± 8.01 cm, mass = 67.50 ± 9.46 kg, body mass index = 23.26 ± 2.79) were recruited from a student population. We took DI and iPhone measurements of thoracic spine rotation in the heel-sit position concurrently with dual-motion analysis (laboratory measure) and ultrasound imaging of the underlying bony tissue motion (reference standard). To determine the criterion and concurrent validity, we used the Pearson product moment correlation coefficient (r, 2 tailed) and Bland-Altman plots. The DI (r = 0.88, P < .001) and iPhone (r = 0.88, P < .001) demonstrated strong criterion validity. Both also had strong concurrent validity (r = 0.98, P < .001). Bland-Altman plots illustrated mean differences of 5.82° (95% confidence interval [CI] = 20.37°, -8.73°) and 4.94° (95% CI = 19.23°, -9.35°) between the DI and iPhone, respectively, and the reference standard and 0.87° (95% CI = 6.79°, -5.05°) between the DI and iPhone. The DI and iPhone provided valid measures of thoracic spine rotation in the heel-sit position. Both can be used in clinical practice to assess thoracic spine rotation, which may be valuable when evaluating thoracic dysfunction.
Al Ansari, Ahmed; Donnon, Tyrone; Al Khalifa, Khalid; Darwish, Abdulla; Violato, Claudio
2014-01-01
Background The purpose of this study was to conduct a meta-analysis on the construct and criterion validity of multi-source feedback (MSF) to assess physicians and surgeons in practice. Methods In this study, we followed the guidelines for the reporting of observational studies included in a meta-analysis. In addition to PubMed and MEDLINE databases, the CINAHL, EMBASE, and PsycINFO databases were searched from January 1975 to November 2012. All articles listed in the references of the MSF studies were reviewed to ensure that all relevant publications were identified. All 35 articles were independently coded by two authors (AA, TD), and any discrepancies (eg, effect size calculations) were reviewed by the other authors (KA, AD, CV). Results Physician/surgeon performance measures from 35 studies were identified. A random-effects model of weighted mean effect size differences (d) resulted in: construct validity coefficients for the MSF system on physician/surgeon performance across different levels in practice ranged from d=0.14 (95% confidence interval [CI] 0.40–0.69) to d=1.78 (95% CI 1.20–2.30); construct validity coefficients for the MSF on physician/surgeon performance on two different occasions ranged from d=0.23 (95% CI 0.13–0.33) to d=0.90 (95% CI 0.74–1.10); concurrent validity coefficients for the MSF based on differences in assessor group ratings ranged from d=0.50 (95% CI 0.47–0.52) to d=0.57 (95% CI 0.55–0.60); and predictive validity coefficients for the MSF on physician/surgeon performance across different standardized measures ranged from d=1.28 (95% CI 1.16–1.41) to d=1.43 (95% CI 0.87–2.00). Conclusion The construct and criterion validity of the MSF system is supported by small to large effect size differences based on the MSF process and physician/surgeon performance across different clinical and nonclinical domain measures. PMID:24600300
Dobbin, Nick; Hunwicks, Richard; Jones, Ben; Till, Kevin; Highton, Jamie; Twist, Craig
2018-02-01
To examine the criterion and construct validity of an isometric midthigh-pull dynamometer to assess whole-body strength in professional rugby league players. Fifty-six male rugby league players (33 senior and 23 youth players) performed 4 isometric midthigh-pull efforts (ie, 2 on the dynamometer and 2 on the force platform) in a randomized and counterbalanced order. Isometric peak force was underestimated (P < .05) using the dynamometer compared with the force platform (95% LoA: -213.5 ± 342.6 N). Linear regression showed that peak force derived from the dynamometer explained 85% (adjusted R 2 = .85, SEE = 173 N) of the variance in the dependent variable, with the following prediction equation derived: predicted peak force = [1.046 × dynamometer peak force] + 117.594. Cross-validation revealed a nonsignificant bias (P > .05) between the predicted and peak force from the force platform and an adjusted R 2 (79.6%) that represented shrinkage of 0.4% relative to the cross-validation model (80%). Peak force was greater for the senior than the youth professionals using the dynamometer (2261.2 ± 222 cf 1725.1 ± 298.0 N, respectively; P < .05). The isometric midthigh pull assessed using a dynamometer underestimates criterion peak force but is capable of distinguishing muscle-function characteristics between professional rugby league players of different standards.
Home Healthcare Nurses' Job Satisfaction Scale: refinement and psychometric testing.
Ellenbecker, Carol H; Byleckie, James J
2005-10-01
This paper describes a study to further develop and test the psychometric properties of the Home Healthcare Nurses' Job Satisfaction Scale, including reliability and construct and criterion validity. Numerous scales have been developed to measure nurses' job satisfaction. Only one, the Home Healthcare Nurses' Job Satisfaction Scale, has been designed specifically to measure job satisfaction of home healthcare nurses. The Home Healthcare Nurses' Job Satisfaction Scale is based on a theoretical model that integrates the findings of empirical research related to job satisfaction. A convenience sample of 340 home healthcare nurses completed the Home Healthcare Nurses' Job Satisfaction Scale and the Mueller and McCloskey Satisfaction Scale, which was used to test criterion validity. Factor analysis was used for testing and refinement of the theory-based assignment of items to constructs. Reliability was assessed by Cronbach's alpha internal consistency reliability coefficients. The data were collected in 2003. Nine factors contributing to home healthcare nurses' job satisfaction emerged from the factor analysis and were strongly supported by the underlying theory. Factor loadings were all above 0.4. Cronbach's alpha coefficients for each of the nine subscales ranged from 0.64 to 0.83; the alpha for the global scale was 0.89. The correlations between the Home Healthcare Nurses' Job Satisfaction Scale and Mueller and McCloskey Satisfaction Scale was 0.79, indicating good criterion-related validity. The Home Healthcare Nurses' Job Satisfaction Scale has potential as a reliable and valid scale for measurement of job satisfaction of home healthcare nurses.
Rodríguez, Iván; Zambrano, Lysien; Manterola, Carlos
2016-04-01
Physiological parameters used to measure exercise intensity are oxygen uptake and heart rate. However, perceived exertion (PE) is a scale that has also been frequently applied. The objective of this study is to establish the criterion-related validity of PE scales in children during an incremental exercise test. Seven electronic databases were used. Studies aimed at assessing criterion-related validity of PE scales in healthy children during an incremental exercise test were included. Correlation coefficients were transformed into z-values and assessed in a meta-analysis by means of a fixed effects model if I2 was below 50% or a random effects model, if it was above 50%. wenty-five articles that studied 1418 children (boys: 49.2%) met the inclusion criteria. Children's average age was 10.5 years old. Exercise modalities included bike, running and stepping exercises. The weighted correlation coefficient was 0.835 (95% confidence interval: 0.762-0.887) and 0.874 (95% confidence interval: 0.794-0.924) for heart rate and oxygen uptake as reference criteria. The production paradigm and scales that had not been adapted to children showed the lowest measurement performance (p < 0.05). Measuring PE could be valid in healthy children during an incremental exercise test. Child-specific rating scales showed a better performance than those that had not been adapted to this population. Further studies with better methodological quality should be conducted in order to confirm these results. Sociedad Argentina de Pediatría.
Tsugawa, Yusuke; Ohbu, Sadayoshi; Cruess, Richard; Cruess, Sylvia; Okubo, Tomoya; Takahashi, Osamu; Tokuda, Yasuharu; Heist, Brian S; Bito, Seiji; Itoh, Toshiyuki; Aoki, Akiko; Chiba, Tsutomu; Fukui, Tsuguya
2011-08-01
Despite the growing importance of and interest in medical professionalism, there is no standardized tool for its measurement. The authors sought to verify the validity, reliability, and generalizability of the Professionalism Mini-Evaluation Exercise (P-MEX), a previously developed and tested tool, in the context of Japanese hospitals. A multicenter, cross-sectional evaluation study was performed to investigate the validity, reliability, and generalizability of the P-MEX in seven Japanese hospitals. In 2009-2010, 378 evaluators (attending physicians, nurses, peers, and junior residents) completed 360-degree assessments of 165 residents and fellows using the P-MEX. The content validity and criterion-related validity were examined, and the construct validity of the P-MEX was investigated by performing confirmatory factor analysis through a structural equation model. The reliability was tested using generalizability analysis. The contents of the P-MEX achieved good acceptance in a preliminary working group, and the poststudy survey revealed that 302 (79.9%) evaluators rated the P-MEX items as appropriate, indicating good content validity. The correlation coefficient between P-MEX scores and external criteria was 0.78 (P < .001), demonstrating good criterion-related validity. Confirmatory factor analysis verified high path coefficient (0.60-0.99) and adequate goodness of fit of the model. The generalizability analysis yielded a high dependability coefficient, suggesting good reliability, except when evaluators were peers or junior residents. Findings show evidence of adequate validity, reliability, and generalizability of the P-MEX in Japanese hospital settings. The P-MEX is the only evaluation tool for medical professionalism verified in both a Western and East Asian cultural context.
Charalambous, Andreas; Kaite, Charis; Constantinou, Marianna; Kouta, Christiana
2016-12-02
To translate and validate the Cancer-Related Fatigue (CRF) Scale in the Greek language. A cross-sectional descriptive design was used in order to translate and validate the CRF Scale in Greek. Factor analyses were performed to understand the psychometric properties of the scale and to establish construct, criterion and convergent validity. Outpatients' oncology clinics of two public hospitals in Cyprus. 148 patients with advanced prostate cancer undergoing chemotherapy. The Cancer Fatigue Scale (CFS) had good stability (test-retest reliability r=0.79, p<0.001) and good internal consistency (Cronbach's α coefficient for all 15 items α=0.916). Furthermore, the Kaiser-Meyer-Olkin Measure of Sampling Adequacy (KMO value) was found to be 0.743 and considered to be satisfactory (>0.5). The correlations between the CFS physical scale (CFS-FS scale) and the European Organization for Research and Treatment of Cancer (EORTC) QLQ-C30 physical subscales were found to be significant (r=-0.715). The same occurred between CFS cognitive and EORTC cognitive subscale (r=-0.579). Overall, the criterion validity was verified. The same occurs for the convergent validity of the CFS since all correlations with the Global Health Status (q29-q30) were found to be significant. This is the first validation study of the CRF Scale in Greek and warrant of its use in the assessment of prostate cancer patient's related fatigue. However, further testing and validation is needed in the early stages of the disease and in patients in later chemotherapy cycles. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
Neijenhuijs, Koen I; Jansen, Femke; Aaronson, Neil K; Brédart, Anne; Groenvold, Mogens; Holzner, Bernhard; Terwee, Caroline B; Cuijpers, Pim; Verdonck-de Leeuw, Irma M
2018-05-07
The EORTC IN-PATSAT32 is a patient-reported outcome measure (PROM) to assess cancer patients' satisfaction with in-patient health care. The aim of this study was to investigate whether the initial good measurement properties of the IN-PATSAT32 are confirmed in new studies. Within the scope of a larger systematic review study (Prospero ID 42017057237), a systematic search was performed of Embase, Medline, PsycINFO, and Web of Science for studies that investigated measurement properties of the IN-PATSAT32 up to July 2017. Study quality was assessed, data were extracted, and synthesized according to the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) methodology. Nine studies were included in this review. The evidence on reliability and construct validity were rated as sufficient and of the quality of the evidence as moderate. The evidence on structural validity was rated as insufficient and of low quality. The evidence on internal consistency was indeterminate. Measurement error, responsiveness, criterion validity, and cross-cultural validity were not reported in the included studies. Measurement error could be calculated for two studies and was judged indeterminate. In summary, the IN-PATSAT32 performs as expected with respect to reliability and construct validity. No firm conclusions can be made yet whether the IN-PATSAT32 also performs as well with respect to structural validity and internal consistency. Further research on these measurement properties of the PROM is therefore needed as well as on measurement error, responsiveness, criterion validity, and cross-cultural validity. For future studies, it is recommended to take the COSMIN methodology into account.
Talip, Whadi-ah; Steyn, Nelia P; Visser, Marianne; Charlton, Karen E; Temple, Norman
2003-09-01
We wanted to develop and validate a test that assesses the knowledge and practices of health professionals (HPs) with regard to the role of nutrition, physical activity, and smoking cessation (lifestyle modification) in chronic diseases of lifestyle. A descriptive cross-sectional validation study was carried out. The validation design consisted of two phases, namely 1) test planning and development and 2) test evaluation. The study sample consisted of five groups of HPs: dietitians, dietetic interns, general practitioners, medical students, and nurses. The overall response rate was 58%, resulting in a sample size of 186 participants. A test was designed to evaluate the knowledge and practices of HPs. The test was first evaluated by an expert group to ensure content, construct, and face validity. Thereafter, the questionnaire was tested on five groups of HPs to test for criterion validity. Internal consistency was evaluated by Cronbach's alpha. An expert panel ensured content, construct, and face validity of the test. Groups with the most training and exposure to nutrition (dietitians and dietetic interns) had the highest group mean score, ranging from 61% to 88%, whereas those with limited nutrition training (general practitioners, medical students, and nurses) had significantly lower scores, ranging from 26% to 80%. This result demonstrated criterion validity. Internal consistency of the overall test demonstrated a Cronbach's alpha of 0.99. Most HPs identified the mass media as their main source of information on lifestyle modification. These HPs also identified lack of time, lack of patient compliance, and lack of knowledge as barriers that prevent them from providing counseling on lifestyle modification. The results of this study showed that this test instrument identifies groups of health professionals with adequate training (knowledge) in lifestyle modification and those who require further training (knowledge).
Kaite, Charis; Constantinou, Marianna; Kouta, Christiana
2016-01-01
Objective To translate and validate the Cancer-Related Fatigue (CRF) Scale in the Greek language. Design A cross-sectional descriptive design was used in order to translate and validate the CRF Scale in Greek. Factor analyses were performed to understand the psychometric properties of the scale and to establish construct, criterion and convergent validity. Setting Outpatients' oncology clinics of two public hospitals in Cyprus. Participants 148 patients with advanced prostate cancer undergoing chemotherapy. Results The Cancer Fatigue Scale (CFS) had good stability (test–retest reliability r=0.79, p<0.001) and good internal consistency (Cronbach's α coefficient for all 15 items α=0.916). Furthermore, the Kaiser-Meyer-Olkin Measure of Sampling Adequacy (KMO value) was found to be 0.743 and considered to be satisfactory (>0.5). The correlations between the CFS physical scale (CFS-FS scale) and the European Organization for Research and Treatment of Cancer (EORTC) QLQ-C30 physical subscales were found to be significant (r=−0.715). The same occurred between CFS cognitive and EORTC cognitive subscale (r=−0.579). Overall, the criterion validity was verified. The same occurs for the convergent validity of the CFS since all correlations with the Global Health Status (q29–q30) were found to be significant. Conclusions This is the first validation study of the CRF Scale in Greek and warrant of its use in the assessment of prostate cancer patient's related fatigue. However, further testing and validation is needed in the early stages of the disease and in patients in later chemotherapy cycles. PMID:27913557
Sitnikova, Kate; Dijkstra-Kersten, Sandra M A; Mokkink, Lidwine B; Terluin, Berend; van Marwijk, Harm W J; Leone, Stephanie S; van der Horst, Henriëtte E; van der Wouden, Johannes C
2017-12-01
The aim of this review is to critically appraise the evidence on measurement properties of self-report questionnaires measuring somatization in adult primary care patients and to provide recommendations about which questionnaires are most useful for this purpose. We assessed the methodological quality of included studies using the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN) checklist. To draw overall conclusions about the quality of the questionnaires, we conducted an evidence synthesis using predefined criteria for judging the measurement properties. We found 24 articles on 9 questionnaires. Studies on the Patient Health Questionnaire-15 (PHQ-15) and the Four-Dimensional Symptom Questionnaire (4DSQ) somatization subscale prevailed and covered the broadest range of measurement properties. These questionnaires had the best internal consistency, test-retest reliability, structural validity, and construct validity. The PHQ-15 also had good criterion validity, whereas the 4DSQ somatization subscale was validated in several languages. The Bodily Distress Syndrome (BDS) checklist had good internal consistency and structural validity. Some evidence was found for good construct validity and criterion validity of the Physical Symptom Checklist (PSC-51) and good construct validity of the Symptom Check-List (SCL-90-R) somatization subscale. However, these three questionnaires were only studied in a small number of primary care studies. Based on our findings, we recommend the use of either the PHQ-15 or 4DSQ somatization subscale for somatization in primary care. Other questionnaires, such as the BDS checklist, PSC-51 and the SCL-90-R somatization subscale show promising results but have not been studied extensively in primary care. Copyright © 2017 Elsevier Inc. All rights reserved.
Dong, Lijuan; Liu, Na; Tian, Xiaoyu; Qiao, Xiaoxia; Gobbens, Robbert J J; Kane, Robert L; Wang, Cuili
2017-11-01
To translate the Tilburg Frailty Indicator (TFI) into Chinese and assess its reliability and validity. A sample of 917 community-dwelling older people, aged ≥60 years, in a Chinese city was included between August 2015 and March 2016. Construct validity was assessed using alternative measures corresponding to the TFI items, including self-rated health status (SRH), unintentional weight loss, walking speed, timed-up-and-go tests (TUGT), making telephone calls, grip strength, exhaustion, Short Portable Mental Status Questionnaire (SPMSQ), Geriatric Depression scale (GDS-15), emotional role, Adaptability Partnership Growth Affection and Resolve scale (APGAR) and Social Support Rating Scale (SSRS). Fried's phenotype and frailty index were measured to evaluate criterion validity. Adverse health outcomes (ADL and IADL disability, healthcare utilization, GDS-15, SSRS) were used to assess predictive (concurrent) validity. The internal consistency reliability was good (Cronbach's α=0.71). The test-retest reliability was strong (r=0.88). Kappa coefficients showed agreements between the TFI items and corresponding alternative measures. Alternative measures correlated as expected with the three domains of TFI, with an exclusion that alternative psychological measures had similar correlations with psychological and physical domains of the TFI. The Chinese TFI had excellent criterion validity with the AUCs regarding physical phenotype and frailty index of 0.87 and 0.86, respectively. The predictive (concurrent) validities of the adverse health outcomes and healthcare utilization were acceptable (AUCs: 0.65-0.83). The Chinese TFI has good validity and reliability as an integral instrument to measure frailty of older people living in the community in China. Copyright © 2017 Elsevier B.V. All rights reserved.
15 CFR 8b.20 - Admission and recruitment.
Code of Federal Regulations, 2014 CFR
2014-01-01
... AGAINST THE HANDICAPPED IN FEDERALLY ASSISTED PROGRAMS OPERATED BY THE DEPARTMENT OF COMMERCE Post... proportion of handicapped individuals who may be admitted; and (2) May not make use of any test or criterion... handicapped individuals unless: (i) The test or criterion, as used by the recipient, has been validated as a...
Procedures for Empirical Determination of En-Route Criterion Levels.
ERIC Educational Resources Information Center
Moncrief, Michael H.
En-route Criterion Levels (ECLs) are defined as decision rules for predicting pupil readiness to advance through an instructional sequence. This study investigated the validity of present ELCs in an individualized mathematics program and tested procedures for empirically determining optimal ECLs. Retest scores and subsequent progress were…
15 CFR 8b.20 - Admission and recruitment.
Code of Federal Regulations, 2011 CFR
2011-01-01
... AGAINST THE HANDICAPPED IN FEDERALLY ASSISTED PROGRAMS OPERATED BY THE DEPARTMENT OF COMMERCE Post... proportion of handicapped individuals who may be admitted; and (2) May not make use of any test or criterion... handicapped individuals unless: (i) The test or criterion, as used by the recipient, has been validated as a...
15 CFR 8b.20 - Admission and recruitment.
Code of Federal Regulations, 2012 CFR
2012-01-01
... AGAINST THE HANDICAPPED IN FEDERALLY ASSISTED PROGRAMS OPERATED BY THE DEPARTMENT OF COMMERCE Post... proportion of handicapped individuals who may be admitted; and (2) May not make use of any test or criterion... handicapped individuals unless: (i) The test or criterion, as used by the recipient, has been validated as a...
15 CFR 8b.20 - Admission and recruitment.
Code of Federal Regulations, 2010 CFR
2010-01-01
... AGAINST THE HANDICAPPED IN FEDERALLY ASSISTED PROGRAMS OPERATED BY THE DEPARTMENT OF COMMERCE Post... proportion of handicapped individuals who may be admitted; and (2) May not make use of any test or criterion... handicapped individuals unless: (i) The test or criterion, as used by the recipient, has been validated as a...
15 CFR 8b.20 - Admission and recruitment.
Code of Federal Regulations, 2013 CFR
2013-01-01
... AGAINST THE HANDICAPPED IN FEDERALLY ASSISTED PROGRAMS OPERATED BY THE DEPARTMENT OF COMMERCE Post... proportion of handicapped individuals who may be admitted; and (2) May not make use of any test or criterion... handicapped individuals unless: (i) The test or criterion, as used by the recipient, has been validated as a...
Translation and validation of the Canadian diabetes risk assessment questionnaire in China.
Guo, Jia; Shi, Zhengkun; Chen, Jyu-Lin; Dixon, Jane K; Wiley, James; Parry, Monica
2018-01-01
To adapt the Canadian Diabetes Risk Assessment Questionnaire for the Chinese population and to evaluate its psychometric properties. A cross-sectional study was conducted with a convenience sample of 194 individuals aged 35-74 years from October 2014 to April 2015. The Canadian Diabetes Risk Assessment Questionnaire was adapted and translated for the Chinese population. Test-retest reliability was conducted to measure stability. Criterion and convergent validity of the adapted questionnaire were assessed using 2-hr 75 g oral glucose tolerance tests and the Finnish Diabetes Risk Scores, respectively. Sensitivity and specificity were evaluated to establish its predictive validity. The test-retest reliability was 0.988. Adequate validity of the adapted questionnaire was demonstrated by positive correlations found between the scores and 2-hr 75 g oral glucose tolerance tests (r = .343, p < .001) and with the Finnish Diabetes Risk Scores (r = .738, p < .001). The area under receiver operating characteristic curve was 0.705 (95% CI .632, .778), demonstrating moderate diagnostic value at a cutoff score of 30. The sensitivity was 73%, with a positive predictive value of 57% and negative predictive value of 78%. Our results provided evidence supporting the translation consistency, content validity, convergent validity, criterion validity, sensitivity, and specificity of the translated Canadian Diabetes Risk Assessment Questionnaire with minor modifications. This paper provides clinical, practical, and methodological information on how to adapt a diabetes risk calculator between cultures for public health nurses. © 2017 Wiley Periodicals, Inc.
Development and validation of the Chinese version of the Diabetes Management Self-efficacy Scale.
Vivienne Wu, Shu-Fang; Courtney, Mary; Edwards, Helen; McDowell, Jan; Shortridge-Baggett, Lillie M; Chang, Pei-Jen
2008-04-01
The purpose of this study was to translate the Diabetes Management Self-Efficacy Scale (DMSES) into Chinese and test the validity and reliability of the instrument within a Taiwanese population. A two-stage design was used for this study. Stage I consisted of a multi-stepped process of forward and backward translation, using focus groups and consensus meetings to translate the 20-item Australia/English version DMSES to Chinese and test content validity. Stage II established the psychometric properties of the Chinese version DMSES (C-DMSES) by examining the criterion, convergent and construct validity, internal consistency and stability testing. The sample for Stage II comprised 230 patients with type 2 diabetes aged 30 years or more from a diabetes outpatient clinic in Taiwan. Three items were modified to better reflect Chinese practice. The C-DMSES obtained a total average CVI score of .86. The convergent validity of the C-DMSES correlated well with the validated measure of the General Self-Efficacy Scale in measuring self-efficacy (r=.55; p<.01). Criterion-related validity showed that the C-DMSES was a significant predictor of the Summary of Diabetes Self-Care Activities scores (Beta=.58; t=10.75, p<.01). Factor analysis supported the C-DMSES being composed of four subscales. Good internal consistency (Cronbach's alpha=.77 to .93) and test-retest reliability (Pearson correlation coefficient r=.86, p<.01) were found. The C-DMSES is a brief and psychometrically sound measure for evaluation of self-efficacy towards management of diabetes by persons with type 2 diabetes in Chinese populations.
Baek, Sora; Park, Hee-Won; Lee, Yookyung; Grace, Sherry L; Kim, Won-Seok
2017-10-01
To perform a translation and cross-cultural adaptation of the Cardiac Rehabilitation Barriers Scale (CRBS) for use in Korea, followed by psychometric validation. The CRBS was developed to assess patients' perception of the degree to which patient, provider and health system-level barriers affect their cardiac rehabilitation (CR) participation. The CRBS consists of 21 items (barriers to adherence) rated on a 5-point Likert scale. The first phase was to translate and cross-culturally adapt the CRBS to the Korean language. After back-translation, both versions were reviewed by a committee. The face validity was assessed in a sample of Korean patients (n=53) with history of acute myocardial infarction that did not participate in CR through semi-structured interviews. The second phase was to assess the construct and criterion validity of the Korean translation as well as internal reliability, through administration of the translated version in 104 patients, principle component analysis with varimax rotation and cross-referencing against CR use, respectively. The length, readability, and clarity of the questionnaire were rated well, demonstrating face validity. Analysis revealed a six-factor solution, demonstrating construct validity. Cronbach's alpha was greater than 0.65. Barriers rated highest included not knowing about CR and not being contacted by a program. The mean CRBS score was significantly higher among non-attendees (2.71±0.26) than CR attendees (2.51±0.18) (p<0.01). The Korean version of CRBS has demonstrated face, content and criterion validity, suggesting it may be useful for assessing barriers to CR utilization in Korea.
Diehl, K; Görig, T; Breitbart, E W; Greinert, R; Hillhouse, J J; Stapleton, J L; Schneider, S
2018-01-01
Evidence suggests that indoor tanning may have addictive properties. However, many instruments for measuring indoor tanning addiction show poor validity and reliability. Recently, a new instrument, the Behavioral Addiction Indoor Tanning Screener (BAITS), has been developed. To test the validity and reliability of the BAITS by using a multimethod approach. We used data from the first wave of the National Cancer Aid Monitoring on Sunbed Use, which included a cognitive pretest (August 2015) and a Germany-wide representative survey (October to December 2015). In the cognitive pretest 10 users of tanning beds were interviewed and 3000 individuals aged 14-45 years were included in the representative survey. Potential symptoms of indoor tanning addiction were measured using the BAITS, a brief screening survey with seven items (answer categories: yes vs. no). Criterion validity was assessed by comparing the results of BAITS with usage parameters. Additionally, we tested internal consistency and construct validity. A total of 19·7% of current and 1·8% of former indoor tanning users were screened positive for symptoms of a potential indoor tanning addiction. We found significant associations between usage parameters and the BAITS (criterion validity). Internal consistency (reliability) was good (Kuder-Richardson-20, 0·854). The BAITS was shown to be a homogeneous construct (construct validity). Compared with other short instruments measuring symptoms of a potential indoor tanning addiction, the BAITS seems to be a valid and reliable tool. With its short length and the binary items the BAITS is easy to use in large surveys. © 2017 British Association of Dermatologists.
ERIC Educational Resources Information Center
Cavanagh, Robert F.; Koehler, Matthew J.
2013-01-01
The impetus for this paper stems from a concern about directions and progress in the measurement of the Technological Pedagogical Content Knowledge (TPACK) framework for effective technology integration. In this paper, we develop the rationale for using a seven-criterion lens, based upon contemporary validity theory, for critiquing empirical…
ERIC Educational Resources Information Center
Woodburn, Jim; Sutcliffe, Nick
1996-01-01
The Objective Structured Clinical Examination (OSCE), initially developed for undergraduate medical education, has been adapted for assessment of clinical skills in podiatry students. A 12-month pilot study found the test had relatively low levels of reliability, high construct and criterion validity, and good stability of performance over time.…
Comparative Analysis of the Relative Validity for Subjective Time Rating Scales. Final Report.
ERIC Educational Resources Information Center
Carpenter, James B.; And Others
Since the accuracy and validity of occupational data may vary according to the rating scale format employed, the first phase of the research described in the report employed hypothetical job descriptions from which accurate criterion data could be generated. The second phase of the research required developing an occupational survey instrument…
Validity and Bias of Academic Achievement Measures in the First Year of Elementary School
ERIC Educational Resources Information Center
Hammes, Patricia Simone; Bigras, Marc; Crepaldi, Maria Aparecida
2016-01-01
We tested the criterion-related validity and potential bias of two measures of pupils' academic achievement: the Teacher Rating Scale (TRS) and the Mathematics and Literacy Achievement Tests (MLTs). These measures are representative of assessment methods largely used in the elementary school. The aims were: (1) to verify the extent to which TRS…
Validity of Suicidality Items from the Youth Risk Behavior Survey in a High School Sample
ERIC Educational Resources Information Center
May, Alexis; Klonsky, E. David
2011-01-01
The Youth Risk Behavior Survey (YRBS) is used by the United States Centers for Disease Control to estimate rates of suicidal thoughts and behaviors in adolescents. This study investigated the validity of the YRBS suicidality items by examining their relationship to criterion variables including loneliness, anxiety, depression, substance use, and…
Developing a tool to measure satisfaction among health professionals in sub-Saharan Africa
2013-01-01
Background In sub-Saharan Africa, lack of motivation and job dissatisfaction have been cited as causes of poor healthcare quality and outcomes. Measurement of health workers’ satisfaction adapted to sub-Saharan African working conditions and cultures is a challenge. The objective of this study was to develop a valid and reliable instrument to measure satisfaction among health professionals in the sub-Saharan African context. Methods A survey was conducted in Senegal and Mali in 2011 among 962 care providers (doctors, midwives, nurses and technicians) practicing in 46 hospitals (capital, regional and district). The participation rate was very high: 97% (937/962). After exploratory factor analysis (EFA), construct validity was assessed through confirmatory factor analysis (CFA). The discriminant validity of our subscales was evaluated by comparing the average variance extracted (AVE) for each of the constructs with the squared interconstruct correlation (SIC), and finally for criterion validity, each subscale was tested with two hypotheses. Two dimensions of reliability were assessed: internal consistency with Cronbach’s alpha subscales and stability over time using a test-retest process. Results Eight dimensions of satisfaction encompassing 24 items were identified and validated using a process that combined psychometric analyses and expert opinions: continuing education, salary and benefits, management style, tasks, work environment, workload, moral satisfaction and job stability. All eight dimensions demonstrated significant discriminant validity. The final model showed good performance, with a root mean square error of approximation (RMSEA) of 0.0508 (90% CI: 0.0448 to 0.0569) and a comparative fit index (CFI) of 0.9415. The concurrent criterion validity of the eight dimensions was good. Reliability was assessed based on internal consistency, which was good for all dimensions but one (moral satisfaction < 0.70). Test-retest showed satisfactory temporal stability (intra class coefficient range: 0.60 to 0.91). Conclusions Job satisfaction is a complex construct; this study provides a multidimensional instrument whose content, construct and criterion validities were verified to ensure its suitability for the sub-Saharan African context. When using these subscales in further studies, the variability of the reliability of the subscales should be taken in to account for calculating the sample sizes. The instrument will be useful in evaluative studies which will help guide interventions aimed at improving both the quality of care and its effectiveness. PMID:23826720
Roets-Merken, Lieve M; Zuidema, Sytse U; Vernooij-Dassen, Myrra J F J; Kempen, Gertrudis I J M
2014-11-01
This study investigated the psychometric properties of the Severe Dual Sensory Loss screening tool, a tool designed to help nurses and care assistants to identify hearing, visual and dual sensory impairment in older adults. Construct validity of the Severe Dual Sensory Loss screening tool was evaluated using Crohnbach's alpha and factor analysis. Interrater reliability was calculated using Kappa statistics. To evaluate the predictive validity, sensitivity and specificity were calculated by comparison with the criterion standard assessment for hearing and vision. The criterion used for hearing impairment was a hearing loss of ≥40 decibel measured by pure-tone audiometry, and the criterion for visual impairment was a visual acuity of ≤0.3 diopter or a visual field of ≤0.3°. Feasibility was evaluated by the time needed to fill in the screening tool and the clarity of the instruction and items. Prevalence of dual sensory impairment was calculated. A total of 56 older adults receiving aged care and 12 of their nurses and care assistants participated in the study. Crohnbach's alpha was 0.81 for the hearing subscale and 0.84 for the visual subscale. Factor analysis showed two constructs for hearing and two for vision. Kappa was 0.71 for the hearing subscale and 0.74 for the visual subscale. The predictive validity showed a sensitivity of 0.71 and a specificity of 0.72 for the hearing subscale; and a sensitivity of 0.69 and a specificity of 0.78 for the visual subscale. The optimum cut-off point for each subscale was score 1. The nurses and care assistants reported that the Severe Dual Sensory Loss screening tool was easy to use. The prevalence of hearing and vision impairment was 55% and 29%, respectively, and that of dual sensory impairment was 20%. The Severe Dual Sensory Loss screening tool was compared with the criterion standards for hearing and visual impairment and was found a valid and reliable tool, enabling nurses and care assistants to identify hearing, visual and dual sensory impairment among older adults. Copyright © 2014 Elsevier Ltd. All rights reserved.
Supervisor Health and Safety Support: Scale Development and Validation
Butts, Marcus M.; Hurst, Carrie S.; Eby, Lillian T.
2013-01-01
Executive Summary Two studies were conducted to develop a psychometrically sound measure of supervisor health and safety support (SHSS). We identified three dimensions of supervisor support (physical health, psychological health, safety) and used Study 1 to develop items and establish content validity. Study 2 was used to establish the dimensionality of the new measure and provide criterion-related and discriminant validity evidence of the measure using supervisor and subordinate data. The measure had incremental validity in predicting employee performance and psychological strain outcomes above and beyond general work support variables. Implications of these findings and for workplace support theory and practice are discussed. PMID:24771991
Kaneko, Hiromasa; Funatsu, Kimito
2013-09-23
We propose predictive performance criteria for nonlinear regression models without cross-validation. The proposed criteria are the determination coefficient and the root-mean-square error for the midpoints between k-nearest-neighbor data points. These criteria can be used to evaluate predictive ability after the regression models are updated, whereas cross-validation cannot be performed in such a situation. The proposed method is effective and helpful in handling big data when cross-validation cannot be applied. By analyzing data from numerical simulations and quantitative structural relationships, we confirm that the proposed criteria enable the predictive ability of the nonlinear regression models to be appropriately quantified.
Monacis, Lucia; Palo, Valeria de; Griffiths, Mark D; Sinatra, Maria
2016-12-01
Background and aims The inclusion of Internet Gaming Disorder (IGD) in Section III of the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders has increased the interest of researchers in the development of new standardized psychometric tools for the assessment of such a disorder. To date, the nine-item Internet Gaming Disorder Scale - Short-Form (IGDS9-SF) has only been validated in English, Portuguese, and Slovenian languages. Therefore, the aim of this investigation was to examine the psychometric properties of the IGDS9-SF in an Italian-speaking sample. Methods A total of 757 participants were recruited to the present study. Confirmatory factor analysis and multi-group analyses were applied to assess the construct validity. Reliability analyses comprised the average variance extracted, the standard error of measurement, and the factor determinacy coefficient. Convergent and criterion validities were established through the associations with other related constructs. The receiver operating characteristic curve analysis was used to determine an empirical cut-off point. Results Findings confirmed the single-factor structure of the instrument, its measurement invariance at the configural level, and the convergent and criterion validities. Satisfactory levels of reliability and a cut-off point of 21 were obtained. Discussion and conclusions The present study provides validity evidence for the use of the Italian version of the IGDS9-SF and may foster research into gaming addiction in the Italian context.
Monacis, Lucia; de Palo, Valeria; Griffiths, Mark D.; Sinatra, Maria
2016-01-01
Background and aims The inclusion of Internet Gaming Disorder (IGD) in Section III of the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders has increased the interest of researchers in the development of new standardized psychometric tools for the assessment of such a disorder. To date, the nine-item Internet Gaming Disorder Scale – Short-Form (IGDS9-SF) has only been validated in English, Portuguese, and Slovenian languages. Therefore, the aim of this investigation was to examine the psychometric properties of the IGDS9-SF in an Italian-speaking sample. Methods A total of 757 participants were recruited to the present study. Confirmatory factor analysis and multi-group analyses were applied to assess the construct validity. Reliability analyses comprised the average variance extracted, the standard error of measurement, and the factor determinacy coefficient. Convergent and criterion validities were established through the associations with other related constructs. The receiver operating characteristic curve analysis was used to determine an empirical cut-off point. Results Findings confirmed the single-factor structure of the instrument, its measurement invariance at the configural level, and the convergent and criterion validities. Satisfactory levels of reliability and a cut-off point of 21 were obtained. Discussion and conclusions The present study provides validity evidence for the use of the Italian version of the IGDS9-SF and may foster research into gaming addiction in the Italian context. PMID:27876422
Park, Yu Kyung; Ju, Hyeon Ok; Na, Hunjoo
2016-02-01
The Perinatal Post-Traumatic Stress Disorder Questionnaire (PPQ) was designed to measure post-traumatic symptoms related to childbirth and symptoms during postnatal period. The purpose of this study was to develop a translated Korean version of the PPQ and to evaluate reliability and validity of the Korean PPQ. Participants were 196 mothers at one to 18 months after giving childbirth and data were collected through e-mails. The PPQ was translated into Korean using translation guideline from World Health Organization. For this study Cronbach's alpha and split-half reliability were used to evaluate the reliability of the PPQ. Exploratory Factor Analysis (EFA), Confirmatory Factor Analysis (CFA), and known-group validity were conducted to examine construct validity. Correlations of the PPQ with Impact of Event Scale (IES), Beck Depression Inventory II (BDI-II), and Beck Anxiety Inventory (BAI) were used to test a criterion validity of the PPQ. Cronbach's alpha and Spearman-Brown split-half correlation coefficient were 0.91 and 0.77, respectively. EFA identified a 3-factor solution including arousal, avoidance, and intrusion factors and CFA revealed the strongest support for the 3-factor model. The correlations of the PPQ with IES, BDI-II, and BAI were .99, .60, and .72, respectively, pointing to criterion validity of a high level. The Korean version PPQ is a useful tool for screening and assessing mothers' experiencing emotional distress related to child birth and during the postnatal period. The PPQ also reflects Post Traumatic Stress Disorder's diagnostic standards well.
Hashim, Hairul Anuar; Shaharuddin, Saidatin Sabiyah; Hamidan, Shazarina; Grove, J Robert
2017-02-01
This study examined psychometric properties of a Malaysian-language Sport Anxiety Scale-2 (SAS-2) in three separate studies. Study 1 examined the criterion validity and internal consistency of SAS-2 among 119 developmental hockey players. Measures of trait anxiety and mood states along with digit vigilance, choice reaction time, and depth perception tests were administered. Regression analysis revealed that somatic anxiety and concentration disruption were significantly associated with sustained attention. Worry was significantly associated with depth perception but not sustained attention. Pearson correlation coefficients also revealed significant relationships between SAS-2 subscales and negative mood state dimensions. Study 2 examined the convergent and discriminant validity of SAS-2 by correlating it with state anxiety measured by the CSAI-2R. Significant positive relationships were obtained between SAS-2 subscales and somatic and cognitive state anxiety. Conversely, state self-confidence was negatively related to SAS-2 subscales. In addition, significant differences were observed between men and women in somatic anxiety. Study 3 examined the factorial validity of the Malaysian SAS-2 using confirmatory factor analysis in a sample of 539 young athletes. Confirmatory factor analysis results provided strong support for the SAS-2 factor structure. Path loadings exceeding 0.5 indicated convergent validity among the subscales, and low to moderate subscale intercorrelations provided evidence of discriminant validity. Overall, the results supported the criterion and construct validity of this Malaysian-language SAS-2 instrument.
Food and Nutrition (Intermediate). Performance Objectives and Criterion-Referenced Test Items.
ERIC Educational Resources Information Center
Missouri Univ., Columbia. Instructional Materials Lab.
This document contains competencies and criterion-referenced test items for the Intermediate Food and Nutrition semester course in Missouri that were derived from the duties and tasks of the Missouri homemaker and identified and validated by home economics teachers and subject matter specialists. The guide is designed to assist home economics…
Multi-Informant Assessment of Temperament in Children with Externalizing Behavior Problems
ERIC Educational Resources Information Center
Copeland, William; Landry, Kerry; Stanger, Catherine; Hudziak, James J.
2004-01-01
We examined the criterion validity of parent and self-report versions of the Junior Temperament and Character Inventory (JTCI) in children with high levels of externalizing problems. The sample included 412 children (206 participants and 206 siblings) participating in a family study of attention and aggressive behavior problems. Criterion validity…
The Validity of the Instructional Reading Level.
ERIC Educational Resources Information Center
Powell, William R.
Presented is a critical inquiry about the product of the informal reading inventory (IRI) and about some of the elements used in the process of determining that product. Recent developments on this topic are briefly reviewed. Questions are raised concerning what is a suitable criterion level for word recognition. The original criterion of 95…
Zubeidat, Ihab; Salinas, José María; Sierra, Juan Carlos; Fernández-Parra, Antonio
2007-01-01
In this study, we analyzed the reliability and validity of the Social Interaction Anxiety Scale (SIAS) and propose a separation criterion between youths with specific and generalized social anxiety and youths without social anxiety. A sample of 1012 Spanish youths attending school completed the SIAS, the Liebowitz Social Anxiety Scale, the Social Avoidance and Distress Scale, the Fear of Negative Evaluation Scale, the Youth Self-Report for Ages 11-18 and the Minnesota Multiphasic Personality Inventory-Adolescent. The factor analysis suggests the existence of three factors in the SIAS, the first two of which explain most of the variance of the construct assessed. Internal consistency is adequate in the first two factors. The SIAS features an adequate theoretical validity with the scores of different variables related to social interaction. Analysis of the criterion scores yields three groups pertaining to three clearly differentiated clusters. In the third cluster, two of social anxiety groups - specific and generalized - have been identified by means of a quantitative separation criterion.
ERIC Educational Resources Information Center
Anderson, Daniel; Lai, Cheng-Fei; Nese, Joseph F. T.; Park, Bitnara Jasmine; Saez, Leilani; Jamgochian, Elisa; Alonzo, Julie; Tindal, Gerald
2010-01-01
In the following technical report, we present evidence of the technical adequacy of the easyCBM[R] math measures in grades K-2. In addition to reliability information, we present criterion-related validity evidence, both concurrent and predictive, and construct validity evidence. The results represent data gathered throughout the 2009/2010 school…
ERIC Educational Resources Information Center
Giesen, J. Martin; And Others
The study was designed to determine the reliability and criterion validity of a psychomotor performance test (the Fine Finger Dexterity Work Task Unit) with 40 partially or totally blind adults. Reliability was established by using the test-retest method. A supervisory rating was developed and the reliability established by using the split-half…
ERIC Educational Resources Information Center
Baker, Doris Luft; Biancarosa, Gina; Park, Bitnara Jasmine; Bousselot, Tracy; Smith, Jean-Louise; Baker, Scott K.; Kame'enui, Edward J.; Alonzo, Julie; Tindal, Gerald
2015-01-01
We examined the criterion validity and diagnostic efficiency of oral reading fluency (ORF), word reading accuracy, and reading comprehension (RC) for students in Grades 7 and 8 taking into account form effects of ORF, time of assessment, and individual differences, including student designations of limited English proficiency and special education…
ERIC Educational Resources Information Center
Omizo, Michael M.; And Others
1985-01-01
This study examined the predictive validity of the Coopersmith Self Esteem Inventory with adolescents relative to each of the criterion measures representing communication satisfaction toward each parent and feelings toward each parent, and the differential validity of the self-esteem, communication satisfaction, and feelings toward each parent…
Criterion Validity Evidence for the easyCBM© CCSS Math Measures: Grades 6-8. Technical Report #1402
ERIC Educational Resources Information Center
Anderson, Daniel; Rowley, Brock; Alonzo, Julie; Tindal, Gerald
2012-01-01
The easyCBM© CCSS Math tests were developed to help inform teachers' instructional decisions by providing relevant information on students' mathematical skills, relative to the Common Core State Standards (CCSS). This technical report describes a study to explore the validity of the easyCBM© CCSS Math tests by evaluating the relation between…
ERIC Educational Resources Information Center
Chen, Chia-ling; Shen, I-hsuan; Chen, Chung-yao; Wu, Ching-yi; Liu, Wen-Yu; Chung, Chia-ying
2013-01-01
This study examined criterion-related validity and clinimetric properties of the pediatric balance scale ("PBS") in children with cerebral palsy (CP). Forty-five children with CP (age range: 19-77 months) and their parents participated in this study. At baseline and at follow up, Pearson correlation coefficients were used to determine…
ERIC Educational Resources Information Center
Seo, Hyojeong; Wehmeyer, Michael L.; Shogren, Karrie A.; Hughes, Carolyn; Thompson, James R.; Little, Todd D.; Palmer, Susan B.
2017-01-01
Given the growing importance of support needs assessment in the field of intellectual disability, it is imperative to develop assessments of support needs whose scores and inferences demonstrate reliability and validity. The purpose of this study was to examine the criterion validity of scores on the "Supports Intensity Scale-Children's…
ERIC Educational Resources Information Center
McGill, Ryan J.
2015-01-01
The current study examined the incremental validity of the clinical clusters from the Woodcock-Johnson III Tests of Cognitive Abilities (WJ-III COG) for predicting scores on the Woodcock-Johnson III Tests of Achievement (WJ-III ACH). All participants were children and adolescents (N = 4,722) drawn from the nationally representative WJ-III…
Validity of the Aberrant Behavior Checklist in a Clinical Sample of Toddlers
ERIC Educational Resources Information Center
Karabekiroglu, Koray; Aman, Michael G.
2009-01-01
We investigated the congruent and criterion validity of the Aberrant Behavior Checklist (ABC) in a clinical sample of toddlers seen over 1 year in Turkey. All consecutive patients (N = 93), 14-43 months old (mean, 30.6 mos.), in a child psychiatry outpatient clinic were included. The ABC, Autism Behavior Checklist (AuBC), and Child Behavior…
Li, Jian; Loerbroks, Adrian; Jarczok, Marc N; Schöllgen, Ina; Bosch, Jos A; Mauss, Daniel; Siegrist, Johannes; Fischer, Joachim E
2012-09-01
We test the psychometric properties of a short version of the Effort-Reward Imbalance (ERI) questionnaire in addition to testing an interaction term of this model's main components on health functioning. A self-administered survey was conducted in a sample of 2,738 industrial workers (77% men with mean age 41.6 years) from a large manufacturing company in Southern Germany. The internal consistency reliability, structural validity, and criterion validity were analyzed. Satisfactory internal consistencies of the three scales: "Effort", "reward", and "overcommitment", were obtained (Cronbach's alpha coefficients 0.77, 0.82, and 0.83, respectively). Confirmatory factor analysis showed a good model fit of the data with the theoretical structure (AGFI = 0.94, RMSEA = 0.060). Evidence of criterion validity was demonstrated. Importantly, a significant synergistic interaction effect of ERI and overcommitment on poor mental health functioning was observed (odds ratio 6.74 (95% CI 5.32-8.52); synergy index 1.78 (95% CI 1.25-2.55)). This short version of the ERI questionnaire is a reliable and valid tool for epidemiological research on occupational health. Copyright © 2012 Wiley Periodicals, Inc.
Howard, Matt C
2014-10-01
Computer self-efficacy is an often studied construct that has been shown to be related to an array of important individual outcomes. Unfortunately, existing measures of computer self-efficacy suffer from several deficiencies, including criterion contamination, outdated wording, and/or inadequate psychometric properties. For this reason, the current article presents the creation of a new computer self-efficacy measure. In Study 1, an over-representative item list is created and subsequently reduced through exploratory factor analysis to create an initial measure, and the discriminant validity of this initial measure is tested. In Study 2, the unidimensional factor structure of the initial measure is supported through confirmatory factor analysis and further reduced into a final, 12-item measure. In Study 3, the convergent and criterion validity of the 12-item measure is tested. Overall, this three study process demonstrates that the new computer self-efficacy measure has superb psychometric properties and internal reliability, and demonstrates excellent evidence for several aspects of validity. It is hoped that the 12-item computer self-efficacy measure will be utilized in future research on computer self-efficacy, which is discussed in the current article.
Costa, Sebastiano; Cuzzocrea, Francesca; Hausenblas, Heather A; Larcan, Rosalba; Oliva, Patrizia
2012-12-01
Background and aims The purpose of this study was to verify the factorial structure, internal validity, reliability, and criterion validity of the 21-item Exercise Dependence Scale-Revised (EDS-R) in an Italian sample. Methods Italian voluntary (N = 519) users of gyms who had a history of regular exercise for over a year completed the EDS-R and measures of exercise frequency. Results and conclusions Confirmatory factor analyses demonstrated a good fit to the hypothesized 7-factor model, and adequate internal consistency for the scale was evidenced. Criterion validity was evidenced by significant correlations among all the subscale of the EDS and exercise frequency. Finally, individuals at risk for exercise dependence reported more exercise behavior compared to the nondependent-symptomatic and nondependent-asymptomatic groups. These results suggest that the seven subscales of the Italian version of the EDS are measuring the construct of exercise dependence as defined by the DSM-IV criteria for substance dependence and also confirm previous research using the EDS-R in other languages. More research is needed to examine the psychometric properties of the EDS-R in diverse populations with various research designs.
Fermont, Jilles M; Blazeby, Jane M; Rogers, Chris A; Wordsworth, Sarah
2017-01-01
Bariatric surgery is considered an effective treatment for individuals with severe and complex obesity. Besides reducing weight and improving obesity related comorbidities such as diabetes, bariatric surgery could improve patients' health-related quality of life. However, the frequently used instrument to measure quality of life, the EQ-5D has not been validated for use in bariatric surgery, which is a major limitation to its use in this clinical context. Our study undertook a psychometric validation of the 5 level EQ-5D (EQ-5D-5L) using clinical trial data to measure health-related quality of life in patients with severe and complex obesity undergoing bariatric surgery. Health-related quality of life was assessed at baseline (before randomisation) and six months later in 189 patients in a randomised controlled trial of bariatric surgery. Patients completed two generic health-related quality of life instruments, the EQ-5D-5L and SF-12, which were used together for the validation using data from all patients in the trial as the trial is ongoing. Psychometric analyses included construct and criterion validity and responsiveness to change. Of the 189 validation patients, 141 (75%) were female, the median age was 49 years old (range 23-70 years) and body mass index ranged from 33-70 kg/m2. For construct validity, there were significant improvements in the distribution of responses in all EQ-5D dimensions between baseline and 6 months after randomisation. For criterion validity, the highest degree of correlation was between the EQ-5D pain/discomfort and SF-12 bodily pain domain. For responsiveness the EQ-5D and SF-12 showed statistically significant improvements in health-related quality of life between baseline and 6 months after randomisation. The EQ-5D-5L is a valid generic measure for measuring health-related quality of life in bariatric surgery patients.
Transcultural adaptation of the Breast Cancer Awareness Measure.
Al-Khasawneh, E M; Leocadio, M; Seshan, V; Siddiqui, S T; Khan, A N; Al-Manaseer, M M
2016-09-01
To overcome the lack of a validated and robust Arabic instrument to measure breast cancer awareness. Currently, there is no validated Arabic instrument for measuring breast cancer awareness levels. We adapted, translated and validated the Breast Cancer Awareness Measure developed by Cancer Research UK. The instrument was translated into Arabic and back-translated for validation. Validation and reliability tests were conducted using purposively sampled 972 Arab women older than 20 years, living in Oman. The adapted content was validated by a panel of medical, linguistic and cultural experts, followed by cognitive interviews (n = 10), behavioural coding (n = 30) and criterion validation (n = 646). The instrument was tested for acceptability and its subscales for internal consistency. Inter-rater reliability was estimated between two similar groups (n = 144 and n = 142) to test homogeneity. The adapted and translated instrument had a high acceptability (98.7% completed). The validation process shaped the adaptation, and resulted in strong criterion validity (R = 0.58, P < 0.01). The instrument subscales for risk factors and warning signs had high internal consistency (Cronbach's alpha 0.856 and 0.890, respectively), with all floor and ceiling effects less than 15%. The correlation measure for inter-rater reliability was 0.97 (P < 0.01). Through the incorporation of contextual characteristics and prevalent beliefs among Arab populations, the adapted Best Cancer Awareness Measure is a robust Arabic instrument for the measurement of breast cancer awareness and early detection practices among Arab women. The purposively selected sample may not be representative of the population. Improvement of awareness and early detection of breast cancer can contribute towards reducing mortality from the disease. The adapted instrument has policy implications, since measurement of awareness levels is essential towards breast health promotion policies in Arab countries. © 2016 International Council of Nurses.
Ouyang, Liwen; Apley, Daniel W; Mehrotra, Sanjay
2016-04-01
Electronic medical record (EMR) databases offer significant potential for developing clinical hypotheses and identifying disease risk associations by fitting statistical models that capture the relationship between a binary response variable and a set of predictor variables that represent clinical, phenotypical, and demographic data for the patient. However, EMR response data may be error prone for a variety of reasons. Performing a manual chart review to validate data accuracy is time consuming, which limits the number of chart reviews in a large database. The authors' objective is to develop a new design-of-experiments-based systematic chart validation and review (DSCVR) approach that is more powerful than the random validation sampling used in existing approaches. The DSCVR approach judiciously and efficiently selects the cases to validate (i.e., validate whether the response values are correct for those cases) for maximum information content, based only on their predictor variable values. The final predictive model will be fit using only the validation sample, ignoring the remainder of the unvalidated and unreliable error-prone data. A Fisher information based D-optimality criterion is used, and an algorithm for optimizing it is developed. The authors' method is tested in a simulation comparison that is based on a sudden cardiac arrest case study with 23 041 patients' records. This DSCVR approach, using the Fisher information based D-optimality criterion, results in a fitted model with much better predictive performance, as measured by the receiver operating characteristic curve and the accuracy in predicting whether a patient will experience the event, than a model fitted using a random validation sample. The simulation comparisons demonstrate that this DSCVR approach can produce predictive models that are significantly better than those produced from random validation sampling, especially when the event rate is low. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Rodríguez-Martínez, Carlos E; Nino, Gustavo; Castro-Rodriguez, Jose A
2014-01-01
There is a critical need for validation studies of questionnaires designed to assess the level of control of asthma in children younger than 5 years old. To validate the Spanish version of the Test for Respiratory and Asthma Control in Kids (TRACK) questionnaire in children younger than age 5 years with symptoms consistent with asthma. In a prospective cohort validation study, parents and/or caregivers of children younger than age 5 years and with symptoms consistent with asthma, during a baseline and a follow-up visit 2 to 6 weeks later, completed the information required to assess the content validity, criterion validity, construct validity, test-retest reliability, sensitivity to change, internal consistency reliability, and usability of the TRACK questionnaire. Median (interquartile range) of the TRACK scores were significantly different between patients with well-controlled asthma, patients with not well-controlled asthma, and patients with very poorly controlled asthma (90.0 [75.0-95.0], 75.0 [55.0-85.0], and 35.0 [25.0-55.0], respectively, P < .001). TRACK scores were significantly different between patients classified as currently symptomatic and symptomatic in the recent past (42.5 [25.0-55.0] vs 85.0 [75.0-90.0]; P < .001). The intraclass correlation coefficient of the measurements was 0.755 (95% CI, 0.503-1.00). All patients whose clinical status changed showed an increase of 10 or more points in TRACK score between baseline and follow-up visits. The Cronbach α was 0.77 for the questionnaire as a whole. The Spanish version of the TRACK questionnaire has excellent sensitivity to change and usability; adequate criterion validity, construct validity, and test-retest reliability; and an acceptable internal consistency, when used in children younger than age 5 years with symptoms consistent with asthma. Copyright © 2014 American Academy of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.
Statistical methodology: II. Reliability and validity assessment in study design, Part B.
Karras, D J
1997-02-01
Validity measures the correspondence between a test and other purported measures of the same or similar qualities. When a reference standard exists, a criterion-based validity coefficient can be calculated. If no such standard is available, the concepts of content and construct validity may be used, but quantitative analysis may not be possible. The Pearson and Spearman tests of correlation are often used to assess the correspondence between tests, but do not account for measurement biases and may yield misleading results. Techniques that measure interest differences may be more meaningful in validity assessment, and the kappa statistic is useful for analyzing categorical variables. Questionnaires often can be designed to allow quantitative assessment of reliability and validity, although this may be difficult. Inclusion of homogeneous questions is necessary to assess reliability. Analysis is enhanced by using Likert scales or similar techniques that yield ordinal data. Validity assessment of questionnaires requires careful definition of the scope of the test and comparison with previously validated tools.
Classen, Sherrilene; Winter, Sandra M.; Velozo, Craig A.; Bédard, Michel; Lanford, Desiree N.; Brumback, Babette; Lutz, Barbara J.
2010-01-01
OBJECTIVE We report on item development and validity testing of a self-report older adult safe driving behaviors measure (SDBM). METHOD On the basis of theoretical frameworks (Precede–Proceed Model of Health Promotion, Haddon’s matrix, and Michon’s model), existing driving measures, and previous research and guided by measurement theory, we developed items capturing safe driving behavior. Item development was further informed by focus groups. We established face validity using peer reviewers and content validity using expert raters. RESULTS Peer review indicated acceptable face validity. Initial expert rater review yielded a scale content validity index (CVI) rating of 0.78, with 44 of 60 items rated ≥0.75. Sixteen unacceptable items (≤0.5) required major revision or deletion. The next CVI scale average was 0.84, indicating acceptable content validity. CONCLUSION The SDBM has relevance as a self-report to rate older drivers. Future pilot testing of the SDBM comparing results with on-road testing will define criterion validity. PMID:20437917
Ramos-Quiroga, Josep Antoni; Bosch, Rosa; Richarte, Vanesa; Valero, Sergi; Gómez-Barros, Nuria; Nogueira, Mariana; Palomar, Gloria; Corrales, Montse; Sáez-Francàs, Naia; Corominas, Margarida; Real, Alberto; Vidal, Raquel; Chalita, Pablo J; Casas, Miguel
2012-01-01
Attention deficit hyperactivity disorder (ADHD) is a common neuropsychiatric disorder in adulthood. Its diagnosis requires a retrospective evaluation of ADHD symptoms in childhood, the continuity of these symptoms in adulthood, and a differential diagnosis. For these reasons, diagnosis of ADHD in adults is a complex process which needs effective diagnostic tools. To analyse the criterion validity of the CAADID semi-structured interview, Spanish version, and the concurrent validity compared with other ADHD severity scales. An observational case-control study was conducted on 691 patients with ADHD. They were out-patients treated in a program for adults with ADHD in a hospital. A sensitivity of 98.86%, specificity 67.68%, positive predictive value 90.77% and a negative predictive value 94.87% were observed. Diagnostic precision was 91.46%. The kappa index concordance between the clinical diagnostic interview and the CAADID was 0.88. Good concurrent validity was obtained, the CAADID correlated significantly with WURS scale (r=0.522, P<.01), ADHD Rating Scale (r=0.670, P<.0.1) and CAARS (self-rating version; r=0.656, P<.01 and observer-report r=0.514, P<.01). CAADID is a valid and useful tool for the diagnosis of ADHD in adults for clinical, as well as for research purposes. Copyright © 2012 SEP y SEPB. Published by Elsevier España, S.L. All rights reserved.
Versey, Nathan G; Gore, Christopher J; Halson, Shona L; Plowman, Jamie S; Dawson, Brian T
2011-09-01
We determined the validity and reliability of heat flow thermistors, flexible thermocouple probes and general purpose thermistors compared with a calibrated reference thermometer in a stirred water bath. Validity (bias) was defined as the difference between the observed and criterion values, and reliability as the repeatability (standard deviation or typical error) of measurement. Data were logged every 5 s for 10 min at water temperatures of 14, 26 and 38 °C for ten heat flow thermistors and 24 general purpose thermistors, and at 35, 38 and 41 °C for eight flexible thermocouple probes. Statistical analyses were conducted using spreadsheets for validity and reliability, where an acceptable bias was set at ±0.1 °C. None of the heat flow thermistors, 17% of the flexible thermocouple probes and 71% of the general purpose thermistors met the validity criterion for temperature. The inter-probe reliabilities were 0.03 °C for heat flow thermistors, 0.04 °C for flexible thermocouple probes and 0.09 °C for general purpose thermistors. The within trial intra-probe reliability of all three temperature probes was 0.01 °C. The results suggest that these temperature sensors should be calibrated individually before use at relevant temperatures and the raw data corrected using individual linear regression equations.
Appearance motives to tan and not tan: evidence for validity and reliability of a new scale.
Cafri, Guy; Thompson, J Kevin; Roehrig, Megan; Rojas, Ariz; Sperry, Steffanie; Jacobsen, Paul B; Hillhouse, Joel
2008-04-01
Risk for skin cancer is increased by UV exposure and decreased by sun protection. Appearance reasons to tan and not tan have consistently been shown to be related to intentions and behaviors to UV exposure and protection. This study was designed to determine the factor structure of appearance motives to tan and not tan, evaluate the extent to which this factor structure is gender invariant, test for mean differences in the identified factors, and evaluate internal consistency, temporal stability, and criterion-related validity. Five-hundred eighty-nine females and 335 male college students were used to test confirmatory factor analysis models within and across gender groups, estimate latent mean differences, and use the correlation coefficient and Cronbach's alpha to further evaluate the reliability and validity of the identified factors. A measurement invariant (i.e., factor-loading invariant) model was identified with three higher-order factors: sociocultural influences to tan (lower order factors: media, friends, family, significant others), appearance reasons to tan (general, acne, body shape), and appearance reasons not to tan (skin aging, immediate skin damage). Females had significantly higher means than males on all higher-order factors. All subscales had evidence of internal consistency, temporal stability, and criterion-related validity. This study offers a framework and measurement instrument that has evidence of validity and reliability for evaluating appearance-based motives to tan and not tan.
Fung, Christina Hoi Ling; Nguyen, Michelle; Moineddin, Rahim; Colantonio, Angela; Wiseman-Hakes, Catherine
2014-06-01
The Daily Cognitive Communicative and Sleep Profile (DCCASP) is a seven-item instrument that captures daily subjective sleep quality, perceived mood, cognitive, and communication functions. The objective of this study was to evaluate the reliability and validity of the DCCASP. The DCCASP was self-administered daily to a convenience sample of young adults (n = 54) for two two-week blocks, interspersed with a two-week rest period. Afterwards, participants completed the Pittsburgh Sleep Quality Index (PSQI). Internal consistency and criterion validity were calculated by Cronbach's α coefficient, Concordance Correlation Coefficient (CCC), and Spearman rank (rs) correlation coefficient, respectively. Results indicated high internal consistency (Cronbach-s α = 0.864-0.938) among mean ratings of sleep quality on the DCCASP. There were significant correlations between mean ratings of sleep quality and all domains (rs=0.38-0.55, p<0.0001). Criterion validity was established between mean sleep quality ratings on the DCCASP and PSQI (rs=0.40, p<0.001). The DCCASP is a reliable and valid self-report instrument to monitor daily sleep quality and perceived mood, cognitive, and communication functions over time, amongst a normative sample of young adults. Further studies on its psychometric properties are necessary to clarify its utility in a clinical population. Copyright © 2014 John Wiley & Sons, Ltd.
Miki, Emi; Yamane, Shingo; Yamaoka, Mai; Fujii, Hiroe; Ueno, Hiroka; Kawahara, Toshie; Tanaka, Keiko; Tamashiro, Hiroaki; Inoue, Eiji; Okamoto, Takatsugu; Kuriyama, Masaru
2016-09-01
The study aim was to investigate the validity and reliability of the Functional Independence Measure and Functional Assessment Measure (FIM + FAM), which is unfamiliar in Japan, by using its Japanese version (FIM + FAM-j) in patients with cerebrovascular accident (CVA). Forty-two CVA patients participated. Criterion validity was examined by correlating the full scale and subscales of FIM + FAM-j with several well-established measurements using Spearman's correlation coefficient. Reliability was evaluated by internal consistency (tested by Cronbach's alpha coefficient) and intra-rater reliability (tested by Kendall's tau correlation coefficient). Good-to-excellent criterion validity was found between the full scale and motor subscales of the FIM + FAM-j and the Barthel Index, National Institutes of Health Stroke Scale, modified Rankin Scale, and lower extremity Brunnstrom Recovery Stage. High internal consistency was observed within the full-scale FIM + FAM-j and the motor and cognitive subscales (Cronbach's alphas were 0.968, 0.954, and 0.948, respectively). Additionally, good intra-rater reliability was observed within the full scale and motor subscales, and excellent reliability for the cognitive subscales (taus were 0.83, 0.80, and 0.98, respectively). This study showed that the FIM + FAM-j demonstrated acceptable levels of validity and reliability when used for CVA as a measure of disability.
Erel, Suat; Şimşek, İbrahim Engin; Özkan, Hüseyin
2015-01-01
The aim of this study was to analyze the validity and reliability of the Turkish version (ICOAP-TR) of the intermittent and constant osteoarthritis pain (ICOAP) questionnaire in patients with knee osteoarthritis (OA). Thirty-eight volunteer patients diagnosed with knee OA answered the questionnaire twice with an interval of 2-4 days. The reliability of the measurement was assessed using Cronbach's alpha coefficient and intraclass correlation (ICC) for test-retest reliability. Criterion validity was tested against the Western Ontario and McMaster Universities Arthritis Index (WOMAC) pain score and visual analog scale (VAS) designed to assess the perceived discomfort rated by the patient. Test-retest reliability was found to be ICC=0.942 for total score, 0.902 for constant pain subscale, and 0.945 for intermittent pain subscale. Internal consistency was tested using Cronbach's alpha and was found to be 0.970 for total score, 0.948 for constant pain subscale, and 0.972 for intermittent pain subscale. For criterion validity, the correlation between the total score of ICOAP-TR and WOMAC pain subscale was r=0.779 (p<0.05), and correlation between total score of ICOAP-TR and VAS was r=0.570 (p<0.05). The ICOAP-TR is a reliable and valid instrument to be used with patients with knee OA.
Development of Internet-Based Tasks for the Executive Function Performance Test.
Rand, Debbie; Lee Ben-Haim, Keren; Malka, Rachel; Portnoy, Sigal
The Executive Function Performance Test (EFPT) is a reliable and valid performance-based tool to assess executive functions (EFs). This study's objective was to develop and verify two Internet-based tasks for the EFPT. A cross-sectional study assessed the alternate-form reliability of the Internet-based bill-paying and telephone-use tasks in healthy adults and people with subacute stroke (Study 1). It also sought to establish the tasks' criterion reliability for assessing EF deficits by correlating performance with that on the Trail Making Test in five groups: healthy young adults, healthy older adults, people with subacute stroke, people with chronic stroke, and young adults with attention deficit hyperactivity disorder (Study 2). The alternative-form reliability and initial construct validity for the Internet-based bill-paying task were verified. Criterion validity was established for both tasks. The Internet-based tasks are comparable to the original EFPT tasks and can be used for assessment of EF deficits. Copyright © 2018 by the American Occupational Therapy Association, Inc.
Ubbiali, Alessandro; Chiorri, Carlo; Donati, Deborah
2011-08-01
The Inventory of Interpersonal Problems-47 (IIP-47) is a brief and valid self-report measure for screening Personality Disorders (PDs). This study examined internal consistency, factor structure, criterion validity, temporal stability, and operating characteristics of the Italian version of the IIP-47 in two independent samples: PD subjects (n = 120) and nonclinical subjects (n = 475). Alpha coefficients ranged from .70 to .90. Multiple-Group Confirmatory Factor Analyses showed that the five-correlated-factor model reported in literature had the highest measurement invariance across the two groups. Criterion validity was supported by correlations among IIP-47 scale scores and scores on established measures of personality dimensions and pathology. Test-retest indices ranged from .71 to .95. PD subjects scored significantly higher than nonclinical subjects on all IIP-47 scales and cut-off scores for different levels of specificity and sensibility are reported. It is concluded that the psychometric properties of the original IIP-47 were preserved in its Italian version.
Pechorro, Pedro; Maroco, João; Ray, James V; Gonçalves, Rui Abrunhosa; Nunes, Cristina
2018-06-01
Research on narcissism has a long tradition, but there is limited knowledge regarding its application among female youth, especially for forensic samples of incarcerated female youth. Drawing on 377 female adolescents (103 selected from forensic settings and 274 selected from school settings) from Portugal, the current study is the first to examine simultaneously the psychometric properties of a brief version of the Narcissistic Personality Inventory (NPI-13) among females drawn from incarcerated and community settings. The results support the three-factor structure model of narcissism after the removal of one item due to its low factor loading. Internal consistency, convergent validity, and discriminant validity showed promising results. In terms of criterion-related validity, significant associations were found with criterion-related variables such as age of criminal onset, conduct disorder, crime severity, violent crimes, and alcohol and drug use. The findings provide support for use of the NPI-13 among female juveniles.
Lemon, Stephenie C; Rosal, Milagros C; Welch, Garry
2011-11-01
This study assessed the psychometric properties of the Audit of Diabetes-Dependent Quality of Life (ADDQoL) modified for low-income, low-education, Spanish-speaking Puerto Ricans with type 2 diabetes residing in the northeastern United States. Cross-sectional data from 226 patients were analyzed. Scale modifications included simplification of instructions, question wording and response format, and oral administration. Reliability was assessed with Cronbach's alpha coefficient and internal structure by exploratory factor analysis. Criterion validity was assessed using correlation analysis and linear and logistic regression models assessing the association of the ADDQoL with standardized physical health status, mental health status, depression, and comorbidity indices. Two ADDQoL items were dropped. The modified scale had excellent internal consistency and supported the original scale factor structure. Criterion validity results supported the validity of this measure. The modified ADDQoL showed psychometric properties that support its use in low-income, Spanish-speaking Puerto Ricans with type 2 diabetes who reside in mainland U.S.
Cuesta-Vargas, Antonio Ignacio; González-Sánchez, Manuel
2014-10-29
Spanish is one of the five most spoken languages in the world. There is currently no published Spanish version of the Örebro Musculoskeletal Pain Questionnaire (OMPQ). The aim of the present study is to describe the process of translating the OMPQ into Spanish and to perform an analysis of reliability, internal structure, internal consistency and concurrent criterion-related validity. Translation and psychometric testing. Two independent translators translated the OMPQ into Spanish. From both translations a consensus version was achieved. A backward translation was made to verify and resolve any semantic or conceptual problems. A total of 104 patients (67 men/37 women) with a mean age of 53.48 (±11.63), suffering from chronic musculoskeletal disorders, twice completed a Spanish version of the OMPQ. Statistical analysis was performed to evaluate the reliability, the internal structure, internal consistency and concurrent criterion-related validity with reference to the gold standard questionnaire SF-12v2. All variables except "Coping" showed a rate above 0.85 on reliability. The internal structure calculation through exploratory factor analysis indicated that 75.2% of the variance can be explained with six components with an eigenvalue higher than 1 and 52.1% with only three components higher than 10% of variance explained. In the concurrent criterion-related validity, several significant correlations were seen close to 0.6, exceeding that value in the correlation between general health and total value of the OMPQ. The Spanish version of the screening questionnaire OMPQ can be used to identify Spanish patients with musculoskeletal pain at risk of developing a chronic disability.
NASA Astrophysics Data System (ADS)
Putra, Z. A. Z.; Sumarmin, R.; Violita, V.
2018-04-01
The guides used for practicing animal physiology need to be revised and adapted to the lecture material. This is because in the subject of Animal Physiology. The guidance of animal physiology practitioners is still conventional with prescription model instructions and is so simple that it is necessary to develop a practical guide that can lead to the development of scientific work. One of which is through practice guided inquiry guided practicum guide. This study aims to describe the process development of the practical guidance and reveal the validity, practicality, and effectiveness Guidance Physiology Animals guided inquiry inferior to the subject of Animal Physiology for students Biology Department State University of Padang. This type of research is development research. This development research uses the Plomp model. Stages performed are problem identification and analysis stage, prototype development and prototyping stage, and assessment phase. Data analysis using descriptive analysis. The instrument of data collection using validation and practical questionnaires, competence and affective field of competence observation and psychomotor and cognitive domain competence test. The result of this research shows that guidance of Inquiry Guided Initiative Guided Physiology with 3.23 valid category, practicality by lecturer with value 3.30 practical category, student with value 3.37 practical criterion. Affective effectiveness test with 93,00% criterion is very effective, psychomotor aspect 89,50% with very effective criteria and cognitive domain with value of 67, pass criterion. The conclusion of this research is Guided Inquiry Student Guided Protoxial Guidance For Students stated valid, practical and effective.
Validity, sensitivity and specificity of the mentation, behavior and mood subscale of the UPDRS.
Holroyd, Suzanne; Currie, Lillian J; Wooten, G Frederick
2008-06-01
The unified Parkinson's disease rating scale (UPDRS) is the most widely used tool to rate the severity and the stage of Parkinson's disease (PD). However, the mentation, behavior and mood (MBM) subscale of the UPDRS has received little investigation regarding its validity and sensitivity. Three items of this subscale were compared to criterion tests to examine validity, sensitivity and specificity. Ninety-seven patients with idiopathic PD were assessed on the UPDRS. Scores on three items of the MBM subscale, intellectual impairment, thought disorder and depression, were compared to criterion tests, the telephone interview for cognition status (TICS), psychiatric assessment for psychosis and the geriatric depression scale (GDS). Non-parametric tests of association were performed to examine concurrent validity of the MBM items. The sensitivities, specificities and optimal cutoff scores for each MBM item were estimated by receiver operating characteristic (ROC) curve analysis. The MBM items demonstrated low to moderate correlation with the criterion tests, and the sensitivity and specificity were not strong. Even using a score of 7.0 on the items of the MBM demonstrated a sensitivity/specificity of only 0.19/0.48 for intellectual impairment, 0.60/0.72 for thought disorder and 0.61/0.87 for depression. Using a more appropriate cutoff of 2.0 revealed sensitivities of 0.01, 0.38 and 0.13 respectively. The MBM subscale items of intellectual impairment, thought disorder and depression are not appropriate for screening or diagnostic purposes. Tools such as the TICS and the GDS should be considered instead.
Lundin, Andreas; Hallgren, Mats; Balliu, Natalja; Forsell, Yvonne
2015-01-01
The alcohol use disorders identification test (AUDIT) and AUDIT-Consumption (AUDIT-C) are commonly used in population surveys but there are few validations studies in the general population. Validity should be estimated in samples close to the targeted population and setting. This study aims to validate AUDIT and AUDIT-C in a general population sample (PART) in Stockholm, Sweden. We used a general population subsample age 20 to 64 that answered a postal questionnaire including AUDIT who later participated in a psychiatric interview (n = 1,093). Interviews using Schedules for Clinical Assessment in Neuropsychiatry was used as criterion standard. Diagnoses were set according to the fourth version of the Diagnostic and Statistical Manual of Mental Disorders (DSM-IV). Agreement between the diagnostic test and criterion standard was measured with area under the receiver operator characteristics curve (AUC). A total of 1,086 (450 men and 636 women) of the interview participants completed AUDIT. There were 96 individuals with DSM-IV-alcohol dependence, 36 DSM-IV-Alcohol Abuse, and 153 Risk drinkers. AUCs were for DSM-IV-alcohol use disorder 0.90 (AUDIT-C 0.85); DSM-IV-dependence 0.94 (AUDIT-C 0.89); risk drinking 0.80 (AUDIT-C 0.80); and any criterion 0.87 (AUDIT-C 0.84). In this general population sample, AUDIT and AUDIT-C performed outstanding or excellent in identifying dependency, risk drinking, alcohol use disorder, any disorder, or risk drinking. Copyright © 2015 by the Research Society on Alcoholism.
Criterion Related Validity of Karate Specific Aerobic Test (KSAT).
Chaabene, Helmi; Hachana, Younes; Franchini, Emerson; Tabben, Montassar; Mkaouer, Bessem; Negra, Yassine; Hammami, Mehrez; Chamari, Karim
2015-09-01
Karate is one the most popular combat sports in the world. Physical fitness assessment on a regular manner is important for monitoring the effectiveness of the training program and the readiness of karatekas to compete. The aim of this research was to examine the criterion related to validity of the karate specific aerobic test (KSAT) as an indicator of aerobic level of karate practitioners. Cardiorespiratory responses, aerobic performance level through both treadmill laboratory test and YoYo intermittent recovery test level 1 (YoYoIRTL1) as well as time to exhaustion in the KSAT test (TE'KSAT) were determined in a total of fifteen healthy international karatekas (i.e. karate practitioners) (means ± SD: age: 22.2 ± 4.3 years; height: 176.4 ± 7.5 cm; body mass: 70.3 ± 9.7 kg and body fat: 13.2 ± 6%). Peak heart rate obtained from KSAT represented ~99% of maximal heart rate registered during the treadmill test showing that KSAT imposes high physiological demands. There was no significant correlation between KSAT's TE and relative (mL/min kg) treadmill maximal oxygen uptake (r = 0.14; P = 0.69; [small]). On the other hand, there was a significant relationship between KSAT's TE and the velocity associated with VO2max (vVO2max) (r = 0.67; P = 0.03; [large]) as well as the velocity at VO2 corresponding to the second ventilatory threshold (vVO2 VAT) (r = 0.64; P = 0.04; [large]). Moreover, significant relationship was found between TE's KSAT and both the total distance covered and parameters of intermittent endurance measured through YoYoIRTL1. The KSAT has not proved to have indirect criterion related validity as no significant correlations have been found between TE's KSAT and treadmill VO2max. Nevertheless, as correlated to other aerobic fitness variables, KSAT can be considered as an indicator of karate specific endurance. The establishment of the criterion related validity of the KSAT requires further investigation.
Davison, Kirsten K.; Austin, S. Bryn; Giles, Catherine; Cradock, Angie L.; Lee, Rebekka M.; Gortmaker, Steven L.
2017-01-01
Interest in evaluating and improving children’s diets in afterschool settings has grown, necessitating the development of feasible yet valid measures for capturing children’s intake in such settings. This study’s purpose was to test the criterion validity and cost of three unobtrusive visual estimation methods compared to a plate-weighing method: direct on-site observation using a 4-category rating scale and off-site rating of digital photographs taken on-site using 4- and 10-category scales. Participants were 111 children in grades 1–6 attending four afterschool programs in Boston, MA in December 2011. Researchers observed and photographed 174 total snack meals consumed across two days at each program. Visual estimates of consumption were compared to weighed estimates (the criterion measure) using intra-class correlations. All three methods were highly correlated with the criterion measure, ranging from 0.92–0.94 for total calories consumed, 0.86–0.94 for consumption of pre-packaged beverages, 0.90–0.93 for consumption of fruits/vegetables, and 0.92–0.96 for consumption of grains. For water, which was not pre-portioned, coefficients ranged from 0.47–0.52. The photographic methods also demonstrated excellent inter-rater reliability: 0.84–0.92 for the 4-point and 0.92–0.95 for the 10-point scale. The costs of the methods for estimating intake ranged from $0.62 per observation for the on-site direct visual method to $0.95 per observation for the criterion measure. This study demonstrates that feasible, inexpensive methods can validly and reliably measure children’s dietary intake in afterschool settings. Improving precision in measures of children’s dietary intake can reduce the likelihood of spurious or null findings in future studies. PMID:25596895
Castillo-Tandazo, Wilson; Flores-Fortty, Adolfo; Feraud, Lourdes; Tettamanti, Daniel
2013-01-01
Purpose To translate, cross-culturally adapt, and validate the Questionnaire for Diabetes-Related Foot Disease (Q-DFD), originally created and validated in Australia, for its use in Spanish-speaking patients with diabetes mellitus. Patients and methods The translation and cross-cultural adaptation were based on international guidelines. The Spanish version of the survey was applied to a community-based (sample A) and a hospital clinic-based sample (samples B and C). Samples A and B were used to determine criterion and construct validity comparing the survey findings with clinical evaluation and medical records, respectively; while sample C was used to determine intra- and inter-rater reliability. Results After completing the rigorous translation process, only four items were considered problematic and required a new translation. In total, 127 patients were included in the validation study: 76 to determine criterion and construct validity and 41 to establish intra- and inter-rater reliability. For an overall diagnosis of diabetes-related foot disease, a substantial level of agreement was obtained when we compared the Q-DFD with the clinical assessment (kappa 0.77, sensitivity 80.4%, specificity 91.5%, positive likelihood ratio [LR+] 9.46, negative likelihood ratio [LR−] 0.21); while an almost perfect level of agreement was obtained when it was compared with medical records (kappa 0.88, sensitivity 87%, specificity 97%, LR+ 29.0, LR− 0.13). Survey reliability showed substantial levels of agreement, with kappa scores of 0.63 and 0.73 for intra- and inter-rater reliability, respectively. Conclusion The translated and cross-culturally adapted Q-DFD showed good psychometric properties (validity, reproducibility, and reliability) that allow its use in Spanish-speaking diabetic populations. PMID:24039434
Harris, Joshua D; Erickson, Brandon J; Cvetanovich, Gregory L; Abrams, Geoffrey D; McCormick, Frank M; Gupta, Anil K; Verma, Nikhil N; Bach, Bernard R; Cole, Brian J
2014-02-01
Condition-specific questionnaires are important components in evaluation of outcomes of surgical interventions. No condition-specific study methodological quality questionnaire exists for evaluation of outcomes of articular cartilage surgery in the knee. To develop a reliable and valid knee articular cartilage-specific study methodological quality questionnaire. Cross-sectional study. A stepwise, a priori-designed framework was created for development of a novel questionnaire. Relevant items to the topic were identified and extracted from a recent systematic review of 194 investigations of knee articular cartilage surgery. In addition, relevant items from existing generic study methodological quality questionnaires were identified. Items for a preliminary questionnaire were generated. Redundant and irrelevant items were eliminated, and acceptable items modified. The instrument was pretested and items weighed. The instrument, the MARK score (Methodological quality of ARticular cartilage studies of the Knee), was tested for validity (criterion validity) and reliability (inter- and intraobserver). A 19-item, 3-domain MARK score was developed. The 100-point scale score demonstrated face validity (focus group of 8 orthopaedic surgeons) and criterion validity (strong correlation to Cochrane Quality Assessment score and Modified Coleman Methodology Score). Interobserver reliability for the overall score was good (intraclass correlation coefficient [ICC], 0.842), and for all individual items of the MARK score, acceptable to perfect (ICC, 0.70-1.000). Intraobserver reliability ICC assessed over a 3-week interval was strong for 2 reviewers (≥0.90). The MARK score is a valid and reliable knee articular cartilage condition-specific study methodological quality instrument. This condition-specific questionnaire may be used to evaluate the quality of studies reporting outcomes of articular cartilage surgery in the knee.
Psychometric validation of a condom self-efficacy scale in Korean.
Cha, EunSeok; Kim, Kevin H; Burke, Lora E
2008-01-01
When an instrument is translated for use in cross-cultural research, it needs to account for cultural factors without distorting the psychometric properties of the instrument. To validate the psychometric properties of the condom self-efficacy scale (CSE) originally developed for American adolescents and young adults after translating the scale to Korean (CSE-K) to determine its suitability for cross-cultural research among Korean college students. A cross-sectional, correlational design was used with an exploratory survey methodology through self-report questionnaires. A convenience sample of 351 students, aged 18 to 25 years, were recruited at a university in Seoul, Korea. The participants completed the CSE-K and the intention of condom use scales after they were translated from English to Korean using a combined translation technique. A demographic and sex history questionnaire, which included an item to assess actual condom usage, was also administered. Mean, variance, reliability, criterion validity, and factorial validity using confirmatory factor analysis were assessed in the CSE-K. Norms for the CSE-K were similar, but not identical, to norms for the English version. The means of all three subscales were lower for the CSE-K than for the original CSE; however, the obtained variance in CSE-K was roughly similar with the original CSE. The Cronbach's alpha coefficient for the total scale was higher for the CSE-K (.91) than that for either the CSE (.85) or CSE in Thai (.85). Criterion validity and construct validity of the CSE-K were confirmed. The CSE-K was a reliable and valid scale in measuring condom self-efficacy among Korean college students. The findings suggest that the CSE was an appropriate instrument to conduct cross-cultural research on sexual behavior in adolescents and young adults.
Van, Connie; Costa, Daniel; Mitchell, Bernadette; Abbott, Penny; Krass, Ines
2012-01-01
Existing validated measures of pharmacist-physician collaboration focus on measuring attitudes toward collaboration and do not measure frequency of collaborative interactions. To develop and validate an instrument to measure the frequency of collaboration between pharmacists and general practitioners (GPs) from the pharmacist's perspective. An 11-item Pharmacist Frequency of Interprofessional Collaboration Instrument (FICI-P) was developed and administered to 586 pharmacists in 8 divisions of general practice in New South Wales, Australia. The initial items were informed by a review of the literature in addition to interviews of pharmacists and GPs. Items were subjected to principal component and Rasch analyses to determine each item's and the overall measure's psychometric properties and for any needed refinements. Two hundred and twenty four (38%) of pharmacist surveys were completed and returned. Principal component analysis suggested removal of 1 item for a final 1-factor solution. The refined 10-item FICI-P demonstrated internal consistency reliability at Cronbach's alpha=0.90. After collapsing the original 5-point response scale to a 4-point response scale, the refined FICI-P demonstrated fit to the Rasch model. Criterion validity of the FICI-P was supported by the correlation of FICI-P scores with scores on a previously validated Physician-Pharmacist Collaboration Instrument. Validity was also supported by predicted differences in FICI-P scores between subgroups of respondents stratified on age, colocation with GPs, and interactions during the intern-training period. The refined 10-item FICI-P was shown to have good internal consistency, criterion validity, and fit to the Rasch model. The creation of such a tool may allow for the measure of impact in the evaluation of interventions designed to improve interprofessional collaboration between GPs and pharmacists. Copyright © 2012 Elsevier Inc. All rights reserved.
Saadatpour, Leila; Hemati, Simin; Habibi, Farzaneh; Behzadi, Erfan; Hashemi-Jazi, Marsa Sadat; Kheirabadi, Gholamreza; Mirbagher, Leila; Gholamrezaei, Ali
2015-09-01
Various symptoms frequently affect cancer patients' quality of life. Appropriate assessment of these symptoms provides valuable data for cancer management. This study aimed to validate the Persian version of the M. D. Anderson Symptom Inventory (MDASI-P). This cross-sectional study was conducted at four cancer treatment centers in two cities in Iran. Breast cancer and colorectal cancer patients aged 18 years and older were consecutively included in the study. The standard forward-backward translation method was applied. Patients completed the MDASI-P along with the previously validated Persian version of the European Organization for Research and Treatment of Cancer Quality of Life Questionnaire-Core 30 (EORTC QLQ-C30). Construct validity (factor analysis), criterion validity (against the EORTC QLQ-C30), and reliability (Cronbach's alpha) were analyzed. A total of 146 breast cancer and 94 colorectal cancer patients were studied. Factor analysis for the symptom severity items resulted in a three-factor solution, further reduced to a two-factor solution: general symptoms and gastrointestinal symptoms. Correlation of the MDASI-P symptom severity items with corresponding EORTC QLQ-C30 symptom items (r = 0.48-0.75) and MDASI-P interference items with corresponding EORTC QLQ-C30 functioning domains (r = -0.46 to -0.23) supported the criterion validity. Cronbach's alpha was 0.90, 0.88, and 0.77 for the total questionnaire, symptom severity items, and the interference subscale, respectively. The MDASI-P is a feasible, valid, and reliable instrument for evaluation of symptoms in Persian-speaking cancer patients and can be used to improve symptom management in these patients. Copyright © 2015 American Academy of Hospice and Palliative Medicine. Published by Elsevier Inc. All rights reserved.
Gagné, Myriam; Boulet, Louis-Philippe; Pérez, Norma; Moisan, Jocelyne
2018-04-30
To systematically identify the measurement properties of patient-reported outcome instruments (PROs) that evaluate adherence to inhaled maintenance medication in adults with asthma. We conducted a systematic review of six databases. Two reviewers independently included studies on the measurement properties of PROs that evaluated adherence in asthmatic participants aged ≥18 years. Based on the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN), the reviewers (1) extracted data on internal consistency, reliability, measurement error, content validity, structural validity, hypotheses testing, cross-cultural validity, criterion validity, and responsiveness; (2) assessed the methodological quality of the included studies; (3) assessed the quality of the measurement properties (positive or negative); and (4) summarised the level of evidence (limited, moderate, or strong). We screened 6,068 records and included 15 studies (14 PROs). No studies evaluated measurement error or responsiveness. Based on methodological and measurement property quality assessments, we found limited positive evidence of: (a) internal consistency of the Adherence Questionnaire, Refined Medication Adherence Reason Scale (MAR-Scale), Medication Adherence Report Scale for Asthma (MARS-A), and Test of the Adherence to Inhalers (TAI); (b) reliability of the TAI; and (c) structural validity of the Adherence Questionnaire, MAR-Scale, MARS-A, and TAI. We also found limited negative evidence of: (d) hypotheses testing of Adherence Questionnaire; (e) reliability of the MARS-A; and (f) criterion validity of the MARS-A and TAI. Our results highlighted the need to conduct further high-quality studies that will positively evaluate the reliability, validity, and responsiveness of the available PROs. This article is protected by copyright. All rights reserved.
2017-01-01
Objective To perform a translation and cross-cultural adaptation of the Cardiac Rehabilitation Barriers Scale (CRBS) for use in Korea, followed by psychometric validation. The CRBS was developed to assess patients' perception of the degree to which patient, provider and health system-level barriers affect their cardiac rehabilitation (CR) participation. Methods The CRBS consists of 21 items (barriers to adherence) rated on a 5-point Likert scale. The first phase was to translate and cross-culturally adapt the CRBS to the Korean language. After back-translation, both versions were reviewed by a committee. The face validity was assessed in a sample of Korean patients (n=53) with history of acute myocardial infarction that did not participate in CR through semi-structured interviews. The second phase was to assess the construct and criterion validity of the Korean translation as well as internal reliability, through administration of the translated version in 104 patients, principle component analysis with varimax rotation and cross-referencing against CR use, respectively. Results The length, readability, and clarity of the questionnaire were rated well, demonstrating face validity. Analysis revealed a six-factor solution, demonstrating construct validity. Cronbach's alpha was greater than 0.65. Barriers rated highest included not knowing about CR and not being contacted by a program. The mean CRBS score was significantly higher among non-attendees (2.71±0.26) than CR attendees (2.51±0.18) (p<0.01). Conclusion The Korean version of CRBS has demonstrated face, content and criterion validity, suggesting it may be useful for assessing barriers to CR utilization in Korea. PMID:29201826
Boubouchairopoulou, N; Kollias, A; Chiu, B; Chen, B; Lagou, S; Anestis, P; Stergiou, G S
2017-07-01
A pocket-size cuffless electronic device for self-measurement of blood pressure (BP) has been developed (Freescan, Maisense Inc., Zhubei, Taiwan). The device estimates BP within 10 s using three embedded electrodes and one force sensor that is applied over the radial pulse to evaluate the pulse wave. Before use, basic anthropometric characteristics are recorded on the device, and individualized initial calibration is required based on a standard BP measurement performed using an upper-arm BP monitor. The device performance in providing valid BP readings was evaluated in 313 normotensive and hypertensive adults in three study phases during which the device sensor was upgraded. A formal validation study of a prototype device against mercury sphygmomanometer was performed according to the American National Standards Institute/Association for the Advancement of Medical Instrumentation/International Organization for Standardization (ANSI/AAMI/ISO) 2013 protocol. The test device succeeded in obtaining a valid BP measurement (three successful readings within up to five attempts) in 55-72% of the participants, which reached 87% with device sensor upgrade. For the validation study, 125 adults were recruited and 85 met the protocol requirements for inclusion. The mean device-observers BP difference was 3.2±6.7 (s.d.) mm Hg for systolic and 2.6±4.6 mm Hg for diastolic BP (criterion 1). The estimated s.d. (inter-subject variability) were 5.83 and 4.17 mm Hg respectively (criterion 2). These data suggest that this prototype cuffless BP monitor provides valid self-measurements in the vast majority of adults, and satisfies the BP measurement accuracy criteria of the ANSI/AAMI/ISO 2013 validation protocol.
Hernández-Padilla, José M; Granero-Molina, José; Márquez-Hernández, Verónica V; Suthers, Fiona; López-Entrambasaguas, Olga M; Fernández-Sola, Cayetano
2017-06-01
Rapid and accurate interpretation of cardiac arrhythmias by nurses has been linked with safe practice and positive patient outcomes. Although training in electrocardiogram rhythm recognition is part of most undergraduate nursing programmes, research continues to suggest that nurses and nursing students lack competence in recognising cardiac rhythms. In order to promote patient safety, nursing educators must develop valid and reliable assessment tools that allow the rigorous assessment of this competence before nursing students are allowed to practise without supervision. The aim of this study was to develop and psychometrically evaluate a toolkit to holistically assess competence in electrocardiogram rhythm recognition. Following a convenience sampling technique, 293 nursing students from a nursing faculty in a Spanish university were recruited for the study. The following three instruments were developed and psychometrically tested: an electrocardiogram knowledge assessment tool (ECG-KAT), an electrocardiogram skills assessment tool (ECG-SAT) and an electrocardiogram self-efficacy assessment tool (ECG-SES). Reliability and validity (content, criterion and construct) of these tools were meticulously examined. A high Cronbach's alpha coefficient demonstrated the excellent reliability of the instruments (ECG-KAT=0.89; ECG-SAT=0.93; ECG-SES=0.98). An excellent context validity index (scales' average content validity index>0.94) and very good criterion validity were evidenced for all the tools. Regarding construct validity, principal component analysis revealed that all items comprising the instruments contributed to measure knowledge, skills or self-efficacy in electrocardiogram rhythm recognition. Moreover, known-groups analysis showed the tools' ability to detect expected differences in competence between groups with different training experiences. The three-instrument toolkit developed showed excellent psychometric properties for measuring competence in electrocardiogram rhythm recognition.
Constantine, Melissa L; Pauls, Rachel N; Rogers, Rebecca R; Rockwood, Todd H
2017-12-01
The Prolapse/Incontinence Sexual Questionnaire-International Urogynecology Association (IUGA) Revised (PISQ-IR) measures sexual function in women with pelvic floor disorders (PFDs) yet is unwieldy, with six individual subscale scores for sexually active women and four for women who are not. We hypothesized that a valid and responsive summary score could be created for the PISQ-IR. Item response data from participating women who completed a revised version of the PISQ-IR at three clinical sites were used to generate item weights using a magnitude estimation (ME) and Q-sort (Q) approaches. Item weights were applied to data from the original PISQ-IR validation to generate summary scores. Correlation and factor analysis methods were used to evaluate validity and responsiveness of summary scores. Weighted and nonweighted summary scores for the sexually active PISQ-IR demonstrated good criterion validity with condition-specific measures: Incontinence Severity Index = 0.12, 0.11, 0.11; Pelvic Floor Distress Inventory-20 = 0.39, 0.39, 0.12; Epidemiology of Prolapse and Incontinence Questionnaire-Q35 = 0.26 0,.25, 0.40); Female Sexual Functioning Index subscale total score = 0.72, 0.75, 0.72 for nonweighted, ME, and Q summary scores, respectively. Responsiveness evaluation showed weighted and nonweighted summary scores detected moderate effect sizes (Cohen's d > 0.5). Weighted items for those NSA demonstrated significant floor effects and did not meet criterion validity. A PISQ-IR summary score for use with sexually active women, nonweighted or calculated with ME or Q item weights, is a valid and reliable measure for clinical use. The summary scores provide value for assesing clinical treatment of pelvic floor disorders.
Maćkiewicz, Marta; Cieciuch, Jan
2016-01-01
In order to adjust personality measurements to children's developmental level, we constructed the Pictorial Personality Traits Questionnaire for Children (PPTQ-C). To validate the measure, we conducted a study with a total group of 1028 children aged between 7 and 13 years old. Structural validity was established through Exploratory Structural Equation Model (ESEM). Criterion validity was confirmed with a multitrait-multimethod analysis for which we introduced the children's self-assessment scores from the Big Five Questionnaire for Children. Despite some problems with reliability, one can conclude that the PPTQ-C can be a valid instrument for measuring personality traits, particularly in a group of young children (aged ~7-10 years).
Development and validation of the Alcohol Myopia Scale.
Lac, Andrew; Berger, Dale E
2013-09-01
Alcohol myopia theory conceptualizes the ability of alcohol to narrow attention and how this demand on mental resources produces the impairments of self-inflation, relief, and excess. The current research was designed to develop and validate a scale based on this framework. People who were alcohol users rated items representing myopic experiences arising from drinking episodes in the past month. In Study 1 (N = 260), the preliminary 3-factor structure was supported by exploratory factor analysis. In Study 2 (N = 289), the 3-factor structure was substantiated with confirmatory factor analysis, and it was superior in fit to an empirically indefensible 1-factor structure. The final 14-item scale was evaluated with internal consistency reliability, discriminant validity, convergent validity, criterion validity, and incremental validity. The alcohol myopia scale (AMS) illuminates conceptual underpinnings of this theory and yields insights for understanding the tunnel vision that arises from intoxication.
ERIC Educational Resources Information Center
Meredith, Keith E.; Sabers, Darrell L.
Data required for evaluating a Criterion Referenced Measurement (CRM) is described with a matrix. The information within the matrix consists of the "pass-fail" decisions of two CRMs. By differentially defining these two CRMs, different concepts of reliability and validity can be examined. Indices suggested for analyzing the matrix are listed with…
The Development of a Criterion Instrument for Counselor Selection.
ERIC Educational Resources Information Center
Remer, Rory; Sease, William
A measure of potential performance as a counselor is needed as an adjunct to the information presently employed in selection decisions. This article deals with one possible method of development of such a potential performance criterion and the steps taken, to date, in the attempt to validate it. It includes: the overall effectiveness of the…
ERIC Educational Resources Information Center
Tibbetts, Katherine A.; And Others
This paper describes the development of a criterion-referenced, performance-based measure of third grade reading comprehension. The primary purpose of the assessment is to contribute unique and valid information for use in the formative evaluation of a whole literacy program. A secondary purpose is to supplement other program efforts to…
ERIC Educational Resources Information Center
Shields, Ann; Cicchetti, Dante
1997-01-01
Two studies examined psychometric properties of a new criterion Q-sort for children's emotion regulation and autonomy. Multitrait-multimethod matrix and factor analyses indicated impressive convergence among the emotion regulation Q-scale and established affect regulation measures. The new scale was not discriminable from measures of related…
A comparison of two patient classification instruments in an acute care hospital.
Seago, Jean Ann
2002-05-01
Patient classification systems are alternately praised and vilified by staff nurses, nurse managers, and nurse executives. Most nurses agree that substantial resources are used to create or find, implement, manage, and maintain the systems, and that the predictive ability of the instruments is intermittent. The purpose of this study is to compare the predictive validity of two types of patient classification instruments commonly used in acute care hospitals in California. Acute care hospitals in California are required by both the Joint Commission on Accreditation of Healthcare Organizations and California Title 22 to have a reliable and valid patient classification system (PCS). The two general types of systems commonly used are the summative task type PCS and the critical incident or criterion type PCS. There is little to assist nurse executives in deciding which type of PCS to choose. There is modest research demonstrating the validity and reliability of different PCSs but no published data comparing the predictive validity of the different types of systems. The unit of analysis is one patient shift called the study shift. The study shift is defined as the first day shift after the patient has been in the hospital for a full 24 hours. Data were collected using medical record review only. Both types, criterion and summative, of PCS data collection instruments were completed for all patients at both collection points. Each patient had a before and after score for each type of instrument. Three hundred forty-nine medical records for inpatients meeting the inclusion criteria were examined. The average patient age was 76 years, the average length of stay was 6.6 days with an average of 6.7 secondary diagnoses recorded. Fifty-five percent of the sample was female and the most common primary diagnosis was CHF, followed by COPD, CVA, and pneumonia. There was a difference in mean summative predictor score and the mean summative actual score of 1.57 points with the predictor score higher (P =.001; CI =.62--2.5). For the criterion instrument, 68.4% of the predictor criterion scores were in category 2 compared to 65.5% of the actual criterion scores. The criterion predictor agreed with the criterion actual score 45% of the time for category 1 patients, 87.3% of the time for category 2 patients, 77.1% of the time for category 3 patients and 72.7% of the time for category 4 patients, with an overall agreement between predictor and actual criterion scores of 79.9% (Kappa P <.001, indicating agreement is not by chance). The most significant finding of this study is that there are virtually no differences in the predictive ability of summative versus criterion patient classification instruments. Using the same patients, both types of instruments predicted the actual score over 78% of the time.
ERIC Educational Resources Information Center
de Bildt, Annelies; Mulder, Erik J.; Hoekstra, Pieter J.; van Lang, Natasja D. J.; Minderaa, Ruud B.; Hartman, Catharina A.
2009-01-01
The Children's Social Behavior Questionnaire (CSBQ) was compared with the Autism Diagnostic Interview-Revised (ADI-R), Autism Diagnostic Observation Schedule (ADOS), and clinical classification in children with mild and moderate intellectual disability (ID), to investigate its criterion related validity. The contribution of the CSBQ to a…
Reliability and Validity of a New Physical Activity Self-Report Measure for Younger Children
ERIC Educational Resources Information Center
Belton, Sarahjane; Mac Donncha, Ciaran
2010-01-01
The purpose of this study was to assess the test-retest reliability and validity of a new Youth Physical Activity Self-Report measure. Heart rate and direct observation were employed as criterion measures with a sample of 79 children (aged 7-9 years). Spearman's rho correlation between self reported activity intensity and heart rate was 0.87 for…
A Criterion-Related Validation Study of the Army Core Leader Competency Model
2007-04-01
2004). Transformational and transactional leadership: A meta-analytic test of their relative validity. Journal of Applied Psychology , 89, 755- 768...performance criteria in an attempt to adjust ratings for this influence. Leader survey materials were developed and pilot tested at Ft. Drum and Ft... psychological constructs in the behavioral science realm. Numerous theories, popular literature, websites, assessments, and competency models are
Validity of Alternative Cut-Off Scores for the Back-Saver Sit and Reach Test
ERIC Educational Resources Information Center
Looney, Marilyn A.; Gilbert, Jennie
2012-01-01
The purpose of the study was to determine if currently used FITNESSGRAM[R] cut-off scores for the Back Saver Sit and Reach Test had the best criterion-referenced validity evidence for 6-12 year old children. Secondary analyses of an existing data set focused on the passive straight leg raise and Back Saver Sit and Reach Test flexibility scores of…
A Controlled Evaluation of the Distress Criterion for Binge Eating Disorder
Grilo, Carlos M.; White, Marney A.
2012-01-01
Objective Research has examined various aspects of the validity of the research criteria for binge eating disorder (BED) but has yet to evaluate the utility of criterion C “marked distress about binge eating.” This study examined the significance of the marked distress criterion for BED using two complementary comparisons groups. Method A total of 1075 community volunteers completed a battery of self-report instruments as part of an internet study. Analyses compared body mass index (BMI), eating-disorder psychopathology, and depressive levels in four groups: 97 participants with BED except for the distress criterion (BED-ND), 221 participants with BED including the distress criterion (BED), 79 participants with bulimia nervosa (BN), and 489 obese participants without binge-eating or purging (NBPO). Parallel analyses compared these study groups using the broadened frequency criterion (i.e., once-weekly for binge/purge behaviors) proposed for DSM-5 and the DSM-IV twice-weekly frequency criterion. Results The BED group had significantly greater eating-disorder psychopathology and depressive levels than the BED-ND group. The BED group, but not the BED-ND group, had significantly greater eating-disorder psychopathology than the NBPO comparison group. The BN group had significantly greater eating-disorder psychopathology and depressive levels than all three other groups. The group differences existed even after controlling for depression levels, BMI, and demographic variables, although some differences between the BN and BED groups were attenuated when controlling for depression levels. Conclusions These findings provide support for the validity of the “marked distress” criterion for the diagnosis of BED. PMID:21707133
Validation of the Chinese Version of the Quality of Nursing Work Life Scale
Fu, Xia; Xu, Jiajia; Song, Li; Li, Hua; Wang, Jing; Wu, Xiaohua; Hu, Yani; Wei, Lijun; Gao, Lingling; Wang, Qiyi; Lin, Zhanyi; Huang, Huigen
2015-01-01
Quality of Nursing Work Life (QNWL) serves as a predictor of a nurse’s intent to leave and hospital nurse turnover. However, QNWL measurement tools that have been validated for use in China are lacking. The present study evaluated the construct validity of the QNWL scale in China. A cross-sectional study was conducted conveniently from June 2012 to January 2013 at five hospitals in Guangzhou, which employ 1938 nurses. The participants were asked to complete the QNWL scale and the World Health Organization Quality of Life abbreviated version (WHOQOL-BREF). A total of 1922 nurses provided the final data used for analyses. Sixty-five nurses from the first investigated division were re-measured two weeks later to assess the test-retest reliability of the scale. The internal consistency reliability of the QNWL scale was assessed using Cronbach’s α. Test-retest reliability was assessed using the intra-class correlation coefficient (ICC). Criterion-relation validity was assessed using the correlation of the total scores of the QNWL and the WHOQOL-BREF. Construct validity was assessed with the following indices: χ2 statistics and degrees of freedom; relative mean square error of approximation (RMSEA); the Akaike information criterion (AIC); the consistent Akaike information criterion (CAIC); the goodness-of-fit index (GFI); the adjusted goodness of fit index; and the comparative fit index (CFI). The findings demonstrated high internal consistency (Cronbach’s α = 0.912) and test-retest reliability (interclass correlation coefficient = 0.74) for the QNWL scale. The chi-square test (χ2 = 13879.60, df [degree of freedom] = 813 P = 0.0001) was significant. The RMSEA value was 0.091, and AIC = 1806.00, CAIC = 7730.69, CFI = 0.93, and GFI = 0.74. The correlation coefficient between the QNWL total scores and the WHOQOL-BREF total scores was 0.605 (p<0.01). The QNWL scale was reliable and valid in Chinese-speaking nurses and could be used as a clinical and research instrument for measuring work-related factors among nurses in China. PMID:25950838
Vuillerot, Carole; Meilleur, Katherine G.; Jain, Minal; Waite, Melissa; Wu, Tianxia; Linton, Melody; Datsgir, Jahannaz; Donkervoort, Sandra; Leach, Meganne E.; Rutkowski, Anne; Rippert, Pascal; Payan, Christine; Iwaz, Jean; Hamroun, Dalil; Bérard, Carole; Poirot, Isabelle; Bönnemann, Carsten G.
2016-01-01
Objective To develop and validate an English version of the Neuromuscular (NM)-Score, a classification for patients with NM diseases in each of the 3 motor function domains: D1, standing and transfers; D2, axial and proximal motor function; and D3, distal motor function. Design Validation survey. Setting Patients seen at a medical research center between June and September 2013. Participants Consecutive patients (N = 42) aged 5 to 19 years with a confirmed or suspected diagnosis of congenital muscular dystrophy. Interventions Not applicable. Main Outcome Measures An English version of the NM-Score was developed by a 9-person expert panel that assessed its content validity and semantic equivalence. Its concurrent validity was tested against criterion standards (Brooke Scale, Motor Function Measure [MFM], activity limitations for patients with upper and/or lower limb impairments [ACTIVLIM], Jebsen Test, and myometry measurements). Informant agreement between patient/caregiver (P/C)-reported and medical doctor (MD)-reported NM scores was measured by weighted kappa. Results Significant correlation coefficients were found between NM scores and criterion standards. The highest correlations were found between NM-score D1 and MFM score D1 (ρ = −.944, P<.0001), ACTIVLIM (ρ = −.895, P<.0001), and hip abduction strength by myometry (ρ = −.811, P<.0001). Informant agreement between P/C-reported and MD-reported NM scores was high for D1 (κ = .801; 95% confidence interval [CI], .701–.914) but moderate for D2 (κ = .592; 95% CI, .412–.773) and D3 (κ = .485; 95% CI, .290–.680). Correlation coefficients between the NM scores and the criterion standards did not significantly differ between P/C-reported and MD-reported NM scores. Conclusions Patients and physicians completed the English NM-Score easily and accurately. The English version is a reliable and valid instrument that can be used in clinical practice and research to describe the functional abilities of patients with NM diseases. PMID:24862765
Development of an opioid-related Overdose Risk Behavior Scale (ORBS).
Pouget, Enrique R; Bennett, Alex S; Elliott, Luther; Wolfson-Stofko, Brett; Almeñana, Ramona; Britton, Peter C; Rosenblum, Andrew
2017-01-01
Drug overdose has emerged as the leading cause of injury-related death in the United States, driven by prescription opioid (PO) misuse, polysubstance use, and use of heroin. To better understand opioid-related overdose risks that may change over time and across populations, there is a need for a more comprehensive assessment of related risk behaviors. Drawing on existing research, formative interviews, and discussions with community and scientific advisors an opioid-related Overdose Risk Behavior Scale (ORBS) was developed. Military veterans reporting any use of heroin or POs in the past month were enrolled using venue-based and chain referral recruitment. The final scale consisted of 25 items grouped into 5 subscales eliciting the number of days in the past 30 during which the participant engaged in each behavior. Internal reliability, test-retest reliability and criterion validity were assessed using Cronbach's alpha, intraclass correlations (ICC) and Pearson's correlations with indicators of having overdosed during the past 30 days, respectivelyInternal reliability, test-retest reliability and criterion validity were assessed using Cronbach's alpha, intraclass correlations (ICC) and Pearson's correlations with indicators of having overdosed during the past 30 days, respectively. Data for 220 veterans were analyzed. The 5 subscales-(A) Adherence to Opioid Dosage and Therapeutic Purposes; (B) Alternative Methods of Opioid Administration; (C) Solitary Opioid Use; (D) Use of Nonprescribed Overdose-associated Drugs; and (E) Concurrent Use of POs, Other Psychoactive Drugs and Alcohol-generally showed good internal reliability (alpha range = 0.61 to 0.88), test-retest reliability (ICC range = 0.81 to 0.90), and criterion validity (r range = 0.22 to 0.66). The subscales were internally consistent with each other (alpha = 0.84). The scale mean had an ICC value of 0.99, and correlations with validators ranged from 0.44 to 0.56. These results constitute preliminary evidence for the reliability and validity of the new scale. If further validated, it could help improve overdose prevention and response research and could help improve the precision of overdose education and prevention efforts.
Lee, Rebekka M; Emmons, Karen M; Okechukwu, Cassandra A; Barrett, Jessica L; Kenney, Erica L; Cradock, Angie L; Giles, Catherine M; deBlois, Madeleine E; Gortmaker, Steven L
2014-11-28
Nutrition and physical activity interventions have been effective in creating environmental changes in afterschool programs. However, accurate assessment can be time-consuming and expensive as initiatives are scaled up for optimal population impact. This study aims to determine the criterion validity of a simple, low-cost, practitioner-administered observational measure of afterschool physical activity, nutrition, and screen time practices and child behaviors. Directors from 35 programs in three cities completed the Out-of-School Nutrition and Physical Activity Observational Practice Assessment Tool (OSNAP-OPAT) on five days. Trained observers recorded snacks served and obtained accelerometer data each day during the same week. Observations of physical activity participation and snack consumption were conducted on two days. Correlations were calculated to validate weekly average estimates from OSNAP-OPAT compared to criterion measures. Weekly criterion averages are based on 175 meals served, snack consumption of 528 children, and physical activity levels of 356 children. OSNAP-OPAT validly assessed serving water (r = 0.73), fruits and vegetables (r = 0.84), juice >4oz (r = 0.56), and grains (r = 0.60) at snack; sugary drinks (r = 0.70) and foods (r = 0.68) from outside the program; and children's water consumption (r = 0.56) (all p <0.05). Reports of physical activity time offered were correlated with accelerometer estimates (minutes of moderate and vigorous physical activity r = 0.59, p = 0.02; vigorous physical activity r = 0.63, p = 0.01). The reported proportion of children participating in moderate and vigorous physical activity was correlated with observations (r = 0.48, p = 0.03), as were reports of computer (r = 0.85) and TV/movie (r = 0.68) time compared to direct observations (both p < 0.01). OSNAP-OPAT can assist researchers and practitioners in validly assessing nutrition and physical activity environments and behaviors in afterschool settings. Phase 1 of this measure validation was conducted during a study registered at clinicaltrials.gov NCT01396473.
Zin, Faridah Mohd; Hillaluddin, Azlin Hilma; Mustaffa, Jamaludin
2017-01-01
Objective: This study aims to develop, validate and determine the reliability of an interactive multimedia strategy to prevent tobacco use among the young (TUPY-S) from an adolescents’ perspective. Methods: A descriptive study design was utilized. A modular instruction guideline by Russel (1974) was followed in the entire process, comprising a feasibility study, a review of existing modules, specification of the objectives, identification of the construct criterion items, learner analysis and entry behavior specification, establishment of the sequence instruction and media selection, a tryout with students and a field test. Result: Feasibility was agreed among the researchers and the school authorities. Culturally suitable rigorously developed tobacco use preventive strategies delivered using information technology (IT) are lacking in the literature. The objective of TUPY-S is to prevent tobacco use among adolescents living in Malaysia. Identified construct criterion items include knowledge, attitude, intention to use, self-efficacy, and refusal skill. The target population was early adolescents belonging to generation-Z. Content was developed from the adolescents’ perspective and delivered using IT in Malay language. Content validity, assessed by six experts in the field and module development, was good at 86%. The students’ tryout showed satisfactory face validity subjectively and objectively (85.5%) and high alpha Cronbach reliability (0.91). Conclusion: TUPY-S was confirmed to suit early adolescents of the current generation living in Malaysia. It demonstrated good content validity among the experts, satisfactory face validity and reliability among the target population. TUPY-S is ready to be evaluated for its effectiveness among early adolescents. PMID:28612599
Bajada, Stefan; Mohanty, Khitish
2016-06-01
The Majeed scoring system is a disease-specific outcome measure that was originally designed to assess pelvic injuries. The aim of this study was to determine the psychometric properties of the Majeed scoring system for chronic sacroiliac joint pain. Internal consistency, content validity, criterion validity, construct validity and responsiveness to change was assessed prospectively for the Majeed scoring system in a cohort of 60 patients diagnosed with sacroiliac joint pain. This diagnosis was confirmed with CT-guided sacroiliac joint anaesthetic block. The overall Majeed score showed acceptable internal consistency (Cronbach alpha = 0.63). Similarly, it showed acceptable floor (0 %) and ceiling (0 %) effects. On the other hand, the domains of pain, work, sitting and sexual intercourse had high (>30 %) floor effects. Significant correlation with the physical component of the Short Form-36 (p = 0.005) and Oswestry disability index (p ≤ 0.001) was found indicating acceptable criterion validity. The overall Majeed score showed acceptable construct validity with all five developed hypotheses showing significance (p ≤ 0.05). The overall Majeed score showed acceptable responsiveness to change with a large (≥0.80) effect size and standardized response mean. Overall the Majeed scoring system demonstrated acceptable psychometric properties for outcome assessment in chronic sacroiliac joint pain. Thus, its use in this condition is adequate. However, some domains demonstrated suboptimal performance indicating that improvement might be achieved with the development of an outcome measure specific for sacroiliac joint dysfunction and degeneration.
Validity and reliability of the Brazilian version of the Work Ability Index questionnaire.
Martinez, Maria Carmen; Latorre, Maria do Rosário Dias de Oliveira; Fischer, Frida Marina
2009-06-01
To evaluate the validity and reliability of the Portuguese language version of a work ability index. Cross sectional survey of a sample of 475 workers from an electrical company in the state of Sao Paulo, Southeastern Brazil (spread across ten municipalities in the Campinas area), carried out in 2005. The following aspects of the Brazilian version of the Work Ability Index were evaluated: construct validity, using factorial exploratory analysis, and discriminant capacity, by comparing mean Work Ability Index scores in two groups with different absenteeism levels; criterion validity, by determining the correlation between self-reported health and Work Ability Index score; and reliability, using Cronbach's alpha to determine the internal consistency of the questionnaire. Factorial analysis indicated three factors in the work ability construct: issues pertaining to 'mental resources' (20.6% of the variance), self-perceived work ability (18.9% of the variance), and presence of diseases and health-related limitations (18.4% of the variance). The index was capable of discriminating workers according to levels of absenteeism, identifying a significantly lower (p<0.0001) mean score among subjects with high absenteeism (37.2 points) when compared to those with low absenteeism (42.3 points). Criterion validity analysis showed a correlation between the index and all dimensions of health status analyzed (p<0.0001). Reliability of the index was high, with a Cronbach's alpha of 0.72. The Brazilian version of the Work Ability Index showed satisfactory psychometric properties with respect to construct validity, thus constituting an appropriate option for evaluating work ability in both individual and population-based settings.
Toward a Measure of Accountability in Nursing: A Three-Stage Validation Study.
Drach-Zahavy, Anat; Leonenko, Marina; Srulovici, Einav
2018-06-04
To develop and psychometrically evaluate a three-dimensional questionnaire suitable for evaluating personal and organizational accountability in nurses. Accountability is defined as a three-dimensional value, directing professionals to take responsibility for their decisions and actions, to be willing to explain them (transparency) and to be judged according to society's accepted values (answerability). Despite the relatively clear definition, measurement of accountability lags well behind. Existing self-report questionnaires do not fully capture the complexity of the concept; nor do they capture the different sources of accountability (e.g., personal accountability, organizational accountability). A three-stage measure development. Data were collected during 2015-2016. In Phase 1, an initial database of items (N = 74) was developed, based on literature review and qualitative study, establishing face and content validity. In Phase 2, the face, content, construct and criterion-related validity of the initial questionnaires (19 items for personal and organizational accountability questionnaire) was established with a sample of 229 nurses. In Phase 3, the final questionnaires (19 items each) were validated with a new sample of 329 nurses and established construct validity. The final version of the instruments comprised 19 items, suitable for assessing personal and organizational accountability. The questionnaire referred to the dimensions of responsibility, transparency and answerability. The findings established the instrument's content, construct and criterion-related validity, as well as good internal reliability. The questionnaire portrays accountability in nursing, by capturing nurses' subjective perceptions of accountability dimensions (responsibility, transparency, answerability), as demonstrated by personal and organizational values. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Van Iddekinge, Chad H; Roth, Philip L; Putka, Dan J; Lanivich, Stephen E
2011-11-01
A common belief among researchers is that vocational interests have limited value for personnel selection. However, no comprehensive quantitative summaries of interests validity research have been conducted to substantiate claims for or against the use of interests. To help address this gap, we conducted a meta-analysis of relations between interests and employee performance and turnover using data from 74 studies and 141 independent samples. Overall validity estimates (corrected for measurement error in the criterion but not for range restriction) for single interest scales were .14 for job performance, .26 for training performance, -.19 for turnover intentions, and -.15 for actual turnover. Several factors appeared to moderate interest-criterion relations. For example, validity estimates were larger when interests were theoretically relevant to the work performed in the target job. The type of interest scale also moderated validity, such that corrected validities were larger for scales designed to assess interests relevant to a particular job or vocation (e.g., .23 for job performance) than for scales designed to assess a single, job-relevant realistic, investigative, artistic, social, enterprising, or conventional (i.e., RIASEC) interest (.10) or a basic interest (.11). Finally, validity estimates were largest when studies used multiple interests for prediction, either by using a single job or vocation focused scale (which tend to tap multiple interests) or by using a regression-weighted composite of several RIASEC or basic interest scales. Overall, the results suggest that vocational interests may hold more promise for predicting employee performance and turnover than researchers may have thought. (c) 2011 APA, all rights reserved.
The Servant Leadership Survey: Development and Validation of a Multidimensional Measure.
van Dierendonck, Dirk; Nuijten, Inge
2011-09-01
PURPOSE: The purpose of this paper is to describe the development and validation of a multi-dimensional instrument to measure servant leadership. DESIGN/METHODOLOGY/APPROACH: Based on an extensive literature review and expert judgment, 99 items were formulated. In three steps, using eight samples totaling 1571 persons from The Netherlands and the UK with a diverse occupational background, a combined exploratory and confirmatory factor analysis approach was used. This was followed by an analysis of the criterion-related validity. FINDINGS: The final result is an eight-dimensional measure of 30 items: the eight dimensions being: standing back, forgiveness, courage, empowerment, accountability, authenticity, humility, and stewardship. The internal consistency of the subscales is good. The results show that the Servant Leadership Survey (SLS) has convergent validity with other leadership measures, and also adds unique elements to the leadership field. Evidence for criterion-related validity came from studies relating the eight dimensions to well-being and performance. IMPLICATIONS: With this survey, a valid and reliable instrument to measure the essential elements of servant leadership has been introduced. ORIGINALITY/VALUE: The SLS is the first measure where the underlying factor structure was developed and confirmed across several field studies in two countries. It can be used in future studies to test the underlying premises of servant leadership theory. The SLS provides a clear picture of the key servant leadership qualities and shows where improvements can be made on the individual and organizational level; as such, it may also offer a valuable starting point for training and leadership development.
NASA Astrophysics Data System (ADS)
Wang, Cong; Shang, De-Guang; Wang, Xiao-Wei
2015-02-01
An improved high-cycle multiaxial fatigue criterion based on the critical plane was proposed in this paper. The critical plane was defined as the plane of maximum shear stress (MSS) in the proposed multiaxial fatigue criterion, which is different from the traditional critical plane based on the MSS amplitude. The proposed criterion was extended as a fatigue life prediction model that can be applicable for ductile and brittle materials. The fatigue life prediction model based on the proposed high-cycle multiaxial fatigue criterion was validated with experimental results obtained from the test of 7075-T651 aluminum alloy and some references.
Debast, Inge; Rossi, Gina; van Alphen, S P J
2018-04-01
The alternative model for personality disorders in the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders ( DSM-5) is considered an important step toward a possibly better conceptualization of personality pathology in older adulthood, by the introduction of levels of personality functioning (Criterion A) and trait dimensions (Criterion B). Our main aim was to examine age-neutrality of the Short Form of the Severity Indices of Personality Problems (SIPP-SF; Criterion A) and Personality Inventory for DSM-5-Brief Form (PID-5-BF; Criterion B). Differential item functioning (DIF) analyses and more specifically the impact on scale level through differential test functioning (DTF) analyses made clear that the SIPP-SF was more age-neutral (6% DIF, only one of four domains showed DTF) than the PID-5-BF (25% DIF, all four tested domains had DTF) in a community sample of older and younger adults. Age differences in convergent validity also point in the direction of differences in underlying constructs. Concurrent and criterion validity in geriatric psychiatry inpatients suggest that both the SIPP-SF scales measuring levels of personality functioning (especially self-functioning) and the PID-5-BF might be useful screening measures in older adults despite age-neutrality not being confirmed.
Cross-cultural validity of a dietary questionnaire for studies of dental caries risk in Japanese.
Shinga-Ishihara, Chikako; Nakai, Yukie; Milgrom, Peter; Murakami, Kaori; Matsumoto-Nakano, Michiyo
2014-01-02
Diet is a major modifiable contributing factor in the etiology of dental caries. The purpose of this paper is to examine the reliability and cross-cultural validity of the Japanese version of the Food Frequency Questionnaire to assess dietary intake in relation to dental caries risk in Japanese. The 38-item Food Frequency Questionnaire, in which Japanese food items were added to increase content validity, was translated into Japanese, and administered to two samples. The first sample comprised 355 pregnant women with mean age of 29.2 ± 4.2 years for the internal consistency and criterion validity analyses. Factor analysis (principal components with Varimax rotation) was used to determine dimensionality. The dietary cariogenicity score was calculated from the Food Frequency Questionnaire and used for the analyses. Salivary mutans streptococci level was used as a semi-quantitative assessment of dental caries risk and measured by Dentocult SM. Dentocult SM scores were compared with the dietary cariogenicity score computed from the Food Frequency Questionnaire to examine criterion validity, and assessed by Spearman's correlation coefficient (rs) and Kruskal-Wallis test. Test-retest reliability of the Food Frequency Questionnaire was assessed with a second sample of 25 adults with mean age of 34.0 ± 3.0 years by using the intraclass correlation coefficient analysis. The Japanese language version of the Food Frequency Questionnaire showed high test-retest reliability (ICC = 0.70) and good criterion validity assessed by relationship with salivary mutans streptococci levels (rs = 0.22; p < 0.001). Factor analysis revealed four subscales that construct the questionnaire (solid sugars, solid and starchy sugars, liquid and semisolid sugars, sticky and slowly dissolving sugars). Internal consistency were low to acceptable (Cronbach's alpha = 0.67 for the total scale, 0.46-0.61 for each subscale). Mean dietary cariogenicity scores were 50.8 ± 19.5 in the first sample, 47.4 ± 14.1, and 40.6 ± 11.3 for the first and second administrations in the second sample. The distribution of Dentocult SM score was 6.8% (score = 0), 34.4% (score = 1), 39.4% (score = 2), and 19.4% (score = 3). Participants with higher scores were more likely to have higher dietary cariogenicity scores (p < 0.001; Kruskal-Wallis test). These results provide the preliminary evidence for the reliability and validity of the Japanese language Food Frequency Questionnaire.
Breakdown parameter for kinetic modeling of multiscale gas flows.
Meng, Jianping; Dongari, Nishanth; Reese, Jason M; Zhang, Yonghao
2014-06-01
Multiscale methods built purely on the kinetic theory of gases provide information about the molecular velocity distribution function. It is therefore both important and feasible to establish new breakdown parameters for assessing the appropriateness of a fluid description at the continuum level by utilizing kinetic information rather than macroscopic flow quantities alone. We propose a new kinetic criterion to indirectly assess the errors introduced by a continuum-level description of the gas flow. The analysis, which includes numerical demonstrations, focuses on the validity of the Navier-Stokes-Fourier equations and corresponding kinetic models and reveals that the new criterion can consistently indicate the validity of continuum-level modeling in both low-speed and high-speed flows at different Knudsen numbers.
Phillips, Tasha R; Sellbom, Martin; Ben-Porath, Yossef S; Patrick, Christopher J
2014-02-01
Replicating and extending research by Sellbom et al. (M. Sellbom, Y. S. Ben-Porath, C. J. Patrick, D. B. Wygant, D. M. Gartland, & K. P. Stafford, 2012, Development and Construct Validation of the MMPI-2-RF Measures of Global Psychopathy, Fearless-Dominance, and Impulsive-Antisociality, Personality Disorders: Theory, Research, and Treatment, 3, 17-38), the current study examined the criterion-related validity of three self-report indices of psychopathy that were derived from scores on the Minnesota Multiphasic Personality Inventory (MMPI)-2-Restructured Form (MMPI-2-RF; Y. S. Ben-Porath & A. Tellegen, 2008, Minnesota Multiphasic Personality Inventory-2-Restructured Form: Manual for Administration, Scoring, and Interpretation, Minneapolis, MN: University of Minnesota Press). We estimated psychopathy indices by regressing scores from the Psychopathic Personality Inventory (PPI; S. O. Lilienfeld & B. P. Andrews, 1996, Development and Preliminary Validation of a Self-Report Measure of Psychopathic Personality Traits in Noncriminal Populations, Journal of Personality Assessment, 66, 488-524) and its two distinct facets, Fearless-Dominance and Impulsive-Antisociality, onto conceptually selected MMPI-2-RF scales. Data for a newly collected sample of 230 incarcerated women were combined with existing data from Sellbom et al.'s (2012) male correctional and mixed-gender college samples to establish regression equations with optimal generalizability. Correlation and regression analyses were then used to examine associations between the MMPI-2-RF-based estimates of PPI psychopathy and criterion measures (i.e., other well-established measures of psychopathy and conceptually related personality traits), and to evaluate whether gender moderated these associations. The MMPI-2-RF-based psychopathy indices correlated as expected with criterion measures and showed only one significant moderating effect for gender, namely, in the association between psychopathy and narcissism. These results provide further support for the validity of the MMPI-2-RF-based estimates of PPI psychopathy, and encourage their use in research and clinical contexts.
Goode, N; Salmon, P M; Taylor, N Z; Lenné, M G; Finch, C F
2017-10-01
One factor potentially limiting the uptake of Rasmussen's (1997) Accimap method by practitioners is the lack of a contributing factor classification scheme to guide accident analyses. This article evaluates the intra- and inter-rater reliability and criterion-referenced validity of a classification scheme developed to support the use of Accimap by led outdoor activity (LOA) practitioners. The classification scheme has two levels: the system level describes the actors, artefacts and activity context in terms of 14 codes; the descriptor level breaks the system level codes down into 107 specific contributing factors. The study involved 11 LOA practitioners using the scheme on two separate occasions to code a pre-determined list of contributing factors identified from four incident reports. Criterion-referenced validity was assessed by comparing the codes selected by LOA practitioners to those selected by the method creators. Mean intra-rater reliability scores at the system (M = 83.6%) and descriptor (M = 74%) levels were acceptable. Mean inter-rater reliability scores were not consistently acceptable for both coding attempts at the system level (M T1 = 68.8%; M T2 = 73.9%), and were poor at the descriptor level (M T1 = 58.5%; M T2 = 64.1%). Mean criterion referenced validity scores at the system level were acceptable (M T1 = 73.9%; M T2 = 75.3%). However, they were not consistently acceptable at the descriptor level (M T1 = 67.6%; M T2 = 70.8%). Overall, the results indicate that the classification scheme does not currently satisfy reliability and validity requirements, and that further work is required. The implications for the design and development of contributing factors classification schemes are discussed. Copyright © 2017 Elsevier Ltd. All rights reserved.
[Validation of a scale to assess the labour quality of life in public hospitals from Tlaxcala].
Hernández-Vicente, Irma Alejandra; Lumbreras-Guzmán, Marivel; Méndez-Hernández, Pablo; Rojas-Lima, Elodia; Cervantes-Rodríguez, Margarita; Juárez-Flores, Clara Arlina
2017-01-01
To validate a scale for assessing the labour quality of life in public hospitals (LQL-PH) from Tlaxcala, Mexico. The instrument was validated among 669 health workers from six hospitals from the Ministry of Health of Tlaxcala, Mexico. Content validity was by inquiry to experts, construct validity by factor analysis, criterion validity by comparing with other scales, and reliability with Cronbach's Alpha. The factor analysis uncovered four dimensions: "individual welfare", "conditions and labour environment", "organization", and "well-being accomplished by the work"; reliability was 0.921. Workers who perceibed better LQL-PH were: under 50 years old, with temporary contract, with less seniority in job, with work schedule at daytime of weekends, and those with academic degree. LQL-PH showed to be an instrument phsycometrically valid and reliable. It's recommendable to prove this scale in other public and private health institutions, as well as its relationship with key health care indicators of labour performance and management.
Dowd, Kieran P.; Harrington, Deirdre M.; Donnelly, Alan E.
2012-01-01
Background The activPAL has been identified as an accurate and reliable measure of sedentary behaviour. However, only limited information is available on the accuracy of the activPAL activity count function as a measure of physical activity, while no unit calibration of the activPAL has been completed to date. This study aimed to investigate the criterion validity of the activPAL, examine the concurrent validity of the activPAL, and perform and validate a value calibration of the activPAL in an adolescent female population. The performance of the activPAL in estimating posture was also compared with sedentary thresholds used with the ActiGraph accelerometer. Methodologies Thirty adolescent females (15 developmental; 15 cross-validation) aged 15–18 years performed 5 activities while wearing the activPAL, ActiGraph GT3X, and the Cosmed K4B2. A random coefficient statistics model examined the relationship between metabolic equivalent (MET) values and activPAL counts. Receiver operating characteristic analysis was used to determine activity thresholds and for cross-validation. The random coefficient statistics model showed a concordance correlation coefficient of 0.93 (standard error of the estimate = 1.13). An optimal moderate threshold of 2997 was determined using mixed regression, while an optimal vigorous threshold of 8229 was determined using receiver operating statistics. The activPAL count function demonstrated very high concurrent validity (r = 0.96, p<0.01) with the ActiGraph count function. Levels of agreement for sitting, standing, and stepping between direct observation and the activPAL and ActiGraph were 100%, 98.1%, 99.2% and 100%, 0%, 100%, respectively. Conclusions These findings suggest that the activPAL is a valid, objective measurement tool that can be used for both the measurement of physical activity and sedentary behaviours in an adolescent female population. PMID:23094069
The psychometric properties of the Portuguese version of the Personality Inventory for DSM-5.
Pires, Rute; Sousa Ferreira, Ana; Guedes, David
2017-10-01
The DSM-5 Section III proposes a hybrid dimensional-categorical model of conceptualizing personality and its disorders that includes assessment of impairments in personality functioning (criterion A) and maladaptive personality traits (criterion B). The Personality Inventory for the DSM-5 is a new dimensional tool, composed of 220 items organized into 25 facets that delineate five higher order domains of clinically relevant personality differences, and was developed to operationalize the DSM-5 model of pathological personality traits. The current studies address the internal consistency (study 1), the test-retest reliability (study 2) and the criterion validity (studies 3 and 4) of the Portuguese version of the PID-5 in samples of native speaking psychology students. Results indicated good internal consistency reliabilities and good temporal stability reliabilities for the majority of the PID-5 traits. The correlational pattern of the PID-5 traits with two measures of personality was in accordance with theoretical expectations and showed its concurrent validity. © 2017 Scandinavian Psychological Associations and John Wiley & Sons Ltd.
Quantifying Human Movement Using the Movn Smartphone App: Validation and Field Study
2017-01-01
Background The use of embedded smartphone sensors offers opportunities to measure physical activity (PA) and human movement. Big data—which includes billions of digital traces—offers scientists a new lens to examine PA in fine-grained detail and allows us to track people’s geocoded movement patterns to determine their interaction with the environment. Objective The objective of this study was to examine the validity of the Movn smartphone app (Moving Analytics) for collecting PA and human movement data. Methods The criterion and convergent validity of the Movn smartphone app for estimating energy expenditure (EE) were assessed in both laboratory and free-living settings, compared with indirect calorimetry (criterion reference) and a stand-alone accelerometer that is commonly used in PA research (GT1m, ActiGraph Corp, convergent reference). A supporting cross-validation study assessed the consistency of activity data when collected across different smartphone devices. Global positioning system (GPS) and accelerometer data were integrated with geographical information software to demonstrate the feasibility of geospatial analysis of human movement. Results A total of 21 participants contributed to linear regression analysis to estimate EE from Movn activity counts (standard error of estimation [SEE]=1.94 kcal/min). The equation was cross-validated in an independent sample (N=42, SEE=1.10 kcal/min). During laboratory-based treadmill exercise, EE from Movn was comparable to calorimetry (bias=0.36 [−0.07 to 0.78] kcal/min, t82=1.66, P=.10) but overestimated as compared with the ActiGraph accelerometer (bias=0.93 [0.58-1.29] kcal/min, t89=5.27, P<.001). The absolute magnitude of criterion biases increased as a function of locomotive speed (F1,4=7.54, P<.001) but was relatively consistent for the convergent comparison (F1,4=1.26, P<.29). Furthermore, 95% limits of agreement were consistent for criterion and convergent biases, and EE from Movn was strongly correlated with both reference measures (criterion r=.91, convergent r=.92, both P<.001). Movn overestimated EE during free-living activities (bias=1.00 [0.98-1.02] kcal/min, t6123=101.49, P<.001), and biases were larger during high-intensity activities (F3,6120=1550.51, P<.001). In addition, 95% limits of agreement for convergent biases were heterogeneous across free-living activity intensity levels, but Movn and ActiGraph measures were strongly correlated (r=.87, P<.001). Integration of GPS and accelerometer data within a geographic information system (GIS) enabled creation of individual temporospatial maps. Conclusions The Movn smartphone app can provide valid passive measurement of EE and can enrich these data with contextualizing temporospatial information. Although enhanced understanding of geographic and temporal variation in human movement patterns could inform intervention development, it also presents challenges for data processing and analytics. PMID:28818819
A structured interview for the DSM-III personality disorders. A preliminary report.
Stangl, D; Pfohl, B; Zimmerman, M; Bowers, W; Corenthal, C
1985-06-01
With few exceptions, published studies fail to indicate that the DSM-III personality disorders can be distinguished from each other with respect to etiology, prognosis, treatment response, or family history. The Structured Interview for the DSM-III Personality Disorders (SIDP) was developed to improve axis II diagnostic reliability, and hence allow validity testing of axis II. Sixty-three subjects were independently rated by two interviewers using the SIDP. The kappa coefficients for interrater agreement reached .70 or higher for histrionic, borderline, and dependent personalities. While it is impossible to separate the validity testing of the SIDP from validity testing of the DSM-III personality criteria themselves, preliminary results from 102 inpatient SIDP interviews suggest some criterion-based validity with respect to standard personality rating scales and some construct validity with respect to the dexamethasone suppression test.
The reliability and validity of a sexual functioning questionnaire.
Corty, E W; Althof, S E; Kurit, D M
1996-01-01
The present study assessed the reliability and validity of a measure of sexual functioning, the CMSH-SFQ, for male patients and their partners. The CMSH-SFQ measures erectile and orgasmic functioning, sexual drive, frequency of sexual behavior, and sexual satisfaction. Test-retest reliability was assessed with 19 males and 19 females for the baseline CMSH-SFQ. Criterion validity was measured by comparing the answers of 25 male patients to those of their partners at baseline and follow-up. The majority of items had acceptable levels of reliability and validity. The CMSH-SFQ provides a reliable and valid device that can be used to measure global sexual functioning in men and their partners and may be used to evaluate the efficacy of treatments for sexual dysfunctions. Limitations and suggestions for use of the CMSH-SFQ are addressed.
Gentile, Douglas A; Humphrey, Jeremy; Walsh, David A
2005-06-01
This article review is organized by studies that are relevant for testing the reliability and validity of ratings systems. Specifically, the interrater reliability, consistency, temporal stability, content validity, construct validity, and criterion validity of media ratings systems are reviewed. Data that are related to testing the "forbidden fruit" and "tainted fruit" hypotheses also are reviewed. Several changes are recommended to improve the ratings systems, including the creation of a universal ratings system that could be applied equally to all media. The research reviewed here can provide a guide for how to construct a reliable, valid, and more useful ratings system. This is important because the decisions that parents make regarding their children's media use can be only as good as the information to which the parents have access.
ERIC Educational Resources Information Center
Anselmo, Giancarlo A.; Yarbrough, Jamie L.; Kovaleski, Joseph F.; Tran, Vi N.
2017-01-01
This study analyzed the relationship between benchmark scores from two curriculum-based measurement probes in mathematics (M-CBM) and student performance on a state-mandated high-stakes test. Participants were 298 students enrolled in grades 7 and 8 in a rural southeastern school. Specifically, we calculated the criterion-related and predictive…
The Information a Test Provides on an Ability Parameter. Research Report. ETS RR-07-18
ERIC Educational Resources Information Center
Haberman, Shelby J.
2007-01-01
In item-response theory, if a latent-structure model has an ability variable, then elementary information theory may be employed to provide a criterion for evaluation of the information the test provides concerning ability. This criterion may be considered even in cases in which the latent-structure model is not valid, although interpretation of…
ERIC Educational Resources Information Center
Cory, Charles H.
This report presents data concerning the validity of a set of experimental computerized and paper-and-pencil tests for measures of on-job performance on global and job elements. It reports on the usefulness of 30 experimental and operational variables for predicting marks on 42 job elements and on a global criterion for Electrician's Mate,…
ERIC Educational Resources Information Center
Neustel, Sandra
As a continuing part of its validity studies, the Association of American Medical Colleges commissioned a study of the speediness of the Medical College Admission Test (MCAT). If speed is a hidden part of the test, it is a threat to its construct validity. As a general rule, the criterion used to indicate lack of speediness is that 80% of the…
Development of the beliefs about yoga scale.
Sohl, Stephanie J; Schnur, Julie B; Daly, Leslie; Suslov, Kathryn; Montgomery, Guy H
2011-01-01
Beliefs about yoga may influence participation in yoga and outcomes of yoga interventions. There is currently no scale appropriate for assessing these beliefs in the general U.S. population. This study took the first steps in developing and validating a Beliefs About Yoga Scale (BAYS) to assess beliefs about yoga that may influence people's engagement in yoga interventions. Items were generated based on previously published research about perceptions of yoga and reviewed by experts within the psychology and yoga communities. 426 adult participants were recruited from an urban medical center to respond to these items. The mean age was 40.7 (SD=13.5) years. Participants completed the BAYS and seven additional indicators of criterion-related validity. The BAYS demonstrated internal consistency (11 items; α=0.76) and three factors emerged: expected health benefits, expected discomfort, and expected social norms. The factor structure was confirmed: x2 (41, n=213)=72.06, p<.001; RMSEA=06, p=.23. Criterion-related validity was supported by positive associations of the BAYS with past experiences and future intentions related to yoga. This initial analysis of the BAYS demonstrated that it is an adequately reliable and valid measure of beliefs about yoga with a three-factor structure. However, the scale may need to be modified based on the population to which it is applied.
Development and initial validation of the appropriate antibiotic use self-efficacy scale.
Hill, Erin M; Watkins, Kaitlin
2018-06-04
While there are various medication self-efficacy scales that exist, none assess self-efficacy for appropriate antibiotic use. The Appropriate Antibiotic Use Self-Efficacy Scale (AAUSES) was developed, pilot tested, and its psychometric properties were examined. Following pilot testing of the scale, a 28-item questionnaire was examined using a sample (n = 289) recruited through the Amazon Mechanical Turk platform. Participants also completed other scales and items, which were used in assessing discriminant, convergent, and criterion-related validity. Test-retest reliability was also examined. After examining the scale and removing items that did not assess appropriate antibiotic use, an exploratory factor analysis was conducted on 13 items from the original scale. Three factors were retained that explained 65.51% of the variance. The scale and its subscales had adequate internal consistency. The scale had excellent test-retest reliability, as well as demonstrated convergent, discriminant, and criterion-related validity. The AAUSES is a valid and reliable scale that assesses three domains of appropriate antibiotic use self-efficacy. The AAUSES may have utility in clinical and research settings in understanding individuals' beliefs about appropriate antibiotic use and related behavioral correlates. Future research is needed to examine the scale's utility in these settings. Copyright © 2018 Elsevier B.V. All rights reserved.
Tierney, Marie; Fraser, Alexander; Purtill, Helen; Kennedy, Norelee
2013-06-01
Measuring physical activity in people with rheumatoid arthritis (RA) is of great importance in light of the increased mortality in this population due to cardiovascular disease. Validation of activity monitors in specific populations is recommended to ensure the accuracy of physical activity measurement. Thus, the purpose of this study was to determine the validity of the SenseWear Pro3 Armband (SWA) as a measure of physical activity during activities of daily living (ADL) in people with RA. Fourteen subjects (8 men and 6 women) with a diagnosis of RA were recruited from rheumatology clinics at the Mid-Western Regional Hospitals, Limerick, Ireland. Participants undertook a series of ADL of varying intensities. The SWA was compared to the criterion measures of the Oxycon Mobile indirect calorimetry system (energy expenditure in kJ) and of manual video observation (step count). Bland and Altman, intraclass correlation coefficient (ICC), and correlation analyses were done using SPSS, version 19.0. The SWA showed substantial agreement (ICC 0.717, P < 0.001) and a strong relationship (Pearson's correlation coefficient = 0.852) compared with the criterion measure when estimating energy expenditure during ADL. However, it was found that the SWA overestimated energy expenditure, particularly at higher intensity levels. The ability of the SWA to estimate step counts during ADL was poor (ICC 0.304, P = 0.038). The SWA can be considered a valid tool to estimate energy expenditure during ADL in the RA population; however, attention should be paid to its tendency to overestimate energy expenditure. Copyright © 2013 by the American College of Rheumatology.
Queri, Silvia; Eggart, Michael; Wendel, Maren; Peter, Ulrike
2017-11-28
Background An instrument should have been developed to measure participation as one possible criterion to evaluate inclusion of elderly people with intellectual disability. The ICF was utilized, because participation is one part of health related functioning, respectively disability. Furthermore ICF includes environmental factors (contextual factors) and attaches them an essentially influence on health related functioning, in particular on participation. Thus ICF Checklist additionally identifies environmental barriers for elimination. Methodology A linking process with VINELAND-II yielded 138 ICF items for the Checklist. The sample consists of 50 persons with a light or moderate intellectual disability. Two-thirds are female and the average age is 68. They were directly asked about their perceived quality of life. Additionally, proxy interviews were carried out with responsible staff members concerning necessary support and behavioral deviances. The ICF Checklist was administered twice, once (t2) the current staff member should rate health related functioning at the given time and in addition, a staff member who knows the person at least 10 years before (t1) should rate the former functioning. Content validity was investigated with factor analysis and criterion validity with correlational analysis related to supports need, behavioral deviances and perceived quality of life. Quantitative analysis was validated by qualitative content analysis of patient documentation. Results Factor analysis shows logical variable clusters across the extracted factors but neither interpretable factors. The Checklist is reliable, valid related to the chosen criterions and shows the expected age-related shifts. Qualitative analysis corresponds with quantitative data. Consequences/Conclusion ICF Checklist is appropriate to manage and evaluate patient-centered care. © Georg Thieme Verlag KG Stuttgart · New York.
Montero-Marín, Jesús; García-Campayo, Javier
2010-06-02
Burnout syndrome has been clinically characterised by a series of three subtypes: frenetic, underchallenged, and worn-out, with reference to coping strategies for stress and frustration at work with different degrees of dedication. The aims of the study are to present an operating definition of these subtypes in order to assess their reliability and convergent validity with respect to a standard burnout criterion and to examine differences with regard to sex and the temporary nature of work contracts. An exploratory factor analysis was performed by the main component method on a range of items devised by experts. The sample was composed of 409 employees of the University of Zaragoza, Spain. The reliability of the scales was assessed with Cronbach's alpha, convergent validity in relation to the Maslach Burnout Inventory with Pearson's r, and differences with Student's t-test and the Mann-Whitney U test. The factorial validity and reliability of the scales were good. The subtypes presented relations of differing degrees with the criterion dimensions, which were greater when dedication to work was lower. The frenetic profile presented fewer relations with the criterion dimensions while the worn-out profile presented relations of the greatest magnitude. Sex was not influential in establishing differences. However, the temporary nature of work contracts was found to have an effect: temporary employees exhibited higher scores in the frenetic profile (p < 0.001), while permanent employees did so in the underchallenged (p = 0.018) and worn-out (p < 0.001) profiles. The classical Maslach description of burnout does not include the frenetic profile; therefore, these patients are not recognised. The developed questionnaire may be a useful tool for the design and appraisal of specific preventive and treatment approaches based on the type of burnout experienced.
Brown, Heidi Wendell; Wise, Meg E.; Westenberg, Danielle; Schmuhl, Nicholas B.; Brezoczky, Kelly Lewis; Rogers, Rebecca G.; Constantine, Melissa L.
2017-01-01
Introduction and hypothesis Fewer than 30% of women with accidental bowel leakage (ABL) seek care, despite the existence of effective, minimally invasive therapies. We developed and validated a condition-specific instrument to assess barriers to care-seeking for ABL in women. Methods Adult women with ABL completed an electronic survey about condition severity, patient activation, previous care-seeking, and demographics. The Barriers to Care-seeking for Accidental Bowel Leakage (BCABL) instrument contained 42 potential items completed at baseline and again 2 weeks later. Paired t tests evaluated test–retest reliability. Factor analysis evaluated factor structure and guided item retention. Cronbach’s alpha evaluated internal consistency. Within and across factor item means generated a summary BCABL score used to evaluate scale validity with six external criterion measures. Results Among 1,677 click-throughs, 736 (44%) entered the survey; 95% of eligible female respondents (427 out of 458) provided complete data. Fifty-three percent of respondents had previously sought care for their ABL; median age was 62 years (range 27–89); mean Vaizey score was 12.8 (SD = 5.0), indicating moderate to severe ABL. Test–retest reliability was excellent for all items. Factor extraction via oblique rotation resulted in the final structure of 16 items in six domains, within which internal consistency was high. All six external criterion measures correlated significantly with BCABL score. Conclusions The BCABL questionnaire, with 16 items mapping to six domains, has excellent criterion validity and test–retest reliability when administered electronically in women with ABL. The BCABL can be used to identify care-seeking barriers for ABL in different populations, inform targeted interventions, and measure their effectiveness. PMID:28236039
Bradford, Daniel E.; Starr, Mark J.; Shackman, Alexander J.
2015-01-01
Abstract Startle potentiation is a well‐validated translational measure of negative affect. Startle potentiation is widely used in clinical and affective science, and there are multiple approaches for its quantification. The three most commonly used approaches quantify startle potentiation as the increase in startle response from a neutral to threat condition based on (1) raw potentiation, (2) standardized potentiation, or (3) percent‐change potentiation. These three quantification approaches may yield qualitatively different conclusions about effects of independent variables (IVs) on affect when within‐ or between‐group differences exist for startle response in the neutral condition. Accordingly, we directly compared these quantification approaches in a shock‐threat task using four IVs known to influence startle response in the no‐threat condition: probe intensity, time (i.e., habituation), alcohol administration, and individual differences in general startle reactivity measured at baseline. We confirmed the expected effects of time, alcohol, and general startle reactivity on affect using self‐reported fear/anxiety as a criterion. The percent‐change approach displayed apparent artifact across all four IVs, which raises substantial concerns about its validity. Both raw and standardized potentiation approaches were stable across probe intensity and time, which supports their validity. However, only raw potentiation displayed effects that were consistent with a priori specifications and/or the self‐report criterion for the effects of alcohol and general startle reactivity. Supplemental analyses of reliability and validity for each approach provided additional evidence in support of raw potentiation. PMID:26372120
Stockman, Ida J; Newkirk-Turner, Brandi L; Swartzlander, Elaina; Morris, Lekeitha R
2016-02-01
This study is a response to the need for evidence-based measures of spontaneous oral language to assess African American children under the age of 4 years. We determined if pass/fail status on a minimal competence core for morphosyntax (MCC-MS) was more highly related to scores on the Index of Productive Syntax (IPSyn)-the measure of convergent criterion validity-than to scores on 3 measures of divergent validity: number of different words (Watkins, Kelly, Harbers, & Hollis, 1995), Percentage of Consonants Correct-Revised (Shriberg, Austin, Lewis, McSweeney, & Wilson, 1997), and the Leiter International Performance Scale-Revised (Roid & Miller, 1997). Archival language samples for 68 African American 3-year-olds were analyzed to determine MCC-MS pass/fail status and the scores on measures of convergent and divergent validity. Higher IPSyn scores were observed for 60 children who passed the MCC-MS than for 8 children who did not. A significant positive correlation, rpb = .73, between MCC-MS pass/fail status and IPSyn scores was observed. This coefficient was higher than MCC-MS correlations with measures of divergent validity: rpb = .13 (Leiter International Performance Scale-Revised), rpb = .42 (number of different words in 100 utterances), and rpb = .46 (Percentage of Consonants Correct-Revised). The MCC-MS has convergent criterion validity with the IPSyn. Although more research is warranted, both measures can be potentially used in oral language assessments of African American 3-year-olds.
Johnson, Marquell; Turek, Jillian; Dornfeld, Chelsea; Drews, Jennifer; Hansen, Nicole
2016-01-01
Background The emergence of mHealth and the utilization of smartphones in physical activity interventions warrant a closer examination of validity evidence for such technology. This study examined the validity of the Samsung S Health application in measuring steps and energy expenditure. Methods Twenty-nine participants (mean age 21.69 ± 1.63) participated in the study. Participants carried a Samsung smartphone in their non-dominant hand and right pocket while walking around a 200-meter track and running on a treadmill at 2.24 m∙s−1. Steps and energy expenditure from the S Health app were compared with StepWatch 3 Step Activity Monitor steps and indirect calorimetry. Results No significant differences between S Health estimated steps and energy expenditure during walking and their respective criterion measures, regardless of placement. There was also no significant difference between S Health estimated steps and the criterion measure during treadmill running, regardless of placement. There was significant differences between S Health estimated energy expenditure and the criterion during treadmill running for both placements (both p < 0.001). Conclusions The S Health application measures steps and energy expenditure accurately during self-selected pace walking regardless of placement. Placement of the phone impacts the S Health application accuracy in measuring physical activity variables during treadmill running. PMID:29942556
Buchowski, Maciej S; Matthews, Charles E; Cohen, Sarah S; Signorello, Lisa B; Fowke, Jay H; Hargreaves, Margaret K; Schlundt, David G; Blot, William J
2012-08-01
Low physical activity (PA) is linked to cancer and other diseases prevalent in racial/ethnic minorities and low-income populations. This study evaluated the PA questionnaire (PAQ) used in the Southern Cohort Community Study, a prospective investigation of health disparities between African-American and white adults. The PAQ was administered upon entry into the cohort (PAQ1) and after 12-15 months (PAQ2) in 118 participants (40-60 year-old, 48% male, 74% African-American). Test-retest reliability (PAQ1 versus PAQ2) was assessed using Spearman correlations and the Wilcoxon signed rank test. Criterion validity of the PAQ was assessed via comparison with a PA monitor and a last-month PA survey (LMPAS), administered up to 4 times in the study period. The PAQ test-retest reliability ranged from 0.25-0.54 for sedentary behaviors and 0.22-0.47 for active behaviors. The criterion validity for the PAQ compared with PA monitor ranged from 0.21-0.24 for sedentary behaviors and from 0.17-0.31 for active behaviors. There was general consistency in the magnitude of correlations between the PAQ and PA-monitor between African-Americans and whites. The SCCS-PAQ has fair to moderate test-retest reliability and demonstrated some evidence of criterion validity for ranking participants by their level of sedentary and active behaviors.
Substance versus style: a new look at social desirability in motivating contexts.
Smith, D Brent; Ellingson, Jill E
2002-04-01
Although there is an emerging consensus that social desirability does not meaningfully affect criterion-related validity, several researchers have reaffirmed the argument that social desirability degrades the construct validity of personality measures. Yet, most research demonstrating the adverse consequences of faking for construct validity uses a fake-good instruction set. The consequence of such a manipulation is to exacerbate the effects of response distortion beyond what would be expected under realistic circumstances (e.g., an applicant setting). The research reported in this article was designed to assess these issues by using real-world contexts not influenced by artificial instructions. Results suggest that response distortion has little impact on the construct validity of personality measures used in selection contexts.
Maćkiewicz, Marta; Cieciuch, Jan
2016-01-01
In order to adjust personality measurements to children's developmental level, we constructed the Pictorial Personality Traits Questionnaire for Children (PPTQ-C). To validate the measure, we conducted a study with a total group of 1028 children aged between 7 and 13 years old. Structural validity was established through Exploratory Structural Equation Model (ESEM). Criterion validity was confirmed with a multitrait-multimethod analysis for which we introduced the children's self-assessment scores from the Big Five Questionnaire for Children. Despite some problems with reliability, one can conclude that the PPTQ-C can be a valid instrument for measuring personality traits, particularly in a group of young children (aged ~7–10 years). PMID:27252661
2013-01-01
Background A prospective study of a cohort of nursing staff from nursing homes was undertaken to validate the Nurse-Work Instability Scale (Nurse-WIS). Baseline investigation data was used to test reliability, construct validity and criterion validity. Method A survey of nursing staff from nursing homes was conducted using a questionnaire containing the Nurse-WIS along with other survey instruments (including SF-12, WAI, SPE). The self-reported number of days’ sick leave taken and if a pension for reduced work capacity was drawn were recorded. The reliability of the scale was checked by item difficulty (P), item discrimination (rjt) and by internal consistency according to Cronbach’s coefficient. The hypotheses for checking construct validity were tested on the basis of correlations. Pearson’s chi-square was used to test concurrent criterion validity; discriminant validity was tested by means of binary logistic regression. Results 396 persons answered the questionnaire (21.3% response rate). More than 80% were female and mostly work full-time in a rotating shift pattern. Following the test for item discrimination, two items were removed from the Nurse-WIS test. According to Cronbach’s (0.927) the scale provides a high degree of measuring accuracy. All hypotheses and assumptions used to test validity were confirmed: As the Nurse-WIS risk increases, health-related quality of life, work ability and job satisfaction decline. Depressive symptoms and a poor subjective prognosis of earning capacity are also more frequent. Musculoskeletal disorders and impairments of psychological well-being are more frequent. Age also influences the Nurse-WIS result. While 12.0% of those below the age of 35 had an increased risk, the figure for those aged over 55 was 50%. Conclusion This study is the first validation study of the Nurse-WIS to date. The Nurse-WIS shows good reliability, good validity and a good level of measuring accuracy. It appears to be suitable for recording prevention and rehabilitation needs among health care workers. If, in the follow-up, the Nurse-WIS likewise proves to be a reliable screening instrument with good predictive validity, it could ensure that suitable action is taken at an early stage, thereby helping to counteract early retirement and the anticipated shortage of health care workers. PMID:24330532
2011-01-01
Background The lack of culturally adapted and validated instruments for child mental health and psychosocial support in low and middle-income countries is a barrier to assessing prevalence of mental health problems, evaluating interventions, and determining program cost-effectiveness. Alternative procedures are needed to validate instruments in these settings. Methods Six criteria are proposed to evaluate cross-cultural validity of child mental health instruments: (i) purpose of instrument, (ii) construct measured, (iii) contents of construct, (iv) local idioms employed, (v) structure of response sets, and (vi) comparison with other measurable phenomena. These criteria are applied to transcultural translation and alternative validation for the Depression Self-Rating Scale (DSRS) and Child PTSD Symptom Scale (CPSS) in Nepal, which recently suffered a decade of war including conscription of child soldiers and widespread displacement of youth. Transcultural translation was conducted with Nepali mental health professionals and six focus groups with children (n = 64) aged 11-15 years old. Because of the lack of child mental health professionals in Nepal, a psychosocial counselor performed an alternative validation procedure using psychosocial functioning as a criterion for intervention. The validation sample was 162 children (11-14 years old). The Kiddie-Schedule for Affective Disorders and Schizophrenia (K-SADS) and Global Assessment of Psychosocial Disability (GAPD) were used to derive indication for treatment as the external criterion. Results The instruments displayed moderate to good psychometric properties: DSRS (area under the curve (AUC) = 0.82, sensitivity = 0.71, specificity = 0.81, cutoff score ≥ 14); CPSS (AUC = 0.77, sensitivity = 0.68, specificity = 0.73, cutoff score ≥ 20). The DSRS items with significant discriminant validity were "having energy to complete daily activities" (DSRS.7), "feeling that life is not worth living" (DSRS.10), and "feeling lonely" (DSRS.15). The CPSS items with significant discriminant validity were nightmares (CPSS.2), flashbacks (CPSS.3), traumatic amnesia (CPSS.8), feelings of a foreshortened future (CPSS.12), and easily irritated at small matters (CPSS.14). Conclusions Transcultural translation and alternative validation feasibly can be performed in low clinical resource settings through task-shifting the validation process to trained mental health paraprofessionals using structured interviews. This process is helpful to evaluate cost-effectiveness of psychosocial interventions. PMID:21816045
Validation of the peak bilirubin criterion for outcome after partial hepatectomy.
van Mierlo, Kim M C; Lodewick, Toine M; Dhar, Dipok K; van Woerden, Victor; Kurstjens, Ralph; Schaap, Frank G; van Dam, Ronald M; Vyas, Soumil; Malagó, Massimo; Dejong, Cornelis H C; Olde Damink, Steven W M
2016-10-01
Postoperative liver failure (PLF) is a dreaded complication after partial hepatectomy. The peak bilirubin criterion (>7.0 mg/dL or ≥120 μmol/L) is used to define PLF. This study aimed to validate the peak bilirubin criterion as postoperative risk indicator for 90-day liver-related mortality. Characteristics of 956 consecutive patients who underwent partial hepatectomy at the Maastricht University Medical Centre or Royal Free London between 2005 and 2012 were analyzed by uni- and multivariable analyses with odds ratios (OR) and 95% confidence intervals (95%CI). Thirty-five patients (3.7%) met the postoperative peak bilirubin criterion at median day 19 with a median bilirubin level of 183 [121-588] μmol/L. Sensitivity and specificity for liver-related mortality after major hepatectomy were 41.2% and 94.6%, respectively. The positive predictive value was 22.6%. Predictors of liver-related mortality were the peak bilirubin criterion (p < 0.001, OR = 15.9 [95%CI 5.2-48.7]), moderate-severe steatosis and fibrosis (p = 0.013, OR = 8.5 [95%CI 1.6-46.6]), ASA 3-4 (p = 0.047, OR = 3.0 [95%CI 1.0-8.8]) and age (p = 0.044, OR = 1.1 [95%CI 1.0-1.1]). The peak bilirubin criterion has a low sensitivity and positive predictive value for 90-day liver-related mortality after major hepatectomy. Copyright © 2016 International Hepato-Pancreato-Biliary Association Inc. Published by Elsevier Ltd. All rights reserved.
Pohl, Rüdiger F; Michalkiewicz, Martha; Erdfelder, Edgar; Hilbig, Benjamin E
2017-07-01
According to the recognition-heuristic theory, decision makers solve paired comparisons in which one object is recognized and the other not by recognition alone, inferring that recognized objects have higher criterion values than unrecognized ones. However, success-and thus usefulness-of this heuristic depends on the validity of recognition as a cue, and adaptive decision making, in turn, requires that decision makers are sensitive to it. To this end, decision makers could base their evaluation of the recognition validity either on the selected set of objects (the set's recognition validity), or on the underlying domain from which the objects were drawn (the domain's recognition validity). In two experiments, we manipulated the recognition validity both in the selected set of objects and between domains from which the sets were drawn. The results clearly show that use of the recognition heuristic depends on the domain's recognition validity, not on the set's recognition validity. In other words, participants treat all sets as roughly representative of the underlying domain and adjust their decision strategy adaptively (only) with respect to the more general environment rather than the specific items they are faced with.
Sertdemir, Y; Burgut, R
2009-01-01
In recent years the use of surrogate end points (S) has become an interesting issue. In clinical trials, it is important to get treatment outcomes as early as possible. For this reason there is a need for surrogate endpoints (S) which are measured earlier than the true endpoint (T). However, before a surrogate endpoint can be used it must be validated. For a candidate surrogate endpoint, for example time to recurrence, the validation result may change dramatically between clinical trials. The aim of this study is to show how the validation criterion (R(2)(trial)) proposed by Buyse et al. are influenced by the magnitude of treatment effect with an application using real data. The criterion R(2)(trial) proposed by Buyse et al. (2000) is applied to the four data sets from colon cancer clinical trials (C-01, C-02, C-03 and C-04). Each clinical trial is analyzed separately for treatment effect on survival (true endpoint) and recurrence free survival (surrogate endpoint) and this analysis is done also for each center in each trial. Results are used for standard validation analysis. The centers were grouped by the Wald statistic in 3 equal groups. Validation criteria R(2)(trial) were 0.641 95% CI (0.432-0.782), 0.223 95% CI (0.008-0.503), 0.761 95% CI (0.550-0.872) and 0.560 95% CI (0.404-0.687) for C-01, C-02, C-03 and C-04 respectively. The R(2)(trial) criteria changed by the Wald statistics observed for the centers used in the validation process. Higher the Wald statistic groups are higher the R(2)(trial) values observed. The recurrence free survival is not a good surrogate for overall survival in clinical trials with non significant treatment effects and moderate for significant treatment effects. This shows that the level of significance of treatment effect should be taken into account in validation process of surrogate endpoints.
Revision, Criterion Validity, and Multi-group Assessment of the Reactions to Homosexuality Scale
Smolenski, Derek J.; Diamond, Pamela M.; Ross, Michael W.; Simon Rosser, B. R.
2010-01-01
Internalized homonegativity encompasses negative attitudes toward one’s own sexual orientation, and is associated with negative mental and physical health outcomes. The Reactions to Homosexuality scale (Ross & Rosser, 1996), an instrument used to measure internalized homonegativity, has been criticized for including content irrelevant to the construct of internalized homonegativity. We revised the scale using exploratory and confirmatory factor analyses, and identified a seven-item, three-factor reduced version that demonstrated measurement invariance across racial/ethnic categorizations and between English and Spanish versions. We also investigated criterion validity by estimating correlations with hypothesized outcomes associated with outness, relationship status, sexual orientation, and gay community affiliation. The evidence of measurement invariance suggests that this scale is appropriate for pluralistic treatment or study groups. PMID:20954058
Moschella, Melissa
2016-06-01
This article explains the problems with Alan Shewmon's critique of brain death as a valid sign of human death, beginning with a critical examination of his analogy between brain death and severe spinal cord injury. The article then goes on to assess his broader argument against the necessity of the brain for adult human organismal integration, arguing that he fails to translate correctly from biological to metaphysical claims. Finally, on the basis of a deeper metaphysical analysis, I offer a revised rationale for the validity of the neurological criterion of human death. © The Author 2016. Published by Oxford University Press, on behalf of the Journal of Medicine and Philosophy Inc. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Goodman, L A; Corcoran, C; Turner, K; Yuan, N; Green, B L
1998-07-01
This article reviews the psychometric properties of the Stressful Life Events Screening Questionnaire (SLESQ), a recently developed trauma history screening measure, and discusses the complexities involved in assessing trauma exposure. There are relatively few general measures of exposure to a variety of types of traumatic events, and most of those that exist have not been subjected to rigorous psychometric evaluation. The SLESQ showed good test-retest reliability, with a median kappa of .73, adequate convergent validity (with a lengthier interview) with a median kappa of .64, and good discrimination between Criterion A and non-Criterion A events. The discussion addresses some of the challenges of assessing traumatic event exposure along the dimensions of defining traumatic events, assessment methodologies, reporting consistency, and incident validation.
Lei, Pingguang; Lei, Guanghe; Tian, Jianjun; Zhou, Zengfen; Zhao, Miao; Wan, Chonghua
2014-10-01
This paper is aimed to develop the irritable bowel syndrome (IBS) scale of the system of Quality of Life Instruments for Chronic Diseases (QLICD-IBS) by the modular approach and validate it by both classical test theory and generalizability theory. The QLICD-IBS was developed based on programmed decision procedures with multiple nominal and focus group discussions, in-depth interview, and quantitative statistical procedures. One hundred twelve inpatients with IBS were used to provide the data measuring QOL three times before and after treatments. The psychometric properties of the scale were evaluated with respect to validity, reliability, and responsiveness employing correlation analysis, factor analyses, multi-trait scaling analysis, t tests and also G studies and D studies of generalizability theory analysis. Multi-trait scaling analysis, correlation, and factor analyses confirmed good construct validity and criterion-related validity when using SF-36 as a criterion. Test-retest reliability coefficients (Pearson r and intra-class correlation (ICC)) for the overall score and all domains were higher than 0.80; the internal consistency α for all domains at two measurements were higher than 0.70 except for the social domain (0.55 and 0.67, respectively). The overall score and scores for all domains/facets had statistically significant changes after treatments with moderate or higher effect size standardized response mean (SRM) ranging from 0.72 to 1.02 at domain levels. G coefficients and index of dependability (Ф coefficients) confirmed the reliability of the scale further with more exact variance components. The QLICD-IBS has good validity, reliability, responsiveness, and some highlights and can be used as the quality of life instrument for patients with IBS.
Okun, Michele L; Buysse, Daniel J; Hall, Martica H
2015-06-15
Although a substantial number of pregnant women report symptoms of insomnia, few studies have used a validated instrument to determine the prevalence in early gestation. Identification of insomnia in pregnancy is vital given the strong connection between insomnia and the incidence of depression, cardiovascular disease, or immune dysregulation. The goal of this paper is to provide additional psychometric evaluation and validation of the Insomnia Symptom Questionnaire (ISQ) and to establish prevalence rates of insomnia among a cohort of pregnant women during early gestation. The ISQ was evaluated in 143 pregnant women at 12 weeks gestation. The internal consistency and criterion validity of the dichotomized ISQ were compared to traditional measures of sleep from sleep diaries, actigraphy, and the Pittsburgh Sleep Quality Index using indices of sensitivity, specificity, positive and negative predictive value (PPV, NPV), and likelihood ratio (LR) tests. The ISQ identified 12.6% of the sample as meeting a case definition of insomnia, consistent with established diagnostic criteria. Good reliability was established with Cronbach α = 0.86. The ISQ had high specificity (most > 85%), but sensitivity, PPV, NPV, and LRs varied according to which sleep measure was used as the validating criterion. Insomnia is a health problem for many pregnant women at all stages in pregnancy. These data support the validity and reliability of the ISQ to identify insomnia in pregnant women. The ISQ is a short and cost-effective tool that can be quickly employed in large observational studies or in clinical practice where perinatal women are seen. A commentary on this article appears in this issue on page 593. © 2015 American Academy of Sleep Medicine.
Simulated Driving Assessment (SDA) for Teen Drivers: Results from a Validation Study
McDonald, Catherine C.; Kandadai, Venk; Loeb, Helen; Seacrist, Thomas S.; Lee, Yi-Ching; Winston, Zachary; Winston, Flaura K.
2015-01-01
Background Driver error and inadequate skill are common critical reasons for novice teen driver crashes, yet few validated, standardized assessments of teen driving skills exist. The purpose of this study was to evaluate the construct and criterion validity of a newly developed Simulated Driving Assessment (SDA) for novice teen drivers. Methods The SDA's 35-minute simulated drive incorporates 22 variations of the most common teen driver crash configurations. Driving performance was compared for 21 inexperienced teens (age 16–17 years, provisional license ≤90 days) and 17 experienced adults (age 25–50 years, license ≥5 years, drove ≥100 miles per week, no collisions or moving violations ≤3 years). SDA driving performance (Error Score) was based on driving safety measures derived from simulator and eye-tracking data. Negative driving outcomes included simulated collisions or run-off-the-road incidents. A professional driving evaluator/instructor reviewed videos of SDA performance (DEI Score). Results The SDA demonstrated construct validity: 1.) Teens had a higher Error Score than adults (30 vs. 13, p=0.02); 2.) For each additional error committed, the relative risk of a participant's propensity for a simulated negative driving outcome increased by 8% (95% CI: 1.05–1.10, p<0.01). The SDA demonstrated criterion validity: Error Score was correlated with DEI Score (r=−0.66, p<0.001). Conclusions This study supports the concept of validated simulated driving tests like the SDA to assess novice driver skill in complex and hazardous driving scenarios. The SDA, as a standard protocol to evaluate teen driver performance, has the potential to facilitate screening and assessment of teen driving readiness and could be used to guide targeted skill training. PMID:25740939
Seo, Kyoungsan; Song, Misoon; Choi, Suyoung; Kim, Se-An; Chang, Sun Ju
2017-04-01
The purpose of this study was to develop the Diabetes Self-Management Behavior for Older Koreans (DSMB-O). This scale is based on the seven relevant domains that have been identified by the American Association of Diabetes Educators (AADE) and is adjusted for sociocultural and age-related characteristics. Four phases were used to develop of the DSMB-O as a criterion-referenced measure. In phases 1 and 2, the DSMB-O adopted the AADE's seven domains and established a self-report questionnaire using a small number of items that are applicable to older Koreans. In phase 3, the DSMB-O was formulated with 16 preliminary items, including seven subitems. By assessing the content validity, 14 items (including five subitems) were selected. The final phase involved evaluating the DSMB-O's psychometric properties, including test-retest reliability, content validity, and criterion-related validity, using data from 150 older Koreans with type 2 diabetes. The coefficients of agreement and Cohen's Kappa for the test-retest reliability test ranged from 0.32 to 1.0 and -0.07 to 1.0, respectively. For the content validity, the values of both the item- and scale-level content validity indices were 1.0. The scores from the DSMB-O were positively correlated with the scores from the Korean version of the Summary of Diabetes Self-Care Activities Questionnaire. The DSMB-O is short and easy for older Koreans to use, as well as having acceptable levels of reliability and validity. Hence, the DSMB-O can be a useful tool to evaluate diabetes self-management behaviors in older Koreans with type 2 diabetes. © 2016 Japan Academy of Nursing Science.
The Multimedia Activity Recall for Children and Adolescents (MARCA): development and evaluation.
Ridley, Kate; Olds, Tim S; Hill, Alison
2006-05-26
Self-report recall questionnaires are commonly used to measure physical activity, energy expenditure and time use in children and adolescents. However, self-report questionnaires show low to moderate validity, mainly due to inaccuracies in recalling activity in terms of duration and intensity. Aside from recall errors, inaccuracies in estimating energy expenditure from self-report questionnaires are compounded by a lack of data on the energy cost of everyday activities in children and adolescents. This article describes the development of the Multimedia Activity Recall for Children and Adolescents (MARCA), a computer-delivered use-of-time instrument designed to address both the limitations of self-report recall questionnaires in children, and the lack of energy cost data in children. The test-retest reliability of the MARCA was assessed using a sample of 32 children (aged 11.8 +/- 0.7 y) who undertook the MARCA twice within 24-h. Criterion validity was assessed by comparing self-reports with accelerometer counts collected on a sample of 66 children (aged 11.6 +/- 0.8 y). Content and construct validity were assessed by establishing whether data collected using the MARCA on 1429 children (aged 11.9 +/- 0.8 y) exhibited relationships and trends in children's physical activity consistent with established findings from a number of previous research studies. Test-retest reliability was high with intra-class coefficients ranging from 0.88 to 0.94. The MARCA demonstrated criterion validity comparable to other self-report instruments with Spearman coefficients ranging from rho = 0.36 to 0.45, and provided evidence of good content and construct validity. The MARCA is a valid and reliable self-report questionnaire, capable of a wide variety of flexible use-of-time analyses related to both physical activity and sedentary behaviour, and offers advantages over existing pen-and-paper questionnaires.
Is the Simple Shoulder Test a valid outcome instrument for shoulder arthroplasty?
Hsu, Jason E; Russ, Stacy M; Somerson, Jeremy S; Tang, Anna; Warme, Winston J; Matsen, Frederick A
2017-10-01
The Simple Shoulder Test (SST) is a brief, inexpensive, and widely used patient-reported outcome tool, but it has not been rigorously evaluated for patients having shoulder arthroplasty. The goal of this study was to rigorously evaluate the validity of the SST for outcome assessment in shoulder arthroplasty using a systematic review of the literature and an analysis of its properties in a series of 408 surgical cases. SST scores, 36-Item Short Form Health Survey scores, and satisfaction scores were collected preoperatively and 2 years postoperatively. Responsiveness was assessed by comparing preoperative and 2-year postoperative scores. Criterion validity was determined by correlating the SST with the 36-Item Short Form Health Survey. Construct validity was tested through 5 clinical hypotheses regarding satisfaction, comorbidities, insurance status, previous failed surgery, and narcotic use. Scores after arthroplasty improved from 3.9 ± 2.8 to 10.2 ± 2.3 (P < .001). The change in SST correlated strongly with patient satisfaction (P < .001). The SST had large Cohen's d effect sizes and standardized response means. Criterion validity was supported by significant differences between satisfied and unsatisfied patients, those with more severe and less severe comorbidities, those with workers' compensation or Medicaid and other types of insurance, those with and without previous failed shoulder surgery, and those taking and those not taking narcotic pain medication before surgery (P < .005). These data combined with a systematic review of the literature demonstrate that the SST is a valid and responsive patient-reported outcome measure for assessing the outcomes of shoulder arthroplasty. Copyright © 2017 Journal of Shoulder and Elbow Surgery Board of Trustees. Published by Elsevier Inc. All rights reserved.
Tsuno, Kanami; Yoshimasu, Kouichi; Hayashi, Takashi; Tatsuta, Nozomi; Ito, Yuki; Kamijima, Michihiro; Nakai, Kunihiko
2018-01-01
Nowadays, attention deficit hyperactivity (ADH) problems are observed commonly among school-age children. However, questionnaires specific to ADH behaviors among preschool children are very few. The aim of this study was to investigate the reliability and validity of the 25-item Behavioral Check List (BCL), which was developed from interviews of parents with children who were diagnosed as having Attention-deficit/hyperactivity disorder (ADHD) and measures ADH behaviors in preschool age. We recruited 22 teachers from 10 nurseries/kindergartens in Miyagi Prefecture, Japan. A total of 138 preschool children were assessed using the BCL. To investigate inter-rater reliability, two teachers from each facility assess seven to twenty children in their class, and intraclass correlation coefficients (ICCs) were calculated. The teachers additionally answered questions in the 1/5-5 Caregiver-Teacher Report Form (C-TRF) to investigate the criterion validity of the BCL. To investigate structural validity, exploratory factor analysis with promax rotation and confirmatory factor analysis were performed. The internal consistency reliability of the BCL was good (α = 0.92) and correlation analyses also confirmed its excellent criterion validity. Although exploratory factor analysis for the BCL yielded a five-factor model that consisted of a factor structure different from that of the original one, the results were similar to the original six factors. The ICCs of the BCL were 0.38-0.99 and it was not high enough for inter-rater reliability in some facilities. However, there is a possibility to improve it by giving raters adequate explanations when using BCL. The present study showed acceptable levels of reliability and validity of the BCL among Japanese preschool children.
Carvalho, Flávia A; Morelhão, Priscila K; Franco, Marcia R; Maher, Chris G; Smeets, Rob J E M; Oliveira, Crystian B; Freitas Júnior, Ismael F; Pinto, Rafael Z
2017-02-01
Although there is some evidence for reliability and validity of self-report physical activity (PA) questionnaires in the general adult population, it is unclear whether we can assume similar measurement properties in people with chronic low back pain (LBP). To determine the test-retest reliability of the International Physical Activity Questionnaire (IPAQ) long-version and the Baecke Physical Activity Questionnaire (BPAQ) and their criterion-related validity against data derived from accelerometers in patients with chronic LBP. Cross-sectional study. Patients with non-specific chronic LBP were recruited. Each participant attended the clinic twice (one week interval) and completed self-report PA. Accelerometer measures >7 days included time spent in moderate-and-vigorous physical activity, steps/day, counts/minute, and vector magnitude counts/minute. Intraclass Correlation Coefficients (ICC) and Bland and Altman method were used to determine reliability and spearman rho correlation were used for criterion-related validity. A total of 73 patients were included in our analyses. The reliability analyses revealed that the BPAQ and its subscales have moderate to excellent reliability (ICC 2,1 : 0.61 to 0.81), whereas IPAQ and most IPAQ domains (except walking) showed poor reliability (ICC 2,1 : 0.20 to 0.40). The Bland and Altman method revealed larger discrepancies for the IPAQ. For the validity analysis, questionnaire and accelerometer measures showed at best fair correlation (rho < 0.37). Although the BPAQ showed better reliability than the IPAQ long-version, both questionnaires did not demonstrate acceptable validity against accelerometer data. These findings suggest that questionnaire and accelerometer PA measures should not be used interchangeably in this population. Copyright © 2016 Elsevier Ltd. All rights reserved.
Pedersen, Scott J; Kitic, Cecilia M; Bird, Marie-Louise; Mainsbridge, Casey P; Cooley, P Dean
2016-08-19
With the advent of workplace health and wellbeing programs designed to address prolonged occupational sitting, tools to measure behaviour change within this environment should derive from empirical evidence. In this study we measured aspects of validity and reliability for the Occupational Sitting and Physical Activity Questionnaire that asks employees to recount the percentage of work time they spend in the seated, standing, and walking postures during a typical workday. Three separate cohort samples (N = 236) were drawn from a population of government desk-based employees across several departmental agencies. These volunteers were part of a larger state-wide intervention study. Workplace sitting and physical activity behaviour was measured both subjectively against the International Physical Activity Questionnaire, and objectively against ActivPal accelerometers before the intervention began. Criterion validity and concurrent validity for each of the three posture categories were assessed using Spearman's rank correlation coefficients, and a bias comparison with 95 % limits of agreement. Test-retest reliability of the survey was reported with intraclass correlation coefficients. Criterion validity for this survey was strong for sitting and standing estimates, but weak for walking. Participants significantly overestimated the amount of walking they did at work. Concurrent validity was moderate for sitting and standing, but low for walking. Test-retest reliability of this survey proved to be questionable for our sample. Based on our findings we must caution occupational health and safety professionals about the use of employee self-report data to estimate workplace physical activity. While the survey produced accurate measurements for time spent sitting at work it was more difficult for employees to estimate their workplace physical activity.
Platzer, Christine; Bröder, Arndt; Heck, Daniel W
2014-05-01
Decision situations are typically characterized by uncertainty: Individuals do not know the values of different options on a criterion dimension. For example, consumers do not know which is the healthiest of several products. To make a decision, individuals can use information about cues that are probabilistically related to the criterion dimension, such as sugar content or the concentration of natural vitamins. In two experiments, we investigated how the accessibility of cue information in memory affects which decision strategy individuals rely on. The accessibility of cue information was manipulated by means of a newly developed paradigm, the spatial-memory-cueing paradigm, which is based on a combination of the looking-at-nothing phenomenon and the spatial-cueing paradigm. The results indicated that people use different decision strategies, depending on the validity of easily accessible information. If the easily accessible information is valid, people stop information search and decide according to a simple take-the-best heuristic. If, however, information that comes to mind easily has a low predictive validity, people are more likely to integrate all available cue information in a compensatory manner.
NASA Astrophysics Data System (ADS)
Hou, Yanqing; Verhagen, Sandra; Wu, Jie
2016-12-01
Ambiguity Resolution (AR) is a key technique in GNSS precise positioning. In case of weak models (i.e., low precision of data), however, the success rate of AR may be low, which may consequently introduce large errors to the baseline solution in cases of wrong fixing. Partial Ambiguity Resolution (PAR) is therefore proposed such that the baseline precision can be improved by fixing only a subset of ambiguities with high success rate. This contribution proposes a new PAR strategy, allowing to select the subset such that the expected precision gain is maximized among a set of pre-selected subsets, while at the same time the failure rate is controlled. These pre-selected subsets are supposed to obtain the highest success rate among those with the same subset size. The strategy is called Two-step Success Rate Criterion (TSRC) as it will first try to fix a relatively large subset with the fixed failure rate ratio test (FFRT) to decide on acceptance or rejection. In case of rejection, a smaller subset will be fixed and validated by the ratio test so as to fulfill the overall failure rate criterion. It is shown how the method can be practically used, without introducing a large additional computation effort. And more importantly, how it can improve (or at least not deteriorate) the availability in terms of baseline precision comparing to classical Success Rate Criterion (SRC) PAR strategy, based on a simulation validation. In the simulation validation, significant improvements are obtained for single-GNSS on short baselines with dual-frequency observations. For dual-constellation GNSS, the improvement for single-frequency observations on short baselines is very significant, on average 68%. For the medium- to long baselines, with dual-constellation GNSS the average improvement is around 20-30%.
Gromisch, Elizabeth S; Zemon, Vance; Holtzer, Roee; Chiaravalloti, Nancy D; DeLuca, John; Beier, Meghan; Farrell, Eileen; Snyder, Stacey; Schairer, Laura C; Glukhovsky, Lisa; Botvinick, Jason; Sloan, Jessica; Picone, Mary Ann; Kim, Sonya; Foley, Frederick W
2016-10-01
Cognitive dysfunction is prevalent in multiple sclerosis. As self-reported cognitive functioning is unreliable, brief objective screening measures are needed. Utilizing widely used full-length neuropsychological tests, this study aimed to establish the criterion validity of highly abbreviated versions of the Brief Visuospatial Memory Test - Revised (BVMT-R), Symbol Digit Modalities Test (SDMT), Delis-Kaplan Executive Function System (D-KEFS) Sorting Test, and Controlled Oral Word Association Test (COWAT) in order to begin developing an MS-specific screening battery. Participants from Holy Name Medical Center and the Kessler Foundation were administered one or more of these four measures. Using test-specific criterion to identify impairment at both -1.5 and -2.0 SD, receiver-operating-characteristic (ROC) analyses of BVMT-R Trial 1, Trial 2, and Trial 1 + 2 raw data (N = 286) were run to calculate the classification accuracy of the abbreviated version, as well as the sensitivity and specificity. The same methods were used for SDMT 30-s and 60-s (N = 321), D-KEFS Sorting Free Card Sort 1 (N = 120), and COWAT letters F and A (N = 298). Using these definitions of impairment, each analysis yielded high classification accuracy (89.3 to 94.3%). BVMT-R Trial 1, SDMT 30-s, D-KEFS Free Card Sort 1, and COWAT F possess good criterion validity in detecting impairment on their respective overall measure, capturing much of the same information as the full version. Along with the first two trials of the California Verbal Learning Test - Second Edition (CVLT-II), these five highly abbreviated measures may be used to develop a brief screening battery.
ERIC Educational Resources Information Center
Geiser, Saul; Santelices, Maria Veronica
2007-01-01
High-school grades are often viewed as an unreliable criterion for college admissions, owing to differences in grading standards across high schools, while standardized tests are seen as methodologically rigorous, providing a more uniform and valid yardstick for assessing student ability and achievement. The present study challenges that…
1997-02-06
Adjudication Duration 2 2. INTRODUCTION This retrospective study analyzes relationships of variables to adjudication and processing duration in the Army...Package for Social Scientists (SPSS), Standard Version 6.1, June 1994, to determine relationships among the dependent and independent variables... consanguinity between variables. Content and criterion validity is employed to determine the measure of scientific validity. Reliability is also
Hibbard, S; Tang, P C; Latko, R; Park, J H; Munn, S; Bolz, S; Somerville, A
2000-12-01
Thematic Apperception Test (Murray, 1943) responses of 69 Asian American (hereafter, Asian) and 83 White students were coded for defenses according to the Defense Mechanism Manual (Cramer, 1991b) and studied for differential validity in predicting paper-and-pencil measures of relevant constructs. Three tests for differential validity were used: (a) differences between validity coefficients, (b) interactions between predictor and ethnicity in criterion prediction, and (c) differences between groups in mean prediction errors using a common regression equation. Modest differential validity was found. It was surprising that the DMM scales were slightly stronger predictors of their criteria among Asians than among Whites and when a common predictor was used, desirable criteria were overpredicted for Asians, whereas undesirable ones were overpredicted for Whites. The results were not affected by acculturation level or English vocabulary among the Asians.
Validating SPICES as a Screening Tool for Frailty Risks among Hospitalized Older Adults
Aronow, Harriet Udin; Borenstein, Jeff; Haus, Flora; Braunstein, Glenn D.; Bolton, Linda Burnes
2014-01-01
Older patients are vulnerable to adverse hospital events related to frailty. SPICES, a common screening protocol to identify risk factors in older patients, alerts nurses to initiate care plans to reduce the probability of patient harm. However, there is little published validating the association between SPICES and measures of frailty and adverse outcomes. This paper used data from a prospective cohort study on frailty among 174 older adult inpatients to validate SPICES. Almost all patients met one or more SPICES criteria. The sum of SPICES was significantly correlated with age and other well-validated assessments for vulnerability, comorbid conditions, and depression. Individuals meeting two or more SPICES criteria had a risk of adverse hospital events three times greater than individuals with either no or one criterion. Results suggest that as a screening tool used within 24 hours of admission, SPICES is both valid and predictive of adverse events. PMID:24876954
Müller, Alessandra Bombarda; Valentini, Nadia Cristina; Bandeira, Paulo Felipe Ribeiro
2017-05-01
The range of stimuli provided by physical space, toys and care practices contributes to the motor, cognitive and social development of children. However, assessing the quality of child education environments is a challenge, and can be considered a health promotion initiative. This study investigated the validity of the criterion, content, construct and reliability of the Affordances in the Home Environment for Motor Development - Infant Scale (AHEMD-IS), version 3-18 months, for the use in daycare settings. Content validation was conducted with the participation of seven motor development and health care experts; and, face validity by 20 specialists in health and education. The results indicate the suitability of the adapted AHEMD-IS, evidencing its validity for the daycare setting a potential tool to assess the opportunities that the collective context offers to child development. Copyright © 2017 Elsevier Inc. All rights reserved.
Construction and Initial Validation of the Multiracial Experiences Measure (MEM)
Yoo, Hyung Chol; Jackson, Kelly; Guevarra, Rudy P.; Miller, Matthew J.; Harrington, Blair
2015-01-01
This article describes the development and validation of the Multiracial Experiences Measure (MEM): a new measure that assesses uniquely racialized risks and resiliencies experienced by individuals of mixed racial heritage. Across two studies, there was evidence for the validation of the 25-item MEM with 5 subscales including Shifting Expressions, Perceived Racial Ambiguity, Creating Third Space, Multicultural Engagement, and Multiracial Discrimination. The 5-subscale structure of the MEM was supported by a combination of exploratory and confirmatory factor analyses. Evidence of criterion-related validity was partially supported with MEM subscales correlating with measures of racial diversity in one’s social network, color-blind racial attitude, psychological distress, and identity conflict. Evidence of discriminant validity was supported with MEM subscales not correlating with impression management. Implications for future research and suggestions for utilization of the MEM in clinical practice with multiracial adults are discussed. PMID:26460977
Construction and initial validation of the Multiracial Experiences Measure (MEM).
Yoo, Hyung Chol; Jackson, Kelly F; Guevarra, Rudy P; Miller, Matthew J; Harrington, Blair
2016-03-01
This article describes the development and validation of the Multiracial Experiences Measure (MEM): a new measure that assesses uniquely racialized risks and resiliencies experienced by individuals of mixed racial heritage. Across 2 studies, there was evidence for the validation of the 25-item MEM with 5 subscales including Shifting Expressions, Perceived Racial Ambiguity, Creating Third Space, Multicultural Engagement, and Multiracial Discrimination. The 5-subscale structure of the MEM was supported by a combination of exploratory and confirmatory factor analyses. Evidence of criterion-related validity was partially supported with MEM subscales correlating with measures of racial diversity in one's social network, color-blind racial attitude, psychological distress, and identity conflict. Evidence of discriminant validity was supported with MEM subscales not correlating with impression management. Implications for future research and suggestions for utilization of the MEM in clinical practice with multiracial adults are discussed. (c) 2016 APA, all rights reserved).
Validation of the Weight Concerns Scale Applied to Brazilian University Students.
Dias, Juliana Chioda Ribeiro; da Silva, Wanderson Roberto; Maroco, João; Campos, Juliana Alvares Duarte Bonini
2015-06-01
The aim of this study was to evaluate the validity and reliability of the Portuguese version of the Weight Concerns Scale (WCS) when applied to Brazilian university students. The scale was completed by 1084 university students from Brazilian public education institutions. A confirmatory factor analysis was conducted. The stability of the model in independent samples was assessed through multigroup analysis, and the invariance was estimated. Convergent, concurrent, divergent, and criterion validities as well as internal consistency were estimated. Results indicated that the one-factor model presented an adequate fit to the sample and values of convergent validity. The concurrent validity with the Body Shape Questionnaire and divergent validity with the Maslach Burnout Inventory for Students were adequate. Internal consistency was adequate, and the factorial structure was invariant in independent subsamples. The results present a simple and short instrument capable of precisely and accurately assessing concerns with weight among Brazilian university students. Copyright © 2015 Elsevier Ltd. All rights reserved.
Development and Validation of Personality Disorder Spectra Scales for the MMPI-2-RF.
Sellbom, Martin; Waugh, Mark H; Hopwood, Christopher J
2018-01-01
The purpose of this study was to develop and validate a set of MMPI-2-RF (Ben-Porath & Tellegen, 2008/2011) personality disorder (PD) spectra scales. These scales could serve the purpose of assisting with DSM-5 PD diagnosis and help link categorical and dimensional conceptions of personality pathology within the MMPI-2-RF. We developed and provided initial validity results for scales corresponding to the 10 PD constructs listed in the DSM-5 using data from student, community, clinical, and correctional samples. Initial validation efforts indicated good support for criterion validity with an external PD measure as well as with dimensional personality traits included in the DSM-5 alternative model for PDs. Construct validity results using psychosocial history and therapists' ratings in a large clinical sample were generally supportive as well. Overall, these brief scales provide clinicians using MMPI-2-RF data with estimates of DSM-5 PD constructs that can support cross-model connections between categorical and dimensional assessment approaches.
Validation of the Australian Propensity for Angry Driving Scale (Aus-PADS).
Leal, Nerida L; Pachana, Nancy A
2009-09-01
The present study used a university sample to assess the test-retest reliability and validity of the Australian Propensity for Angry Driving Scale (Aus-PADS). The scale has stability over time, and convergent validity was established, as Aus-PADS scores correlated significantly with established anger and impulsivity measures. Discriminant validity was also established, as Aus-PADS scores did not correlate with Venturesomeness scores. The Aus-PADS has demonstrated criterion validity, as scores were correlated with behavioural measures, such as yelling at other drivers, gesturing at other drivers, and feeling angry but not doing anything. Aus-PADS scores reliably predicted the frequency of these behaviours over and above other study variables. No significant relationship between aggressive driving and crash involvement was observed. It was concluded that the Aus-PADS is a reliable and valid tool appropriate for use in Australian research, and that the potential relationship between aggressive driving and crash involvement warrants further investigation with a more representative (and diverse) driver sample.
Criterion Related Validity of Karate Specific Aerobic Test (KSAT)
Chaabene, Helmi; Hachana, Younes; Franchini, Emerson; Tabben, Montassar; Mkaouer, Bessem; Negra, Yassine; Hammami, Mehrez; Chamari, Karim
2015-01-01
Background: Karate is one the most popular combat sports in the world. Physical fitness assessment on a regular manner is important for monitoring the effectiveness of the training program and the readiness of karatekas to compete. Objectives: The aim of this research was to examine the criterion related to validity of the karate specific aerobic test (KSAT) as an indicator of aerobic level of karate practitioners. Patients and Methods: Cardiorespiratory responses, aerobic performance level through both treadmill laboratory test and YoYo intermittent recovery test level 1 (YoYoIRTL1) as well as time to exhaustion in the KSAT test (TE’KSAT) were determined in a total of fifteen healthy international karatekas (i.e. karate practitioners) (means ± SD: age: 22.2 ± 4.3 years; height: 176.4 ± 7.5 cm; body mass: 70.3 ± 9.7 kg and body fat: 13.2 ± 6%). Results: Peak heart rate obtained from KSAT represented ~99% of maximal heart rate registered during the treadmill test showing that KSAT imposes high physiological demands. There was no significant correlation between KSAT’s TE and relative (mL/min kg) treadmill maximal oxygen uptake (r = 0.14; P = 0.69; [small]). On the other hand, there was a significant relationship between KSAT’s TE and the velocity associated with VO2max (vVO2max) (r = 0.67; P = 0.03; [large]) as well as the velocity at VO2 corresponding to the second ventilatory threshold (vVO2 VAT) (r = 0.64; P = 0.04; [large]). Moreover, significant relationship was found between TE’s KSAT and both the total distance covered and parameters of intermittent endurance measured through YoYoIRTL1. Conclusions: The KSAT has not proved to have indirect criterion related validity as no significant correlations have been found between TE’s KSAT and treadmill VO2max. Nevertheless, as correlated to other aerobic fitness variables, KSAT can be considered as an indicator of karate specific endurance. The establishment of the criterion related validity of the KSAT requires further investigation. PMID:26446345
Translating and validating a Training Needs Assessment tool into Greek
Markaki, Adelais; Antonakis, Nikos; Hicks, Carolyn M; Lionis, Christos
2007-01-01
Background The translation and cultural adaptation of widely accepted, psychometrically tested tools is regarded as an essential component of effective human resource management in the primary care arena. The Training Needs Assessment (TNA) is a widely used, valid instrument, designed to measure professional development needs of health care professionals, especially in primary health care. This study aims to describe the translation, adaptation and validation of the TNA questionnaire into Greek language and discuss possibilities of its use in primary care settings. Methods A modified version of the English self-administered questionnaire consisting of 30 items was used. Internationally recommended methodology, mandating forward translation, backward translation, reconciliation and pretesting steps, was followed. Tool validation included assessing item internal consistency, using the alpha coefficient of Cronbach. Reproducibility (test – retest reliability) was measured by the kappa correlation coefficient. Criterion validity was calculated for selected parts of the questionnaire by correlating respondents' research experience with relevant research item scores. An exploratory factor analysis highlighted how the items group together, using a Varimax (oblique) rotation and subsequent Cronbach's alpha assessment. Results The psychometric properties of the Greek version of the TNA questionnaire for nursing staff employed in primary care were good. Internal consistency of the instrument was very good, Cronbach's alpha was found to be 0.985 (p < 0.001) and Kappa coefficient for reproducibility was found to be 0.928 (p < 0.0001). Significant positive correlations were found between respondents' current performance levels on each of the research items and amount of research involvement, indicating good criterion validity in the areas tested. Factor analysis revealed seven factors with eigenvalues of > 1.0, KMO (Kaiser-Meyer-Olkin) measure of sampling adequacy = 0.680 and Bartlett's test of sphericity, p < 0.001. Conclusion The translated and adapted Greek version is comparable with the original English instrument in terms of validity and reliability and it is suitable to assess professional development needs of nursing staff in Greek primary care settings. PMID:17474989
Validity of the posttraumatic stress disorders (PTSD) checklist in pregnant women.
Gelaye, Bizu; Zheng, Yinnan; Medina-Mora, Maria Elena; Rondon, Marta B; Sánchez, Sixto E; Williams, Michelle A
2017-05-12
The PTSD Checklist-civilian (PCL-C) is one of the most commonly used self-report measures of PTSD symptoms, however, little is known about its validity when used in pregnancy. This study aims to evaluate the reliability and validity of the PCL-C as a screen for detecting PTSD symptoms among pregnant women. A total of 3372 pregnant women who attended their first prenatal care visit in Lima, Peru participated in the study. We assessed the reliability of the PCL-C items using Cronbach's alpha. Criterion validity and performance characteristics of PCL-C were assessed against an independent, blinded Clinician-Administered PTSD Scale (CAPS) interview using measures of sensitivity, specificity and receiver operating characteristics (ROC) curves. We tested construct validity using exploratory and confirmatory factor analytic approaches. The reliability of the PCL-C was excellent (Cronbach's alpha =0.90). ROC analysis showed that a cut-off score of 26 offered optimal discriminatory power, with a sensitivity of 0.86 (95% CI: 0.78-0.92) and a specificity of 0.63 (95% CI: 0.62-0.65). The area under the ROC curve was 0.75 (95% CI: 0.71-0.78). A three-factor solution was extracted using exploratory factor analysis and was further complemented with three other models using confirmatory factor analysis (CFA). In a CFA, a three-factor model based on DSM-IV symptom structure had reasonable fit statistics with comparative fit index of 0.86 and root mean square error of approximation of 0.09. The Spanish-language version of the PCL-C may be used as a screening tool for pregnant women. The PCL-C has good reliability, criterion validity and factorial validity. The optimal cut-off score obtained by maximizing the sensitivity and specificity should be considered cautiously; women who screened positive may require further investigation to confirm PTSD diagnosis.
Biofeedback in Partial Weight Bearing: Validity of 3 Different Devices.
van Lieshout, Remko; Stukstette, Mirelle J; de Bie, Rob A; Vanwanseele, Benedicte; Pisters, Martijn F
2016-11-01
Study Design Controlled laboratory study to assess criterion-related validity, with a cross-sectional within-subject design. Background Patients with orthopaedic conditions have difficulties complying with partial weight-bearing instructions. Technological advances have resulted in biofeedback devices that offer real-time feedback. However, the accuracy of these devices is mostly unknown. Inaccurate feedback can result in incorrect lower-limb loading and may lead to delayed healing. Objectives To investigate validity of peak force measurements obtained using 3 different biofeedback devices under varying levels of partial weight-bearing categories. Methods Validity of 3 biofeedback devices (OpenGo science, SmartStep, and SensiStep) was assessed. Healthy participants were instructed to walk at a self-selected speed with crutches under 3 different weight-bearing conditions, categorized as a percentage range of body weight: 1% to 20%, greater than 20% to 50%, and greater than 50% to 75%. Peak force data from the biofeedback devices were compared with the peak vertical ground reaction force measured with a force plate. Criterion validity was estimated using simple and regression-based Bland-Altman 95% limits of agreement and weighted kappas. Results Fifty-five healthy adults (58% male) participated. Agreement with the gold standard was substantial for the SmartStep, moderate for OpenGo science, and slight for SensiStep (weighted ± = 0.76, 0.58, and 0.19, respectively). For the 1% to 20% and greater than 20% to 50% weight-bearing categories, both the OpenGo science and SmartStep had acceptable limits of agreement. For the weight-bearing category greater than 50% to 75%, none of the devices had acceptable agreement. Conclusion The OpenGo science and SmartStep provided valid feedback in the lower weight-bearing categories, and the SensiStep showed poor validity of feedback in all weight-bearing categories. J Orthop Sports Phys Ther 2016;46(11):-1. Epub 12 Oct 2016. doi:10.2519/jospt.2016.6625.
Jin, X F; Wang, J; Li, Y J; Liu, J F; Ni, D F
2016-09-20
Objective: To cross-culturally translate the questionnaire of olfactory disorders(QOD)into a simplified Chinese version, and evaluate its reliability and validity in clinical. Method: A simplified Chinese version of the QOD was evaluated in test-retest reliability, split-half reliability and internal consistency.Then it was evaluated in validity test including content validity, criterion-related validity, responsibility. Criterion-related validity was using the medical outcome study's 36-item short rorm health survey(SF-36) and the World Health Organization quality of life-brief (WHOQOL-BREF) for comparison. Result: A total of 239 patients with olfactory dysfunction were enrolled and tested, in which 195 patients completed all three surveys(QOD, SF-36, WHOQOL-BREF). The test-retest reliabilities of the QOD-parosmia statements(QOD-P), QOD-quality of life(QOD-QoL), and the QOD-visual simulation(QOD-VAS)sections were 0.799( P <0.01),0.781( P <0.01),0.488( P <0.01), respectively, and the Cronbach' s α coefficients reliability were 0.477,0.812,0.889,respectively.The split-half reliability of QOD-QoL was 0.89. There was no correlation between the QOD-P section and the SF-36, but there were statistically significant correlations between the QOD-QoL and QOD-VAS sections with the SF-36. There was no correlation between the QOD-P section and the WHOQOL-BREF, but there were statistically significant correlations between the QOD-QoL and QOD-VAS sections with the SF-36 in most sections. Conclusion: The simplified Chinese version of the QOD was testified to be a reliable and valid questionnaire for evaluating patients with olfactory dysfunction living in mainland of China.The QOD-P section needs further modifications to properly adapt patients with Chinese cultural and knowledge background. Copyright© by the Editorial Department of Journal of Clinical Otorhinolaryngology Head and Neck Surgery.
Downer, Jason T.; Booren, Leslie M.; Lima, Olivia K.; Luckner, Amy E.; Pianta, Robert C.
2012-01-01
This paper introduces the Individualized Classroom Assessment Scoring System (inCLASS), an observation tool that targets children’s interactions in preschool classrooms with teachers, peers, and tasks. In particular, initial evidence is reported of the extent to which the inCLASS meets the following psychometric criteria: inter-rater reliability, normal distributions and adequate range, construct validity, and criterion-related validity. These initial findings suggest that the inCLASS has the potential to provide an authentic, contextualized assessment of young children’s classroom behaviors. Future directions for research with the inCLASS are discussed. PMID:23175598
Meunier, Jean-Christophe; Roskam, Isabelle
2009-01-01
This study presents a validation of a scale that assesses parents' childrearing behavior toward young children. The scale was validated on 565 parents of 2- to 7-year-old children. The current results replicated the factor solution of the original scale designed for parents of school-aged children. The scale demonstrated good psychometric properties: moderate to high internal consistency, the expected relations with criterion variables (parental self-efficacy beliefs, child's behavior and personality), and discriminative properties according to the parents' gender and educational level, the child's age and gender, and the difference between referred and nonreferred children.
Propagation of an ultrashort, intense laser pulse in a relativistic plasma
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ritchie, B.; Decker, C.D.
1997-12-31
A Maxwell-relativistic fluid model is developed for the propagation of an ultrashort, intense laser pulse through an underdense plasma. The separability of plasma and optical frequencies ({omega}{sub p} and {omega} respectively) for small {omega}{sub p}/{omega} is not assumed; thus the validity of multiple-scales theory (MST) can be tested. The theory is valid when {omega}{sub p}/{omega} is of order unity or for cases in which {omega}{sub p}/{omega} {much_lt} 1 but strongly relativistic motion causes higher-order plasma harmonics to be generated which overlap the region of the first-order laser harmonic, such that MST would not expected to be valid although its principalmore » validity criterion {omega}{sub p}/{omega} {much_lt} 1 holds.« less
Reliability and validity of a combat exposure index for Vietnam era veterans.
Janes, G R; Goldberg, J; Eisen, S A; True, W R
1991-01-01
The reliability and validity of a self-report measure of combat exposure are examined in a cohort of male-male twin pairs who served in the military during the Vietnam era. Test-retest reliability for a five-level ordinal index of combat exposure is assessed by use of 192 duplicate sets of responses. The chance-corrected proportion in agreement (as measured by the kappa coefficient) is .84. As a measure of criterion-related validity, the combat index is correlated with the award of combat-related military medals ascertained from the military records. The probability of receiving a Purple Heart, Bronze Star, Commendation Medal and Combat Infantry Badge is associated strongly with the combat exposure index. These results show that this simple index is a reliable and valid measure of combat exposure.
Cross-cultural validity of a dietary questionnaire for studies of dental caries risk in Japanese
2014-01-01
Background Diet is a major modifiable contributing factor in the etiology of dental caries. The purpose of this paper is to examine the reliability and cross-cultural validity of the Japanese version of the Food Frequency Questionnaire to assess dietary intake in relation to dental caries risk in Japanese. Methods The 38-item Food Frequency Questionnaire, in which Japanese food items were added to increase content validity, was translated into Japanese, and administered to two samples. The first sample comprised 355 pregnant women with mean age of 29.2 ± 4.2 years for the internal consistency and criterion validity analyses. Factor analysis (principal components with Varimax rotation) was used to determine dimensionality. The dietary cariogenicity score was calculated from the Food Frequency Questionnaire and used for the analyses. Salivary mutans streptococci level was used as a semi-quantitative assessment of dental caries risk and measured by Dentocult SM. Dentocult SM scores were compared with the dietary cariogenicity score computed from the Food Frequency Questionnaire to examine criterion validity, and assessed by Spearman’s correlation coefficient (rs) and Kruskal-Wallis test. Test-retest reliability of the Food Frequency Questionnaire was assessed with a second sample of 25 adults with mean age of 34.0 ± 3.0 years by using the intraclass correlation coefficient analysis. Results The Japanese language version of the Food Frequency Questionnaire showed high test-retest reliability (ICC = 0.70) and good criterion validity assessed by relationship with salivary mutans streptococci levels (rs = 0.22; p < 0.001). Factor analysis revealed four subscales that construct the questionnaire (solid sugars, solid and starchy sugars, liquid and semisolid sugars, sticky and slowly dissolving sugars). Internal consistency were low to acceptable (Cronbach’s alpha = 0.67 for the total scale, 0.46-0.61 for each subscale). Mean dietary cariogenicity scores were 50.8 ± 19.5 in the first sample, 47.4 ± 14.1, and 40.6 ± 11.3 for the first and second administrations in the second sample. The distribution of Dentocult SM score was 6.8% (score = 0), 34.4% (score = 1), 39.4% (score = 2), and 19.4% (score = 3). Participants with higher scores were more likely to have higher dietary cariogenicity scores (p < 0.001; Kruskal-Wallis test). Conclusions These results provide the preliminary evidence for the reliability and validity of the Japanese language Food Frequency Questionnaire. PMID:24383547
Spyridou, Andria; Schauer, Maggie; Ruf-Leuschner, Martina
2015-02-21
Prenatal assessment for psychosocial risk factors and prevention and intervention is scarce and, in most cases, nonexistent in obstetrical care. In this study we aimed to evaluate if the KINDEX, a short instrument developed in Germany, is a useful tool in the hands of non-trained medical staff, in order to identify and refer women in psychosocial risk to the adequate mental health and social services. We also examined the criterion-related concurrent validity of the tool through a validation interview carried out by an expert clinical psychologist. Our final objective was to achieve the cultural adaptation of the KINDEX Greek Version and to offer a valid tool for the psychosocial risk assessment to the obstetric care providers. Two obstetricians and five midwives carried out 93 KINDEX interviews (duration 20 minutes) with pregnant women to assess psychosocial risk factors present during pregnancy. Afterwards they referred women who they identified having two or more psychosocial risk factors to the mental health attention unit of the hospital. During the validation procedure an expert clinical psychologist carried out diagnostic interviews with a randomized subsample of 50 pregnant women based on established diagnostic instruments for stress and psychopathology, like the PSS-14, ESI, PDS, HSCL-25. Significant correlations between the results obtained through the assessment using the KINDEX and the risk areas of stress, psychopathology and trauma load assessed in the validation interview demonstrate the criterion-related concurrent validity of the KINDEX. The referral accuracy of the medical staff is confirmed through comparisons between pregnant women who have and have not been referred to the mental health attention unit. Prenatal screenings for psychosocial risks like the KINDEX are feasible in public health settings in Greece. In addition, validity was confirmed in high correlations between the KINDEX results and the results of the validation interviews. The KINDEX Greek version can be considered a valid tool, which can be used by non-trained medical staff providing obstetrical care to identify high-risk women and refer them to adequate mental health and social services. These kind of assessments are indispensable for the promotion of a healthy family environment and child development.
Myers, Michael J; Yancy, Haile F; Araneta, Michael; Armour, Jennifer; Derr, Janice; Hoostelaere, Lawrence A D; Farmer, Doris; Jackson, Falana; Kiessling, William M; Koch, Henry; Lin, Huahua; Liu, Yan; Mowlds, Gabrielle; Pinero, David; Riter, Ken L; Sedwick, John; Shen, Yuelian; Wetherington, June; Younkins, Ronsha
2006-01-01
A method trial was initiated to validate the use of a commercial DNA forensic kit to extract DNA from animal feed as part of a PCR-based method. Four different PCR primer pairs (one bovine pair, one porcine pair, one ovine primer pair, and one multispecies pair) were also evaluated. Each laboratory was required to analyze a total of 120 dairy feed samples either not fortified (control, true negative) or fortified with bovine meat and bone meal, porcine meat and bone meal (PMBM), or lamb meal. Feeds were fortified with the animal meals at a concentration of 0.1% (wt/wt). Ten laboratories participated in this trial, and each laboratory was required to evaluate two different primer pairs, i.e., each PCR primer pair was evaluated by five different laboratories. The method was considered to be validated for a given animal source when three or more laboratories achieved at least 97% accuracy (29 correct of 30 samples for 96.7% accuracy, rounded up to 97%) in detecting the fortified samples for that source. Using this criterion, the method was validated for the bovine primer because three laboratories met the criterion, with an average accuracy of 98.9%. The average false-positive rate was 3.0% in these laboratories. A fourth laboratory was 80% accurate in identifying the samples fortified with bovine meat and bone meal. A fifth laboratory was not able to consistently extract the DNA from the feed samples and did not achieve the criterion for accuracy for either the bovine or multispecies PCR primers. For the porcine primers, the method was validated, with four laboratories meeting the criterion for accuracy with an average accuracy of 99.2%. The fifth laboratory had a 93.3% accuracy outcome for the porcine primer. Collectively, these five laboratories had a 1.3% false-positive rate for the porcine primer. No laboratory was able to meet the criterion for accuracy with the ovine primers, most likely because of problems with the synthesis of the primer pair; none of the positive control DNA samples could be detected with the ovine primers. The multispecies primer pair was validated in three laboratories for use with bovine meat and bone meal and lamb meal but not with PMBM. The three laboratories had an average accuracy of 98.9% for bovine meat and bone meal, 97.8% for lamb meal, and 63.3% for PMBM. When examined on an individual laboratory basis, one of these four laboratories could not identify a single feed sample containing PMBM by using the multispecies primer, whereas the other laboratory identified only one PMBM-fortified sample, suggesting that the limit of detection for PMBM with this primer pair is around 0.1% (wt/wt). The results of this study demonstrated that the DNA forensic kit can be used to extract DNA from animal feed, which can then be used for PCR analysis to detect animal-derived protein present in the feed sample.
González-Sánchez, Manuel; Ruiz-Muñoz, Maria; Li, Guang Zhi; Cuesta-Vargas, Antonio I
2018-08-01
To perform a cross-cultural adaptation and validation of the Foot Function Index (FFI) questionnaire to develop the Chinese version. Three hundred and six patients with foot and ankle neuromusculoskeletal diseases participated in this observational study. Construct validity, internal consistency and criterion validity were calculated for the FFI Chinese version after the translation and transcultural adaptation process. Internal consistency ranged from 0.996 to 0.998. Test-retest analysis ranged from 0.985 to 0.994; minimal detectable change 90: 2.270; standard error of measurement: 0.973. Load distribution of the three factors had an eigenvalue greater than 1. Chi-square value was 9738.14 (p < 0.001). Correlations with the three factors were significant between Factor 1 and the other two: r = -0.634 (Factor 2) and r = -0.191 (Factor 1). Foot Function Index (Taiwan Version), Short-Form 12 (Version 2) and EuroQol-5D were used for criterion validity. Factors 1 and 2 showed significant correlation with 15/16 and 14/16 scales and subscales, respectively. Foot Function Index Chinese version psychometric characteristics were good to excellent. Chinese researchers and clinicians may use this tool for foot and ankle assessment and monitoring. Implications for rehabilitation A cross-cultural adaptation of the FFI has been done from original version to Chinese. Consistent results and satisfactory psychometric properties of the Foot Function Index Chinese version have been reported. For Chinese speaking researcher and clinician FFI-Ch could be used as a tool to assess patients with foot disease.
from the Adolescents’ Perspective in Malaysia
Mohd Zin, Faridah; Hillaluddin, Azlin Hilma; Mustaffa, Jamaludin
2017-05-01
Objective: This study aims to develop, validate and determine the reliability of an interactive multimedia strategy to prevent tobacco use among the young (TUPY-S) from an adolescents’ perspective. Methods: A descriptive study design was utilized. A modular instruction guideline by Russel (1974) was followed in the entire process, comprising a feasibility study, a review of existing modules, specification of the objectives, identification of the construct criterion items, learner analysis and entry behavior specification, establishment of the sequence instruction and media selection, a tryout with students and a field test. Result: Feasibility was agreed among the researchers and the school authorities. Culturally suitable rigorously developed tobacco use preventive strategies delivered using information technology (IT) are lacking in the literature. The objective of TUPY-S is to prevent tobacco use among adolescents living in Malaysia. Identified construct criterion items include knowledge, attitude, intention to use, self-efficacy, and refusal skill. The target population was early adolescents belonging to generation-Z. Content was developed from the adolescents’ perspective and delivered using IT in Malay language. Content validity, assessed by six experts in the field and module development, was good at 86%. The students’ tryout showed satisfactory face validity subjectively and objectively (85.5%) and high alpha Cronbach reliability (0.91). Conclusion: TUPY-S was confirmed to suit early adolescents of the current generation living in Malaysia. It demonstrated good content validity among the experts, satisfactory face validity and reliability among the target population. TUPY-S is ready to be evaluated for its effectiveness among early adolescents. Creative Commons Attribution License
Wang, Lin; Hui, Stanley Sai-chuen; Wong, Stephen Heung-sang
2014-11-15
The current study aimed to examine the validity of various published bioelectrical impedance analysis (BIA) equations in estimating FFM among Chinese children and adolescents and to develop BIA equations for the estimation of fat-free mass (FFM) appropriate for Chinese children and adolescents. A total of 255 healthy Chinese children and adolescents aged 9 to 19 years old (127 males and 128 females) from Tianjin, China, participated in the BIA measurement at 50 kHz between the hand and the foot. The criterion measure of FFM was also employed using dual-energy X-ray absorptiometry (DEXA). FFM estimated from 24 published BIA equations was cross-validated against the criterion measure from DEXA. Multiple linear regression was conducted to examine alternative BIA equation for the studied population. FFM estimated from the 24 published BIA equations yielded high correlations with the directly measured FFM from DEXA. However, none of the 24 equations was statistically equivalent with the DEXA-measured FFM. Using multiple linear regression and cross-validation against DEXA measurement, an alternative prediction equation was determined as follows: FFM (kg)=1.613+0.742×height (cm)2/impedance (Ω)+0.151×body weight (kg); R2=0.95; SEE=2.45 kg; CV=6.5, 93.7% of the residuals of all the participants fell within the 95% limits of agreement. BIA was highly correlated with FFM in Chinese children and adolescents. When the new developed BIA equations are applied, BIA can provide a practical and valid measurement of body composition in Chinese children and adolescents.
Wang, Lin; Hui, Stanley Sai-chuen; Wong, Stephen Heung-sang
2014-01-01
Background The current study aimed to examine the validity of various published bioelectrical impedance analysis (BIA) equations in estimating FFM among Chinese children and adolescents and to develop BIA equations for the estimation of fat-free mass (FFM) appropriate for Chinese children and adolescents. Material/Methods A total of 255 healthy Chinese children and adolescents aged 9 to 19 years old (127 males and 128 females) from Tianjin, China, participated in the BIA measurement at 50 kHz between the hand and the foot. The criterion measure of FFM was also employed using dual-energy X-ray absorptiometry (DEXA). FFM estimated from 24 published BIA equations was cross-validated against the criterion measure from DEXA. Multiple linear regression was conducted to examine alternative BIA equation for the studied population. Results FFM estimated from the 24 published BIA equations yielded high correlations with the directly measured FFM from DEXA. However, none of the 24 equations was statistically equivalent with the DEXA-measured FFM. Using multiple linear regression and cross-validation against DEXA measurement, an alternative prediction equation was determined as follows: FFM (kg)=1.613+0.742×height (cm)2/impedance (Ω)+0.151×body weight (kg); R2=0.95; SEE=2.45kg; CV=6.5, 93.7% of the residuals of all the participants fell within the 95% limits of agreement. Conclusions BIA was highly correlated with FFM in Chinese children and adolescents. When the new developed BIA equations are applied, BIA can provide a practical and valid measurement of body composition in Chinese children and adolescents. PMID:25398209
Pellegrino, Federica; Groff, Elena; Bastiani, Luca; Fattori, Bruno; Sotti, Guido
2015-04-01
Xerostomia is the most common acute and late side effect of radiation treatment for head and neck cancer. Affecting taste perception, chewing, swallowing and speech, xerostomia is also the major cause of decreased quality of life. The aims of this study were to validate the Italian translation of the self-reported eight-item xerostomia questionnaire (XQ) and determine its psychometric properties in patients treated with radiotherapy for head and neck cancer. An observational cross-sectional study was conducted in the Radiotherapy Unit of the Veneto Institute of Oncology - IOV in Padua. The XQ was translated according to international guidelines and filled out by 102 patients. Construct validity was assessed using principal component analysis, internal consistency using Cronbach's α coefficient and test-retest reliability at 1-month interval using the intraclass correlation coefficient (ICC). Criterion-related validity was evaluated to compare the Italian version of XQ with the European Organization for Research and Treatment of Cancer (EORTC) Core Quality-of-Life Questionnaire (QLQ-C30) and its Head and Neck Cancer Module (QLQ-H&N35). Cronbach's α for the Italian version of XQ was strong at α = 0.93, test-retest reliability was also strong (0.79) and factor analysis confirmed that the questionnaire was one-dimensional. Criterion-related validity was excellent with high association with the EORTC QLQ-H&N35 xerostomia and sticky saliva scales. The Italian version of XQ has excellent psychometric properties and can be used to evaluate the impact of emerging radiation delivery techniques aiming at preventing xerostomia.
Park, Young-Jae; Lee, Jin-Moo; Yoo, Seung-Yeon; Park, Young-Bae
2016-04-01
To examine whether color parameters of tongue inspection (TI) using a digital camera was reliable and valid, and to examine which color parameters serve as predictors of symptom patterns in terms of East Asian medicine (EAM). Two hundred female subjects' tongue substances were photographed by a mega-pixel digital camera. Together with the photographs, the subjects were asked to complete Yin deficiency, Phlegm pattern, and Cold-Heat pattern questionnaires. Using three sets of digital imaging software, each digital image was exposure- and white balance-corrected, and finally L* (luminance), a* (red-green balance), and b* (yellow-blue balance) values of the tongues were calculated. To examine intra- and inter-rater reliabilities and criterion validity of the color analysis method, three raters were asked to calculate color parameters for 20 digital image samples. Finally, four hierarchical regression models were formed. Color parameters showed good or excellent reliability (0.627-0.887 for intra-class correlation coefficients) and significant criterion validity (0.523-0.718 for Spearman's correlation). In the hierarchical regression models, age was a significant predictor of Yin deficiency (β = 0.192), and b* value of the tip of the tongue was a determinant predictor of Yin deficiency, Phlegm, and Heat patterns (β = - 0.212, - 0.172, and - 0.163). Luminance (L*) was predictive of Yin deficiency (β = -0.172) and Cold (β = 0.173) pattern. Our results suggest that color analysis of the tongue using the L*a*b* system is reliable and valid, and that color parameters partially serve as symptom pattern predictors in EAM practice.
Hoenig, Helen M; Amis, Kristopher; Edmonds, Carol; Morgan, Michelle S; Landerman, Lawrence; Caves, Kevin
2017-01-01
Background There is limited research about the effects of video quality on the accuracy of assessments of physical function. Methods A repeated measures study design was used to assess reliability and validity of the finger-nose test (FNT) and the finger-tapping test (FTT) carried out with 50 veterans who had impairment in gross and/or fine motor coordination. Videos were scored by expert raters under eight differing conditions, including in-person, high definition video with slow motion review and standard speed videos with varying bit rates and frame rates. Results FTT inter-rater reliability was excellent with slow motion video (ICC 0.98-0.99) and good (ICC 0.59) under the normal speed conditions. Inter-rater reliability for FNT 'attempts' was excellent (ICC 0.97-0.99) for all viewing conditions; for FNT 'misses' it was good to excellent (ICC 0.89) with slow motion review but substantially worse (ICC 0.44) on the normal speed videos. FTT criterion validity (i.e. compared to slow motion review) was excellent (β = 0.94) for the in-person rater and good ( β = 0.77) on normal speed videos. Criterion validity for FNT 'attempts' was excellent under all conditions ( r ≥ 0.97) and for FNT 'misses' it was good to excellent under all conditions ( β = 0.61-0.81). Conclusions In general, the inter-rater reliability and validity of the FNT and FTT assessed via video technology is similar to standard clinical practices, but is enhanced with slow motion review and/or higher bit rate.
da Silva, Wanderson Roberto; Dias, Juliana Chioda Ribeiro; Maroco, João; Campos, Juliana Alvares Duarte Bonini
2014-09-01
This study aimed at evaluating the validity, reliability, and factorial invariance of the complete (34-item) and shortened (8-item and 16-item) versions of the Body Shape Questionnaire (BSQ) when applied to Brazilian university students. A total of 739 female students with a mean age of 20.44 (standard deviation=2.45) years participated. Confirmatory factor analysis was conducted to verify the degree to which the one-factor structure satisfies the proposal for the BSQ's expected structure. Two items of the 34-item version were excluded because they had factor weights (λ)<40. All models had adequate convergent validity (average variance extracted=.43-.58; composite reliability=.85-.97) and internal consistency (α=.85-.97). The 8-item B version was considered the best shortened BSQ version (Akaike information criterion=84.07, Bayes information criterion=157.75, Browne-Cudeck criterion=84.46), with strong invariance for independent samples (Δχ(2)λ(7)=5.06, Δχ(2)Cov(8)=5.11, Δχ(2)Res(16)=19.30). Copyright © 2014 Elsevier Ltd. All rights reserved.
Buekenhout, Imke; Leitão, José; Gomes, Ana A
2018-05-24
Month ordering tasks have been used in experimental settings to obtain measures of working memory (WM) capacity in older/clinical groups based solely on their face validity. We sought to assess the appropriateness of using a month ordering task in other contexts, including clinical settings, as a psychometrically sound WM assessment. To this end, we constructed a month ordering task (ucMOT), studied its reliability (internal consistency and temporal stability), and gathered construct-related and criterion-related validity evidence for its use as a WM assessment. The ucMOT proved to be internally consistent and temporally stable, and analyses of the criterion-related validity evidence revealed that its scores predicted the efficiency of language comprehension processes known to depend crucially on WM resources, namely, processes involved in pronoun interpretation. Furthermore, all ucMOT items discriminated between younger and older age groups; the global scores were significantly correlated with scores on well-established WM tasks and presented lower correlations with instruments that evaluate different (although related) processes, namely, inhibition and processing speed. We conclude that the ucMOT possesses solid psychometric properties. Accordingly, we acquired normative data for the Portuguese population, which we present as a regression-based algorithm that yields z scores adjusted for age, gender, and years of formal education. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
López-Villalobos, José A; Andrés-De Llano, Jesús; López-Sánchez, María V; Rodríguez-Molinero, Luis; Garrido-Redondo, Mercedes; Sacristán-Martín, Ana M; Martínez-Rivera, María T; Alberola-López, Susana
2017-02-01
The aim of this research is to analyze Attention Deficit Hyperactivity Disorder Rating Scales IV (ADHD RS-IV) criteria validity and its clinical usefulness for the assessment of Attention Deficit Hyperactivity Disorder (ADHD) as a function of assessment method and age. A sample was obtained from an epidemiological study (n = 1095, 6-16 years). Clinical cases of ADHD (ADHD-CL) were selected by dimensional ADHD RS-IV and later by clinical interview (DSM-IV). ADHD-CL cases were compared with four categorical results of ADHD RS-IV provided by parents (CATPA), teachers (CATPR), either parents or teachers (CATPAOPR) and both parents and teachers (CATPA&PR). Criterion validity and clinical usefulness of the answer modalities to ADHD RS-IV were studied. ADHD-CL rate was 6.9% in childhood, 6.2% in preadolescence and 6.9% in adolescence. Alternative methods to the clinical interview led to increased numbers of ADHD cases in all age groups analyzed, in the following sequence: CATPAOPR> CATPRO> CATPA> CATPA&PR> ADHD-CL. CATPA&PR was the procedure with the greatest validity, specificity and clinical usefulness in all three age groups, particularly in the childhood. Isolated use of ADHD RS-IV leads to an increase in ADHD cases compared to clinical interview, and varies depending on the procedure used.
The Work-Health-Check (WHC): a brief new tool for assessing psychosocial stress in the workplace.
Gadinger, M C; Schilling, O; Litaker, D; Fischer, J E
2012-01-01
Brief, psychometrically robust questionnaires assessing work-related psychosocial stressors are lacking. The purpose of the study is to evaluate the psychometric properties of a brief new questionnaire for assessing sources of work-related psychosocial stress. Managers, blue- and white-collar workers (n= 628 at measurement point one, n=459 at measurement point two), sampled from an online panel of a German marketing research institute. We either developed or identified appropriate items from existing questionnaires for ten scales, which are conceptually based in work stress models and reflected either work-related demands or resources. Factorial structure was evaluated by confirmatory factor analyses (CFA). Scale reliability was assessed by Cronbach's Alpha, and test-retest; correlations with work-related efforts demonstrated convergent and discriminant validity for the demand and resource scales, respectively. Scale correlations with health indicators tested criterion validity. All scales had satisfactory reliability (Cronbach's Alpha: 0.74-0.93, retest reliabilities: 0.66-0.81). CFA supported the anticipated factorial structure. Significant correlations between job-related efforts and demand scales (mean r=0.44) and non-significant correlations with the resource scales (mean r=0.07) suggested good convergent and discriminant validity, respectively. Scale correlations with health indicators demonstrated good criterion validity. The WHC appears to be a brief, psychometrically robust instrument for assessing work-related psychosocial stressors.
Development and Validation of the Five-by-Five Resilience Scale.
DeSimone, Justin A; Harms, P D; Vanhove, Adam J; Herian, Mitchel N
2017-09-01
This article introduces a new measure of resilience and five related protective factors. The Five-by-Five Resilience Scale (5×5RS) is developed on the basis of theoretical and empirical considerations. Two samples ( N = 475 and N = 613) are used to assess the factor structure, reliability, convergent validity, and criterion-related validity of the 5×5RS. Confirmatory factor analysis supports a bifactor model. The 5×5RS demonstrates adequate internal consistency as evidenced by Cronbach's alpha and empirical reliability estimates. The 5×5RS correlates positively with the Connor-Davidson Resilience Scale (CD-RISC), a commonly used measure of resilience. The 5×5RS exhibits similar criterion-related validity to the CD-RISC as evidenced by positive correlations with satisfaction with life, meaning in life, and secure attachment style as well as negative correlations with rumination and anxious or avoidant attachment styles. 5×5RS scores are positively correlated with healthy behaviors such as exercise and negatively correlated with sleep difficulty and symptomology of anxiety and depression. The 5×5RS incrementally explains variance in some criteria above and beyond the CD-RISC. Item responses are modeled using the graded response model. Information estimates demonstrate the ability of the 5×5RS to assess individuals within at least one standard deviation of the mean on relevant latent traits.
PKIX Certificate Status in Hybrid MANETs
NASA Astrophysics Data System (ADS)
Muñoz, Jose L.; Esparza, Oscar; Gañán, Carlos; Parra-Arnau, Javier
Certificate status validation is a hard problem in general but it is particularly complex in Mobile Ad-hoc Networks (MANETs) because we require solutions to manage both the lack of fixed infrastructure inside the MANET and the possible absence of connectivity to trusted authorities when the certification validation has to be performed. In this sense, certificate acquisition is usually assumed as an initialization phase. However, certificate validation is a critical operation since the node needs to check the validity of certificates in real-time, that is, when a particular certificate is going to be used. In such MANET environments, it may happen that the node is placed in a part of the network that is disconnected from the source of status data at the moment the status checking is required. Proposals in the literature suggest the use of caching mechanisms so that the node itself or a neighbour node has some status checking material (typically on-line status responses or lists of revoked certificates). However, to the best of our knowledge the only criterion to evaluate the cached (obsolete) material is the time. In this paper, we analyse how to deploy a certificate status checking PKI service for hybrid MANET and we propose a new criterion based on risk to evaluate cached status data that is much more appropriate and absolute than time because it takes into account the revocation process.
Spathis, Jemima Grace; Connick, Mark James; Beckman, Emma Maree; Newcombe, Peter Anthony; Tweedy, Sean Michael
2015-01-01
Paralympic throwing events for athletes with physical impairments comprise seated and standing javelin, shot put, discus and seated club throwing. Identification of talented throwers would enable prediction of future success and promote participation; however, a valid and reliable talent identification battery for Paralympic throwing has not been reported. This study evaluates the reliability and validity of a talent identification battery for Paralympic throws. Participants were non-disabled so that impairment would not confound analyses, and results would provide an indication of normative performance. Twenty-eight non-disabled participants (13 M; 15 F) aged 23.6 years (±5.44) performed five kinematically distinct criterion throws (three seated, two standing) and nine talent identification tests (three anthropometric, six motor); 23 were tested a second time to evaluate test-retest reliability. Talent identification test-retest reliability was evaluated using Intra-class Correlation Coefficient (ICC) and Bland-Altman plots (Limits of Agreement). Spearman's correlation assessed strength of association between criterion throws and talent identification tests. Reliability was generally acceptable (mean ICC = 0.89), but two seated talent identification tests require more extensive familiarisation. Correlation strength (mean rs = 0.76) indicated that the talent identification tests can be used to validly identify individuals with competitively advantageous attributes for each of the five kinematically distinct throwing activities. Results facilitate further research in this understudied area.
Sajjad, Madiha; Khan, Rehan Ahmed; Yasmeen, Rahila
2018-01-01
To develop a tool to evaluate faculty perceptions of assessment quality in an undergraduate medical program. The Assessment Implementation Measure (AIM) tool was developed by a mixed method approach. A preliminary questionnaire developed through literature review was submitted to a panel of 10 medical education experts for a three-round 'Modified Delphi technique'. Panel agreement of > 75% was considered the criterion for inclusion of items in the questionnaire. Cognitive pre-testing of five faculty members was conducted. Pilot study was done with 30 randomly selected faculty members. Content validity index (CVI) was calculated for individual items (I-CVI) and composite scale (S-CVI). Cronbach's alpha was calculated to determine the internal consistency reliability of the tool. The final AIM tool had 30 items after the Delphi process. S-CVI was 0.98 with the S-CVI/Avg method and 0.86 by S-CVI/UA method, suggesting good content validity. Cut-off value of < 0.9 I-CVI was taken as criterion for item deletion. Cognitive pre-testing revealed good item interpretation. Cronbach's alpha calculated for the AIM was 0.9, whereas Cronbach's alpha for the four domains ranged from 0.67 to 0.80. 'AIM' is a relevant and useful instrument with good content validity and reliability of results, and may be used to evaluate the teachers´ perceptions about assessment quality.