reliability external validity: Topics by Science.gov

Sample records for reliability external validity

Translation and validation of the German version of the Bournemouth Questionnaire for Neck Pain.

PubMed

Soklic, Marina; Peterson, Cynthia; Humphreys, B Kim

2012-01-25

Clinical outcome measures are important tools to monitor patient improvement during treatment as well as to document changes for research purposes. The short-form Bournemouth questionnaire for neck pain patients (BQN) was developed from the biopsychosocial model and measures pain, disability, cognitive and affective domains. It has been shown to be a valid and reliable outcome measure in English, French and Dutch and more sensitive to change compared to other questionnaires. The purpose of this study was to translate and validate a German version of the Bournemouth questionnaire for neck pain patients. German translation and back translation into English of the BQN was done independently by four persons and overseen by an expert committee. Face validity of the German BQN was tested on 30 neck pain patients in a single chiropractic practice. Test-retest reliability was evaluated on 31 medical students and chiropractors before and after a lecture. The German BQN was then assessed on 102 first time neck pain patients at two chiropractic practices for internal consistency, external construct validity, external longitudinal construct validity and sensitivity to change compared to the German versions of the Neck Disability Index (NDI) and the Neck Pain and Disability Scale (NPAD). Face validity testing lead to minor changes to the German BQN. The Intraclass Correlation Coefficient for the test-retest reliability was 0.99. The internal consistency was strong for all 7 items of the BQN with Cronbach α's of .79 and .80 for the pre and post-treatment total scores. External construct validity and external longitudinal construct validity using Pearson's correlation coefficient showed statistically significant correlations for all 7 scales of the BQN with the other questionnaires. The German BQN showed greater responsiveness compared to the other questionnaires for all scales. The German BQN is a valid and reliable outcome measure that has been successfully translated and culturally adapted. It is shorter, easier to use, and more responsive to change than the NDI and NPAD.
[Reliability and external validity of a questionnaire to assess the knowledge about risk and cardiovascular disease and in patients attending Spanish community pharmacies].

PubMed

Amariles, Pedro; Pino-Marín, Daniel; Sabater-Hernández, Daniel; García-Jiménez, Emilio; Roig-Sánchez, Inés; Faus, María José

2016-11-01

To determine the test-retest reliability of a questionnaire, with a validation preliminary, to assess knowledge of cardiovascular risk (CVR) and cardiovascular disease in patients attending community pharmacies in Spain. To complement the external validity, establishing the relationship between an educational activity and the increase in knowledge about CVR and cardiovascular disease. Sub-analysis of a controlled clinical study, EMDADER-CV, in which a questionnaire about knowledge concerning CVR was applied at 4 different times. Spanish Community Pharmacies. There were 323 patients in the control group, from the 640 who completed the study. Intraclass correlation coefficient to assess the reliability in 3 comparisons (post-educational activity with week 16, post-educational activity with week 32, and week 16 with week 32); and the non-parametric Friedman test to establish the relationship between an oral and written educational activity with increasing knowledge. For the 323 patients in the 3 comparisons, the intraclass correlation coefficient values were 0.624; 0.608 and 0.801, respectively (fair-good to excellent reliability). So, the Friedman test showed a statistically significant relationship between educational activity and increased knowledge (p < .0001). According to the intraclass correlation coefficient, the questionnaire aimed at assessing the knowledge on CVR and cardiovascular disease has a reliability between acceptable and excellent, which added to the previous validation, shows that the instrument meets the criteria of validity and reliability. Furthermore, the questionnaire showed the ability to relate an increase in knowledge with an educational intervention, feature that complements its external validity. Copyright © 2016 Elsevier España, S.L.U. All rights reserved.
Space Shuttle Propulsion System Reliability

NASA Technical Reports Server (NTRS)

Welzyn, Ken; VanHooser, Katherine; Moore, Dennis; Wood, David

2011-01-01

This session includes the following sessions: (1) External Tank (ET) System Reliability and Lessons, (2) Space Shuttle Main Engine (SSME), Reliability Validated by a Million Seconds of Testing, (3) Reusable Solid Rocket Motor (RSRM) Reliability via Process Control, and (4) Solid Rocket Booster (SRB) Reliability via Acceptance and Testing.
The Modified Cognitive Constructions Coding System: Reliability and Validity Assessments

ERIC Educational Resources Information Center

Moran, Galia S.; Diamond, Gary M.

2006-01-01

The cognitive constructions coding system (CCCS) was designed for coding client's expressed problem constructions on four dimensions: intrapersonal-interpersonal, internal-external, responsible-not responsible, and linear-circular. This study introduces, and examines the reliability and validity of, a modified version of the CCCS--a version that…
Reliability and validity of the closed kinetic chain upper extremity stability test.

PubMed

Lee, Dong-Rour; Kim, Laurentius Jongsoon

2015-04-01

[Purpose] The purpose of this study was to examine the reliability and validity of the Closed Kinetic Chain Upper Extremity Stability (CKCUES) test. [Subjects and Methods] A sample of 40 subjects (20 males, 20 females) with and without pain in the upper limbs was recruited. The subjects were tested twice, three days apart to assess the reliability of the CKCUES test. The CKCUES test was performed four times, and the average was calculated using the data of the last 3 tests. In order to test the validity of the CKCUES test, peak torque of internal/external shoulder rotation was measured using an isokinetic dynamometer, and maximum grip strength was measured using a hand dynamometer, and their Pearson correlation coefficients with the average values of the CKCUES test were calculated. [Results] The reliability of the CKCUES test was very high (ICC=0.97). The correlations between the CKCUES test and maximum grip strength (r=0.78-0.79), and the peak torque of internal/external shoulder rotation (r=0.87-0.94) were high indicating its validity. [Conclusion] The reliability and validity of the CKCUES test were high. The CKCUES test is expected to be used for clinical tests on upper limb stability at low price.
Reliability and Validity of the Yale Global Tic Severity Scale

ERIC Educational Resources Information Center

Storch, Eric A.; Murphy, Tanya K.; Geffken, Gary R.; Sajid, Muhammad; Allen, Pam; Roberti, Jonathan W.; Goodman, Wayne K.

2005-01-01

To investigate the reliability and validity of the Yale Global Tic Severity Scale (YGTSS), 28 youth aged 6 to 17 years with Tourette's syndrome (TS) participated in the study. Data included clinician reports of tics and obsessive-compulsive disorder (OCD) severity, parent reports of tics, internalizing and externalizing problems, and child reports…
Issues in cross-cultural validity: example from the adaptation, reliability, and validity testing of a Turkish version of the Stanford Health Assessment Questionnaire.

PubMed

Küçükdeveci, Ayse A; Sahin, Hülya; Ataman, Sebnem; Griffiths, Bridget; Tennant, Alan

2004-02-15

Guidelines have been established for cross-cultural adaptation of outcome measures. However, invariance across cultures must also be demonstrated through analysis of Differential Item Functioning (DIF). This is tested in the context of a Turkish adaptation of the Health Assessment Questionnaire (HAQ). Internal construct validity of the adapted HAQ is assessed by Rasch analysis; reliability, by internal consistency and the intraclass correlation coefficient; external construct validity, by association with impairments and American College of Rheumatology functional stages. Cross-cultural validity is tested through DIF by comparison with data from the UK version of the HAQ. The adapted version of the HAQ demonstrated good internal construct validity through fit of the data to the Rasch model (mean item fit 0.205; SD 0.998). Reliability was excellent (alpha = 0.97) and external construct validity was confirmed by expected associations. DIF for culture was found in only 1 item. Cross-cultural validity was found to be sufficient for use in international studies between the UK and Turkey. Future adaptation of instruments should include analysis of DIF at the field testing stage in the adaptation process.
Validation of the Middlesex Elderly Assessment of Mental State (MEAMS) as a cognitive screening test in patients with acquired brain injury in Turkey.

PubMed

Kutlay, Sehim; Kuçukdeveci, Ayse A; Elhan, Atilla H; Yavuzer, Gunes; Tennant, Alan

2007-02-28

Assessment of cognitive impairment with a valid cognitive screening tool is essential in neurorehabilitation. The aim of this study was to test the reliability and validity of the Turkish-adapted version of the Middlesex Elderly Assessment of Mental State (MEAMS) among acquired brain injury patients in Turkey. Some 155 patients with acquired brain injury admitted for rehabilitation were assessed by the adapted version of MEAMS at admission and discharge. Reliability was tested by internal consistency, intra-class correlation coefficient (ICC) and person separation index; internal construct validity by Rasch analysis; external construct validity by associations with physical and cognitive disability (FIM); and responsiveness by Effect Size. Reliability was found to be good with Cronbach's alpha of 0.82 at both admission and discharge; and likewise an ICC of 0.80. Person separation index was 0.813. Internal construct validity was good by fit of the data to the Rasch model (mean item fit -0.178; SD 1.019). Items were substantially free of differential item functioning. External construct validity was confirmed by expected associations with physical and cognitive disability. Effect size was 0.42 compared with 0.22 for cognitive FIM. The reliability and validity of the Turkish version of MEAMS as a cognitive impairment screening tool in acquired brain injury has been demonstrated.
The Development and Piloting of Parallel Scales Measuring External and Internal HIV and Tuberculosis Stigma Among Healthcare Workers in the Free State Province, South Africa.

PubMed

Wouters, Edwin; Rau, Asta; Engelbrecht, Michelle; Uebel, Kerry; Siegel, Jacob; Masquillier, Caroline; Kigozi, Gladys; Sommerland, Nina; Yassi, Annalee

2016-05-15

The dual burden of tuberculosis and human immunodeficiency virus (HIV) is severely impacting the South African healthcare workforce. However, the use of on-site occupational health services is hampered by stigma among the healthcare workforce. The success of stigma-reduction interventions is difficult to evaluate because of a dearth of appropriate scientific tools to measure stigma in this specific professional setting. The current pilot study aimed to develop and test a range of scales measuring different aspects of stigma-internal and external stigma toward tuberculosis as well as HIV-in a South African healthcare setting. The study employed data of a sample of 200 staff members of a large hospital in Bloemfontein, South Africa. Confirmatory factor analysis produced 7 scales, displaying internal construct validity: (1) colleagues' external HIV stigma, (2) colleagues' actions against external HIV stigma, (3) respondent's external HIV stigma, (4) respondent's internal HIV stigma, (5) colleagues' external tuberculosis stigma, (6) respondent's external tuberculosis stigma, and (7) respondent's internal tuberculosis stigma. Subsequent analyses (reliability analysis, structural equation modeling) demonstrated that the scales displayed good psychometric properties in terms of reliability and external construct validity. The study outcomes support the use of the developed scales as a valid and reliable means to measure levels of tuberculosis- and HIV-related stigma among the healthcare workforce in a resource-limited context. Future studies should build on these findings to fine-tune the instruments and apply them to larger study populations across a range of different resource-limited healthcare settings with high HIV and tuberculosis prevalence. © The Author 2016. Published by Oxford University Press for the Infectious Diseases Society of America. All rights reserved. For permissions, e-mail journals.permissions@oup.com.
The Development and Piloting of Parallel Scales Measuring External and Internal HIV and Tuberculosis Stigma Among Healthcare Workers in the Free State Province, South Africa

PubMed Central

Wouters, Edwin; Rau, Asta; Engelbrecht, Michelle; Uebel, Kerry; Siegel, Jacob; Masquillier, Caroline; Kigozi, Gladys; Sommerland, Nina; Yassi, Annalee

2016-01-01

Background The dual burden of tuberculosis and human immunodeficiency virus (HIV) is severely impacting the South African healthcare workforce. However, the use of on-site occupational health services is hampered by stigma among the healthcare workforce. The success of stigma-reduction interventions is difficult to evaluate because of a dearth of appropriate scientific tools to measure stigma in this specific professional setting. Methods The current pilot study aimed to develop and test a range of scales measuring different aspects of stigma—internal and external stigma toward tuberculosis as well as HIV—in a South African healthcare setting. The study employed data of a sample of 200 staff members of a large hospital in Bloemfontein, South Africa. Results Confirmatory factor analysis produced 7 scales, displaying internal construct validity: (1) colleagues’ external HIV stigma, (2) colleagues’ actions against external HIV stigma, (3) respondent’s external HIV stigma, (4) respondent’s internal HIV stigma, (5) colleagues’ external tuberculosis stigma, (6) respondent’s external tuberculosis stigma, and (7) respondent’s internal tuberculosis stigma. Subsequent analyses (reliability analysis, structural equation modeling) demonstrated that the scales displayed good psychometric properties in terms of reliability and external construct validity. Conclusions The study outcomes support the use of the developed scales as a valid and reliable means to measure levels of tuberculosis- and HIV-related stigma among the healthcare workforce in a resource-limited context. Future studies should build on these findings to fine-tune the instruments and apply them to larger study populations across a range of different resource-limited healthcare settings with high HIV and tuberculosis prevalence. PMID:27118854
Initial Evidence for the Reliability and Validity of the Student Risk Screening Scale for Internalizing and Externalizing Behaviors at the Elementary Level

ERIC Educational Resources Information Center

Lane, Kathleen Lynne; Oakes, Wendy P.; Harris, Pamela J.; Menzies, Holly Mariah; Cox, Meredith; Lambert, Warren

2012-01-01

We report findings of an exploratory validation study of a revised instrument: the Student Risk Screening Scale-Internalizing and Externalizing (SRSS-IE). The SRSS-IE was modified to include seven additional items reflecting characteristics of internalizing behaviors, with proposed items generated from the current literature base, review of…
Development and validation of the Stirling Eating Disorder Scales.

PubMed

Williams, G J; Power, K G; Miller, H R; Freeman, C P; Yellowlees, A; Dowds, T; Walker, M; Parry-Jones, W L

1994-07-01

The development and reliability/validity check of an 80-item, 8-scale measure for use with eating disorder patients is presented. The Stirling Eating Disorder Scales (SEDS) assess anorexic dietary behavior, anorexic dietary cognitions, bulimic dietary behavior, bulimic dietary cognitions, high perceived external control, low assertiveness, low self-esteem, and self-directed hostility. The SEDS were administered to 82 eating disorder patients and 85 controls. Results indicate that the SEDS are acceptable in terms of internal consistency, reliability, group validity, and concurrent validity.
German Translation and Validation of the Cognitive Style Questionnaire Short Form (CSQ-SF-D)

PubMed Central

Huys, Quentin J. M.; Renz, Daniel; Petzschner, Frederike; Berwian, Isabel; Stoppel, Christian; Haker, Helene

2016-01-01

Background The Cognitive Style Questionnaire is a valuable tool for the assessment of hopeless cognitive styles in depression research, with predictive power in longitudinal studies. However, it is very burdensome to administer. Even the short form is still long, and neither this nor the original version exist in validated German translations. Methods The questionnaire was translated from English to German, back-translated and commented on by clinicians. The reliability, factor structure and external validity of an online form of the questionnaire were examined on 214 participants. External validity was measured on a subset of 90 subjects. Results The resulting CSQ-SF-D had good to excellent reliability, both across items and subscales, and similar external validity to the original English version. The internality subscale appeared less robust than other subscales. A detailed analysis of individual item performance suggests that stable results could be achieved with a very short form (CSQ-VSF-D) including only 27 of the 72 items. Conclusions The CSQ-SF-D is a validated and freely distributed translation of the CSQ-SF into German. This should make efficient assessment of cognitive style in German samples more accessible to researchers. PMID:26934499
German Translation and Validation of the Cognitive Style Questionnaire Short Form (CSQ-SF-D).

PubMed

Huys, Quentin J M; Renz, Daniel; Petzschner, Frederike; Berwian, Isabel; Stoppel, Christian; Haker, Helene

2016-01-01

The Cognitive Style Questionnaire is a valuable tool for the assessment of hopeless cognitive styles in depression research, with predictive power in longitudinal studies. However, it is very burdensome to administer. Even the short form is still long, and neither this nor the original version exist in validated German translations. The questionnaire was translated from English to German, back-translated and commented on by clinicians. The reliability, factor structure and external validity of an online form of the questionnaire were examined on 214 participants. External validity was measured on a subset of 90 subjects. The resulting CSQ-SF-D had good to excellent reliability, both across items and subscales, and similar external validity to the original English version. The internality subscale appeared less robust than other subscales. A detailed analysis of individual item performance suggests that stable results could be achieved with a very short form (CSQ-VSF-D) including only 27 of the 72 items. The CSQ-SF-D is a validated and freely distributed translation of the CSQ-SF into German. This should make efficient assessment of cognitive style in German samples more accessible to researchers.
How Sharp is a Unicorn's Horn?

ERIC Educational Resources Information Center

Johnston, Peter H.; Allignton, Richard L.

1983-01-01

Criticizes a study of the reliability and validity of curriculum-based reading inventories by L. S. Fuchs, D. Fuchs, and S. L. Deno and raises questions regarding the study's internal and external validity. (AEA)
The Screening Test for Emotional Problems--Teacher-Report Version (Step-T): Studies of Reliability and Validity

ERIC Educational Resources Information Center

Erford, Bradley T.; Butler, Caitlin; Peacock, Elizabeth

2015-01-01

The Screening Test for Emotional Problems-Teacher Version (STEP-T) was designed to identify students aged 7-17 years with wide-ranging emotional disturbances. Coefficients alpha and test-retest reliability were adequate for all subscales except Anxiety. The hypothesized five-factor model fit the data very well and external aspects of validity were…
Validity and reliability of isometric muscle strength measurements of hip abduction and abduction with external hip rotation in a bent-hip position using a handheld dynamometer with a belt.

PubMed

Aramaki, Hidefumi; Katoh, Munenori; Hiiragi, Yukinobu; Kawasaki, Tsubasa; Kurihara, Tomohisa; Ohmi, Yorikatsu

2016-07-01

[Purpose] This study aimed to investigate the relatedness, reliability, and validity of isometric muscle strength measurements of hip abduction and abduction with an external hip rotation in a bent-hip position using a handheld dynamometer with a belt. [Subjects and Methods] Twenty healthy young adults, with a mean age of 21.5 ± 0.6 years were included. Isometric hip muscle strength in the subjects' right legs was measured under two posture positions using two devices: a handheld dynamometer with a belt and an isokinetic dynamometer. Reliability was evaluated using an intra-class correlation coefficient (ICC); relatedness and validity were evaluated using Pearson's product moment correlation coefficient. Differences in measurements of devices were assessed by two-way ANOVA. [Results] ICC (1, 1) was ≥0.9; significant positive correlations in measurements were found between the two devices under both conditions. No main effect was found between the measurement values. [Conclusion] Our findings revealed that there was relatedness, reliability, and validity of this method for isometric muscle strength measurements using a handheld dynamometer with a belt.
The Utrecht questionnaire (U-CEP) measuring knowledge on clinical epidemiology proved to be valid.

PubMed

Kortekaas, Marlous F; Bartelink, Marie-Louise E L; de Groot, Esther; Korving, Helen; de Wit, Niek J; Grobbee, Diederick E; Hoes, Arno W

2017-02-01

Knowledge on clinical epidemiology is crucial to practice evidence-based medicine. We describe the development and validation of the Utrecht questionnaire on knowledge on Clinical epidemiology for Evidence-based Practice (U-CEP); an assessment tool to be used in the training of clinicians. The U-CEP was developed in two formats: two sets of 25 questions and a combined set of 50. The validation was performed among postgraduate general practice (GP) trainees, hospital trainees, GP supervisors, and experts. Internal consistency, internal reliability (item-total correlation), item discrimination index, item difficulty, content validity, construct validity, responsiveness, test-retest reliability, and feasibility were assessed. The questionnaire was externally validated. Internal consistency was good with a Cronbach alpha of 0.8. The median item-total correlation and mean item discrimination index were satisfactory. Both sets were perceived as relevant to clinical practice. Construct validity was good. Both sets were responsive but failed on test-retest reliability. One set took 24 minutes and the other 33 minutes to complete, on average. External GP trainees had comparable results. The U-CEP is a valid questionnaire to assess knowledge on clinical epidemiology, which is a prerequisite for practicing evidence-based medicine in daily clinical practice. Copyright © 2016 Elsevier Inc. All rights reserved.
Translation, Adaptation, and Preliminary Validation of the Female Sexual Function Index into Spanish (Colombia).

PubMed

Vallejo-Medina, Pablo; Pérez-Durán, Claudia; Saavedra-Roa, Alejandro

2018-04-01

The Female Sexual Function Index (FSFI) subjectively explores the dimensions of female sexual functioning. This research undertook to adapt and validate the FSFI to Spanish language in a Colombian sample. To this effect, this study was conducted in two steps, namely: (1) cultural adaptation of the scale with the collaboration of seven experts; and (2) preliminary validation of the scale in a sample of 925 participants. Reliability indices were appropriate in this sample, and external validity in relation to other measures showed significant relationships. Findings suggest that the FSFI is reliable and valid in Spanish for a Colombian population. Further research is needed to establish the test-retest reliability and discriminant validity of this Spanish version.
Initial Evidence for the Reliability and Validity of the Student Risk Screening Scale for Internalizing and Externalizing Behaviors at the Middle School Level

ERIC Educational Resources Information Center

Lane, Kathleen Lynne; Oakes, Wendy Peia; Carter, Erik W.; Lambert, Warren E.; Jenkins, Abbie B.

2013-01-01

We reported findings of an exploratory validation study of a revised universal screening instrument: the Student Risk Screening Scale--Internalizing and Externalizing (SRSS-IE) for use with middle school students. Tested initially for use with elementary-age students, the SRSS-IE was adapted to include seven additional items reflecting…

A Validation of the Student Risk Screening Scale for Internalizing and Externalizing Behaviors: Patterns in Rural and Urban Elementary Schools

ERIC Educational Resources Information Center

Lane, Kathleen Lynne; Menzies, Holly M.; Oakes, Wendy P.; Lambert, Warren; Cox, Meredith; Hankins, Katy

2012-01-01

We report findings of two studies, one conducted in a rural school district (N = 982) and a second conducted in an urban district (N = 1,079), offering additional evidence of the reliability and validity of a revised instrument, the Student Risk Screening Scale-Internalizing and Externalizing (SRSS-IE), to accurately detect internalizing and…
Validation of the Dutch Eating Behaviour Questionnaire (DEBQ) among Maltese women.

PubMed

Dutton, Elaine; Dovey, Terence M

2016-12-01

The main aim of this study was to assess the dimensional structure of the Maltese version of the Dutch Eating Behaviour Questionnaire (DEBQ) and evaluate the instrument's validity and reliability among Maltese women (N = 586). Exploratory factor analysis reflected the theoretical structure of three factors; emotional, restrained and external eating which was supported by a Confirmatory Factor analysis. Minor issues with specific items in the Emotional and External eating scale were identified and discussed. Criterion-related validity was ascertained through correlations with the EAT-26. The study also assessed the DEBQ's predictive value in differentiating between BMI groups and between dieters and weight maintainers. The results suggest that the Maltese DEBQ is a psychometrically valid and reliable instrument for assessing eating behaviours with women in the Maltese community. The study also highlights the critical role of Emotional and Restrained eating in dieting and overweight Maltese women. Copyright © 2016 Elsevier Ltd. All rights reserved.
Valid and Reliable Science Content Assessments for Science Teachers

NASA Astrophysics Data System (ADS)

Tretter, Thomas R.; Brown, Sherri L.; Bush, William S.; Saderholm, Jon C.; Holmes, Vicki-Lynn

2013-03-01

Science teachers' content knowledge is an important influence on student learning, highlighting an ongoing need for programs, and assessments of those programs, designed to support teacher learning of science. Valid and reliable assessments of teacher science knowledge are needed for direct measurement of this crucial variable. This paper describes multiple sources of validity and reliability (Cronbach's alpha greater than 0.8) evidence for physical, life, and earth/space science assessments—part of the Diagnostic Teacher Assessments of Mathematics and Science (DTAMS) project. Validity was strengthened by systematic synthesis of relevant documents, extensive use of external reviewers, and field tests with 900 teachers during assessment development process. Subsequent results from 4,400 teachers, analyzed with Rasch IRT modeling techniques, offer construct and concurrent validity evidence.
Measuring Long-Distance Romantic Relationships: A Validity Study

ERIC Educational Resources Information Center

Pistole, M. Carole; Roberts, Amber

2011-01-01

This study investigated aspects of construct validity for the scores of a new long-distance romantic relationship measure. A single-factor structure of the long-distance romantic relationship index emerged, with convergent and discriminant evidence of external validity, high internal consistency reliability, and applied utility of the scores.…
External Validation and Evaluation of Reliability and Validity of the Modified Seoul National University Renal Stone Complexity Scoring System to Predict Stone-Free Status After Retrograde Intrarenal Surgery.

PubMed

Park, Juhyun; Kang, Minyong; Jeong, Chang Wook; Oh, Sohee; Lee, Jeong Woo; Lee, Seung Bae; Son, Hwancheol; Jeong, Hyeon; Cho, Sung Yong

2015-08-01

The modified Seoul National University Renal Stone Complexity scoring system (S-ReSC-R) for retrograde intrarenal surgery (RIRS) was developed as a tool to predict stone-free rate (SFR) after RIRS. We externally validated the S-ReSC-R. We retrospectively reviewed 159 patients who underwent RIRS. The S-ReSC-R was assigned from 1 to 12 according to the location and number of sites involved. The stone-free status was defined as no evidence of a stone or with clinically insignificant residual fragment stones less than 2 mm. Interobserver and test-retest reliabilities were evaluated. Statistical performance of the prediction model was assessed by its predictive accuracy, predictive probability, and clinical usefulness. Overall SFR was 73.0%. The SFRs were 86.7%, 70.2%, and 48.6% in low-score (1-2), intermediate-score (3-4), and high-score (5-12) groups, respectively (p<0.001). External validation of S-ReSC-R revealed an area under the curve (AUC) of 0.731 (95% CI 0.650-0.813). The AUC of the three-titered S-ReSC-R was 0.701 (95% CI 0.609-0.794). The calibration plot showed that the predicted probability of SFR had a concordance comparable to that of observed frequency. The Hosmer-Lemeshow goodness of fit test revealed a p-value of 0.01 for the S-ReSC-R and 0.90 for the three-titered S-ReSC-R. Interobserver and test-retest reliabilities revealed an almost perfect level of agreement. The present study proved the predictive value of S-ReSC-R to predict SFR following RIRS in an independent cohort. Interobserver and test-retest reliabilities confirmed that S-ReSC-R was reliable and valid.
Publishing nutrition research: validity, reliability, and diagnostic test assessment in nutrition-related research.

PubMed

Gleason, Philip M; Harris, Jeffrey; Sheean, Patricia M; Boushey, Carol J; Bruemmer, Barbara

2010-03-01

This is the sixth in a series of monographs on research design and analysis. The purpose of this article is to describe and discuss several concepts related to the measurement of nutrition-related characteristics and outcomes, including validity, reliability, and diagnostic tests. The article reviews the methodologic issues related to capturing the various aspects of a given nutrition measure's reliability, including test-retest, inter-item, and interobserver or inter-rater reliability. Similarly, it covers content validity, indicators of absolute vs relative validity, and internal vs external validity. With respect to diagnostic assessment, the article summarizes the concepts of sensitivity and specificity. The hope is that dietetics practitioners will be able to both use high-quality measures of nutrition concepts in their research and recognize these measures in research completed by others. Copyright 2010 American Dietetic Association. Published by Elsevier Inc. All rights reserved.
Reliability, validity, sensitivity and specificity of Guajarati version of the Roland-Morris Disability Questionnaire.

PubMed

Nambi, S Gopal

2013-01-01

The most common instruments developed to assess the functional status of patients with Non specific low back pain is the Roland-Morris Disability Questionnaire (RMDQ). Clinical and epidemiological research related to low back pain in the Gujarati population would be facilitated by the availability of well-established outcome measures. To find the reliability, validity, sensitivity and specificity of the Gujarati version of the RMDQ for use in Non Specific Chronic low back pain. A reliability, validity, sensitivity and specificity study of Gujarati version of the Roland-Morris Disability Questionnaire (RMDQ). Thirty out patients with Non Specific Chronic low back pain were assessed by the RMDQ. Reliability is assessed by using internal consistency and the intra-class correlation coefficient (ICC). Internal construct validity is assessed by RASCH Analysis and external construct validity is assessed by association with pain and spinal movement. Clinical calculator was used to determine the sensitivity and specificity. Internal consistency of the RMDQ is found to be adequate (> 0.65) at both times, with high ICC's also at both time points. Internal construct validity of the scale is good, indicating a single underlying construct. Expected associations with pain and spinal movement confirm external construct validity. The Sensitivity and Specificity at cut off point of 0.5 was 80% and 84% with respectively positive predictive value (PPV) of 83.33% and negative predictive value (NPV) of 80.76%. The Questionnaire is at the ordinal level. The RMDQ is a one-dimensional, ordinal measure, which works well in the Gujarati population.
Validation of the breast evaluation questionnaire for breast hypertrophy and breast reduction.

PubMed

Lewin, Richard; Elander, Anna; Lundberg, Jonas; Hansson, Emma; Thorarinsson, Andri; Claudelin, Malin; Bladh, Helena; Lidén, Mattias

2018-06-13

There is a lack of published, validated questionnaires for evaluating psychosocial morbidity in patients with breast hypertrophy undergoing breast reduction surgery. To validate the breast evaluation questionnaire (BEQ), originally developed for the assessment of breast augmentation patients, for the assessment of psychosocial morbidity in patients with breast hypertrophy undergoing breast reduction surgery. Validation study Subjects: Women with macromastia Methods: The validation of the BEQ, adapted to breast reduction, was performed in several steps. Content validity, reliability, construct validity and responsiveness were assessed. The original version was adjusted according to the results for content validity and resulted in item reduction and a modified BEQ (mBEQ) that was then assessed for reliability, construct validity and responsiveness. Internal and external validation was performed for the modified BEQ. Convergent validity was tested against Breast-Q (reduction) and discriminate validity was tested against the SF-36. Known-groups validation revealed significant differences between the normal population and patients undergoing breast reduction surgery. The BEQ showed good reliability by test-re-test analysis and high responsiveness. The modified BEQ may be reliable, valid and responsive instrument for assessing women who undergo breast reduction.
Authenticity, Validity and Reliability in A-Level English Literature

ERIC Educational Resources Information Center

Hodgson, John

2017-01-01

This article discusses the use of assessment by teachers to replace external marking. It shows how professional participation and moderation can provide reliability in summative assessment, even in public examinations for older students. It draws on historical experiences of assessment for A-level English literature.
Validity and reliability of a low-cost digital dynamometer for measuring isometric strength of lower limb.

PubMed

Romero-Franco, Natalia; Jiménez-Reyes, Pedro; Montaño-Munuera, Juan A

2017-11-01

Lower limb isometric strength is a key parameter to monitor the training process or recognise muscle weakness and injury risk. However, valid and reliable methods to evaluate it often require high-cost tools. The aim of this study was to analyse the concurrent validity and reliability of a low-cost digital dynamometer for measuring isometric strength in lower limb. Eleven physically active and healthy participants performed maximal isometric strength for: flexion and extension of ankle, flexion and extension of knee, flexion, extension, adduction, abduction, internal and external rotation of hip. Data obtained by the digital dynamometer were compared with the isokinetic dynamometer to examine its concurrent validity. Data obtained by the digital dynamometer from 2 different evaluators and 2 different sessions were compared to examine its inter-rater and intra-rater reliability. Intra-class correlation (ICC) for validity was excellent in every movement (ICC > 0.9). Intra and inter-tester reliability was excellent for all the movements assessed (ICC > 0.75). The low-cost digital dynamometer demonstrated strong concurrent validity and excellent intra and inter-tester reliability for assessing isometric strength in the main lower limb movements.
The Italian version of the Mouth Handicap in Systemic Sclerosis scale (MHISS) is valid, reliable and useful in assessing oral health-related quality of life (OHRQoL) in systemic sclerosis (SSc) patients.

PubMed

Maddali Bongi, S; Del Rosso, A; Miniati, I; Galluccio, F; Landi, G; Tai, G; Matucci-Cerinic, M

2012-09-01

In systemic sclerosis (SSc), mouth and face involvement leads to problems in oral health-related quality of life (OHRQoL). Mouth Handicap in Systemic Sclerosis scale (MHISS) is a 12-item questionnaire specifically quantifying mouth disability in SSc, organized in 3 subscales. Our aim was to validate Italian version of MHISS, by assessing its test-retest reliability and internal and external consistency in Italian SSc patients. Forty SSc patients (7 dSSc, 33 lSSc; age and disease duration: 57.27 ± 11.41, 9.4 ± 4.4 years; 22 with sicca syndrome) were evaluated with MHISS. MHISS was translated following a forward-backward translation procedure, with independent translations and counter-translation. Test-retest reliability was evaluated, comparing the results of two administrations, with intraclass correlation coefficient (ICC). Internal consistency was assessed by Cronbach's α and external consistency by comparison with mouth opening. MHISS has a good test-retest reliability (ICC: 0.93) and internal consistency (Cronbach's α:0.99). A good external consistency was confirmed by correlation with mouth opening (rho: -0,3869, p: 0.0137). Total MHISS score was 17.65 ± 5.20, with scores of subscale 1 (reduced mouth opening) of 6.60 ± 2.85 and scores of subscales 2 (sicca syndrome) and 3 (aesthetic concerns) of 7.82 ± 2.59 and 3.22 ± 1.14. Total and subscale 2 scores are higher in dSSc than in lSSc. This result may be due to the higher presence of sicca syndrome in dSSc than in lSSc (p = 0.0109). Our results support validity and reliability in Italian SSc patients of MHISS, specifically measuring SSc OHRQoL.
Development and validity of a scale to measure workplace culture of health.

PubMed

Kwon, Youngbum; Marzec, Mary L; Edington, Dee W

2015-05-01

To describe the development of and test the validity and reliability of the Workplace Culture of Health (COH) scale. Exploratory factor analysis and confirmatory factor analysis were performed on data from a health care organization (N = 627). To verify the factor structure, confirmatory factor analysis was performed on a second data set from a medical equipment manufacturer (N = 226). The COH scale included a structure of five orthogonal factors: senior leadership and polices, programs and rewards, quality assurance, supervisor support, and coworker support. With regard to construct validity (convergent and discriminant) and reliability, two different US companies showed the same factorial structure, satisfactory fit statistics, and suitable internal and external consistency. The COH scale represents a reliable and valid scale to assess the workplace environment and culture for supporting health.
Transcultural adaptation to Spanish of the instrument "Effectiveness of Auditory Rehabilitation" for the assessment of quality of life in patients using hearing aids.

PubMed

Cardemil, Felipe; Esquivel, Patricia; Aguayo, Lorena; Barría, Tamara; Fuente, Adrian; Carvajal, Rocío; Fromín, Rose; Villalobos, Iván; Yueh, Bevan

2013-01-01

It is becoming increasingly important to have reliable and valid questionnaires. This becomes especially important when evaluating hearing loss. the "Effectiveness of Auditory Rehabilitation" (EAR) questionnaire for the Spanish-speaking population. This instrument assesses quality of life and hearing aspects in patients using hearing aids. Cross-sectional validation study. A cultural adaptation through the use of English to Spanish translations and re-translations was carried out. The validity and reliability of the newly adapted instrument were evaluated. A total of 69 individuals (44 older adults and 25 younger adults) were examined. The pure-tone averages (PTA, 500, 1,000 and 2,000 Hz) were 47.3 dB HL and 47.1 dB HL for the left and right ears, respectively. The mean maximum speech discrimination in silence for monosyllables were 83.3% and 82.9% for the left and right ears, respectively. Internal consistency presented Cronbach alpha values of 0.85 and 0.77 for the internal and external dimensions, respectively. The intraclass correlation coefficients were 0.80 for the internal module and 0.85 for the external module. Construct validity reported a correlation coefficient of 0.71 at baseline and 0.76 at 3 months after the initial assessment for the internal module, and 0.62 at baseline and 0.74 at 3 months after the initial assessment for the external module. The size effects were 1.3 and 1.1 for the internal and external modules, respectively. The Spanish version of the EAR questionnaire seems to be a reliable and valid instrument. The evaluation of audiological aspects, as well as aspects relating to aesthetics and comfort are the main strengths of this instrument. Finally, the EAR scale is more sensitive to change than other scales. Copyright © 2013 Elsevier España, S.L. All rights reserved.
Development of a job stressor scale for nurses caring for patients with intractable neurological diseases.

PubMed

Ando, Yukako; Kataoka, Tsuyoshi; Okamura, Hitoshi; Tanaka, Katsutoshi; Kobayashi, Toshio

2013-12-01

The purpose of this research is to verify the reliability and validity of a job stressor scale for nurses caring for patients with intractable neurological diseases. A mail survey was conducted using a self-report questionnaire. The subjects were 263 nurses and assistant nurses working in wards specializing in intractable neurological diseases. The response rate was 71.9% (valid response rate, 66.2%). With regard to reliability, internal consistency and stability were assessed. Internal consistency was examined via Cronbach's alpha. For stability, the test-retest method was performed and stability was examined via intraclass correlation coefficients. With regard to validity, factor validity, criterion-related validity, and content validity were assessed. Exploratory factor analysis was used for factor validity. For criterion-related validity, an existing scale was used as an external criterion; concurrent validity was examined via Spearman's rank correlation coefficients. As a result of analysis, there were 26 items in the scale created with an eight factor structure. Cronbach's a for the 26 items was 0.90; with the exception of two factors, alpha for all of the individual sub-factors was high at 0.7 or higher. The intraclass correlation coefficient for the 26 items was 0.89 (p < 0.001). With regard to criterion-related validity, concurrent validity was confirmed and the correlation coefficient with an external criterion was 0.73 (p < 0.001). For content validity, subjects who responded that "The questionnaire represents a stressor well or to a degree" accounted for 81% of the total responses. Reliability and validity were confirmed, so the scale created in the current research is a usable scale.
The revised Generalized Expectancy for Success Scale: a validity and reliability study.

PubMed

Hale, W D; Fiedler, L R; Cochran, C D

1992-07-01

The Generalized Expectancy for Success Scale (GESS; Fibel & Hale, 1978) was revised and assessed for reliability and validity. The revised version was administered to 199 college students along with other conceptually related measures, including the Rosenberg Self-Esteem Scale, the Life Orientation Test, and Rotter's Internal-External Locus of Control Scale. One subsample of students also completed the Eysenck Personality Inventory, while another subsample performed a criterion-related task that involved risk taking. Item analysis yielded 25 items with correlations of .45 or higher with the total score. Results indicated high internal consistency and test-retest reliability.
The Validity and Reliability Test of the Indonesian Version of Gastroesophageal Reflux Disease Quality of Life (GERD-QOL) Questionnaire.

PubMed

Siahaan, Laura A; Syam, Ari F; Simadibrata, Marcellus; Setiati, Siti

2017-01-01

to obtain a valid and reliable GERD-QOL questionnaire for Indonesian application. at the initial stage, the GERD-QOL questionnaire was first translated into Indonesian language and the translated questionnaire was subsequently translated back into the original language (back-to-back translation). The results were evaluated by the researcher team and therefore, an Indonesian version of GERD-QOL questionnaire was developed. Ninety-one patients who had been clinically diagnosed with GERD based on the Montreal criteria were interviewed using the Indonesian version of GERD-QOL questionnaire and the SF 36 questionnaire. The validity was evaluated using a method of construct validity and external validity, and reliability can be tested by the method of internal consistency and test retest. the Indonesian version of GERD-QOL questionnaire had a good internal consistency reliability with a Cronbach Alpha of 0.687-0.842 and a good test retest reliability with an intra-class correlation coefficient of 0.756-0.936; p<0.05). The questionnaire had also been demonstrated to have a good validity with a proven high correlation to each question of SF-36 (p<0.05). the Indonesian version of GERD-QOL questionnaire has been proven valid and reliable to evaluate the quality of life of GERD patients.
Externalizing disorders: cluster 5 of the proposed meta-structure for DSM-V and ICD-11.

PubMed

Krueger, R F; South, S C

2009-12-01

The extant major psychiatric classifications DSM-IV and ICD-10 are purportedly atheoretical and largely descriptive. Although this achieves good reliability, the validity of a medical diagnosis is greatly enhanced by an understanding of the etiology. In an attempt to group mental disorders on the basis of etiology, five clusters have been proposed. We consider the validity of the fifth cluster, externalizing disorders, within this proposal. We reviewed the literature in relation to 11 validating criteria proposed by the Study Group of the DSM-V Task Force, in terms of the extent to which these criteria support the idea of a coherent externalizing spectrum of disorders. This cluster distinguishes itself by the central role of disinhibitory personality in mental disorders spread throughout sections of the current classifications, including substance dependence, antisocial personality disorder and conduct disorder. Shared biomarkers, co-morbidity and course offer additional evidence for a valid cluster of externalizing disorders. Externalizing disorders meet many of the salient criteria proposed by the Study Group of the DSM-V Task Force to suggest a classification cluster.
Reliability and Validity of the Hip Stability Isometric Test (HipSIT): A New Method to Assess Hip Posterolateral Muscle Strength.

PubMed

Almeida, Gabriel Peixoto Leão; das Neves Rodrigues, Helena Larissa; de Freitas, Bruno Wesley; de Paula Lima, Pedro Olavo

2017-12-01

Study Design Cross-sectional study. Background The Hip Stability Isometric Test (HipSIT) evaluates the strength of the hip posterolateral stabilizers in a position that favors greater activation of the gluteus maximus and gluteus medius and lower activation of the tensor fascia lata. Objectives To check the validity and reliability of the HipSIT and to evaluate the HipSIT in women with patellofemoral pain (PFP). Methods The HipSIT was evaluated with a handheld dynamometer. During testing, the participants were sidelying, with their legs positioned at 45° of hip flexion and 90° of knee flexion. Participants were instructed to raise the knee of the upper leg while keeping the upper and lower heels in contact. To establish reliability and validity, 49 women were tested with the HipSIT by 2 different evaluators on day 1, and then again 7 days later. The strength of the hip extensors, abductors, and external rotators was also evaluated. Twenty women with unilateral PFP were also evaluated. Results The HipSIT has excellent intrarater and interrater reliability. The standard error of measurement was 0.01 kgf/kg, and the minimal detectable change was 0.036 kgf/kg. The HipSIT showed good validity in isolated hip abduction, external rotation, and extension (P<.01). Women with PFP showed a 10% deficit in the HipSIT results for the symptomatic limb (P = .01). Conclusion The HipSIT showed excellent interrater and intrarater reliability, moderate to good validity in women, and was able to identify strength deficits in women with PFP. J Orthop Sports Phys Ther 2017;47(12):906-913. Epub 9 Oct 2017. doi:10.2519/jospt.2017.7274.
Reliability, Validity, and Clinical Utility of the Dominic Interactive for Adolescents-RevisedA DSM-5-Based Self-Report Screen for Mental Disorders, Borderline Personality Traits, and Suicidality.

PubMed

Bergeron, Lise; Smolla, Nicole; Berthiaume, Claude; Renaud, Johanne; Breton, Jean-Jacques; St-Georges, Marie; Morin, Pauline; Zavaglia, Elissa; Labelle, Réal

2017-03-01

The Dominic Interactive for Adolescents-Revised (DIA-R) is a multimedia self-report screen for 9 mental disorders, borderline personality traits, and suicidality defined by the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders ( DSM-5). This study aimed to examine the reliability and the validity of this instrument. French- and English-speaking adolescents aged 12 to 15 years ( N = 447) were recruited from schools and clinical settings in Montreal and were evaluated twice. The internal consistency was estimated by Cronbach alpha coefficients and the test-retest reliability by intraclass correlation coefficients. Cutoff points on the DIA-R scales were determined by using clinically relevant measures for defining external validation criteria: the Schedule for Affective Disorders and Schizophrenia for School-Aged Children, the Beck Hopelessness Scale, and the Abbreviated-Diagnostic Interview for Borderlines. Receiver operating characteristic (ROC) analyses provided accuracy estimates (area under the ROC curve, sensitivity, specificity, likelihood ratio) to evaluate the ability of the DIA-R scales to predict external criteria. For most of the DIA-R scales, reliability coefficients were excellent or moderate. High or moderate accuracy estimates from ROC analyses demonstrated the ability of the DIA-R thresholds to predict psychopathological conditions. These thresholds were generally capable to discriminate between clinical and school subsamples. However, the validity of the obsessions/compulsions scale was too low. Findings clearly support the reliability and the validity of the DIA-R. This instrument may be useful to assess a wide range of adolescents' mental health problems in the continuum of services. This conclusion applies to all scales, except the obsessions/compulsions one.
Measuring Emotions in Students' Learning and Performance: The Achievement Emotions Questionnaire (AEQ)

ERIC Educational Resources Information Center

Pekrun, Reinhard; Goetz, Thomas; Frenzel, Anne C.; Barchfeld, Petra; Perry, Raymond P.

2011-01-01

Aside from test anxiety scales, measurement instruments assessing students' achievement emotions are largely lacking. This article reports on the construction, reliability, internal validity, and external validity of the Achievement Emotions Questionnaire (AEQ) which is designed to assess various achievement emotions experienced by students in…

Self-reported competency--validation of the Norwegian version of the patient competency rating scale for traumatic brain injury.

PubMed

Sveen, Unni; Andelic, Nada; Bautz-Holter, Erik; Røe, Cecilie

2015-01-01

To evaluate the psychometric properties of the Norwegian version of the Patient Competency Rating Scale (PCRS) in patients with traumatic brain injury (TBI) at 12 months post-injury. Demographic and injury-related data were registered upon admission to the hospital in 148 TBI patients with mild, moderate, or severe TBI. At 12 months post-injury, competency in activities and global functioning were measured using the PCRS patient version and the Glasgow Outcome Scale-Extended (GOSE). Descriptive reliability statistics, factor analysis and Rasch modeling were applied to explore the psychometric properties of the PCRS. External validity was evaluated using the GOSE. The PCRS can be divided into three subscales that reflect interpersonal/emotional, cognitive, and activities of daily living competency. The three-factor solution explained 56.6% of the variance in functioning. The internal consistency was very good, with a Cronbach's α of 0.95. Item 30, "controlling my laughter", did not load above 0.40 on any factors and did not fit the Rasch model. The external validity of the subscales was acceptable, with correlations between 0.50 and 0.52 with the GOSE. The Norwegian version of the PCRS is reliable, has an acceptable construct and external validity, and can be recommended for use during the later phases of TBI.
Measuring Eccentric Strength of the Shoulder External Rotators Using a Handheld Dynamometer: Reliability and Validity

PubMed Central

Johansson, Fredrik R.; Skillgate, Eva; Lapauw, Mattis L.; Clijmans, Dorien; Deneulin, Valentijn P.; Palmans, Tanneke; Engineer, Human Kinetic; Cools, Ann M.

2015-01-01

Context Shoulder strength assessment plays an important role in the clinical examination of the shoulder region. Eccentric strength measurements are of special importance in guiding the clinician in injury prevention or return-to-play decisions after injury. Objective To examine the absolute and relative reliability and validity of a standardized eccentric strength-measurement protocol for the glenohumeral external rotators. Design Descriptive laboratory study. Setting Testing environment at the Department of Rehabilitation Sciences and Physiotherapy of Ghent University, Belgium. Patients or Other Participants Twenty-five healthy participants (9 men and 16 women) without any history of shoulder pain were tested by 2 independent assessors using a handheld dynamometer (HHD) and underwent an isokinetic testing procedure. Intervention(s) The clinical protocol used an HHD, a DynaPort accelerometer to measure acceleration and angular velocity of testing 30°/s over 90° of range of motion, and a Biodex dynamometer to measure isokinetic activity. Main Outcome Measure(s) Three eccentric strength measurements: (1) tester 1 with the HHD, (2) tester 2 with the HHD, and (3) Biodex isokinetic strength measurement. Results The intratester reliability was excellent (0.879 and 0.858), whereas the intertester reliability was good, with an intraclass correlation coefficient between testers of 0.714. Pearson product moment correlation coefficients of 0.78 and 0.70 were noted between the HHD and the isokinetic data, showing good validity of this new procedure. Conclusions Standardized eccentric rotator cuff strength can be tested and measured in the clinical setting with good-to-excellent reliability and validity using an HHD. PMID:25974381
Analysis of internal and external validity criteria for a computerized visual search task: A pilot study.

PubMed

Richard's, María M; Introzzi, Isabel; Zamora, Eliana; Vernucci, Santiago

2017-01-01

Inhibition is one of the main executive functions, because of its fundamental role in cognitive and social development. Given the importance of reliable and computerized measurements to assessment inhibitory performance, this research intends to analyze the internal and external criteria of validity of a computerized conjunction search task, to evaluate the role of perceptual inhibition. A sample of 41 children (21 females and 20 males), aged between 6 and 11 years old (M = 8.49, SD = 1.47), intentionally selected from a private management school of Mar del Plata (Argentina), middle socio-economic level were assessed. The Conjunction Search Task from the TAC Battery, Coding and Symbol Search tasks from Wechsler Intelligence Scale for Children were used. Overall, results allow us to confirm that the perceptual inhibition task form TAC presents solid rates of internal and external validity that make a valid measurement instrument of this process.
[Clinical and empirical findings with the OPD-CA].

PubMed

Winter, Sibylle; Jelen, Anna; Pressel, Christine; Lenz, Klaus; Lehmkuhl, Ulrike

2011-01-01

60 clinical patients (5-17 years) were diagnosed with an interview-manual of OPD-CA (Winter, 2004). For clinical validity a comparison of patients with internal (N=17) and external disorders (N=19) was shown. References for clinical validity resulted from the comparison of the groups, especially for the axes "conflict" and "prerequisites for treatment". Patients with internal disorders showed the conflict desire for care versus autarchy significantly more often than patients with external disorders. On the other hand patients with external disorders displayed the conflict submission versus control significantly more often. Significant differences were also found for the axis "prerequisites for treatment". Patients with internal disorders had better "prerequisites for treatment" in the domains experience of illness and the prerequisites for therapy. For the axes "interpersonal relation", "structure" and "prerequisites for treatment" satisfactory data for validity and reliability were found. The clinical validity points to the usefulness of OPD-CA-manual for psychodynamic diagnostics in childhood and adolescence.
Forecasting Emergency Department Crowding: An External, Multi-Center Evaluation

PubMed Central

Hoot, Nathan R.; Epstein, Stephen K.; Allen, Todd L.; Jones, Spencer S.; Baumlin, Kevin M.; Chawla, Neal; Lee, Anna T.; Pines, Jesse M.; Klair, Amandeep K.; Gordon, Bradley D.; Flottemesch, Thomas J.; LeBlanc, Larry J.; Jones, Ian; Levin, Scott R.; Zhou, Chuan; Gadd, Cynthia S.; Aronsky, Dominik

2009-01-01

Objective To apply a previously described tool to forecast ED crowding at multiple institutions, and to assess its generalizability for predicting the near-future waiting count, occupancy level, and boarding count. Methods The ForecastED tool was validated using historical data from five institutions external to the development site. A sliding-window design separated the data for parameter estimation and forecast validation. Observations were sampled at consecutive 10-minute intervals during 12 months (n = 52,560) at four sites and 10 months (n = 44,064) at the fifth. Three outcome measures – the waiting count, occupancy level, and boarding count – were forecast 2, 4, 6, and 8 hours beyond each observation, and forecasts were compared to observed data at corresponding times. The reliability and calibration were measured following previously described methods. After linear calibration, the forecasting accuracy was measured using the median absolute error (MAE). Results The tool was successfully used for five different sites. Its forecasts were more reliable, better calibrated, and more accurate at 2 hours than at 8 hours. The reliability and calibration of the tool were similar between the original development site and external sites; the boarding count was an exception, which was less reliable at four out of five sites. Some variability in accuracy existed among institutions; when forecasting 4 hours into the future, the MAE of the waiting count ranged between 0.6 and 3.1 patients, the MAE of the occupancy level ranged between 9.0 and 14.5% of beds, and the MAE of the boarding count ranged between 0.9 and 2.7 patients. Conclusion The ForecastED tool generated potentially useful forecasts of input and throughput measures of ED crowding at five external sites, without modifying the underlying assumptions. Noting the limitation that this was not a real-time validation, ongoing research will focus on integrating the tool with ED information systems. PMID:19716629
The Student Risk Screening Scale for Early Childhood: An Initial Validation Study

ERIC Educational Resources Information Center

Lane, Kathleen Lynne; Oakes, Wendy Peia; Menzies, Holly Mariah; Major, Rebecca; Allegra, Laurie; Powers, Lisa; Schatschneider, Chris

2015-01-01

We report findings of two exploratory validation studies of a revised instrument: the "Student Risk Screening Scale for Early Childhood" version (SRSS-EC). The SRSS-EC was modified to reflect characteristics of externalizing and internalizing behaviors manifested by preschool-age children. In Study 1, we explored the reliability of…
Preliminary findings on the reliability and validity of the Cantonese Birmingham Cognitive Screen in patients with acute ischemic stroke

PubMed Central

Pan, Xiaoping; Chen, Haobo; Bickerton, Wai-Ling; Lau, Johnny King Lam; Kong, Anthony Pak Hin; Rotshtein, Pia; Guo, Aihua; Hu, Jianxi; Humphreys, Glyn W

2015-01-01

Background There are no currently effective cognitive assessment tools for patients who have suffered stroke in the People’s Republic of China. The Birmingham Cognitive Screen (BCoS) has been shown to be a promising tool for revealing patients’ poststroke cognitive deficits in specific domains, which facilitates more individually designed rehabilitation in the long run. Hence we examined the reliability and validity of a Cantonese version BCoS in patients with acute ischemic stroke, in Guangzhou. Method A total of 98 patients with acute ischemic stroke were assessed with the Cantonese version of the BCoS, and an additional 133 healthy individuals were recruited as controls. Apart from the BCoS, the patients also completed a number of external cognitive tests, including the Montreal Cognitive Assessment Test (MoCA), Mini Mental State Examination (MMSE), Albert’s cancellation test, the Rey–Osterrieth Complex Figure Test, and six gesture matching tasks. Cutoff scores for failing each subtest, ie, deficits, were computed based on the performance of the controls. The validity and reliability of the Cantonese BCoS were examined, as well as interrater and test–retest reliability. We also compared the proportions of cases being classified as deficits in controlled attention, memory, character writing, and praxis, between patients with and without spoken language impairment. Results Analyses showed high test–retest reliability and agreement across independent raters on the qualitative aspects of measurement. Significant correlations were observed between the subtests of the Cantonese BCoS and the other external cognitive tests, providing evidence for convergent validity of the Cantonese BCoS. The screen was also able to generate measures of cognitive functions that were relatively uncontaminated by the presence of aphasia. Conclusion This study suggests good reliability and validity of the Cantonese version of the BCoS. The Cantonese BCoS is a very promising tool for the detection of cognitive problems in Cantonese speakers. PMID:26396522
[Development and validity of workplace bullying in nursing-type inventory (WPBN-TI)].

PubMed

Lee, Younju; Lee, Mihyoung

2014-04-01

The purpose of this study was to develop an instrument to assess bullying of nurses, and test the validity and reliability of the instrument. The initial thirty items of WPBN-TI were identified through a review of the literature on types bullying related to nursing and in-depth interviews with 14 nurses who experienced bullying at work. Sixteen items were developed through 2 content validity tests by 9 experts and 10 nurses. The final WPBN-TI instrument was evaluated by 458 nurses from five general hospitals in the Incheon metropolitan area. SPSS 18.0 program was used to assess the instrument based on internal consistency reliability, construct validity, and criterion validity. WPBN-TI consisted of 16 items with three distinct factors (verbal and nonverbal bullying, work-related bullying, and external threats), which explained 60.3% of the total variance. The convergent validity and determinant validity for WPBN-TI were 100.0%, 89.7%, respectively. Known-groups validity of WPBN-TI was proven through the mean difference between subjective perception of bullying. The satisfied criterion validity for WPBN-TI was more than .70. The reliability of WPBN-TI was Cronbach's α of .91. WPBN-TI with high validity and reliability is suitable to determine types of bullying in nursing workplace.
Reliability and validity of goniometric iPhone applications for the assessment of active shoulder external rotation.

PubMed

Mitchell, Katy; Gutierrez, Simran Bakshi; Sutton, Stacy; Morton, Stephanie; Morgenthaler, Andrea

2014-10-01

The purpose of this study was to determine the reliability and validity of two smartphone applications: (1) GetMyROM - inclinometery-based and (2) DrGoniometry - photo-based in the measurement of active shoulder external rotation (ER) as compared to standard goniometry (SG). Ninety-four Texas Woman's University Doctor of Physical Therapy students from the School of Physical Therapy - Houston campus, were recruited to participate in this study. Two iPhone applications were compared to SG using both novice and experienced raters. Active shoulder ER range of motion was measured over two time periods in random order by blinded novice and experienced raters. Intra-rater reliability using novice raters for the two applications ranged from an intraclass correlation coefficient (ICC) of 0.79 to 0.81 with SG at 0.82. Inter-rater reliability (novice/expert) for the two applications ranged from an ICC of 0.92 to 0.94 with SG at 0.91. Concurrent validity (when compared to SG) ranged from 0.93 to 0.94. There were no significant differences between the novice and experienced raters. Both applications were found to be reliable and comparable to SG. A photo-based application potentially offers a superior method of measurement as visualizing the landmarks may be simplified in this format and it provides a record of measurement. Further study using patient populations may find the two studied applications are useful as an adjunct for clinical practice.
Reliability, validity, and utility of the Minnesota Multiphasic Personality Inventory-2-Restructured Form (MMPI-2-RF) in assessments of bariatric surgery candidates.

PubMed

Tarescavage, Anthony M; Wygant, Dustin B; Boutacoff, Lana I; Ben-Porath, Yossef S

2013-12-01

In the current study, we examined the reliability, validity, and clinical utility of Minnesota Multiphasic Personality Inventory-2-Restructured Form (MMPI-2-RF; Ben-Porath & Tellegen, 2011) scores in a sample of 759 bariatric surgery candidates. We provide descriptives for all scales, internal consistency and standard error of measurement estimates for all substantive scales, external correlates of substantive scales using chart review and self-report criteria, and relative risk ratios to assess the clinical utility of the instrument. Results generally support the reliability, validity, and clinical utility of MMPI-2-RF scale scores in the psychological evaluation of bariatric surgery candidates. Limitations, future directions, and practical application of these results are discussed. (c) 2013 APA, all rights reserved.
Validation of EncephalApp, Smartphone-Based Stroop Test, for the Diagnosis of Covert Hepatic Encephalopathy.

PubMed

Bajaj, Jasmohan S; Heuman, Douglas M; Sterling, Richard K; Sanyal, Arun J; Siddiqui, Muhammad; Matherly, Scott; Luketic, Velimir; Stravitz, R Todd; Fuchs, Michael; Thacker, Leroy R; Gilles, HoChong; White, Melanie B; Unser, Ariel; Hovermale, James; Gavis, Edith; Noble, Nicole A; Wade, James B

2015-10-01

Detection of covert hepatic encephalopathy (CHE) is difficult, but point-of-care testing could increase rates of diagnosis. We aimed to validate the ability of the smartphone app EncephalApp, a streamlined version of Stroop App, to detect CHE. We evaluated face validity, test-retest reliability, and external validity. Patients with cirrhosis (n = 167; 38% with overt HE [OHE]; mean age, 55 years; mean Model for End-Stage Liver Disease score, 12) and controls (n = 114) were each given a paper and pencil cognitive battery (standard) along with EncephalApp. EncephalApp has Off and On states; results measured were OffTime, OnTime, OffTime+OnTime, and number of runs required to complete 5 off and on runs. Thirty-six patients with cirrhosis underwent driving simulation tests, and EncephalApp results were correlated with results. Test-retest reliability was analyzed in a subgroup of patients. The test was performed before and after transjugular intrahepatic portosystemic shunt placement, and before and after correction for hyponatremia, to determine external validity. All patients with cirrhosis performed worse on paper and pencil and EncephalApp tests than controls. Patients with cirrhosis and OHE performed worse than those without OHE. Age-dependent EncephalApp cutoffs (younger or older than 45 years) were set. An OffTime+OnTime value of >190 seconds identified all patients with CHE with an area under the receiver operator characteristic value of 0.91; the area under the receiver operator characteristic value was 0.88 for diagnosis of CHE in those without OHE. EncephalApp times correlated with crashes and illegal turns in driving simulation tests. Test-retest reliability was high (intraclass coefficient, 0.83) among 30 patients retested 1-3 months apart. OffTime+OnTime increased significantly (206 vs 255 seconds, P = .007) among 10 patients retested 33 ± 7 days after transjugular intrahepatic portosystemic shunt placement. OffTime+OnTime decreased significantly (242 vs 225 seconds, P = .03) in 7 patients tested before and after correction for hyponatremia (126 ± 3 to 132 ± 4 meq/L, P = .01) 10 ± 5 days apart. A smartphone app called EncephalApp has good face validity, test-retest reliability, and external validity for the diagnosis of CHE. Copyright © 2015 AGA Institute. Published by Elsevier Inc. All rights reserved.
Validation of EncephalApp, Smartphone-based Stroop Test, for the Diagnosis of Covert Hepatic Encephalopathy

PubMed Central

Bajaj, Jasmohan S; Heuman, Douglas M; Sterling, Richard K; Sanyal, Arun J; Siddiqui, Muhammad; Matherly, Scott; Luketic, Velimir; Stravitz, R Todd; Fuchs, Michael; Thacker, Leroy R; Gilles, HoChong; White, Melanie B; Unser, Ariel; Hovermale, James; Gavis, Edith; Noble, Nicole A; Wade, James B

2014-01-01

Background & Aims Detection of covert hepatic encephalopathy (CHE) is difficult but point of care testing could increase rates of diagnosis. We aimed to validate the ability of the smartphone app EncephalApp, a streamlined version of Stroop App, to detect CHE. We evaluated face validity, test–retest reliability, and external validity. Methods Patients with cirrhosis (n=167; 38% with overt HE [OHE]; mean age, 55 years; mean model for end-stage liver disease score, 12) and controls (n=114) were each given a paper and pencil cognitive battery (standard) along with EncephalApp. EncephalApp has Off and On states; results measured were: OffTime, OnTime, OffTime+OnTime, and number of runs required to complete 5 off and on runs. Thirty-six patients with cirrhosis underwent driving simulation tests, and EncephalApp results were correlated with results. Test–retest reliability was analyzed in a subgroup of patients. The test was performed before and after transjugular intra-hepatic portosystemic shunt placement, before and after correction for hyponatremia, to determine external validity. Results All patients with cirrhosis performed worse on paper and pencil and EncephalApp tests than controls. Patients with cirrhosis and OHE performed worse than those without OHE. Age-dependent EncephalApp cut-offs (younger or older than 45 years) were set. An OffTime+OnTime value of >190 seconds identified all patients with CHE with an area under the receiver operator characteristic (AUROC) value of 0.91; the AUROC value was 0.88 for diagnosis of CHE in those without OHE. EncephalApp times correlated with crashes and illegal turns in driving simulation tests. Test–retest reliability was high (intra-class coefficient, 0.83) among 30 patients retested 1–3 months apart. OffTime+OnTime increased significantly (206 vs 255, P=.007) among 10 patients retested 33±7 days after transjugular intra-hepatic portosystemic shunt placement. OffTime+OnTime decreased significantly (242 vs 225, P=.03) in 7 patients tested before and after correction for hyponatremia (126±3 to 132±4 meq/L, P=.01), 10±5 days apart. Conclusions A smartphone app called EncephalApp has good face validity, test–retest reliability, and external validity for the diagnosis of CHE. PMID:24846278
Development and Psychometric Properties of the Math and Me Survey: Measuring Third through Sixth Graders' Attitudes toward Mathematics

ERIC Educational Resources Information Center

Adelson, Jill L.; McCoach, D. Betsy

2011-01-01

The Math and Me Survey was designed to measure elementary students' attitudes toward mathematics. The authors conducted content validation, exploratory factor analysis, confirmatory factor analysis, item response theory, reliability, and external validity analyses to improve it and to test its psychometric properties. The final Math and Me Survey…
Reliability and Validity of the Migraine Disability Assessment Scale among Migraine and Tension Type Headache in Iranian Patients

PubMed Central

Asgari, Fatemeh; Haghdoost, Faraidoon; Masjedi, Samaneh Sadat; Manouchehri, Navid; Banihashemi, Mahboobeh; Ghorbani, Abbas; Najafi, Mohammad Reza; Saadatnia, Mohammad; Lipton, Richard B.

2014-01-01

Introduction. MIDAS is a valid and reliable short questionnaire for assessment of headache related disability. Linguistic validation of Persian MIDAS and assessment of psychometric properties between tension type headache (TTH) and migraine were the aims of this study. Methods. Patients with migraine or TTH were included. At the first visit, we administered a headache symptom questionnaire, MIDAS, and SF-36. Patients filled out MIDAS in second and third visit within three and eight weeks after base line visit. Internal consistency (Cronbach α) and test-retest reproducibility (Spearman correlation coefficient) were used to assess reliability. Convergent validity and MIDAS capability to differentiate between chronic and episodic headaches (migraine and TTH) were also assessed. Results. The 267 participants had episodic migraine (EM-64%), chronic migraine (CM-13.5%), episodic TTH (ETTH-13.5%), and chronic TTH (CTTH-9). Internal consistency reliability was 0.8 for the entire sample, 0.72 for TTH, and 0.82 for migraine. Test-retest reliability for all questions between visit 1 and visit 2 varied from 0.54 to 0.71. Convergent validity was assessed using SF-36 as an external referent. Patients with episodic headaches (EM and ETTH) had significantly lower MIDAS scores than chronic headaches (CM and CTTH). Conclusion. Persian MIDAS is a valid and reliable questionnaire for migraine and TTH that can differentiate between episodic headache and chronic headache. PMID:24527462
Health-related quality of life in children with dysphonia and validation of the French Pediatric Voice Handicap Index.

PubMed

Oddon, P A; Boucekine, M; Boyer, L; Triglia, J M; Nicollas, R

2018-01-01

voice disorders are common in the pediatric population and can negatively affect children's quality of life. The pediatric voice handicap Index (pVHI) is a valid instrument to assess parental perception of their children voice but it is not translated into French language. The aim of the present study was to adapt a French version of the pVHI and to evaluate its psychometric properties including construct validity, reliability, and some aspects of external validity. we performed a cross sectional study including 32 dysphonic children and 60 children with no history of voice problems between 3 and 12 years of age. The original pVHI was translated into French language according to forward-backward rules and then administered to parents or caregivers. Construct validity and internal consistency were explored using confirmatory factor analysis and Cronbach's alpha. The questionnaire was filled twice to assess test-retest reliability using the intra-class correlation coefficient. The external validity was explored by comparing the French pVHI total and subscales scores between dysphonic and asymptomatic children. Correlations between the French pVHI and both the perceptual GRBAS scale and the health-related quality of life (HRQOL) survey "Vécu et Santé Perçu de l'Adolescent et de l'Enfant" (VSP-Ap) were also performed. the structure of the French pVHI showed a good fit with excellent reliability (α = 0.929) and high test-retest reliability. Significant differences were found between the group of dysphonic children and the control group (p < 0.001). The French pVHI scores were positively correlated to all parameters of the GRBAS scale (p < 0.05). Significant negative correlations were found between the Functional domain of the pVHI and various domains of the VSP-Ap as Leisure Activities, Schooling and Sentimental Relationship (p < 0.05). the French pVHI is considered to be a valid and reliable instrument to assess voice-related quality of life in children with voice disorder. We recommend its use in the multidimensional protocols for assessing voice disorder in the pediatric population. Copyright © 2017. Published by Elsevier B.V.
Patterns of Cognitive Strengths and Weaknesses: Identification Rates, Agreement, and Validity for Learning Disabilities Identification

PubMed Central

Miciak, Jeremy; Fletcher, Jack M.; Stuebing, Karla; Vaughn, Sharon; Tolar, Tammy D.

2014-01-01

Purpose Few empirical investigations have evaluated LD identification methods based on a pattern of cognitive strengths and weaknesses (PSW). This study investigated the reliability and validity of two proposed PSW methods: the concordance/discordance method (C/DM) and cross battery assessment (XBA) method. Methods Cognitive assessment data for 139 adolescents demonstrating inadequate response to intervention was utilized to empirically classify participants as meeting or not meeting PSW LD identification criteria using the two approaches, permitting an analysis of: (1) LD identification rates; (2) agreement between methods; and (3) external validity. Results LD identification rates varied between the two methods depending upon the cut point for low achievement, with low agreement for LD identification decisions. Comparisons of groups that met and did not meet LD identification criteria on external academic variables were largely null, raising questions of external validity. Conclusions This study found low agreement and little evidence of validity for LD identification decisions based on PSW methods. An alternative may be to use multiple measures of academic achievement to guide intervention. PMID:24274155
Validation of the Italian version of the Stanford Presenteeism Scale in nurses.

PubMed

Cicolini, Giancarlo; Della Pelle, Carlo; Cerratti, Francesca; Franza, Marcello; Flacco, Maria E

2016-07-01

To ascertain the validity and reliability of the Italian version of the Stanford Presenteeism Scale (SPS-6). Presenteeism has been associated with a work productivity reduction, a lower quality of work and an increased risk of developing health disorders. It is particularly high among nurses and needs valid tools to be assessed. A validation study was carried out from July to September 2014. A three-section tool, made of a demographic form, the Stanford Presenteeism Scale (SPS-6) and the Perceived Stress Scale (PSS-10) was administered to a sample of nurses, enrolled in three Italian hospitals. Cronbach's α for the entire sample (229 nurses) was found to be 0.72. A significant negative correlation between SPS and perceived stress scores evidenced the external validity. The factor analysis showed a two-component solution, accounting for 71.2% of the variance. The confirmatory factor analysis showed an adequate fit. The Italian SPS-6 is a valid and reliable tool for workplace surveys. Since the validity and reliability of SPS-6 has been confirmed for the Italian version, we have now a valid tool that can measure the levels of presenteeism among Italian nurses. © 2016 John Wiley & Sons Ltd.
Adaptation of the ESPA29 Parental Socialization Styles Scale to the Basque language: evidence of validity.

PubMed

López-Jáuregui, Alicia; Oliden, Paula Elosua

2009-11-01

The aim of this study is to adapt the ESPA29 scale of parental socialization styles in adolescence to the Basque language. The study of its psychometric properties is based on the search for evidence of internal and external validity. The first focuses on the assessment of the dimensionality of the scale by means of exploratory factor analysis. The relationship between the dimensions of parental socialization styles and gender and age guarantee the external validity of the scale. The study of the equivalence of the adapted and original versions is based on the comparisons of the reliability coefficients and on factor congruence. The results allow us to conclude the equivalence of the two scales.
Self-reported quality of life measure is reliable and valid in adult patients suffering from schizophrenia with executive impairment.

PubMed

Baumstarck, Karine; Boyer, Laurent; Boucekine, Mohamed; Aghababian, Valérie; Parola, Nathalie; Lançon, Christophe; Auquier, Pascal

2013-06-01

Impaired executive functions are among the most widely observed in patients suffering from schizophrenia. The use of self-reported outcomes for evaluating treatment and managing care of these patients has been questioned. The aim of this study was to provide new evidence about the suitability of self-reported outcome for use in this specific population by exploring the internal structure, reliability and external validity of a specific quality of life (QoL) instrument, the Schizophrenia Quality of Life questionnaire (SQoL18). cross-sectional study. age over 18 years, diagnosis of schizophrenia according to the DSM-IV criteria. sociodemographic (age, gender, and education level) and clinical data (duration of illness, Positive and Negative Syndrome Scale, Calgary Depression Scale for Schizophrenia); QoL (SQoL18); and executive performance (Stroop test, lexical and verbal fluency, and trail-making test). Non-impaired and impaired populations were defined for each of the three tests. For the six groups, psychometric properties were compared to those reported from the reference population assessed in the validation study. One hundred and thirteen consecutive patients were enrolled. The factor analysis performed in the impaired groups showed that the questionnaire structure adequately matched the initial structure of the SQoL18. The unidimensionality of the dimensions was preserved, and the internal/external validity indices were close to those of the non-impaired groups and the reference population. Our study suggests that executive dysfunction did not compromise the reliability or validity of self-reported disease-specific QoL questionnaire. Copyright © 2013 Elsevier B.V. All rights reserved.
External validation of change formulae in neuropsychology with neuroimaging biomarkers: a methodological recommendation and preliminary clinical data.

PubMed

Duff, Kevin; Suhrie, Kayla R; Dalley, Bonnie C A; Anderson, Jeffrey S; Hoffman, John M

2018-06-08

Within neuropsychology, a number of mathematical formulae (e.g. reliable change index, standardized regression based) have been used to determine if change across time has reliably occurred. When these formulae have been compared, they often produce different results, but 'different' results do not necessarily indicate which formulae are 'best.' The current study sought to further our understanding of change formulae by comparing them to clinically relevant external criteria (amyloid deposition and hippocampal volume). In a sample of 25 older adults with varying levels of cognitive intactness, participants were tested twice across one week with a brief cognitive battery. Seven different change scores were calculated for each participant. An amyloid PET scan (to get a composite of amyloid deposition) and an MRI (to get hippocampal volume) were also obtained. Deviation-based change formulae (e.g. simple discrepancy score, reliable change index with or without correction for practice effects) were all identical in their relationship to the two neuroimaging biomarkers, and all were non-significant. Conversely, regression-based change formulae (e.g. simple and complex indices) showed stronger relationships to amyloid deposition and hippocampal volume. These results highlight the need for external validation of the various change formulae used by neuropsychologists in clinical settings and research projects. The findings also preliminarily suggest that regression-based change formulae may be more relevant than deviation-based change formulae in this context.

Measuring Perceived Barriers to Physical Activity in Adolescents.

PubMed

Gunnell, Katie E; Brunet, Jennifer; Wing, Erin K; Bélanger, Mathieu

2015-05-01

Perceived barriers to moderate-to-vigorous physical activity (PA) may contribute to the low rates of moderate-to-vigorous PA in adolescents. We examined the psychometric properties of scores from the perceived barriers to moderate-to-vigorous PA scale (PB-MVPA) by examining composite reliability and validity evidence based on the internal structure of the PB-MVPA and relations with other variables. This study was a cross-sectional analysis of data collected in 2013 from adolescents (N = 507; Mage = 12.40, SD = .62) via self-report scales. Using exploratory and confirmatory factor analyses, we found that perceived barriers were best represented as two factors representing internal (e.g., "I am not interested in physical activity") and external (e.g., "I need equipment I don't have") dimensions. Composite reliability was over .80. Using multiple regression to examine the relationship between perceived barriers and moderate-to-vigorous PA, we found that perceived internal barriers were inversely related to moderate-to-vigorous PA (β = -.32, p < .05). Based on results of the analysis of variances, there were no known-group sex differences for perceived internal and external barriers (p > .26). The PB-MVPA scale demonstrated evidence of score reliability and validity. To improve the understanding of the impact of perceived barriers on moderate-to- vigorous PA in adolescents, researchers should examine internal and external barriers separately.
Reliable and valid tools for measuring surgeons' teaching performance: residents' vs. self evaluation.

PubMed

Boerebach, Benjamin C M; Arah, Onyebuchi A; Busch, Olivier R C; Lombarts, Kiki M J M H

2012-01-01

In surgical education, there is a need for educational performance evaluation tools that yield reliable and valid data. This paper describes the development and validation of robust evaluation tools that provide surgeons with insight into their clinical teaching performance. We investigated (1) the reliability and validity of 2 tools for evaluating the teaching performance of attending surgeons in residency training programs, and (2) whether surgeons' self evaluation correlated with the residents' evaluation of those surgeons. We surveyed 343 surgeons and 320 residents as part of a multicenter prospective cohort study of faculty teaching performance in residency training programs. The reliability and validity of the SETQ (System for Evaluation Teaching Qualities) tools were studied using standard psychometric techniques. We then estimated the correlations between residents' and surgeons' evaluations. The response rate was 87% among surgeons and 84% among residents, yielding 2625 residents' evaluations and 302 self evaluations. The SETQ tools yielded reliable and valid data on 5 domains of surgical teaching performance, namely, learning climate, professional attitude towards residents, communication of goals, evaluation of residents, and feedback. The correlations between surgeons' self and residents' evaluations were low, with coefficients ranging from 0.03 for evaluation of residents to 0.18 for communication of goals. The SETQ tools for the evaluation of surgeons' teaching performance appear to yield reliable and valid data. The lack of strong correlations between surgeons' self and residents' evaluations suggest the need for using external feedback sources in informed self evaluation of surgeons. Copyright © 2012 Association of Program Directors in Surgery. Published by Elsevier Inc. All rights reserved.
The Interpersonal Shame Inventory for Asian Americans: Scale Development and Psychometric Properties

PubMed Central

Wong, Y. Joel; Kim, Bryan S. K.; Nguyen, Chi P.; Cheng, Janice Ka Yan; Saw, Anne

2016-01-01

This article reports the development and psychometric properties of the Interpersonal Shame Inventory (ISI), a culturally salient and clinically relevant measure of interpersonal shame for Asian Americans. Across 4 studies involving Asian American college students, the authors provided evidence for this new measure’s validity and reliability. Exploratory factor analyses and confirmatory factor analyses provided support for a model with 2 correlated factors: external shame (arising from concerns about others’ negative evaluations) and family shame (arising from perceptions that one has brought shame to one’s family), corresponding to 2 subscales: ISI-E and ISI-F, respectively. Evidence for criterion-related, concurrent, discriminant, and incremental validity was demonstrated by testing the associations between external shame and family shame and immigration/international status, generic state shame, face concerns, thwarted belongingness, perceived burdensomeness, self-esteem, depressive symptoms, and suicide ideation. External shame and family shame also exhibited differential relations with other variables. Mediation findings were consistent with a model in which family shame mediated the effects of thwarted belongingness on suicide ideation. Further, the ISI subscales demonstrated high alpha coefficients and test–retest reliability. These findings are discussed in light of the conceptual, methodological, and clinical contributions of the ISI. PMID:24188650
Development and validation of a prognostic nomogram for colorectal cancer after radical resection based on individual patient data from three large-scale phase III trials

PubMed Central

Akiyoshi, Takashi; Maeda, Hiromichi; Kashiwabara, Kosuke; Kanda, Mitsuro; Mayanagi, Shuhei; Aoyama, Toru; Hamada, Chikuma; Sadahiro, Sotaro; Fukunaga, Yosuke; Ueno, Masashi; Sakamoto, Junichi; Saji, Shigetoyo; Yoshikawa, Takaki

2017-01-01

Background Few prediction models have so far been developed and assessed for the prognosis of patients who undergo curative resection for colorectal cancer (CRC). Materials and Methods We prepared a clinical dataset including 5,530 patients who participated in three major randomized controlled trials as a training dataset and 2,263 consecutive patients who were treated at a cancer-specialized hospital as a validation dataset. All subjects underwent radical resection for CRC which was histologically diagnosed to be adenocarcinoma. The main outcomes that were predicted were the overall survival (OS) and disease free survival (DFS). The identification of the variables in this nomogram was based on a Cox regression analysis and the model performance was evaluated by Harrell's c-index. The calibration plot and its slope were also studied. For the external validation assessment, risk group stratification was employed. Results The multivariate Cox model identified variables; sex, age, pathological T and N factor, tumor location, size, lymphnode dissection, postoperative complications and adjuvant chemotherapy. The c-index was 0.72 (95% confidence interval [CI] 0.66-0.77) for the OS and 0.74 (95% CI 0.69-0.78) for the DFS. The proposed stratification in the risk groups demonstrated a significant distinction between the Kaplan–Meier curves for OS and DFS in the external validation dataset. Conclusions We established a clinically reliable nomogram to predict the OS and DFS in patients with CRC using large scale and reliable independent patient data from phase III randomized controlled trials. The external validity was also confirmed on the practical dataset. PMID:29228760
International development and psychometric properties of the Child and Adolescent Trauma Screen (CATS).

PubMed

Sachser, Cedric; Berliner, Lucy; Holt, Tonje; Jensen, Tine K; Jungbluth, Nathaniel; Risch, Elizabeth; Rosner, Rita; Goldbeck, Lutz

2017-03-01

Systematic screening is a powerful means by which children and adolescents with posttraumatic stress symptoms (PTSS) can be detected. Reliable and valid measures based on current diagnostic criteria are needed. To investigate the internal consistency and construct validity of the Child and Adolescent Trauma Screen (CATS) in three samples of trauma-exposed children in the US (self-reports: n=249; caregiver reports: n=267; pre-school n=190), in Germany (self-reports: n=117; caregiver reports: n=95) and in Norway (self-reports: n=109; caregiver reports: n=62). Internal consistency was calculated using Cronbach's α. Convergent-discriminant validity was investigated using bivariate correlation coefficients with measures of depression, anxiety and externalizing symptoms. CFA was used to investigate the DSM-5 factor structure. In all three language samples the 20 item symptom score of the self-report and the caregiver report proved good to excellent reliability with α ranging between .88 and .94. The convergent-discriminant validity pattern showed medium to strong correlations with measures of depression (r =.62-.82) and anxiety (r =.40-.77) and low to medium correlations with externalizing symptoms (r =-.15-.43) within informants in all language versions. Using CFA the underlying DSM-5 factor structure with four symptom clusters (re-experiencing, avoidance, negative alterations in mood and cognitions, hyperarousal) was supported (n =475 for self-report; n =424 for caregiver reports). The external validation of the CATS with a DSM-5 based semi-structured clinical interview and corresponding determination of cut-points is pending. The CATS has satisfactory psychometric properties. Clinicians may consider the CATS as a screening tool and for symptom monitoring. Copyright © 2016 Elsevier B.V. All rights reserved.
Adaptation of Panic-Related Psychopathology Measures to Russian

ERIC Educational Resources Information Center

Kotov, Roman; Schmidt, Norman B.; Zvolensky, Michael J.; Vinogradov, Alexander; Antipova, Anna V.

2005-01-01

The study reports results of adaptation of panic-related psychopathology measures to Russian, including the Anxiety Sensitivity Index (ASI), the Agoraphobic Cognitions Questionnaire (ACQ), and the Mobility Inventory for Agoraphobia (MIA). Psychometric properties (e.g., reliability, factor structure, endorsement) and external validity of the…
A Cost Analysis Model for Army Sponsored Graduate Dental Education Programs.

DTIC Science & Technology

1997-04-01

characteristics of a good measurement tool ? Cooper and Emory in their textbook, Business Research Methods, state there are three major criteria for evaluating...a measurement tool : validity, reliability, and practicality (Cooper and Emory 1995). Validity can be compartmentalized into internal and external...tremendous expense? The AEGD-1 year program is used extensively as a recruiting tool to encourage senior dental students to join the Army Dental Corps. The
The multiple sclerosis work difficulties questionnaire: translation and cross-cultural adaptation to Turkish and assessment of validity and reliability.

PubMed

Kahraman, Turhan; Özdoğar, Asiye Tuba; Honan, Cynthia Alison; Ertekin, Özge; Özakbaş, Serkan

2018-05-09

To linguistically and culturally adapt the Multiple Sclerosis Work Difficulties Questionnaire-23 (MSWDQ-23) for use in Turkey, and to examine its reliability and validity. Following standard forward-back translation of the MSWDQ-23, it was administered to 124 people with multiple sclerosis (MS). Validity was evaluated using related outcome measures including those related to employment status and expectations, disability level, fatigue, walking, and quality of life. Randomly selected participants were asked to complete the MSWDQ-23 again to assess test-retest reliability. Confirmatory factor analysis on the MSWDQ-23 demonstrated a good fit for the data, and the internal consistency of each subscale was excellent. The test-retest reliability for the total score, psychological/cognitive barriers, physical barriers, and external barriers subscales were high. The MSWDQ-23 and its subscales were positively correlated with the employment, disability level, walking, and fatigue outcome measures. This study suggests that the Turkish version of MSWDQ-23 has high reliability and adequate validity, and it can be used to determine the difficulties faced by people with multiple sclerosis in workplace. Moreover, the study provides evidence about the test-retest reliability of the questionnaire. Implications for rehabilitation Multiple sclerosis affects young people of working age. Understanding work-related problems is crucial to enhance people with multiple sclerosis likelihood of maintaining their job. The Multiple Sclerosis Work Difficulties Questionnaire-23 (MSWDQ-23) is a valid and reliable measure of perceived workplace difficulties in people with multiple sclerosis: we presented its validation to Turkish. Professionals working in the field of vocational rehabilitation may benefit from using the MSWDQ-23 to predict the current work outcomes and future employment expectations.
Cross-cultural adaptation, validity, and reliability of the Parenting Styles and Dimensions Questionnaire - Short Version (PSDQ) for use in Brazil.

PubMed

Oliveira, Thaís D; Costa, Danielle de S; Albuquerque, Maicon R; Malloy-Diniz, Leandro F; Miranda, Débora M; de Paula, Jonas J

2018-06-11

The Parenting Styles and Dimensions Questionnaire (PSDQ) is used worldwide to assess three styles (authoritative, authoritarian, and permissive) and seven dimensions of parenting. In this study, we adapted the short version of the PSDQ for use in Brazil and investigated its validity and reliability. Participants were 451 mothers of children aged 3 to 18 years, though sample size varied with analyses. The translation and adaptation of the PSDQ followed a rigorous methodological approach. Then, we investigated the content, criterion, and construct validity of the adapted instrument. The scale content validity index (S-CVI) was considered adequate (0.97). There was evidence of internal validity, with the PSDQ dimensions showing strong correlations with their higher-order parenting styles. Confirmatory factor analysis endorsed the three-factor, second-order solution (i.e., three styles consisting of seven dimensions). The PSDQ showed convergent validity with the validated Brazilian version of the Parenting Styles Inventory (Inventário de Estilos Parentais - IEP), as well as external validity, as it was associated with several instruments measuring sociodemographic and behavioral/emotional-problem variables. The PSDQ is an effective and reliable psychometric instrument to assess childrearing strategies according to Baumrind's model of parenting styles.
Dental Students' Perceptions of Risk Factors for Musculoskeletal Disorders: Adapting the Job Factors Questionnaire for Dentistry.

PubMed

Presoto, Cristina D; Wajngarten, Danielle; Domingos, Patrícia A S; Campos, Juliana A D B; Garcia, Patrícia P N S

2018-01-01

The aims of this study were to adapt the Job Factors Questionnaire to the field of dentistry, evaluate its psychometric properties, evaluate dental students' perceptions of work/study risk factors for musculoskeletal disorders, and determine the influence of gender and academic level on those perceptions. All 580 students enrolled in two Brazilian dental schools in 2015 were invited to participate in the study. A three-factor structure (Repetitiveness, Work Posture, and External Factors) was tested through confirmatory factor analysis. Convergent validity was estimated using the average variance extracted (AVE), discriminant validity was based on the correlational analysis of the factors, and reliability was assessed. A causal model was created using structural equation modeling to evaluate the influence of gender and academic level on students' perceptions. A total of 480 students completed the questionnaire for an 83% response rate. The responding students' average age was 21.6 years (SD=2.98), and 74.8% were women. Higher scores were observed on the Work Posture factor items. The refined model presented proper fit to the studied sample. Convergent validity was compromised only for External Factors (AVE=0.47), and discriminant validity was compromised for Work Posture and External Factors (r 2 =0.69). Reliability was adequate. Academic level did not have a significant impact on the factors, but the women students exhibited greater perception. Overall, the adaptation resulted in a useful instrument for assessing perceptions of risk factors for musculoskeletal disorders. Gender was found to significantly influence all three factors, with women showing greater perception of the risk factors.
A Public-Private Partnership Develops and Externally Validates a 30-Day Hospital Readmission Risk Prediction Model

PubMed Central

Choudhry, Shahid A.; Li, Jing; Davis, Darcy; Erdmann, Cole; Sikka, Rishi; Sutariya, Bharat

2013-01-01

Introduction: Preventing the occurrence of hospital readmissions is needed to improve quality of care and foster population health across the care continuum. Hospitals are being held accountable for improving transitions of care to avert unnecessary readmissions. Advocate Health Care in Chicago and Cerner (ACC) collaborated to develop all-cause, 30-day hospital readmission risk prediction models to identify patients that need interventional resources. Ideally, prediction models should encompass several qualities: they should have high predictive ability; use reliable and clinically relevant data; use vigorous performance metrics to assess the models; be validated in populations where they are applied; and be scalable in heterogeneous populations. However, a systematic review of prediction models for hospital readmission risk determined that most performed poorly (average C-statistic of 0.66) and efforts to improve their performance are needed for widespread usage. Methods: The ACC team incorporated electronic health record data, utilized a mixed-method approach to evaluate risk factors, and externally validated their prediction models for generalizability. Inclusion and exclusion criteria were applied on the patient cohort and then split for derivation and internal validation. Stepwise logistic regression was performed to develop two predictive models: one for admission and one for discharge. The prediction models were assessed for discrimination ability, calibration, overall performance, and then externally validated. Results: The ACC Admission and Discharge Models demonstrated modest discrimination ability during derivation, internal and external validation post-recalibration (C-statistic of 0.76 and 0.78, respectively), and reasonable model fit during external validation for utility in heterogeneous populations. Conclusions: The ACC Admission and Discharge Models embody the design qualities of ideal prediction models. The ACC plans to continue its partnership to further improve and develop valuable clinical models. PMID:24224068
Is laser speckle contrast analysis (LASCA) the new kid on the block in systemic sclerosis? A systematic literature review and pilot study to evaluate reliability of LASCA to measure peripheral blood perfusion in scleroderma patients.

PubMed

Cutolo, Maurizio; Vanhaecke, Amber; Ruaro, Barbara; Deschepper, Ellen; Ickinger, Claudia; Melsens, Karin; Piette, Yves; Trombetta, Amelia Chiara; De Keyser, Filip; Smith, Vanessa

2018-06-06

A reliable tool to evaluate flow is paramount in systemic sclerosis (SSc). We describe herein on the one hand a systematic literature review on the reliability of laser speckle contrast analysis (LASCA) to measure the peripheral blood perfusion (PBP) in SSc and perform an additional pilot study, investigating the intra- and inter-rater reliability of LASCA. A systematic search was performed in 3 electronic databases, according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. In the pilot study, 30 SSc patients and 30 healthy subjects (HS) underwent LASCA assessment. Intra-rater reliability was assessed by having a first anchor rater performing the measurements at 2 time-points and inter-rater reliability by having the anchor rater and a team of second raters performing the measurements in 15 SSc and 30 HS. The measurements were repeated with a second anchor rater in the other 15 SSc patients, as external validation. Only 1 of the 14 records of interest identified through the systematic search was included in the final analysis. In the additional pilot study: intra-class correlation coefficient (ICC) for intra-rater reliability of the first anchor rater was 0.95 in SSc and 0.93 in HS, the ICC for inter-rater reliability was 0.97 in SSc and 0.93 in HS. Intra- and inter-rater reliability of the second anchor rater was 0.78 and 0.87. The identified literature regarding the reliability of LASCA measurements reports good to excellent inter-rater agreement. This very pilot study could confirm the reliability of LASCA measurements with good to excellent inter-rater agreement and found additionally good to excellent intra-rater reliability. Furthermore, similar results were found in the external validation. Copyright © 2018. Published by Elsevier B.V.
Validity and reliability of the session-RPE method for quantifying training in Australian football: a comparison of the CR10 and CR100 scales.

PubMed

Scott, Tannath J; Black, Cameron R; Quinn, John; Coutts, Aaron J

2013-01-01

The purpose of this study was to examine and compare the criterion validity and test-retest reliability of the CR10 and CR100 rating of perceived exertion (RPE) scales for team sport athletes that undertake high-intensity, intermittent exercise. Twenty-one male Australian football (AF) players (age: 19.0 ± 1.8 years, body mass: 83.92 ± 7.88 kg) participated the first part (part A) of this study, which examined the construct validity of the session-RPE (sRPE) method for quantifying training load in AF. Ten male athletes (age: 16.1 ± 0.5 years) participated in the second part of the study (part B), which compared the test-retest reliability of the CR10 and CR100 RPE scales. In part A, the validity of the sRPE method was assessed by examining the relationships between sRPE, and objective measures of internal (i.e., heart rate) and external training load (i.e., distance traveled), collected from AF training sessions. Part B of the study assessed the reliability of sRPE through examining the test-retest reliability of sRPE during 3 different intensities of controlled intermittent running (10, 11.5, and 13 km·h(-1)). Results from part A demonstrated strong correlations for CR10- and CR100-derived sRPE with measures of internal training load (Banisters TRIMP and Edwards TRIMP) (CR10: r = 0.83 and 0.83, and CR100: r = 0.80 and 0.81, p < 0.05). Correlations between sRPE and external training load (distance, higher speed running and player load) for both the CR10 (r = 0.81, 0.71, and 0.83) and CR100 (r = 0.78, 0.69, and 0.80) were significant (p < 0.05). Results from part B demonstrated poor reliability for both the CR10 (31.9% CV) and CR100 (38.6% CV) RPE scales after short bouts of intermittent running. Collectively, these results suggest both CR10- and CR100-derived sRPE methods have good construct validity for assessing training load in AF. The poor levels of reliability revealed under field testing indicate that the sRPE method may not be sensible to detecting small changes in exercise intensity during brief intermittent running bouts. Despite this limitation, the sRPE remains a valid method to quantify training loads in high-intensity, intermittent team sport.
The Brighton musculoskeletal Patient-Reported Outcome Measure (BmPROM): An assessment of validity, reliability, and responsiveness.

PubMed

Bryant, Elizabeth; Murtagh, Shemane; Finucane, Laura; McCrum, Carol; Mercer, Christopher; Smith, Toby; Canby, Guy; Rowe, David A; Moore, Ann P

2018-05-11

In response for the need of a freely available, stand-alone, validated outcome measure for use within musculoskeletal (MSK) physiotherapy practice, sensitive enough to measure clinical effectiveness, we developed an MSK patient reported outcome measure. This study examined the validity and reliability of the newly developed Brighton musculoskeletal Patient-Reported Outcome Measure (BmPROM) within physiotherapy outpatient settings. Two hundred twenty-four patients attending physiotherapy outpatient departments in South East England with an MSK condition participated in this study. The BmPROM was assessed for user friendliness (rated feedback, N = 224), reliability (internal consistency and test-retest reliability, n = 42), validity (internal and external construct validity, N = 224), and responsiveness (internal, n = 25). Exploratory factor analysis indicated that a two-factor model provides a good fit to the data. Factors were representative of "Functionality" and "Wellbeing". Correlations observed between the BmPROM and SF-36 domains provided evidence of convergent validity. Reliability results indicated that both subscales were internally consistent with alphas above the acceptable limits for both "Functionality" (α = .85, 95% CI [.81, .88]) and 'Wellbeing' (α = .80, 95% CI [.75, .84]). Test-retest analyses (n = 42) demonstrated a high degree of reliability between "Functionality" (ICC = .84; 95% CI [.72, .91]) and "Wellbeing" scores (ICC = .84; 95% CI [.72, .91]). Further examination of test-retest reliability through the Bland-Altman analysis demonstrated that the difference between "Functionality" and "Wellbeing" test scores did not vary as a function of absolute test score. Large treatment effect sizes were found for both subscales (Functionality d = 1.10; Wellbeing 1.03). The BmPROM is a reliable and valid outcome measure for use in evaluating physiotherapy treatment of MSK conditions. Copyright © 2018 John Wiley & Sons, Ltd.
[Internal consistency and criterion validity and reliability of the Mexican Version of the Child Behavior Checklist 1.5-5 (CBCL/1.5-5)].

PubMed

Albores-Gallo, Lilia; Hernández-Guzmán, Laura; Hasfura-Buenaga, Cecilia; Navarro-Luna, Enrique

To investigate the validity and internal consistency of the Mexican version of the CBCL/1.5 -5 that assesses the most common psychopathology in pre-school children in clinical and epidemiological settings. A total of 438 parents from two groups, clinical-psychiatric (N= 62) and community (N= 376) completed the CBCL/1.5-5/Mexican version. The internal consistency was high for total problems α=0.95, and internalized α=0.89 and externalized α=0.91 subscales. The test re-test (one week) using the intraclass correlation coefficient (ICC) was ≥ 0.95 for the internalized, externalized, and total problems subscales. The ROC curve for the criterion status of clinically-referred vs. non-referred using the total problems scale ≥ 24 resulted in an AUC (area under curve) of 0.77, a specificity 0.73, and a sensitivity of 0.70. The CBCL/1.5 -5/Mexican version is a reliable and valid tool. Copyright Â© 2016 Sociedad Chilena de Pediatría. Publicado por Elsevier España, S.L.U. All rights reserved.
Developing an index to measure the voluntariness of consent to research.

PubMed

Dugosh, Karen L; Festinger, David S; Marlowe, Douglas B; Clements, Nicolle T

2014-10-01

The goals of the current study were to expand the content domain and further validate the Coercion Assessment Scale (CAS), a measure of perceived coercion for criminally involved substance abusers being recruited into research. Unlike the few existing measures of this construct, the CAS identifies specific external sources of pressure that may influence one's decision to participate. In Phase 1, we conducted focus groups with criminal justice clients and stakeholders to expand the instrument by identifying additional sources of pressure. In Phase 2, we evaluated the expanded measure (i.e., endorsement rates, reliability, validity) in an ongoing research trial. Results identified new sources of pressure and provided evidence supporting the CAS's utility and reliability over time as well as convergent and discriminative validity. © The Author(s) 2014.
External Validation of the Acoustic Voice Quality Index Version 03.01 With Extended Representativity.

PubMed

Barsties, Ben; Maryn, Youri

2016-07-01

The Acoustic Voice Quality Index (AVQI) is an objective method to quantify the severity of overall voice quality in concatenated continuous speech and sustained phonation segments. Recently, AVQI was successfully modified to be more representative and ecologically valid because the internal consistency of AVQI was balanced out through equal proportion of the 2 speech types. The present investigation aims to explore its external validation in a large data set. An expert panel of 12 speech-language therapists rated the voice quality of 1058 concatenated voice samples varying from normophonia to severe dysphonia. The Spearman rank-order correlation coefficients (r) were used to measure concurrent validity. The AVQI's diagnostic accuracy was evaluated with several estimates of its receiver operating characteristics (ROC). Finally, 8 of the 12 experts were chosen because of reliability criteria. A strong correlation was identified between AVQI and auditoryperceptual rating (r = 0.815, P = .000). It indicated that 66.4% of the auditory-perceptual rating's variation was explained by AVQI. Additionally, the ROC results showed again the best diagnostic outcome at a threshold of AVQI = 2.43. This study highlights external validation and diagnostic precision of the AVQI version 03.01 as a robust and ecologically valid measurement to objectify voice quality. © The Author(s) 2016.
Insightful practice: a reliable measure for medical revalidation

PubMed Central

Guthrie, Bruce; Sullivan, Frank M; Mercer, Stewart W; Russell, Andrew; Bruce, David A

2012-01-01

Background Medical revalidation decisions need to be reliable if they are to reassure on the quality and safety of professional practice. This study tested an innovative method in which general practitioners (GPs) were assessed on their reflection and response to a set of externally specified feedback. Setting and participants 60 GPs and 12 GP appraisers in the Tayside region of Scotland, UK. Methods A feedback dataset was specified as (1) GP-specific data collected by GPs themselves (patient and colleague opinion; open book self-evaluated knowledge test; complaints) and (2) Externally collected practice-level data provided to GPs (clinical quality and prescribing safety). GPs' perceptions of whether the feedback covered UK General Medical Council specified attributes of a ‘good doctor’ were examined using a mapping exercise. GPs' professionalism was examined in terms of appraiser assessment of GPs' level of insightful practice, defined as: engagement with, insight into and appropriate action on feedback data. The reliability of assessment of insightful practice and subsequent recommendations on GPs' revalidation by face-to-face and anonymous assessors were investigated using Generalisability G-theory. Main outcome measures Coverage of General Medical Council attributes by specified feedback and reliability of assessor recommendations on doctors' suitability for revalidation. Results Face-to-face assessment proved unreliable. Anonymous global assessment by three appraisers of insightful practice was highly reliable (G=0.85), as were revalidation decisions using four anonymous assessors (G=0.83). Conclusions Unlike face-to-face appraisal, anonymous assessment of insightful practice offers a valid and reliable method to decide GP revalidation. Further validity studies are needed. PMID:22653078
Introducing the Professionalism Mini-Evaluation Exercise (P-MEX) in Japan: results from a multicenter, cross-sectional study.

PubMed

Tsugawa, Yusuke; Ohbu, Sadayoshi; Cruess, Richard; Cruess, Sylvia; Okubo, Tomoya; Takahashi, Osamu; Tokuda, Yasuharu; Heist, Brian S; Bito, Seiji; Itoh, Toshiyuki; Aoki, Akiko; Chiba, Tsutomu; Fukui, Tsuguya

2011-08-01

Despite the growing importance of and interest in medical professionalism, there is no standardized tool for its measurement. The authors sought to verify the validity, reliability, and generalizability of the Professionalism Mini-Evaluation Exercise (P-MEX), a previously developed and tested tool, in the context of Japanese hospitals. A multicenter, cross-sectional evaluation study was performed to investigate the validity, reliability, and generalizability of the P-MEX in seven Japanese hospitals. In 2009-2010, 378 evaluators (attending physicians, nurses, peers, and junior residents) completed 360-degree assessments of 165 residents and fellows using the P-MEX. The content validity and criterion-related validity were examined, and the construct validity of the P-MEX was investigated by performing confirmatory factor analysis through a structural equation model. The reliability was tested using generalizability analysis. The contents of the P-MEX achieved good acceptance in a preliminary working group, and the poststudy survey revealed that 302 (79.9%) evaluators rated the P-MEX items as appropriate, indicating good content validity. The correlation coefficient between P-MEX scores and external criteria was 0.78 (P < .001), demonstrating good criterion-related validity. Confirmatory factor analysis verified high path coefficient (0.60-0.99) and adequate goodness of fit of the model. The generalizability analysis yielded a high dependability coefficient, suggesting good reliability, except when evaluators were peers or junior residents. Findings show evidence of adequate validity, reliability, and generalizability of the P-MEX in Japanese hospital settings. The P-MEX is the only evaluation tool for medical professionalism verified in both a Western and East Asian cultural context.
Preliminary Validity of the Eyberg Child Behavior Inventory With Filipino Immigrant Parents

PubMed Central

Coffey, Dean M.; Javier, Joyce R.; Schrager, Sheree M.

2016-01-01

Filipinos are an understudied minority affected by significant behavioral health disparities. We evaluate evidence for the reliability, construct validity, and convergent validity of the Eyberg Child Behavior Inventory (ECBI) in 6- to 12- year old Filipino children (N = 23). ECBI scores demonstrated high internal consistency, supporting a single-factor model (pre-intervention α =.91; post-intervention α =.95). Results document convergent validity with the Child Behavior Checklist Externalizing scale at pretest (r = .54, p < .01) and posttest (r = .71, p < .001). We conclude that the ECBI is a promising tool to measure behavior problems in Filipino children. PMID:27087739

Preliminary Validity of the Eyberg Child Behavior Inventory With Filipino Immigrant Parents.

PubMed

Coffey, Dean M; Javier, Joyce R; Schrager, Sheree M

Filipinos are an understudied minority affected by significant behavioral health disparities. We evaluate evidence for the reliability, construct validity, and convergent validity of the Eyberg Child Behavior Inventory (ECBI) in 6- to 12- year old Filipino children ( N = 23). ECBI scores demonstrated high internal consistency, supporting a single-factor model (pre-intervention α =.91; post-intervention α =.95). Results document convergent validity with the Child Behavior Checklist Externalizing scale at pretest ( r = .54, p < .01) and posttest ( r = .71, p < .001). We conclude that the ECBI is a promising tool to measure behavior problems in Filipino children.
Utility of the Personality Inventory for DSM-5-Brief Form (PID-5-BF) in the Measurement of Maladaptive Personality and Psychopathology.

PubMed

Anderson, Jaime L; Sellbom, Martin; Salekin, Randall T

2018-07-01

The Diagnostic and Statistical Manual of Mental Disorders-Fifth edition ( DSM-5) Personality and Personality Disorders workgroup developed the Personality Inventory for the DSM-5 (PID-5) for the assessment of the alternative trait model for DSM-5. Along with this measure, the American Psychiatric Association published an abbreviated version, the PID-5-Brief form (PID-5-BF). Although this measure is available on the DSM-5 website for use, only two studies have evaluated its psychometric properties and validity and no studies have examined the U.S. version of this measure. The current study evaluated the reliability, factor structure, and construct validity of PID-5-BF scale scores. This included an evaluation of the scales' associations with Section II PDs, a well-validated dimensional measure of personality psychopathology, and broad externalizing and internalizing psychopathology measures. We found support for the reliability of PID-5-BF scales as well as for the factor structure of the measure. Furthermore, a series of correlation and regression analyses showed conceptually expected associations between PID-5-BF and external criterion variables. Finally, we compared the correlations with external criterion measures to those of the full-length PID-5 and PID-5-Short form. Intraclass correlation analyses revealed a comparable pattern of correlations across all three measures, thereby supporting the use of the PID-5-BF as a screening measure of dimensional maladaptive personality traits.
Pre- and Post-Planned Evaluation: Which Is Preferable?

ERIC Educational Resources Information Center

Strasser, Stephen; Deniston, O. Lynn

1978-01-01

Factors involved in pre-planned and post-planned evaluation of program effectiveness are compared: (1) reliability and cost of data; (2) internal and external validity; (3) obtrusiveness and threat; (4) goal displacement and program direction. A model to help program administrators decide which approach is more appropriate is presented. (Author/MH)
Can I Trust ORE Reports?

ERIC Educational Resources Information Center

Feedback, 1984

1984-01-01

This issue of FEEDBACK, a newsletter produced by the the Austin Independent School District Office of Research and Evaluation (ORE), illustrates the accuracy, validity, and fairness of ORE reports. The independence of the reports is explained. Internal and external quality controls are used to ensure reliability and accuracy of the reports.…
Implementing the undergraduate mini-CEX: a tailored approach at Southampton University.

PubMed

Hill, Faith; Kendall, Kathleen; Galbraith, Kevin; Crossley, Jim

2009-04-01

The mini-clinical evaluation exercise (mini-CEX) is widely used in the UK to assess clinical competence, but there is little evidence regarding its implementation in the undergraduate setting. This study aimed to estimate the validity and reliability of the undergraduate mini-CEX and discuss the challenges involved in its implementation. A total of 3499 mini-CEX forms were completed. Validity was assessed by estimating associations between mini-CEX score and a number of external variables, examining the internal structure of the instrument, checking competency domain response rates and profiles against expectations, and by qualitative evaluation of stakeholder interviews. Reliability was evaluated by overall reliability coefficient (R), estimation of the standard error of measurement (SEM), and from stakeholders' perceptions. Variance component analysis examined the contribution of relevant factors to students' scores. Validity was threatened by various confounding variables, including: examiner status; case complexity; attachment specialty; patient gender, and case focus. Factor analysis suggested that competency domains reflect a single latent variable. Maximum reliability can be achieved by aggregating scores over 15 encounters (R = 0.73; 95% confidence interval [CI] +/- 0.28 based on a 6-point assessment scale). Examiner stringency contributed 29% of score variation and student attachment aptitude 13%. Stakeholder interviews revealed staff development needs but the majority perceived the mini-CEX as more reliable and valid than the previous long case. The mini-CEX has good overall utility for assessing aspects of the clinical encounter in an undergraduate setting. Strengths include fidelity, wide sampling, perceived validity, and formative observation and feedback. Reliability is limited by variable examiner stringency, and validity by confounding variables, but these should be viewed within the context of overall assessment strategies.
External validation of the DHAKA score and comparison with the current IMCI algorithm for the assessment of dehydration in children with diarrhoea: a prospective cohort study.

PubMed

Levine, Adam C; Glavis-Bloom, Justin; Modi, Payal; Nasrin, Sabiha; Atika, Bita; Rege, Soham; Robertson, Sarah; Schmid, Christopher H; Alam, Nur H

2016-10-01

Dehydration due to diarrhoea is a leading cause of child death worldwide, yet no clinical tools for assessing dehydration have been validated in resource-limited settings. The Dehydration: Assessing Kids Accurately (DHAKA) score was derived for assessing dehydration in children with diarrhoea in a low-income country setting. In this study, we aimed to externally validate the DHAKA score in a new population of children and compare its accuracy and reliability to the current Integrated Management of Childhood Illness (IMCI) algorithm. DHAKA was a prospective cohort study done in children younger than 60 months presenting to the International Centre for Diarrhoeal Disease Research, Bangladesh, with acute diarrhoea (defined by WHO as three or more loose stools per day for less than 14 days). Local nurses assessed children and classified their dehydration status using both the DHAKA score and the IMCI algorithm. Serial weights were obtained and dehydration status was established by percentage weight change with rehydration. We did regression analyses to validate the DHAKA score and compared the accuracy and reliability of the DHAKA score and IMCI algorithm with receiver operator characteristic (ROC) curves and the weighted κ statistic. This study was registered with ClinicalTrials.gov, number NCT02007733. Between March 22, 2015, and May 15, 2015, 496 patients were included in our primary analyses. On the basis of our criterion standard, 242 (49%) of 496 children had no dehydration, 184 (37%) of 496 had some dehydration, and 70 (14%) of 496 had severe dehydration. In multivariable regression analyses, each 1-point increase in the DHAKA score predicted an increase of 0·6% in the percentage dehydration of the child and increased the odds of both some and severe dehydration by a factor of 1·4. Both the accuracy and reliability of the DHAKA score were significantly greater than those of the IMCI algorithm. The DHAKA score is the first clinical tool for assessing dehydration in children with acute diarrhoea to be externally validated in a low-income country. Further validation studies in a diverse range of settings and paediatric populations are warranted. National Institutes of Health Fogarty International Center. Copyright © 2016 The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY license. Published by Elsevier Ltd.. All rights reserved.
External validation of the DHAKA score and comparison with the current IMCI algorithm for the assessment of dehydration in children with diarrhoea: a prospective cohort study

PubMed Central

Levine, Adam C; Glavis-Bloom, Justin; Modi, Payal; Nasrin, Sabiha; Atika, Bita; Rege, Soham; Robertson, Sarah; Schmid, Christopher H; Alam, Nur H

2016-01-01

Summary Background Dehydration due to diarrhoea is a leading cause of child death worldwide, yet no clinical tools for assessing dehydration have been validated in resource-limited settings. The Dehydration: Assessing Kids Accurately (DHAKA) score was derived for assessing dehydration in children with diarrhoea in a low-income country setting. In this study, we aimed to externally validate the DHAKA score in a new population of children and compare its accuracy and reliability to the current Integrated Management of Childhood Illness (IMCI) algorithm. Methods DHAKA was a prospective cohort study done in children younger than 60 months presenting to the International Centre for Diarrhoeal Disease Research, Bangladesh, with acute diarrhoea (defined by WHO as three or more loose stools per day for less than 14 days). Local nurses assessed children and classified their dehydration status using both the DHAKA score and the IMCI algorithm. Serial weights were obtained and dehydration status was established by percentage weight change with rehydration. We did regression analyses to validate the DHAKA score and compared the accuracy and reliability of the DHAKA score and IMCI algorithm with receiver operator characteristic (ROC) curves and the weighted κ statistic. This study was registered with ClinicalTrials.gov, number NCT02007733. Findings Between March 22, 2015, and May 15, 2015, 496 patients were included in our primary analyses. On the basis of our criterion standard, 242 (49%) of 496 children had no dehydration, 184 (37%) of 496 had some dehydration, and 70 (14%) of 496 had severe dehydration. In multivariable regression analyses, each 1-point increase in the DHAKA score predicted an increase of 0·6% in the percentage dehydration of the child and increased the odds of both some and severe dehydration by a factor of 1·4. Both the accuracy and reliability of the DHAKA score were significantly greater than those of the IMCI algorithm. Interpretation The DHAKA score is the first clinical tool for assessing dehydration in children with acute diarrhoea to be externally validated in a low-income country. Further validation studies in a diverse range of settings and paediatric populations are warranted. Funding National Institutes of Health Fogarty International Center. PMID:27567350
Adaptation and Validation of the Brief Sexual Opinion Survey (SOS) in a Colombian Sample and Factorial Equivalence with the Spanish Version

PubMed Central

Sierra, Juan Carlos; Soler, Franklin

2016-01-01

Attitudes toward sexuality are a key variable for sexual health. It is really important for psychology and education to have adapted and validated questionnaires to evaluate these attitudes. Therefore, the objective of this research was to adapt, validate and calculate the equivalence of the Colombia Sexual Opinion Survey as compared to the same survey from Spain. To this end, a total of eight experts were consulted and 1,167 subjects from Colombia and Spain answered the Sexual Opinion Survey, the Sexual Assertiveness Scale, the Massachusetts General Hospital-Sexual Functioning Questionnaire, and the Sexuality Scale. The evaluation was conducted by online and the results show adequate qualitative and quantitative properties of the items, with adequate reliability and external validity and compliance with the strong invariance between the two countries. Consequently, the Colombia Sexual Opinion Survey is a valid and reliable scale and its scores can be compared with the ones from the Spain survey, with minimum bias. PMID:27627114
Development and validation of a new instrument for testing functional health literacy in Japanese adults.

PubMed

Nakagami, Katsuyuki; Yamauchi, Toyoaki; Noguchi, Hiroyuki; Maeda, Tohru; Nakagami, Tomoko

2014-06-01

This study aimed to develop a reliable and valid measure of functional health literacy in a Japanese clinical setting. Test development consisted of three phases: generation of an item pool, consultation with experts to assess content validity, and comparison with external criteria (the Japanese Health Knowledge Test) to assess criterion validity. A trial version of the test was administered to 535 Japanese outpatients. Internal consistency reliability, calculated by Cronbach's alpha, was 0.81, and concurrent validity was moderate. Receiver Operating Characteristics and Item Response Theory were used to classify patients as having adequate, marginal, or inadequate functional health literacy. Both inadequate and marginal functional health literacy were associated with older age, lower income, lower educational attainment, and poor health knowledge. The time required to complete the test was 10-15 min. This test should enable health workers to better identify patients with inadequate health literacy. © 2013 Wiley Publishing Asia Pty Ltd.
Adaptation and Validation of the Brief Sexual Opinion Survey (SOS) in a Colombian Sample and Factorial Equivalence with the Spanish Version.

PubMed

Vallejo-Medina, Pablo; Marchal-Bertrand, Laurent; Gómez-Lugo, Mayra; Espada, José Pedro; Sierra, Juan Carlos; Soler, Franklin; Morales, Alexandra

2016-01-01

Attitudes toward sexuality are a key variable for sexual health. It is really important for psychology and education to have adapted and validated questionnaires to evaluate these attitudes. Therefore, the objective of this research was to adapt, validate and calculate the equivalence of the Colombia Sexual Opinion Survey as compared to the same survey from Spain. To this end, a total of eight experts were consulted and 1,167 subjects from Colombia and Spain answered the Sexual Opinion Survey, the Sexual Assertiveness Scale, the Massachusetts General Hospital-Sexual Functioning Questionnaire, and the Sexuality Scale. The evaluation was conducted by online and the results show adequate qualitative and quantitative properties of the items, with adequate reliability and external validity and compliance with the strong invariance between the two countries. Consequently, the Colombia Sexual Opinion Survey is a valid and reliable scale and its scores can be compared with the ones from the Spain survey, with minimum bias.
Validation of a short food frequency questionnaire to evaluate nutritional lifestyles in hypercholesterolemic patients.

PubMed

Béliard, Sophie; Coudert, Mathieu; Valéro, René; Charbonnier, Laurie; Duchêne, Emilie; Allaert, François André; Bruckert, Éric

2012-12-01

The purpose of our study was to develop and validate a short food frequency questionnaire which could assess the nutritional lifestyles of hypercholesterolemic patients consulting in daily practice. The questionnaire explores 11 nutrient categories. Hundred and thirty-one patients were recruited for the construct validity and 58 patients for the external validity in La Pitié Hospital, Paris. The reference method used was the diet history. To measure the internal consistency and to test the sensibility to change on a large scale, the questionnaire was used in an observational study conducted in Spain in 1048 moderate hypercholesterolemic patients. Psychometric analyses included construct validity, internal consistency, test-retest reliability, external validity and sensibility to change. Validation of the questionnaire indicated a good internal consistency (Cronbach Coefficient Alpha at 0.69) and test-retest reliability (intraclass correlation coefficient=0.89). The correlation between the scores of the FFQ and those of the diet history was significant with a Pearson correlation coefficient at 0.3 (P=0.029). The comparison between the ranking of the patients showed an agreement of 72% with a kappa of 0.48 [0.10; 0.69]. The sensibility to change was good with a score evolution improving one and four months after nutrition advices: 28.2% of patients ranked in group 1 at inclusion versus 61.3% (P<0.0001) at one month and 75.2% (P<0.0001) at four months. In conclusion, we developed and validated a food questionnaire for hypercholesterolemic patients, which can be used as a therapeutic education tool in daily practice or in clinical research. Copyright © 2012. Published by Elsevier Masson SAS.
Breaking Out of the Lab: Measuring Real-Time Responses to Televised Political Content in Real-World Settings.

PubMed

Maier, Jürgen; Hampe, J Felix; Jahn, Nico

2016-01-01

Real-time response (RTR) measurement is an important technique for analyzing human processing of electronic media stimuli. Although it has been demonstrated that RTR data are reliable and internally valid, some argue that they lack external validity. The reason for this is that RTR measurement is restricted to a laboratory environment due to its technical requirements. This paper introduces a smartphone app that 1) captures real-time responses using the dial technique and 2) provides a solution for one of the most important problems in RTR measurement, the (automatic) synchronization of RTR data. In addition, it explores the reliability and validity of mobile RTR measurement by comparing the real-time reactions of two samples of young and well-educated voters to the 2013 German televised debate. Whereas the first sample participated in a classical laboratory study, the second sample was equipped with our mobile RTR system and watched the debate at home. Results indicate that the mobile RTR system yields similar results to the lab-based RTR measurement, providing evidence that laboratory studies using RTR are externally valid. In particular, the argument that the artificial reception situation creates artificial results has to be questioned. In addition, we conclude that RTR measurement outside the lab is possible. Hence, mobile RTR opens the door for large-scale studies to better understand the processing and impact of electronic media content.
The integral inventory for depression, a new, self-rated clinimetric instrument for the emotional and painful dimensions in major depressive disorder.

PubMed

Dueñas, Héctor; Lara, Carmen; Walton, Richard J; Granger, Renee E; Dossenbach, Martin; Raskin, Joel

2011-09-01

To assess the reliability and validity of the Integral Inventory for Depression (IID) scale using post hoc analyses of data from a multi-country study (ClinicalTrials.gov: NCT00561509) of patients with major depressive disorder (MDD). Patients (N = 1629) completed the IID (comprising two separate dimensions for emotional and physically painful symptoms; maximum score of 65) and a reference scale (16-item Quick Inventory of Depressive Symptomatology Self-Report) at baseline and at follow-up (8 and 24 weeks). Physicians rated MDD symptoms using the Clinical Global Impressions of Severity scale at each visit. Inter-item correlation, internal consistency, external validity, factor structure, and exploratory analysis of an optimal severity cut-off point were assessed. The IID displayed two distinct dimensions (i.e. painful and emotional) with little item redundancy and good internal consistency (Cronbach's α > 0.83 at each visit). The IID displayed good external validity (Pearson's correlations coefficients >0.60 at each visit) and statistically significant agreement (McNemar's test; P < 0.001 at follow-up) with the reference scale. Results suggest that a cut-off score of ≤24 had adequate precision (>80%) to identify patients with and without moderate MDD. Results suggest that the IID may be a reliable and valid tool for assessing emotional and painful symptoms of MDD.
Breaking Out of the Lab

PubMed Central

Maier, Jürgen; Hampe, J. Felix; Jahn, Nico

2016-01-01

Real-time response (RTR) measurement is an important technique for analyzing human processing of electronic media stimuli. Although it has been demonstrated that RTR data are reliable and internally valid, some argue that they lack external validity. The reason for this is that RTR measurement is restricted to a laboratory environment due to its technical requirements. This paper introduces a smartphone app that 1) captures real-time responses using the dial technique and 2) provides a solution for one of the most important problems in RTR measurement, the (automatic) synchronization of RTR data. In addition, it explores the reliability and validity of mobile RTR measurement by comparing the real-time reactions of two samples of young and well-educated voters to the 2013 German televised debate. Whereas the first sample participated in a classical laboratory study, the second sample was equipped with our mobile RTR system and watched the debate at home. Results indicate that the mobile RTR system yields similar results to the lab-based RTR measurement, providing evidence that laboratory studies using RTR are externally valid. In particular, the argument that the artificial reception situation creates artificial results has to be questioned. In addition, we conclude that RTR measurement outside the lab is possible. Hence, mobile RTR opens the door for large-scale studies to better understand the processing and impact of electronic media content. PMID:27274577
Measurement of body temperature in adult patients: comparative study of accuracy, reliability and validity of different devices.

PubMed

Rubia-Rubia, J; Arias, A; Sierra, A; Aguirre-Jaime, A

2011-07-01

We compared a range of alternative devices with core body temperature measured at the pulmonary artery to identify the most valid and reliable instrument for measuring temperature in routine conditions in health services. 201 patients from the intensive care unit of the Candelaria University Hospital, Canary Islands, admitted to hospital between April 2006 and July 2007. All patients (or their families) gave informed consent. Readings from gallium-in-glass, reactive strip and digital in axilla, infra-red ear and frontal thermometers were compared with the pulmonary artery core temperature simultaneously. External factors suspected of having an influence on the differences were explored. The cut-off point readings for each thermometer were fixed for the maximum negative predictive value in comparison with the core temperature. The validity, reliability, accuracy, external influence, the waste they generated, ease of use, speed, durability, security, comfort and cost of each thermometer was evaluated. An ad hoc overall valuation score was obtained from these parameters for each instrument. For an error of ± 0.2°C and concordance with respect to fever, the gallium-in-glass thermometer gave the best results. The largest area under the receiver operating characteristic (ROC) curve is obtained by the digital axillar thermometer with probe (0.988 ± 0.007). The minimum difference between readings was given by the infrared ear thermometer, in comparison with the core temperature (-0.1 ± 0.3°C). Age, weight, level of conscience, male sex, environmental temperature and vaso-constrictor medication increases the difference in the readings and fever treatment reduces it, although this is not the same for all thermometers. The compact digital axillar thermometer and the digital thermometer with probe obtained the highest overall valuation score. If we only evaluate the aspects of validity, reliability, accuracy and external influence, the best thermometer would be the gallium-in-glass after 12 min. The gallium-in-glass thermometer is less accurate after only 5 min in comparison with the reading taken after being placed for 12 min. If we add the evaluation of waste production, ease-of-use, speed, durability, security, patient comfort and costs, the thermometers that obtain the highest score are the compact digital and digital with probe in right axilla. Copyright © 2010 Elsevier Ltd. All rights reserved.
Are cannabis prevalence estimates comparable across countries and regions? A cross-cultural validation using search engine query data.

PubMed

Steppan, Martin; Kraus, Ludwig; Piontek, Daniela; Siciliano, Valeria

2013-01-01

Prevalence estimation of cannabis use is usually based on self-report data. Although there is evidence on the reliability of this data source, its cross-cultural validity is still a major concern. External objective criteria are needed for this purpose. In this study, cannabis-related search engine query data are used as an external criterion. Data on cannabis use were taken from the 2007 European School Survey Project on Alcohol and Other Drugs (ESPAD). Provincial data came from three Italian nation-wide studies using the same methodology (2006-2008; ESPAD-Italia). Information on cannabis-related search engine query data was based on Google search volume indices (GSI). (1) Reliability analysis was conducted for GSI. (2) Latent measurement models of "true" cannabis prevalence were tested using perceived availability, web-based cannabis searches and self-reported prevalence as indicators. (3) Structure models were set up to test the influences of response tendencies and geographical position (latitude, longitude). In order to test the stability of the models, analyses were conducted on country level (Europe, US) and on provincial level in Italy. Cannabis-related GSI were found to be highly reliable and constant over time. The overall measurement model was highly significant in both data sets. On country level, no significant effects of response bias indicators and geographical position on perceived availability, web-based cannabis searches and self-reported prevalence were found. On provincial level, latitude had a significant positive effect on availability indicating that perceived availability of cannabis in northern Italy was higher than expected from the other indicators. Although GSI showed weaker associations with cannabis use than perceived availability, the findings underline the external validity and usefulness of search engine query data as external criteria. The findings suggest an acceptable relative comparability of national (provincial) prevalence estimates of cannabis use that are based on a common survey methodology. Search engine query data are a too weak indicator to base prevalence estimations on this source only, but in combination with other sources (waste water analysis, sales of cigarette paper) they may provide satisfactory estimates. Copyright © 2012. Published by Elsevier B.V.
QSAR study of curcumine derivatives as HIV-1 integrase inhibitors.

PubMed

Gupta, Pawan; Sharma, Anju; Garg, Prabha; Roy, Nilanjan

2013-03-01

A QSAR study was performed on curcumine derivatives as HIV-1 integrase inhibitors using multiple linear regression. The statistically significant model was developed with squared correlation coefficients (r(2)) 0.891 and cross validated r(2) (r(2) cv) 0.825. The developed model revealed that electronic, shape, size, geometry, substitution's information and hydrophilicity were important atomic properties for determining the inhibitory activity of these molecules. The model was also tested successfully for external validation (r(2) pred = 0.849) as well as Tropsha's test for model predictability. Furthermore, the domain analysis was carried out to evaluate the prediction reliability of external set molecules. The model was statistically robust and had good predictive power which can be successfully utilized for screening of new molecules.
Development and validation of the ExPRESS instrument for primary health care providers' evaluation of external supervision.

PubMed

Schriver, Michael; Cubaka, Vincent Kalumire; Vedsted, Peter; Besigye, Innocent; Kallestrup, Per

2018-01-01

External supervision of primary health care facilities to monitor and improve services is common in low-income countries. Currently there are no tools to measure the quality of support in external supervision in these countries. To develop a provider-reported instrument to assess the support delivered through external supervision in Rwanda and other countries. "External supervision: Provider Evaluation of Supervisor Support" (ExPRESS) was developed in 18 steps, primarily in Rwanda. Content validity was optimised using systematic search for related instruments, interviews, translations, and relevance assessments by international supervision experts as well as local experts in Nigeria, Kenya, Uganda and Rwanda. Construct validity and reliability were examined in two separate field tests, the first using exploratory factor analysis and a test-retest design, the second for confirmatory factor analysis. We included 16 items in section A ('The most recent experience with an external supervisor'), and 13 items in section B ('The overall experience with external supervisors'). Item-content validity index was acceptable. In field test I, test-retest had acceptable kappa values and exploratory factor analysis suggested relevant factors in sections A and B used for model hypotheses. In field test II, models were tested by confirmatory factor analysis fitting a 4-factor model for section A, and a 3-factor model for section B. ExPRESS is a promising tool for evaluation of the quality of support of primary health care providers in external supervision of primary health care facilities in resource-constrained settings. ExPRESS may be used as specific feedback to external supervisors to help identify and address gaps in the supervision they provide. Further studies should determine optimal interpretation of scores and the number of respondents needed per supervisor to obtain precise results, as well as test the functionality of section B.
[Interpersonal attention management inventory: a new instrument to capture different self- and external perception skills].

PubMed

Blaser, Klaus; Zlabinger, Milena; Hinterberger, Thilo

2014-01-01

The Interpersonal Attention Management Inventory (IAMI) represents a new instrument to capture self- and external perception skills. The underlying theoretical model assumes 3 mental locations of attention (the intrapersonal space, the extrapersonal space, and the external intrapersonal space) of the other. The IAMI was studied regarding its factor structure; it was shortened and statistical values as well as first reference values were calculated based on a larger sample (n = 1089). By factor analysis, the superordinate scales could be widely validated. The shortened version with 31 items and 3 superordinate scales shows a high reliability of the global value (Cronbach's α = 0.81) and, regarding the convergent validity, a modest correlation (r = 0.41) of the global value and mindfulness, measured with the Freiburg Mindfulness Inventory (FMI). Further validation studies are invited so that the IAMI can be used as an instrument for (course) diagnosis in the therapy of psychiatric disorders as well as for research in social neuroscience, e.g., in investigations on mindfulness, compassion, empathy, theory of mind, and self-boundaries.
Emotional and tangible social support in a German population-based sample: Development and validation of the Brief Social Support Scale (BS6).

PubMed

Beutel, Manfred E; Brähler, Elmar; Wiltink, Jörg; Michal, Matthias; Klein, Eva M; Jünger, Claus; Wild, Philipp S; Münzel, Thomas; Blettner, Maria; Lackner, Karl; Nickels, Stefan; Tibubos, Ana N

2017-01-01

Aim of the study was the development and validation of the psychometric properties of a six-item bi-factorial instrument for the assessment of social support (emotional and tangible support) with a population-based sample. A cross-sectional data set of N = 15,010 participants enrolled in the Gutenberg Health Study (GHS) in 2007-2012 was divided in two sub-samples. The GHS is a population-based, prospective, observational single-center cohort study in the Rhein-Main-Region in western Mid-Germany. The first sub-sample was used for scale development by performing an exploratory factor analysis. In order to test construct validity, confirmatory factor analyses were run to compare the extracted bi-factorial model with the one-factor solution. Reliability of the scales was indicated by calculating internal consistency. External validity was tested by investigating demographic characteristics health behavior, and distress using analysis of variance, Spearman and Pearson correlation analysis, and logistic regression analysis. Based on an exploratory factor analysis, a set of six items was extracted representing two independent factors. The two-factor structure of the Brief Social Support Scale (BS6) was confirmed by the results of the confirmatory factor analyses. Fit indices of the bi-factorial model were good and better compared to the one-factor solution. External validity was demonstrated for the BS6. The BS6 is a reliable and valid short scale that can be applied in social surveys due to its brevity to assess emotional and practical dimensions of social support.

Translation, Adaptation and Cross Language Validation of Tinnitus Handicap Inventory in Urdu.

PubMed

Aqeel, Muhammad; Ahmed, Ammar

2017-12-01

Tinnitus is characterized as a perception of numerous auditory sounds in absence of external stimulus. Tinnitus can have a considerable consequence on a person's quality of life, and is considered to be very complicated to quantify. The aim of this study was to investigate the reliability and validity of Urdu translation of the Tinnitus Handicap Inventory (THI) in Pakistan. It was designed to assess the presence of various auditory sounds without the external stimulus. Scale consisted of 25 items having three subscales functional, emotional, and catastrophic. The study comprised into two stages, preliminary and main studies. The results of preliminary study revealed that the overall scale had high internal consistency [alpha coefficient of Urdu version of THI (THI-U)= 0.99, alpha coefficient of English version of THI=0.98]. The overall scale had test-retest correlation over a fifteen days period of interval (0.99). Main study was performed on 110 tinnitus patients. The results of main study showed that the internal consistency and reliability of Urdu version was (α=0.93). The THI-U and its subscales demonstrated good internal consistency reliability ( α =0.81 to 0.86). High to moderate correlations were noted between tinnitus symptom ratings. A confirmatory factor analysis was used to validate the three subscales of THI-U, and high inter-correlations were found between the subscales also results revealed that a three-factor model for the THI-U was most tenable. The results displayed that the confirmatory factor analysis confirmed to validate the three subscales of THI-U. THI-U might present important information about precise facets of tinnitus distress along with diagnostic interviews in clinical practice.
Validation of an instrument to assess barriers to care-seeking for accidental bowel leakage in women: the BCABL questionnaire

PubMed Central

Brown, Heidi Wendell; Wise, Meg E.; Westenberg, Danielle; Schmuhl, Nicholas B.; Brezoczky, Kelly Lewis; Rogers, Rebecca G.; Constantine, Melissa L.

2017-01-01

Introduction and hypothesis Fewer than 30% of women with accidental bowel leakage (ABL) seek care, despite the existence of effective, minimally invasive therapies. We developed and validated a condition-specific instrument to assess barriers to care-seeking for ABL in women. Methods Adult women with ABL completed an electronic survey about condition severity, patient activation, previous care-seeking, and demographics. The Barriers to Care-seeking for Accidental Bowel Leakage (BCABL) instrument contained 42 potential items completed at baseline and again 2 weeks later. Paired t tests evaluated test–retest reliability. Factor analysis evaluated factor structure and guided item retention. Cronbach’s alpha evaluated internal consistency. Within and across factor item means generated a summary BCABL score used to evaluate scale validity with six external criterion measures. Results Among 1,677 click-throughs, 736 (44%) entered the survey; 95% of eligible female respondents (427 out of 458) provided complete data. Fifty-three percent of respondents had previously sought care for their ABL; median age was 62 years (range 27–89); mean Vaizey score was 12.8 (SD = 5.0), indicating moderate to severe ABL. Test–retest reliability was excellent for all items. Factor extraction via oblique rotation resulted in the final structure of 16 items in six domains, within which internal consistency was high. All six external criterion measures correlated significantly with BCABL score. Conclusions The BCABL questionnaire, with 16 items mapping to six domains, has excellent criterion validity and test–retest reliability when administered electronically in women with ABL. The BCABL can be used to identify care-seeking barriers for ABL in different populations, inform targeted interventions, and measure their effectiveness. PMID:28236039
Measuring strategies for learning regulation in medical education: scale reliability and dimensionality in a Swedish sample.

PubMed

Edelbring, Samuel

2012-08-15

The degree of learners' self-regulated learning and dependence on external regulation influence learning processes in higher education. These regulation strategies are commonly measured by questionnaires developed in other settings than in which they are being used, thereby requiring renewed validation. The aim of this study was to psychometrically evaluate the learning regulation strategy scales from the Inventory of Learning Styles with Swedish medical students (N = 206). The regulation scales were evaluated regarding their reliability, scale dimensionality and interrelations. The primary evaluation focused on dimensionality and was performed with Mokken scale analysis. To assist future scale refinement, additional item analysis, such as item-to-scale correlations, was performed. Scale scores in the Swedish sample displayed good reliability in relation to published results: Cronbach's alpha: 0.82, 0.72, and 0.65 for self-regulation, external regulation and lack of regulation scales respectively. The dimensionalities in scales were adequate for self-regulation and its subscales, whereas external regulation and lack of regulation displayed less unidimensionality. The established theoretical scales were largely replicated in the exploratory analysis. The item analysis identified two items that contributed little to their respective scales. The results indicate that these scales have an adequate capacity for detecting the three theoretically proposed learning regulation strategies in the medical education sample. Further construct validity should be sought by interpreting scale scores in relation to specific learning activities. Using established scales for measuring students' regulation strategies enables a broad empirical base for increasing knowledge on regulation strategies in relation to different disciplinary settings and contributes to theoretical development.
Development and testing of the KERNset: an instrument to assess the quality of telephone triage in out-of-hours primary care services.

PubMed

Smits, Marleen; Keizer, Ellen; Ram, Paul; Giesen, Paul

2017-12-02

Telephone triage is a core but vulnerable part of the care process at out-of-hours general practitioner (GP) cooperatives. In the Netherlands, different instruments have been used for assessing the quality of telephone triage. These instruments focussed mainly on communicational aspects, and less on the medical quality of triage decisions. Our aim was to develop and test a minimum set of items to assess the quality of telephone triage. A national survey among all GP cooperatives in the Netherlands was performed to examine the most important aspects of telephone triage. Next, corresponding items from existing instruments were searched on these topics. Subsequently, an expert panel judged these items on importance, completeness and formulation. The concept KERNset consisted of 24 items about the telephone conversation: 13 medical, ten communicational and one regarding both types. It was pilot tested on measurement characteristics, reliability, validity and variation between triagists. In this pilot study, 114 anonymous calls from four GP cooperatives spread across the Netherlands were judged by three out of eight raters, both internal and external raters. Cronbach's alpha was .94 for the medical items and .75 for the communicational items. Inter-rater reliability: complete agreement between the external raters was 45% and reasonable agreement 73% (difference of maximally one point on the five-point scale). Intra-rater reliability: complete agreement within raters was 55% and reasonable agreement 84%. There were hardly any differences between internal and external raters, but there were differences in strictness between individual raters. The construct validity was confirmed by the high correlation between the general impression of the call and the items of the KERNset. Of the differences within items 19% could be explained by differences between triage nurses, which means the KERNset is able to demonstrate differences between triage nurses. The KERNset can be used to assess the quality of telephone triage. The validity is good and differences between calls and between triage nurses can be measured. A more intensive training for raters could improve the reliability.
Technical Adequacy of the Disruptive Behavior Rating Scale-2nd Edition--Self-Report

ERIC Educational Resources Information Center

Erford, Bradley T.; Miller, Emily M.; Isbister, Katherine

2015-01-01

This study provides preliminary analysis of the Disruptive Behavior Rating Scale-2nd Edition--Self-Report, which was designed to screen individuals aged 10 years and older for anxiety and behavior symptoms. Score reliability and internal and external facets of validity were good for a screening-level test.
Standardization of the Functional Assessment and Intervention Program (FAIP) with Children Who Have Externalizing Behaviors

ERIC Educational Resources Information Center

Hartwig, Laurie; Heathfield, Lora Tuesday; Jenson, William R.

2004-01-01

The purpose of this study was to develop standardization data for the Functional Assessment Intervention Program (FAIP; University of Utah, Utah State University, & Utah State Office of Education, 1999), a computerized, functional behavioral assessment expert system. Reliability, validity, and utility analyses were conducted with students serving…
Technical Analysis of Teacher Responses to the Self-Evaluation Scale-Teacher (SES-T) Version

ERIC Educational Resources Information Center

Erford, Bradley T.; Lowe, Samantha; Chang, Catherine Y.

2011-01-01

The Self-Evaluation Scale--Teacher version, used to assess teacher perceived self-esteem of students, was analyzed. A unidimensional model emerged from exploratory factor analysis, with cautious acceptance of data fit. Reliability and external aspects of validity were supported by the Self-Evaluation Scale--Teacher data.
Valid and reliable authentic assessment of culminating student performance in the biomedical sciences.

PubMed

Oh, Deborah M; Kim, Joshua M; Garcia, Raymond E; Krilowicz, Beverly L

2005-06-01

There is increasing pressure, both from institutions central to the national scientific mission and from regional and national accrediting agencies, on natural sciences faculty to move beyond course examinations as measures of student performance and to instead develop and use reliable and valid authentic assessment measures for both individual courses and for degree-granting programs. We report here on a capstone course developed by two natural sciences departments, Biological Sciences and Chemistry/Biochemistry, which engages students in an important culminating experience, requiring synthesis of skills and knowledge developed throughout the program while providing the departments with important assessment information for use in program improvement. The student work products produced in the course, a written grant proposal, and an oral summary of the proposal, provide a rich source of data regarding student performance on an authentic assessment task. The validity and reliability of the instruments and the resulting student performance data were demonstrated by collaborative review by content experts and a variety of statistical measures of interrater reliability, including percentage agreement, intraclass correlations, and generalizability coefficients. The high interrater reliability reported when the assessment instruments were used for the first time by a group of external evaluators suggests that the assessment process and instruments reported here will be easily adopted by other natural science faculty.
The significance of motivation in periodontal treatment: validity and reliability of the motivation assessment scale among patients undergoing periodontal treatment.

PubMed

Pac, A; Oruba, Z; Olszewska-Czyż, I; Chomyszyn-Gajewska, M

2014-03-01

The individual evaluation of patients' motivation should be introduced to the protocol of periodontal treatment, as it could impact positively on effective treatment planning and treatment outcomes. However, a standardised tool measuring the extent of periodontal patients' motivation has not yet been proposed in the literature. Thus, the objective of the present study was to determine the validity and reliability of the Zychlińscy motivation scale adjusted to the needs of periodontology. Cross sectional study. Department of Periodontology and Oral Medicine, Dental University Clinic, Jagiellonian University, Krakow, Poland. 199 adult periodontal patients, aged 20-78. 14-item questionnaire. The items were adopted from the original Zychlińscy motivation assessment scale. Validity and reliability of the proposed motivation assessment instrument. The assessed Cronbach's alpha of 0.79 indicates the scale is a reliable tool. Principal component analysis revealed a model with three factors, which explained half of the total variance. Those factors represented: the patient's attitude towards treatment and oral hygiene practice; previous experiences during treatment; and the influence of external conditions on the patient's attitude towards treatment. The proposed scale proved to be a reliable and accurate tool for the evaluation of periodontal patients' motivation.
Geographic Information Systems to Assess External Validity in Randomized Trials.

PubMed

Savoca, Margaret R; Ludwig, David A; Jones, Stedman T; Jason Clodfelter, K; Sloop, Joseph B; Bollhalter, Linda Y; Bertoni, Alain G

2017-08-01

To support claims that RCTs can reduce health disparities (i.e., are translational), it is imperative that methodologies exist to evaluate the tenability of external validity in RCTs when probabilistic sampling of participants is not employed. Typically, attempts at establishing post hoc external validity are limited to a few comparisons across convenience variables, which must be available in both sample and population. A Type 2 diabetes RCT was used as an example of a method that uses a geographic information system to assess external validity in the absence of a priori probabilistic community-wide diabetes risk sampling strategy. A geographic information system, 2009-2013 county death certificate records, and 2013-2014 electronic medical records were used to identify community-wide diabetes prevalence. Color-coded diabetes density maps provided visual representation of these densities. Chi-square goodness of fit statistic/analysis tested the degree to which distribution of RCT participants varied across density classes compared to what would be expected, given simple random sampling of the county population. Analyses were conducted in 2016. Diabetes prevalence areas as represented by death certificate and electronic medical records were distributed similarly. The simple random sample model was not a good fit for death certificate record (chi-square, 17.63; p=0.0001) and electronic medical record data (chi-square, 28.92; p<0.0001). Generally, RCT participants were oversampled in high-diabetes density areas. Location is a highly reliable "principal variable" associated with health disparities. It serves as a directly measurable proxy for high-risk underserved communities, thus offering an effective and practical approach for examining external validity of RCTs. Copyright © 2017 American Journal of Preventive Medicine. Published by Elsevier Inc. All rights reserved.
Fixing the Problem With Empathy: Development and Validation of the Affective and Cognitive Measure of Empathy.

PubMed

Vachon, David D; Lynam, Donald R

2016-04-01

Low empathy is a criterion for most externalizing disorders, and empathy training is a regular component of treatment for aggressive people, from school bullies to sex offenders. However, recent meta-analytic evidence suggests that current measures of empathy explain only 1% of the variance in aggressive behavior. A new assessment of empathy was developed to more fully represent the empathy construct and better predict important outcomes--particularly aggressive behavior and externalizing psychopathology. Across three independent samples (N = 210-708), the 36-item Affective and Cognitive measure of Empathy (ACME) was internally consistent, structurally reliable, and invariant across sex. The ACME bore significant associations to important outcomes, which were incremental relative to other measures of empathy and generalizable across sex. Importantly, the affective scales of the ACME-particularly a new "Affective Dissonance" scale--yielded moderate to strong associations with aggressive behavior and externalizing disorders. The ACME is a short, reliable, and useful measure of empathy. © The Author(s) 2015.
Assessment of abdominal muscle function using the Biodex System-4. Validity and reliability in healthy volunteers and patients with giant ventral hernia.

PubMed

Gunnarsson, U; Johansson, M; Strigård, K

2011-08-01

The decrease in recurrence rates in ventral hernia surgery have led to a redirection of focus towards other important patient-related endpoints. One such endpoint is abdominal wall function. The aim of the present study was to evaluate the reliability and external validity of abdominal wall strength measurement using the Biodex System-4 with a back abdomen unit. Ten healthy volunteers and ten patients with ventral hernias exceeding 10 cm were recruited. Test-retest reliability, both with and without girdle, was evaluated by comparison of measurements at two test occasions 1 week apart. Reliability was calculated by the interclass correlation coefficients (ICC) method. Validity was evaluated by correlation with the well-established International Physical Activity Questionnaire (IPAQ) and a self-assessment of abdominal wall strength. One person in the healthy group was excluded after the first test due to neck problems following minor trauma. The reliability was excellent (>0.75), with ICC values between 0.92 and 0.97 for the different modalities tested. No differences were seen between testing with and without a girdle. Validity was also excellent both when calculated as correlation to self-assessment of abdominal wall strength, and to IPAQ, giving Kendall tau values of 0.51 and 0.47, respectively, and corresponding P values of 0.002 and 0.004. Measurement of abdominal muscle function using the Biodex System-4 is a reliable and valid method to assess this important patient-related endpoint. Further investigations will be made to explore the potential of this technique in the evaluation of the results of ventral hernia surgery, and to compare muscle function after different abdominal wall reconstruction techniques.
Clinimetrics of ultrasound pathologies in osteoarthritis: systematic literature review and meta-analysis.

PubMed

Oo, W M; Linklater, J M; Daniel, M; Saarakkala, S; Samuels, J; Conaghan, P G; Keen, H I; Deveza, L A; Hunter, D J

2018-05-01

The aims of this study were to systematically review clinimetrics of commonly assessed ultrasound pathologies in knee, hip and hand osteoarthritis (OA), and to conduct a meta-analysis for each clinimetric. Medline, Embase, and Cochrane Library databases were searched from their inceptions to September 2016. According to the Outcome Measures in Rheumatology (OMERACT) Instrument Selection Algorithm, data extraction focused on ultrasound technical features and performance metrics. Methodological quality was assessed with modified 19-item Downs and Black score and 11-item Quality Appraisal of Diagnostic Reliability (QAREL) score. Separate meta-analyses were performed for clinimetrics: (1) inter-rater/intra-rater reliability; (2) construct validity; (3) criteria validity; and (4) internal/external responsiveness. Statistical Package for the Social Sciences (SPSS), Excel and Comprehensive Meta-analysis were used. Our search identified 1126 records; of these, 100 were eligible, including a total of 8542 patients and 32,373 joints. The average Downs and Black score was 13.01, and average QAREL was 5.93. The stratified meta-analysis was performed only for knee OA, which demonstrated moderate to substantial reliability [minimum kappa > 0.44(0.15,0.74), minimum intraclass correlation coefficient (ICC) > 0.82(0.73-0.89)], weak construct validity against pain (r = 0.12 to 0.27), function (r = 0.15 to 0.23), and blood biomarkers (r = 0.01 to 0.21), but weak to strong correlation with plain radiography (r = 0.13 to 0.60), strong association with Magnetic Resonance Imaging (MRI) [minimum r = 0.60(0.52,0.67)] and strong discrimination against symptomatic patients (OR = 3.08 to 7.46). There was strong criterion validity against cartilage histology [r = 0.66(-0.05,0.93)], and small to moderate internal [standardized mean difference(SMD) = 0.20 to 0.58] and external (r = 0.35 to 0.43) responsiveness to interventions. Ultrasound demonstrated strong criterion validity with cartilage histology, poor to strong correlation with patient findings and MRI, moderate reliability, and low responsiveness to interventions. CRD42016039954. Copyright © 2018 Osteoarthritis Research Society International. All rights reserved.
Cross-cultural Adaptation, Reliability, and Validity of the Yoruba Version of the Roland-Morris Disability Questionnaire.

PubMed

Mbada, Chidozie Emmanuel; Idowu, Opeyemi Ayodiipo; Ogunjimi, Olawale Richard; Ayanniyi, Olusola; Orimolade, Elkanah Ayodele; Oladiran, Ajibola Babatunde; Johnson, Olubusola Esther; Akinsulore, Adesanmi; Oni, Temitope Olawale

2017-04-01

A translation, cross-cultural adaptation, and psychometric analysis. The aim of this study was to translate, cross-culturally adapt, and validate the Yoruba version of the RMDQ. The Roland-Morris Disability Questionnaire (RMDQ) is a valid outcome tool for low back pain (LBP) in clinical and research settings. There seems to be no valid and reliable version of the RMDQ in the Nigerian languages. Following the Guillemin criteria, the English version of the RMDQ was forward and back translated. Two Yoruba translated versions of the RMDQ were assessed for clarity, common language usage, and conceptual equivalence. Consequently, a harmonized Yoruba version was produced and was pilot-tested among 20 patients with nonspecific long-term LBP (NSLBP) for cognitive debriefing. The final version of the Yoruba RMDQ was tested for its construct validity and re-retest reliability among 120 and 87 patients with NSLBP, respectively. Pearson product moment correlation coefficient (r) of 0.82 was obtained for reliability of the Yoruba version of the RMDQ. The test-retest reliability of the Yoruba RMDQ yielded Cronbach alpha 0.932, while the intraclass correlation (ICC) ranged between 0.896 and 0.956. The analysis of the global scores of both the English and Yoruba versions of the RMDQ yielded ICC value of between 0.995 (95% confidence interval 0.996-0.997), with the item-by-item Kappa agreement ranging between 0.824 and 1.000. The external validity of RMDQ using Quadruple Visual Analogue Scale was r = -0.596 (P = 0.001). The Yoruba version of the RMDQ had no floor/ceiling effects, as no patient achieved either of the maximum or the minimum possible scores. The Yoruba version of the RMDQ has excellent reliability and validity and may be an appropriate outcome tool for clinical and research purposes among Yoruba-speaking patients with LBP. 3.
Convergent and discriminant validity and reliability of the pediatric anxiety rating scale in youth with autism spectrum disorders.

PubMed

Storch, Eric A; Wood, Jeffrey J; Ehrenreich-May, Jill; Jones, Anna M; Park, Jennifer M; Lewin, Adam B; Murphy, Tanya K

2012-11-01

The psychometric properties of the Pediatric Anxiety Rating Scale (PARS), a clinician-administered measure for assessing severity of anxiety symptoms, were examined in 72 children and adolescents diagnosed with an autism spectrum disorder (ASD). The internal consistency of the PARS was 0.59, suggesting that the items were related but not repetitive. The PARS showed high 26-day test-retest (ICC = 0.83) and inter-rater reliability (ICC = 0.86). The PARS was strongly correlated with clinician-ratings of overall anxiety severity and parent-report anxiety measures, supporting convergent validity. Results for divergent validity were mixed. Although the PARS was not associated with the sum of the Social and Communication items on the Autism Diagnostic Observation System, it was moderately correlated with parent-reported inattention, aggression and externalizing behavior. Overall, these results suggest that the psychometric properties of the PARS are adequate for assessing anxiety symptoms in youth with ASD, although additional clarification of divergent validity is needed.
Validation Evidence of the Motivation for Teaching Scale in Secondary Education.

PubMed

Abós, Ángel; Sevil, Javier; Martín-Albo, José; Aibar, Alberto; García-González, Luis

2018-04-10

Grounded in self-determination theory, the aim of this study was to develop a scale with adequate psychometric properties to assess motivation for teaching and to explain some outcomes of secondary education teachers at work. The sample comprised 584 secondary education teachers. Analyses supported the five-factor model (intrinsic motivation, identified regulation, introjected regulation, external regulation and amotivation) and indicated the presence of a continuum of self-determination. Evidence of reliability was provided by Cronbach's alpha, composite reliability and average variance extracted. Multigroup confirmatory factor analyses supported the partial invariance (configural and metric) of the scale in different sub-samples, in terms of gender and type of school. Concurrent validity was analyzed by a structural equation modeling that explained 71% of the work dedication variance and 69% of the boredom at work variance. Work dedication was positively predicted by intrinsic motivation (ß = .56, p < .001) and external regulation (ß = .29, p < .001) and negatively predicted by introjected regulation (ß = -.22, p < .001) and amotivation (ß = -.49, p < .001). Boredom at work was negatively predicted by intrinsic motivation (ß = -.28, p < .005) and positively predicted by amotivation (ß = .68, p < .001). The Motivation for Teaching Scale in Secondary Education (Spanish acronym EME-ES, Escala de Motivación por la Enseñanza en Educación Secundaria) is discussed as a valid and reliable instrument. This is the first specific scale in the work context of secondary teachers that has integrated the five-factor structure together with their dedication and boredom at work.
Student Internalizing and Externalizing Behavior Screeners: Evidence for Reliability, Validity, and Usability in Elementary Schools

ERIC Educational Resources Information Center

Hartman, Kelsey; Gresham, Frank M.; Byrd, Shelby

2017-01-01

Universal screening for emotional and behavioral risk in schools facilitates early identification and intervention for students as part of multitiered systems of support. Early identification has the potential to mitigate adverse outcomes of emotional and behavioral disorders. The purpose of this study was to extend existing research on the…
A Critique of the Use of Self-Evaluation in a Compulsory Accreditation System

ERIC Educational Resources Information Center

Van Kemenade, Everard; Hardjono, Teun W.

2010-01-01

Self-evaluation is supposed to be a valid, reliable and easy-to-use instrument to commit professionals to external quality assurance. The writing of a self-evaluation report is the first step in most higher education accreditation systems all over the world. Research on accreditation in the Netherlands and Flanders shows that professionals…
An Overview of Student Teachers' Academic Intrinsic Motivation

ERIC Educational Resources Information Center

Uyulgan, Melis Arzu; Akkuzu, Nalan

2014-01-01

Student teachers' desire to learn is affected by a variety of motivational factors. In this study, the effect of some internal and external variables on Academic Intrinsic Motivation (AIM) was explored. First, the validity and reliability of the scale of AIM was determined, then the effect on AIM of variables such as grade levels, academic grade…
Measuring Transgender Individuals' Comfort with Gender Identity and Appearance: Development and Validation of the Transgender Congruence Scale

ERIC Educational Resources Information Center

Kozee, Holly B.; Tylka, Tracy L.; Bauerband, L. Andrew

2012-01-01

Our study used the construct of congruence to conceptualize the degree to which transgender individuals feel genuine, authentic, and comfortable with their gender identity and external appearance. In Study 1, the Transgender Congruence scale (TCS) was developed, and data from 162 transgender individuals were used to estimate the reliability and…

Development of a use estimation process at a metropolitan park district

Treesearch

Andrew J. Mowen

2001-01-01

The need for a committed system to monitor and track visitation over time is increasingly recognized by agencies and organizations that must be responsive to staffing, budgeting, and relations with external stakeholders. This paper highlights a process that one metropolitan park agency uses to monitor visitation, discusses the role of validity and reliability in the...
Validity and reliability of naturalistic driving scene categorization Judgments from crowdsourcing.

PubMed

Cabrall, Christopher D D; Lu, Zhenji; Kyriakidis, Miltos; Manca, Laura; Dijksterhuis, Chris; Happee, Riender; de Winter, Joost

2018-05-01

A common challenge with processing naturalistic driving data is that humans may need to categorize great volumes of recorded visual information. By means of the online platform CrowdFlower, we investigated the potential of crowdsourcing to categorize driving scene features (i.e., presence of other road users, straight road segments, etc.) at greater scale than a single person or a small team of researchers would be capable of. In total, 200 workers from 46 different countries participated in 1.5days. Validity and reliability were examined, both with and without embedding researcher generated control questions via the CrowdFlower mechanism known as Gold Test Questions (GTQs). By employing GTQs, we found significantly more valid (accurate) and reliable (consistent) identification of driving scene items from external workers. Specifically, at a small scale CrowdFlower Job of 48 three-second video segments, an accuracy (i.e., relative to the ratings of a confederate researcher) of 91% on items was found with GTQs compared to 78% without. A difference in bias was found, where without GTQs, external workers returned more false positives than with GTQs. At a larger scale CrowdFlower Job making exclusive use of GTQs, 12,862 three-second video segments were released for annotation. Infeasible (and self-defeating) to check the accuracy of each at this scale, a random subset of 1012 categorizations was validated and returned similar levels of accuracy (95%). In the small scale Job, where full video segments were repeated in triplicate, the percentage of unanimous agreement on the items was found significantly more consistent when using GTQs (90%) than without them (65%). Additionally, in the larger scale Job (where a single second of a video segment was overlapped by ratings of three sequentially neighboring segments), a mean unanimity of 94% was obtained with validated-as-correct ratings and 91% with non-validated ratings. Because the video segments overlapped in full for the small scale Job, and in part for the larger scale Job, it should be noted that such reliability reported here may not be directly comparable. Nonetheless, such results are both indicative of high levels of obtained rating reliability. Overall, our results provide compelling evidence for CrowdFlower, via use of GTQs, being able to yield more accurate and consistent crowdsourced categorizations of naturalistic driving scene contents than when used without such a control mechanism. Such annotations in such short periods of time present a potentially powerful resource in driving research and driving automation development. Copyright © 2017 Elsevier Ltd. All rights reserved.
Psychometric Properties of the Arabic Version of the Drug Use Disorders Identification Test (DUDIT) in Clinical, Prison Inmate, and Student Samples.

PubMed

Sfendla, Anis; Zouini, Btissame; Lemrani, Dina; Berman, Anne H; Senhaji, Meftaha; Kerekes, Nóra

2017-04-01

The study aimed to validate the Arabic version of the Drug Use Disorders Identification Test (DUDIT) by (1) assessing its factor structure, (2) determining structural validity, (3) evaluating item-total and inter-item correlation, and (4) assessing its predictive validity. The study population included 169 prison inmates, 51 patients with clinical diagnosis of substance used disorder, and 53 students (N = 273). All participants completed the self-report version of the Arabic DUDIT. After exploratory factor analysis, internal consistency of the Arabic DUDIT was determined and external validation was performed. Principal factor analysis showed that Arabic DUDIT exhibited only one factor, which explained 66.9% of the variance. Reliability based on Cronbach's alpha was .95. When compared to the DSM-IV substance use disorder diagnosis in a clinical sample, DUDIT had an area under the curve (AUC) of .98, with a sensitivity of .98 and a specificity of .90. The Arabic version of DUDIT is a valid and reliable tool for screening for drug use in Arabic-speaking countries.
CADASTER QSPR Models for Predictions of Melting and Boiling Points of Perfluorinated Chemicals.

PubMed

Bhhatarai, Barun; Teetz, Wolfram; Liu, Tao; Öberg, Tomas; Jeliazkova, Nina; Kochev, Nikolay; Pukalov, Ognyan; Tetko, Igor V; Kovarich, Simona; Papa, Ester; Gramatica, Paola

2011-03-14

Quantitative structure property relationship (QSPR) studies on per- and polyfluorinated chemicals (PFCs) on melting point (MP) and boiling point (BP) are presented. The training and prediction chemicals used for developing and validating the models were selected from Syracuse PhysProp database and literatures. The available experimental data sets were split in two different ways: a) random selection on response value, and b) structural similarity verified by self-organizing-map (SOM), in order to propose reliable predictive models, developed only on the training sets and externally verified on the prediction sets. Individual linear and non-linear approaches based models developed by different CADASTER partners on 0D-2D Dragon descriptors, E-state descriptors and fragment based descriptors as well as consensus model and their predictions are presented. In addition, the predictive performance of the developed models was verified on a blind external validation set (EV-set) prepared using PERFORCE database on 15 MP and 25 BP data respectively. This database contains only long chain perfluoro-alkylated chemicals, particularly monitored by regulatory agencies like US-EPA and EU-REACH. QSPR models with internal and external validation on two different external prediction/validation sets and study of applicability-domain highlighting the robustness and high accuracy of the models are discussed. Finally, MPs for additional 303 PFCs and BPs for 271 PFCs were predicted for which experimental measurements are unknown. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Reliability and Validity of Wisconsin Upper Respiratory Symptom Survey, Korean Version

PubMed Central

Yang, Su-Young; Kang, Weechang; Yeo, Yoon; Park, Yang-Chun

2011-01-01

Background The Wisconsin Upper Respiratory Symptom Survey (WURSS) is a self-administered questionnaire developed in the United States to evaluate the severity of the common cold and its reliability has been validated. We developed a Korean language version of this questionnaire by using a sequential forward and backward translation approach. The purpose of this study was to validate the Korean version of the Wisconsin Upper Respiratory Symptom Survey (WURSS-K) in Korean patients with common cold. Methods This multicenter prospective study enrolled 107 participants who were diagnosed with common cold and consented to participate in the study. The WURSS-K includes 1 global illness severity item, 32 symptom-based items, 10 functional quality-of-life (QOL) items, and 1 item assessing global change. The SF-8 was used as an external comparator. Results The participants were 54 women and 53 men aged 18 to 42 years. The WURSS-K showed good reliability in 10 domains, with Cronbach’s alphas ranging from 0.67 to 0.96 (mean: 0.84). Comparison of the reliability coefficients of the WURSS-K and WURSS yielded a Pearson correlation coefficient of 0.71 (P = 0.02). Validity of the WURSS-K was evaluated by comparing it with the SF-8, which yielded a Pearson correlation coefficient of −0.267 (P < 0.001). The Guyatt’s responsiveness index of the WURSS-K ranged from 0.13 to 0.46, and the correlation coefficient with the WURSS was 0.534 (P < 0.001), indicating that there was close correlation between the WURSS-K and WURSS. Conclusions The WURSS-K is a reliable, valid, and responsive disease-specific questionnaire for assessing symptoms and QOL in Korean patients with common cold. PMID:21691034
Validation of the psychometrics properties of a French quality of life questionnaire among a cohort of renal transplant recipients less than one year.

PubMed

Beauger, Davy; Fruit, Dorothée; Villeneuve, Claire; Laroche, Marie-Laure; Jouve, Elisabeth; Rousseau, Annick; Boyer, Laurent; Gentile, Stéphanie

2016-09-01

Renal transplantation is considered as the treatment of choice for patients with end-stage renal disease. Health-related quality of life (HRQoL) of renal transplant recipients (RTR) is very important to assess, especially during the first year after transplantation. To provide new evidence about the suitability of HRQoL measures in RTR during the first post-transplant year, we explored the internal structure, reliability and external validity of a French specific HRQoL instrument, the Renal Transplant Quality of life Questionnaire Second Version (RTQ V2). The data were issued from the French multicenter cohort of renal transplant patients followed during 4 years (EPIGREN). The HRQoL of RTR was assessed five times (at 1, 3, 6, 9 and 12 months after transplantation) with the RTQ V2, a specific instrument consisting of 32 items describing five dimensions. Socio-demographic information, clinical characteristics and HRQoL (i.e., RTQ V2 and SF-36) were collected. For the five times, psychometric properties of the RTQ V2 were compared to those reported from the reference population assessed in the validation study. Three hundred and thirty-four patients were enrolled. The proportions of well-projected items, item-internal consistency, item-discriminant validity, floor and ceiling effects, Cronbach's alpha coefficients and item goodness-of-fit statistics were satisfactory for each dimension at the five times of the study. The suitability indices of construct validity were higher than 90 % for each time (minimum-maximum: 90.8-97.4 %). The external validity was less satisfactory, with a suitability indices ranged from 46.7 % at M1 to 66.7 % at M12. However, the discrepancies with the reference population (mainly for the gender) appeared logical considering the scientific literature on HRQoL of RTR during the first post-transplant year and may not compromise the external validity. These results support the validity and reliability of the RTQ V2 for evaluating HRQoL in RTR during the first post-transplant year, and confirm that the RTQ V2 is a useful tool to assess the HRQoL precociously after transplant.
External validation of a forest inventory and analysis volume equation and comparisons with estimates from multiple stem-profile models

Treesearch

Christopher M. Oswalt; Adam M. Saunders

2009-01-01

Sound estimation procedures are desideratum for generating credible population estimates to evaluate the status and trends in resource conditions. As such, volume estimation is an integral component of the U.S. Department of Agriculture, Forest Service, Forest Inventory and Analysis (FIA) program's reporting. In effect, reliable volume estimation procedures are...
[A Validation Study of the Modified Korean Version of Ethical Leadership at Work Questionnaire (K-ELW)].

PubMed

Kim, Jeong-Eon; Park, Eun-Jun

2015-04-01

The purpose of this study was to validate the Korean version of the Ethical Leadership at Work questionnaire (K-ELW) that measures RNs' perceived ethical leadership of their nurse managers. The strong validation process suggested by Benson (1998), including translation and cultural adaptation stage, structural stage, and external stage, was used. Participants were 241 RNs who reported their perceived ethical leadership using both the pre-version of K-ELW and a previously known Ethical Leadership Scale, and interactional justice of their managers, as well as their own demographics, organizational commitment and organizational citizenship behavior. Data analyses included descriptive statistics, Pearson correlation coefficients, reliability coefficients, exploratory factor analysis, and confirmatory factor analysis. SPSS 19.0 and Amos 18.0 versions were used. A modified K-ELW was developed from construct validity evidence and included 31 items in 7 domains: People orientation, task responsibility fairness, relationship fairness, power sharing, concern for sustainability, ethical guidance, and integrity. Convergent validity, discriminant validity, and concurrent validity were supported according to the correlation coefficients of the 7 domains with other measures. The results of this study provide preliminary evidence that the modified K-ELW can be adopted in Korean nursing organizations, and reliable and valid ethical leadership scores can be expected.
Development and validation of the Survey of Organizational Research Climate (SORC).

PubMed

Martinson, Brian C; Thrush, Carol R; Lauren Crain, A

2013-09-01

Development and targeting efforts by academic organizations to effectively promote research integrity can be enhanced if they are able to collect reliable data to benchmark baseline conditions, to assess areas needing improvement, and to subsequently assess the impact of specific initiatives. To date, no standardized and validated tool has existed to serve this need. A web- and mail-based survey was administered in the second half of 2009 to 2,837 randomly selected biomedical and social science faculty and postdoctoral fellows at 40 academic health centers in top-tier research universities in the United States. Measures included the Survey of Organizational Research Climate (SORC) as well as measures of perceptions of organizational justice. Exploratory and confirmatory factor analyses yielded seven subscales of organizational research climate, all of which demonstrated acceptable internal consistency (Cronbach's α ranging from 0.81 to 0.87) and adequate test-retest reliability (Pearson r ranging from 0.72 to 0.83). A broad range of correlations between the seven subscales and five measures of organizational justice (unadjusted regression coefficients ranging from 0.13 to 0.95) document both construct and discriminant validity of the instrument. The SORC demonstrates good internal (alpha) and external reliability (test-retest) as well as both construct and discriminant validity.
Development and Validation of the Survey of Organizational Research Climate (SORC)

PubMed Central

Martinson, Brian C.; Thrush, Carol R.; Crain, A. Lauren

2012-01-01

Background Development and targeting efforts by academic organizations to effectively promote research integrity can be enhanced if they are able to collect reliable data to benchmark baseline conditions, to assess areas needing improvement, and to subsequently assess the impact of specific initiatives. To date, no standardized and validated tool has existed to serve this need. Methods A web- and mail-based survey was administered in the second half of 2009 to 2,837 randomly selected biomedical and social science faculty and postdoctoral fellows at 40 academic health centers in top-tier research universities in the United States. Measures included the Survey of Organizational Research Climate (SORC) as well as measures of perceptions of organizational justice. Results Exploratory and confirmatory factor analyses yielded seven subscales of organizational research climate, all of which demonstrated acceptable internal consistency (Cronbach’s α ranging from 0.81 to 0.87) and adequate test-retest reliability (Pearson r ranging from 0.72 to 0.83). A broad range of correlations between the seven subscales and five measures of organizational justice (unadjusted regression coefficients ranging from .13 to .95) document both construct and discriminant validity of the instrument. Conclusions The SORC demonstrates good internal (alpha) and external reliability (test-retest) as well as both construct and discriminant validity. PMID:23096775
Validation of the Spanish Version of the COPD-Q Questionnaire on COPD Knowledge.

PubMed

Puente-Maestu, Luis; Chancafe-Morgan, Jorge; Calle, Myriam; Rodríguez-Hermosa, Juan L; Malo de Molina, Rosa; Ortega-González, Ángel; Fuster, Antonia; Márquez-Martín, Eduardo; Marcos, Pedro J; Ramírez, Laura; Ray, Shaunta'; Franks, Andrea

2016-01-01

Although recognition of the importance of educating chronic obstructive pulmonary disease (COPD) patients has grown in recent years, their understanding of this disease is not being measured due to a lack of specific instruments. The aim of this study was to validate the COPD-Q questionnaire, a 13-item instrument for determining COPD knowledge. The COPD-Q was translated and backtranslated, and subsequently submitted to logic and content validation by a group of COPD experts and 8 COPD patients. Reliability was studied in an independent group of 59 patients with severe COPD seen in the pulmonology ward or clinics of 6 hospitals in Spain (Andalusia, Baleares, Castilla-La Mancha, Galicia and Madrid). This sample was also used for other internal and external validations. The mean age of the group was approximately 70 years and their health awareness was low-to-medium. The number of correct answers was 8.3 (standard deviation: 1.9), median 8, range 3-13. Floor and ceiling effects were 0% and 1.5%, respectively. Internal consistency of the questionnaire was good (Cronbach's alpha=0.85) and reliability was also high, with a kappa coefficient >0.6 for all items and an intraclass correlation efficient of 0.84 for the total score. The 13-item COPD-Q is a valid, applicable and reliable instrument for determining patients' knowledge of COPD. Copyright © 2014 SEPAR. Published by Elsevier Espana. All rights reserved.
The psychometric validation of the Social Problem-Solving Inventory--Revised with UK incarcerated sexual offenders.

PubMed

Wakeling, Helen C

2007-09-01

This study examined the reliability and validity of the Social Problem-Solving Inventory--Revised (SPSI-R; D'Zurilla, Nezu, & Maydeu-Olivares, 2002) with a population of incarcerated sexual offenders. An availability sample of 499 adult male sexual offenders was used. The SPSI-R had good reliability measured by internal consistency and test-retest reliability, and adequate validity. Construct validity was determined via factor analysis. An exploratory factor analysis extracted a two-factor model. This model was then tested against the theory-driven five-factor model using confirmatory factor analysis. The five-factor model was selected as the better fitting of the two, and confirmed the model according to social problem-solving theory (D'Zurilla & Nezu, 1982). The SPSI-R had good convergent validity; significant correlations were found between SPSI-R subscales and measures of self-esteem, impulsivity, and locus of control. SPSI-R subscales were however found to significantly correlate with a measure of socially desirable responding. This finding is discussed in relation to recent research suggesting that impression management may not invalidate self-report measures (e.g. Mills & Kroner, 2005). The SPSI-R was sensitive to sexual offender intervention, with problem-solving improving pre to post-treatment in both rapists and child molesters. The study concludes that the SPSI-R is a reasonably internally valid and appropriate tool to assess problem-solving in sexual offenders. However future research should cross-validate the SPSI-R with other behavioural outcomes to examine the external validity of the measure. Furthermore, future research should utilise a control group to determine treatment impact.
Risk score to predict gastrointestinal bleeding after acute ischemic stroke.

PubMed

Ji, Ruijun; Shen, Haipeng; Pan, Yuesong; Wang, Penglian; Liu, Gaifen; Wang, Yilong; Li, Hao; Singhal, Aneesh B; Wang, Yongjun

2014-07-25

Gastrointestinal bleeding (GIB) is a common and often serious complication after stroke. Although several risk factors for post-stroke GIB have been identified, no reliable or validated scoring system is currently available to predict GIB after acute stroke in routine clinical practice or clinical trials. In the present study, we aimed to develop and validate a risk model (acute ischemic stroke associated gastrointestinal bleeding score, the AIS-GIB score) to predict in-hospital GIB after acute ischemic stroke. The AIS-GIB score was developed from data in the China National Stroke Registry (CNSR). Eligible patients in the CNSR were randomly divided into derivation (60%) and internal validation (40%) cohorts. External validation was performed using data from the prospective Chinese Intracranial Atherosclerosis Study (CICAS). Independent predictors of in-hospital GIB were obtained using multivariable logistic regression in the derivation cohort, and β-coefficients were used to generate point scoring system for the AIS-GIB. The area under the receiver operating characteristic curve (AUROC) and the Hosmer-Lemeshow goodness-of-fit test were used to assess model discrimination and calibration, respectively. A total of 8,820, 5,882, and 2,938 patients were enrolled in the derivation, internal validation and external validation cohorts. The overall in-hospital GIB after AIS was 2.6%, 2.3%, and 1.5% in the derivation, internal, and external validation cohort, respectively. An 18-point AIS-GIB score was developed from the set of independent predictors of GIB including age, gender, history of hypertension, hepatic cirrhosis, peptic ulcer or previous GIB, pre-stroke dependence, admission National Institutes of Health stroke scale score, Glasgow Coma Scale score and stroke subtype (Oxfordshire). The AIS-GIB score showed good discrimination in the derivation (0.79; 95% CI, 0.764-0.825), internal (0.78; 95% CI, 0.74-0.82) and external (0.76; 95% CI, 0.71-0.82) validation cohorts. The AIS-GIB score was well calibrated in the derivation (P = 0.42), internal (P = 0.45) and external (P = 0.86) validation cohorts. The AIS-GIB score is a valid clinical grading scale to predict in-hospital GIB after AIS. Further studies on the effect of the AIS-GIB score on reducing GIB and improving outcome after AIS are warranted.
Methodology, Methods, and Metrics for Testing and Evaluating Augmented Cognition Systems

DOE Office of Scientific and Technical Information (OSTI.GOV)

Greitzer, Frank L.

The augmented cognition research community seeks cognitive neuroscience-based solutions to improve warfighter performance by applying and managing mitigation strategies to reduce workload and improve the throughput and quality of decisions. The focus of augmented cognition mitigation research is to define, demonstrate, and exploit neuroscience and behavioral measures that support inferences about the warfighter’s cognitive state that prescribe the nature and timing of mitigation. A research challenge is to develop valid evaluation methodologies, metrics and measures to assess the impact of augmented cognition mitigations. Two considerations are external validity, which is the extent to which the results apply to operational contexts;more » and internal validity, which reflects the reliability of performance measures and the conclusions based on analysis of results. The scientific rigor of the research methodology employed in conducting empirical investigations largely affects the validity of the findings. External validity requirements also compel us to demonstrate operational significance of mitigations. Thus it is important to demonstrate effectiveness of mitigations under specific conditions. This chapter reviews some cognitive science and methodological considerations in designing augmented cognition research studies and associated human performance metrics and analysis methods to assess the impact of augmented cognition mitigations.« less
Development and validation of a scoring index to predict the presence of lesions in capsule endoscopy in patients with suspected Crohn's disease of the small bowel: a Spanish multicenter study.

PubMed

Egea-Valenzuela, Juan; González Suárez, Begoña; Sierra Bernal, Cristian; Juanmartiñena Fernández, José Francisco; Luján-Sanchís, Marisol; San Juan Acosta, Mileidis; Martínez Andrés, Blanca; Pons Beltrán, Vicente; Sastre Lozano, Violeta; Carretero Ribón, Cristina; de Vera Almenar, Félix; Sánchez Cuenca, Joaquín; Alberca de Las Parras, Fernando; Rodríguez de Miguel, Cristina; Valle Muñoz, Julio; Férnandez-Urién Sainz, Ignacio; Torres González, Carolina; Borque Barrera, Pilar; Pérez-Cuadrado Robles, Enrique; Alonso Lázaro, Noelia; Martínez García, Pilar; Prieto de Frías, César; Carballo Álvarez, Fernando

2018-05-01

Capsule endoscopy (CE) is the first-line investigation in cases of suspected Crohn's disease (CD) of the small bowel, but the factors associated with a higher diagnostic yield remain unclear. Our aim is to develop and validate a scoring index to assess the risk of the patients in this setting on the basis of biomarkers. Data on fecal calprotectin, C-reactive protein, and other biomarkers from a population of 124 patients with suspected CD of the small bowel studied by CE and included in a PhD study were used to build a scoring index. This was first used on this population (internal validation process) and after that on a different set of patients from a multicenter study (external validation process). An index was designed in which every biomarker is assigned a score. Three risk groups have been established (low, intermediate, and high). In the internal validation analysis (124 individuals), patients had a 10, 46.5, and 81% probability of showing inflammatory lesions in CE in the low-risk, intermediate-risk, and high-risk groups, respectively. In the external validation analysis, including 410 patients from 12 Spanish hospitals, this probability was 15.8, 49.7, and 80.6% for the low-risk, intermediate-risk, and high-risk groups, respectively. Results from the internal validation process show that the scoring index is coherent, and results from the external validation process confirm its reliability. This index can be a useful tool for selecting patients before CE studies in cases of suspected CD of the small bowel.
Multisite external validation of a risk prediction model for the diagnosis of blood stream infections in febrile pediatric oncology patients without severe neutropenia.

PubMed

Esbenshade, Adam J; Zhao, Zhiguo; Aftandilian, Catherine; Saab, Raya; Wattier, Rachel L; Beauchemin, Melissa; Miller, Tamara P; Wilkes, Jennifer J; Kelly, Michael J; Fernbach, Alison; Jeng, Michael; Schwartz, Cindy L; Dvorak, Christopher C; Shyr, Yu; Moons, Karl G M; Sulis, Maria-Luisa; Friedman, Debra L

2017-10-01

Pediatric oncology patients are at an increased risk of invasive bacterial infection due to immunosuppression. The risk of such infection in the absence of severe neutropenia (absolute neutrophil count ≥ 500/μL) is not well established and a validated prediction model for blood stream infection (BSI) risk offers clinical usefulness. A 6-site retrospective external validation was conducted using a previously published risk prediction model for BSI in febrile pediatric oncology patients without severe neutropenia: the Esbenshade/Vanderbilt (EsVan) model. A reduced model (EsVan2) excluding 2 less clinically reliable variables also was created using the initial EsVan model derivative cohort, and was validated using all 5 external validation cohorts. One data set was used only in sensitivity analyses due to missing some variables. From the 5 primary data sets, there were a total of 1197 febrile episodes and 76 episodes of bacteremia. The overall C statistic for predicting bacteremia was 0.695, with a calibration slope of 0.50 for the original model and a calibration slope of 1.0 when recalibration was applied to the model. The model performed better in predicting high-risk bacteremia (gram-negative or Staphylococcus aureus infection) versus BSI alone, with a C statistic of 0.801 and a calibration slope of 0.65. The EsVan2 model outperformed the EsVan model across data sets with a C statistic of 0.733 for predicting BSI and a C statistic of 0.841 for high-risk BSI. The results of this external validation demonstrated that the EsVan and EsVan2 models are able to predict BSI across multiple performance sites and, once validated and implemented prospectively, could assist in decision making in clinical practice. Cancer 2017;123:3781-3790. © 2017 American Cancer Society. © 2017 American Cancer Society.
A computable phenotype for asthma case identification in adult and pediatric patients: External validation in the Chicago Area Patient-Outcomes Research Network (CAPriCORN).

PubMed

Afshar, Majid; Press, Valerie G; Robison, Rachel G; Kho, Abel N; Bandi, Sindhura; Biswas, Ashvini; Avila, Pedro C; Kumar, Harsha Vardhan Madan; Yu, Byung; Naureckas, Edward T; Nyenhuis, Sharmilee M; Codispoti, Christopher D

2017-10-13

Comprehensive, rapid, and accurate identification of patients with asthma for clinical care and engagement in research efforts is needed. The original development and validation of a computable phenotype for asthma case identification occurred at a single institution in Chicago and demonstrated excellent test characteristics. However, its application in a diverse payer mix, across different health systems and multiple electronic health record vendors, and in both children and adults was not examined. The objective of this study is to externally validate the computable phenotype across diverse Chicago institutions to accurately identify pediatric and adult patients with asthma. A cohort of 900 asthma and control patients was identified from the electronic health record between January 1, 2012 and November 30, 2014. Two physicians at each site independently reviewed the patient chart to annotate cases. The inter-observer reliability between the physician reviewers had a κ-coefficient of 0.95 (95% CI 0.93-0.97). The accuracy, sensitivity, specificity, negative predictive value, and positive predictive value of the computable phenotype were all above 94% in the full cohort. The excellent positive and negative predictive values in this multi-center external validation study establish a useful tool to identify asthma cases in in the electronic health record for research and care. This computable phenotype could be used in large-scale comparative-effectiveness trials.
Examination of the validity and reliability of the French version of the Brief Self-Control Scale

PubMed Central

Brevers, Damien; Foucart, Jennifer; Verbanck, Paul; Turel, Ofir

2017-01-01

This study aims to develop and to validate a French version of the Brief Self-Control Scale (BSCS; Tangney et al., 2004). This instrument is usually applied as a unidimensional self-report measure for assessing trait self-control, which captures one’s dispositional ability to resist short-term temptation in order to reach more valuable long-term goals. Data were collected from two independent samples of French-speaking individuals (n1 = 287; n2 = 160). Results indicated that the French version of the BSCS can be treated as unidimensional, like the original questionnaire. Data also showed consistent acceptable reliability and reasonable test-retest stability. Acceptable external validity of constructs was supported by relationships with self-reported measures of impulsivity (UPPS), including urgency, lack of premeditation, and lack of perseverance. Overall, the findings suggest that the average score of the French version of the BSCS is a viable option for assessing trait self-control in French speaking populations. PMID:29200467
Examination of the validity and reliability of the French version of the Brief Self-Control Scale.

PubMed

Brevers, Damien; Foucart, Jennifer; Verbanck, Paul; Turel, Ofir

2017-10-01

This study aims to develop and to validate a French version of the Brief Self-Control Scale (BSCS; Tangney et al., 2004). This instrument is usually applied as a unidimensional self-report measure for assessing trait self-control, which captures one's dispositional ability to resist short-term temptation in order to reach more valuable long-term goals. Data were collected from two independent samples of French-speaking individuals ( n 1 = 287; n 2 = 160). Results indicated that the French version of the BSCS can be treated as unidimensional, like the original questionnaire. Data also showed consistent acceptable reliability and reasonable test-retest stability. Acceptable external validity of constructs was supported by relationships with self-reported measures of impulsivity (UPPS), including urgency, lack of premeditation, and lack of perseverance. Overall, the findings suggest that the average score of the French version of the BSCS is a viable option for assessing trait self-control in French speaking populations.
Predicting the need for institutional care shortly after admission to rehabilitation: Rasch analysis and predictive validity of the BRASS Index.

PubMed

Panella, L; La Porta, F; Caselli, S; Marchisio, S; Tennant, A

2012-09-01

Effective discharge planning is increasingly recognised as a critical component of hospital-based Rehabilitation. The BRASS index is a risk screening tool for identification, shortly after hospital admission, of patients who are at risk of post-discharge problems. To evaluate the internal construct validity and reliability of the Blaylock Risk Assessment Screening Score (BRASS) within the rehabilitation setting. Observational prospective study. Rehabilitation ward of an Italian district hospital. One hundred and four consecutively admitted patients. Using classical psychometric methods and Rasch analysis (RA), the internal construct validity and reliability of the BRASS were examined. Also, external and predictive validity of the Rasch-modified BRASS (RMB) score were determined. Reliability of the original BRASS was low (Cronbach's alpha=0.595) and factor analyses showed that it was clearly multidimensional. A RA, based on a reduced 7-BRASS item set (RMB), satisfied model's expectations. Reliability was 0.777. The RMB scores strongly correlated with the original BRASS (rho=0.952; P<0.000) and with FIM™ admission scores (rho=-0.853; P<0.000). A RMB score of 12 was associated with an increased risk of nursing home admission (RR=2.1, 95%CI=1.7-2.5), whereas a score of 17 was associated to a higher risk of length of stay >28 days (RR=7.6, 95%CI=1.8-31.9). This study demonstrated that the original BRASS was multidimensional and unreliable. However, the RMB holds adequate internal construct validity and is sufficiently reliable as a predictor of discharge problems for group, but not individual use. The application of tools and methods (such as the BRASS Index) developed under the biomedical paradigm in a Physical and Rehabilitation Medicine setting may have limitations. Further research is needed to develop, within the rehabilitation setting, a valid measuring tool of risk of post-discharge problems at the individual level.

The SATISPSY-22: development and validation of a French hospitalized patients' satisfaction questionnaire in psychiatry.

PubMed

Zendjidjian, X Y; Auquier, P; Lançon, C; Loundou, A; Parola, N; Faugère, M; Boyer, L

2015-01-01

The aim of our study was to develop a specific French self-administered instrument for measuring hospitalized patients' satisfaction in psychiatry based on exclusive patient point of view: the SATISPSY-22. The development of the SATISPSY was undertaken in three steps: item generation, item reduction, and validation. The content of the SATISPSY was derived from 80 interviews with patients hospitalized in psychiatry. Using item response and classical test theories, item reduction was performed in 2 hospitals on 270 responders. The validation was based on construct validity, reliability, and some aspects of external validity. The SATISPSY contains 22 items describing 6 dimensions (staff, quality of care, personal experience, information, activity, and food). The six-factor structure accounted for 78.0% of the total variance. Each item achieved the 0.40 standard for item-internal consistency, and the Cronbach's alpha coefficients were>0.70. Scores of dimensions were strongly positively correlated with Visual Analogue Scale scores. Significant associations with socioeconomic and clinical indicators showed good discriminant and external validity. INFIT statistics were ranged from 0.71 to 1.25. The SATISPSY-22 presents satisfactory psychometric properties, enabling patient feedback to be incorporated in a continuous quality health care improvement strategy. Copyright © 2014 Elsevier Masson SAS. All rights reserved.
Design and validation of a self-administered test to assess bullying (bull-M) in high school Mexicans: a pilot study.

PubMed

Ramos-Jimenez, Arnulfo; Wall-Medrano, Abraham; Villar, Oscar Esparza-Del; Hernández-Torres, Rosa P

2013-04-11

Bullying (Bull) is a public health problem worldwide, and Mexico is not exempt. However, its epidemiology and early detection in our country is limited, in part, by the lack of validated tests to ensure the respondents' anonymity. The aim of this study was to validate a self-administered test (Bull-M) for assessing Bull among high-school Mexicans. Experts and school teachers from highly violent areas of Ciudad Juarez (Chihuahua, México), reported common Bull behaviors. Then, a 10-item test was developed based on twelve of these behaviors; the students' and peers' participation in Bull acts and in some somatic consequences in Bull victims with a 5-point Likert frequency scale. Validation criteria were: content (CV, judges); reliability [Cronbach's alpha (CA), test-retest (spearman correlation, rs)]; construct [principal component (PCA), confirmatory factor (CFA), goodness-of-fit (GF) analysis]; and convergent (Bull-M vs. Bull-S test) validity. Bull-M showed good reliability (CA = 0.75, rs = 0.91; p < 0.001). Two factors were identified (PCA) and confirmed (CFA): "bullying me (victim)" and "bullying others (aggressor)". GF indices were: Root mean square error of approximation (0.031), GF index (0.97), and normalized fit index (0.92). Bull-M was as good as Bull-S for measuring Bull prevalence. Bull-M has a good reliability and convergent validity and a bi-modal factor structure for detecting Bull victims and aggressors; however, its external validity and sensitivity should be analyzed on a wider and different population.
A tool for sexual minority mental health research: The Patient Health Questionnaire (PHQ-9) as a depressive symptom severity measure for sexual minority women in Viet Nam

PubMed Central

Nguyen, Trang Quynh; Bandeen-Roche, Karen; Bass, Judith K; German, Danielle; Nguyen, Nam Thi Thu; Knowlton, Amy R

2016-01-01

In a context with limited attention to mental health and prevalent sexual prejudice, valid measurements are a key first step to understanding the psychological suffering of sexual minority populations. We adapted the Patient Health Questionnaire as a depressive symptom severity measure for Vietnamese sexual minority women, ensuring its cultural relevance and suitability for internet-based research. Psychometric evaluation found that the scale is mostly unidimensional and has good convergent validity, good external construct validity, and excellent reliability. The sample’s high endorsement of scale items emphasizes the need to study minority stress and mental health in this population. PMID:27642381
A tool for sexual minority mental health research: The Patient Health Questionnaire (PHQ-9) as a depressive symptom severity measure for sexual minority women in Viet Nam.

PubMed

Nguyen, Trang Quynh; Bandeen-Roche, Karen; Bass, Judith K; German, Danielle; Nguyen, Nam Thi Thu; Knowlton, Amy R

In a context with limited attention to mental health and prevalent sexual prejudice, valid measurements are a key first step to understanding the psychological suffering of sexual minority populations. We adapted the Patient Health Questionnaire as a depressive symptom severity measure for Vietnamese sexual minority women, ensuring its cultural relevance and suitability for internet-based research. Psychometric evaluation found that the scale is mostly unidimensional and has good convergent validity, good external construct validity, and excellent reliability. The sample's high endorsement of scale items emphasizes the need to study minority stress and mental health in this population.
A Methodological Approach to Small Area Estimation for the Behavioral Risk Factor Surveillance System

PubMed Central

Xu, Fang; Wallace, Robyn C.; Garvin, William; Greenlund, Kurt J.; Bartoli, William; Ford, Derek; Eke, Paul; Town, G. Machell

2016-01-01

Public health researchers have used a class of statistical methods to calculate prevalence estimates for small geographic areas with few direct observations. Many researchers have used Behavioral Risk Factor Surveillance System (BRFSS) data as a basis for their models. The aims of this study were to 1) describe a new BRFSS small area estimation (SAE) method and 2) investigate the internal and external validity of the BRFSS SAEs it produced. The BRFSS SAE method uses 4 data sets (the BRFSS, the American Community Survey Public Use Microdata Sample, Nielsen Claritas population totals, and the Missouri Census Geographic Equivalency File) to build a single weighted data set. Our findings indicate that internal and external validity tests were successful across many estimates. The BRFSS SAE method is one of several methods that can be used to produce reliable prevalence estimates in small geographic areas. PMID:27418213
Validity, reliability and responsiveness of the EQ-5D in German stroke patients undergoing rehabilitation.

PubMed

Hunger, Matthias; Sabariego, Carla; Stollenwerk, Björn; Cieza, Alarcos; Leidl, Reiner

2012-09-01

To analyse the psychometric properties of the EQ-5D in German stroke survivors undergoing neurological rehabilitation. The EQ-5D, the Hospital Anxiety and Depression Scale (HADS) and the Stroke Impact Scale (SIS) were completed before (210 subjects) and after (183 subjects) a patient education programme in seven rehabilitation clinics in Bavaria, Germany. A postal follow-up was conducted after 6 months. Acceptance, validity, reliability and responsiveness of the EQ-5D were tested. The SIS subscales were used as external anchors to classify the patients into change groups between the measurements. The proportion of missing answers ranged from 4.7 to 8.6%. Between 16 and 19% reported no problems in any EQ-5D dimension. At baseline, correlations between EQ-5D index and the SIS subscales ranged from 0.15 (communication) to 0.60 (mobility). Correlations with the EQ VAS were slightly smaller. All scores were reliable in test-retest with intraclass correlations ranging from 0.67 to 0.81. EQ-5D index and EQ VAS were consistently responsive only to improvements in health, showing small- to medium effect sizes (0.27-0.42). The EQ-5D has shown reasonable validity, reliability and, more limited, responsiveness in stroke patients with mild to moderate limitations of functional status, allowing it to be used in clinical trials in rehabilitation.
SEQUenCE: a service user-centred quality of care instrument for mental health services.

PubMed

Hester, Lorraine; O'Doherty, Lorna Jane; Schnittger, Rebecca; Skelly, Niamh; O'Donnell, Muireann; Butterly, Lisa; Browne, Robert; Frorath, Charlotte; Morgan, Craig; McLoughlin, Declan M; Fearon, Paul

2015-08-01

To develop a quality of care instrument that is grounded in the service user perspective and validate it in a mental health service. The instrument (SEQUenCE (SErvice user QUality of CarE)) was developed through analysis of focus group data and clinical practice guidelines, and refined through field-testing and psychometric analyses. All participants were attending an independent mental health service in Ireland. Participants had a diagnosis of bipolar affective disorder (BPAD) or a psychotic disorder. Twenty-nine service users participated in six focus group interviews. Seventy-one service users participated in field-testing: 10 judged the face validity of an initial 61-item instrument; 28 completed a revised 52-item instrument from which 12 items were removed following test-retest and convergent validity analyses; 33 completed the resulting 40-item instrument. Test-retest reliability, internal consistency and convergent validity of the instrument. The final instrument showed acceptable test-retest reliability at 5-7 days (r = 0.65; P < 0.001), good convergent validity with the Verona Service Satisfaction Scale (r = 0.84, P < 0.001) and good internal consistency (Cronbach's alpha = 0.87). SEQUenCE is a valid, reliable scale that is grounded in the service user perspective and suitable for routine use. It may serve as a useful tool in individual care planning, service evaluation and research. The instrument was developed and validated with service users with a diagnosis of either BPAD or a psychotic disorder; it does not yet have established external validity for other diagnostic groups. © The Author 2015. Published by Oxford University Press in association with the International Society for Quality in Health Care; all rights reserved.
Evaluating the spoken English proficiency of graduates of foreign medical schools.

PubMed

Boulet, J R; van Zanten, M; McKinley, D W; Gary, N E

2001-08-01

The purpose of this study was to gather additional evidence for the validity and reliability of spoken English proficiency ratings provided by trained standardized patients (SPs) in high-stakes clinical skills examination. Over 2500 candidates who took the Educational Commission for Foreign Medical Graduates' (ECFMG) Clinical Skills Assessment (CSA) were studied. The CSA consists of 10 or 11 timed clinical encounters. Standardized patients evaluate spoken English proficiency and interpersonal skills in every encounter. Generalizability theory was used to estimate the consistency of spoken English ratings. Validity coefficients were calculated by correlating summary English ratings with CSA scores and other external criterion measures. Mean spoken English ratings were also compared by various candidate background variables. The reliability of the spoken English ratings, based on 10 independent evaluations, was high. The magnitudes of the associated variance components indicated that the evaluation of a candidate's spoken English proficiency is unlikely to be affected by the choice of cases or SPs used in a given assessment. Proficiency in spoken English was related to native language (English versus other) and scores from the Test of English as a Foreign Language (TOEFL). The pattern of the relationships, both within assessment components and with external criterion measures, suggests that valid measures of spoken English proficiency are obtained. This result, combined with the high reproducibility of the ratings over encounters and SPs, supports the use of trained SPs to measure spoken English skills in a simulated medical environment.
Quality of life and psychological health indicators in the national social life, health, and aging project.

PubMed

Shiovitz-Ezra, Sharon; Leitsch, Sara; Graber, Jessica; Karraker, Amelia

2009-11-01

The National Social Life, Health, and Aging Project (NSHAP) measures seven indicators of quality of life (QoL) and psychological health. The measures used for happiness, self-esteem, depression, and loneliness are well established in the literature. Conversely, measures of anxiety, stress, and self-reported emotional health were modified for their use in this unique project. The purpose of this paper is to provide (a) an overview of NSHAP's QoL assessment and (b) evidence for the adequacy of the modified measures. First, we examined the psychometric properties of the modified measures. Second, the established QoL measures were used to examine the concurrent validity of the modified measures. Finally, gender- and age-group differences were examined for each modified measure. The anxiety index exhibited good internal reliability and concurrent validity. Consistent with the literature, a single-factor structure best fit the data. Stress was satisfactory in terms of concurrent validity but with only fair internal consistency. Self-reported emotional health exhibited good concurrent validity and moderate external validity. The modified indices used in NSHAP tended to exhibit good internal reliability and concurrent validity. These measures can confidently be used in the exploration of QoL and psychological health in later life and its many correlates.
Adaptation and Validation of a Chinese Version of Patient Health Engagement Scale for Patients with Chronic Disease.

PubMed

Zhang, Yaying; Graffigna, Guendalina; Bonanomi, Andrea; Choi, Kai-Chow; Barello, Serena; Mao, Pan; Feng, Hui

2017-01-01

The Patient Health Engagement Scale (PHE-s) was designed to assess the emotional and psychological attitudes of patients' engagement along their healthcare management journey. The aim of this study was to validate a culturally adapted Chinese version of the PHE-s (CPHE-s). Three hundred and seventy-seven participants were recruited from eight community health centers in a sample of patients with chronic disease in Hunan Province, China. The original Italian PHE-s was translated into Mandarin Chinese using a standardized forward-backward translation. The Rasch model was utilized and presented uni-dimensionality and good items fitness of the PHE-s. The internal consistency was 0.89 and the weighted Kappa coefficients of the items (test-retest reliability) ranged from 0.52 to 0.79. Both principal component analysis and confirmatory factor analysis supported a single-factor structure of the PHE-s. In testing the external validity, the PHE-s showed a significant moderate correlation with patient activation but not with medicine adherence behavior, which requires further exploration. The result suggested that the PHE-s is a reliable and valid instrument to assess the level of patient engagement in his or her own health management among chronic patients in China. Further analysis of reliability and validity should be assessed among other patient cohorts in China, and future directions for testing changes after patient engagement interventions should be developed by exploring some clinical relevance.
Validity of selected physical activity questions in white Seventh-day Adventists and non-Adventists.

PubMed

Singh, P N; Tonstad, S; Abbey, D E; Fraser, G E

1996-08-01

The validity and reliability of selected physical activity questions were assessed in both Seventh-day Adventist (N = 131) and non-Adventist (N = 101) study groups. Vigorous activity questions similar to those used by others and new questions that measured moderate and light activities were included. Validation was external, comparing questionnaire data with treadmill exercise time, resting heart rate, and body mass index (kg.m-2), and internal, comparing data with other similar questions. Both Adventist and non-Adventist males showed significant age-adjusted correlations between treadmill time and a "Run-Walk-Jog Index" (R = 0.28, R = 0.48, respectively). These correlations increased substantially when restricting analysis to exercise speeds exceeding 3 mph (R = 0.39, R = 0.71, respectively). Frequency of sweating and a vigorous physical activity index also correlated significantly with treadmill time in males. Correlations were generally weaker in females. Moderate- and light-intensity questions were not correlated with physical fitness. Internal correlations R = 0.50-0.78) between the above three vigorous activity questions were significant in all groups, and correlations (R = 0.14-0.60) for light and moderate activity questions were also documented. Test-retest reliability coefficients were high for vigorous activity questions (R = 0.48-0.85) and for one set of moderate activity questions (R = 0.43-0.75). No important differences in validity and reliability were found between Adventist and non-Adventists, but the validity of vigorous activity measures was generally weaker in females.
Adaptation and Validation of a Chinese Version of Patient Health Engagement Scale for Patients with Chronic Disease

PubMed Central

Zhang, Yaying; Graffigna, Guendalina; Bonanomi, Andrea; Choi, Kai-chow; Barello, Serena; Mao, Pan; Feng, Hui

2017-01-01

The Patient Health Engagement Scale (PHE-s) was designed to assess the emotional and psychological attitudes of patients' engagement along their healthcare management journey. The aim of this study was to validate a culturally adapted Chinese version of the PHE-s (CPHE-s). Three hundred and seventy-seven participants were recruited from eight community health centers in a sample of patients with chronic disease in Hunan Province, China. The original Italian PHE-s was translated into Mandarin Chinese using a standardized forward–backward translation. The Rasch model was utilized and presented uni-dimensionality and good items fitness of the PHE-s. The internal consistency was 0.89 and the weighted Kappa coefficients of the items (test–retest reliability) ranged from 0.52 to 0.79. Both principal component analysis and confirmatory factor analysis supported a single-factor structure of the PHE-s. In testing the external validity, the PHE-s showed a significant moderate correlation with patient activation but not with medicine adherence behavior, which requires further exploration. The result suggested that the PHE-s is a reliable and valid instrument to assess the level of patient engagement in his or her own health management among chronic patients in China. Further analysis of reliability and validity should be assessed among other patient cohorts in China, and future directions for testing changes after patient engagement interventions should be developed by exploring some clinical relevance. PMID:28220090
Reliability and validity of the German version of the Structured Interview of Personality Organization (STIPO)

PubMed Central

2013-01-01

Background The assessment of personality organization and its observable behavioral manifestations, i.e. personality functioning, has a long tradition in psychodynamic psychiatry. Recently, the DSM-5 Levels of Personality Functioning Scale has moved it into the focus of psychiatric diagnostics. Based on Kernberg’s concept of personality organization the Structured Interview of Personality Organization (STIPO) was developed for diagnosing personality functioning. The STIPO covers seven dimensions: (1) identity, (2) object relations, (3) primitive defenses, (4) coping/rigidity, (5) aggression, (6) moral values, and (7) reality testing and perceptual distortions. The English version of the STIPO has previously revealed satisfying psychometric properties. Methods Validity and reliability of the German version of the 100-item instrument have been evaluated in 122 psychiatric patients. All patients were diagnosed according to the Diagnostic and Statistical Manual for Mental Disorders (DSM-IV) and were assessed by means of the STIPO. Moreover, all patients completed eight questionnaires that served as criteria for external validity of the STIPO. Results Interrater reliability varied between intraclass correlations of .89 and 1.0, Crohnbach’s α for the seven dimensions was .69 to .93. All a priori selected questionnaire scales correlated significantly with the corresponding STIPO dimensions. Patients with personality disorder (PD) revealed significantly higher STIPO scores (i.e. worse personality functioning) than patients without PD; patients cluster B PD showed significantly higher STIPO scores than patients with cluster C PD. Conclusions Interrater reliability, Crohnbach’s α, concurrent validity, and differential validity of the STIPO are satisfying. The STIPO represents an appropriate instrument for the assessment of personality functioning in clinical and research settings. PMID:23941404
Validation of the Arabic Version of the Infant Feeding Intentions Scale Among Lebanese Women.

PubMed

Yehya, Nadine; Tamim, Hani; Shamsedine, Lama; Ayash, Soumaya; Abdel Khalek, Lama; Abou Ezzi, Amanda; Nabulsi, Mona

2017-05-01

The Infant Feeding Intentions (IFI) scale was shown to reliably measure maternal intentions to initiate breastfeeding and continue exclusive breastfeeding until 1, 3, or 6 months in English and Spanish but not in Arab contexts. Research aim: This study aimed to validate an Arabic version of the IFI scale (IFI-A) and examine its ability to predict exclusive breastfeeding at 1, 3, or 6 months in pregnant Lebanese women. The internal consistency reliability and construct validity of the IFI-A scale were tested on 50 pregnant women (Group 1), whereas its predictive ability was tested on 196 pregnant women (Group 2), who were surveyed monthly about their infants' nutrition method until 6 months. The IFI-A scale's Cronbach's alpha internal consistency reliability is .82. Its corrected item-total correlations ranged from .26 for Item 2 ("at least give breastfeeding a try") to .86 for Item 4 ("will be exclusively breastfeeding at 3 months"). Exploratory factor analysis revealed that it is unidimensional. IFI-A scores correlated significantly with exclusive breastfeeding duration in Group 1 ( r = .624; p = .001) and with participants' breastfeeding attitude ( r = .390; p < .001) and previous breastfeeding duration ( r = .237; p = .011) in Group 2, thus confirming its external construct validity. In adjusted analysis, the IFI-A scale predicted exclusive breastfeeding at 3 months, albeit weakly (odds ratio = 1.16; 95% confidence interval [0.99, 1.36]), but not at 1 or 6 months. The IFI-A scale is a reliable and valid tool to assess maternal feeding intentions and predict exclusive breastfeeding at 3 months in the Arab context. Further studies are needed in other Arab contexts to confirm our findings.
Recommendations for the Definition of Clinical Responder in Insulin Preservation Studies

PubMed Central

Gitelman, Stephen E.; Palmer, Jerry P.

2014-01-01

Clinical responder studies should contribute to the translation of effective treatments and interventions to the clinic. Since ultimately this translation will involve regulatory approval, we recommend that clinical trials prespecify a responder definition that can be assessed against the requirements and suggestions of regulatory agencies. In this article, we propose a clinical responder definition to specifically assist researchers and regulatory agencies in interpreting the clinical importance of statistically significant findings for studies of interventions intended to preserve β-cell function in newly diagnosed type 1 diabetes. We focus on studies of 6-month β-cell preservation in type 1 diabetes as measured by 2-h–stimulated C-peptide. We introduce criteria (bias, reliability, and external validity) for the assessment of responder definitions to ensure they meet U.S. Food and Drug Administration and European Medicines Agency guidelines. Using data from several published TrialNet studies, we evaluate our definition (no decrease in C-peptide) against published alternatives and determine that our definition has minimum bias with external validity. We observe that reliability could be improved by using changes in C-peptide later than 6 months beyond baseline. In sum, to support efficacy claims of β-cell preservation therapies in type 1 diabetes submitted to U.S. and European regulatory agencies, we recommend use of our definition. PMID:24722251
MBS Measurement Tool for Swallow Impairment—MBSImp: Establishing a Standard

PubMed Central

Martin-Harris, Bonnie; Brodsky, Martin B.; Michel, Yvonne; Castell, Donald O.; Schleicher, Melanie; Sandidge, John; Maxwell, Rebekah; Blair, Julie

2014-01-01

The aim of this study was to test reliability, content, construct, and external validity of a new modified barium swallowing study (MBSS) tool (MBSImp) that is used to quantify swallowing impairment. Multiple regression, confirmatory factor, and correlation analyses were used to analyze 300 in- and outpatients with heterogeneous medical and surgical diagnoses who were sequentially referred for MBS exams at a university medical center and private tertiary care community hospital. Main outcome measures were the MBSImp and index scores of aspiration, health status, and quality of life. Inter- and intrarater concordance were 80% or greater for blinded scoring of MBSSs. Regression analysis revealed contributions of eight of nine swallow types to impressions of overall swallowing impairment (p ≤ 0.05). Factor analysis revealed 13 significant components (loadings ≥ 0.5) that formed two impairment groupings (oral and pharyngeal). Significant correlations were found between Oral and Pharyngeal Impairment scores and Penetration-Aspiration Scale scores, and indexes of intake status, nutrition, health status, and quality of life. The MBSImp demonstrated clinical practicality, favorable inter- and intrarater reliability following standardized training, content, and external validity. This study reflects potential for establishment of a new standard for quantification and comparison of oropharyngeal swallowing impairment across patient diagnoses as measured on MBSS. PMID:18855050
Italian validation of the Amsterdam Preoperative Anxiety and Information Scale.

PubMed

Buonanno, Pasquale; Laiola, Anna; Palumbo, Chiara; Spinelli, Gianmario; Terminiello, Virginia; Servillo, Giuseppe

2017-07-01

Preoperative anxiety is usually experienced by patients awaiting surgical procedures and it can negatively impact patient's outcome. The Amsterdam Preoperative Anxiety and Information Scale (APAIS) is a questionnaire created to identify anxious patients and their need for information: it has been translated and validated in many languages because of its reliability and ease of completion. To date, no Italian version of the APAIS has been produced; our aim was to translate and validate the APAIS in Italian. We produced an Italian version of the APAIS and we administered it to 110 patients undergoing elective surgery; we explored its structure by factor analysis and its reliability by Cronbach's alpha. We analyzed its external validity by confronting it to the Spielberg's State-Trait Anxiety Inventory (STAI). Sensitivity, specificity, and positive and negative predictive values of the Italian version of the APAIS were determined. The Italian version of the APAIS confirmed the original structure of the questionnaire and its internal consistency; it well correlated with the STAI-Y1, the subscale of the STAI which explore the anxiety "state." An APAIS score of 14 was found as best cutoff to distinguish anxious and non-anxious patients. The Italian translation of the APAIS showed psychometric properties similar to the original version. Its reliability and its efficiency make it a powerful tool even in Italian population to detect anxiety and need for information.
Family burden in inherited ichthyosis: creation of a specific questionnaire.

PubMed

Dufresne, Hélène; Hadj-Rabia, Smail; Méni, Cécile; Sibaud, Vincent; Bodemer, Christine; Taïeb, Charles

2013-02-15

The concept of individual burden, associated with disease, has been introduced recently to determine the "disability" caused by the pathology in the broadest sense of the word (psychological, social, economic, physical). Inherited ichthyosis belong to a large heterogeneous group of Mendelian Disorders of Cornification. Skin symptoms have a major impact on patients' Quality of Life but little is known about the burden of the disease on the families of patients. To develop and validate a specific burden questionnaire for the families of patients affected by ichthyosis. Two steps were required. First, the creation of the questionnaire which followed a strict methodological process involving a multidisciplinary team and families. Secondarily, the validation of the questionnaire, including the assessment of its reliability, external validity, reproducibility and sensitivity, was carried out on a population of patients affected by autosomal recessive congenital ichthyosis. A population of parents of patients affected by ichthyosis was enrolled to answer the new questionnaire in association with the Short Form Q12 questionnaire (SF-12) and a clinical severity score was filled for each patient. Ninety four families were interviewed to construct the verbatim in order to create the questionnaire and a cognitive debriefing was realized. The concept of burden could be structured around five components: "economic", "daily life", "familial and personal relationship", "work", and "psychological impact". As a result, "Family Burden Ichthyosis" (FBI) reproducible questionnaire of 25 items was created.Forty two questionnaires were analyzable for psychometric validation. Reliability (Cronbach's alpha coefficient = 0.89), reflected the good homogeneity of the questionnaire. The correlation between mental dimensions of the SF-12 and the FBI questionnaire was statistically significant which confirmed the external validity. The mean FBI score was 71.7 ± 18.8 and a significant difference in the FBI score was shown between two groups of severity underlining a good sensitivity of the questionnaire. The internal and external validity of the "FBI" questionnaire was confirmed and it is correlated to the severity of ichtyosis. Ichthyoses, and other chronic pathologies, are difficult to assess by clinical or Quality of Life aspects alone as their impact can be multidimensional. "FBI" takes them all into consideration in order to explain every angle of the handicap generated.
Validation of two complementary oral-health related quality of life indicators (OIDP and OSS 0-10 ) in two qualitatively distinct samples of the Spanish population

PubMed Central

Montero, J; Bravo, M; Albaladejo, A

2008-01-01

Background Oral health-related quality of life can be assessed positively, by measuring satisfaction with mouth, or negatively, by measuring oral impact on the performance of daily activities. The study objective was to validate two complementary indicators, i.e., the OIDP (Oral Impacts on Daily Performances) and Oral Satisfaction 0–10 Scale (OSS), in two qualitatively different socio-demographic samples of the Spanish adult population, and to analyse the factors affecting both perspectives of well-being. Methods A cross-sectional study was performed, recruiting a Validation Sample from randomly selected Health Centres in Granada (Spain), representing the general population (n = 253), and a Working Sample (n = 561) randomly selected from active Regional Government staff, i.e., representing the more privileged end of the socio-demographic spectrum of this reference population. All participants were examined according to WHO methodology and completed an in-person interview on their oral impacts and oral satisfaction using the OIDP and OSS 0–10 respectively. The reliability and validity of the two indicators were assessed. An alternative method of describing the causes of oral impacts is presented. Results The reliability coefficient (Cronbach's alpha) of the OIDP was above the recommended 0.7 threshold in both Validation and Occupational samples (0.79 and 0.71 respectively). Test-retest analysis confirmed the external reliability of the OSS (Intraclass Correlation Coefficient, 0.89; p < 0.001) Some subjective factors (perceived need for dental treatment, complaints about mouth and intermediate impacts) were strongly associated with both indicators, supporting their construct and criterion validity. The main cause of oral impact was dental pain. Several socio-demographic, behavioural and clinical variables were identified as modulating factors. Conclusion OIDP and OSS are valid and reliable subjective measures of oral impacts and oral satisfaction, respectively, in an adult Spanish population. Exploring simultaneously these issues may provide useful insights into how satisfaction and impact on well-being are constructed. PMID:19019208
[Reliability and validity of the Japanese version of the Thinking Style Inventory].

PubMed

Ochiai, Jun; Maie, Yuko; Wada, Yuichi

2016-06-01

This study examined the internal and external validity of the Japanese version of the Thinking Styles Inventory (TSI: Hiruma, 2000), which was originally developed by Sternberg and Wagner (1991) based on the framework of Sternberg's (1988) theory of mental self-government. The term "thinking style" refers to the concept that individuals differ in how they organize, direct, and manage their own thinking activities. We administered the Japanese version of the TSI to Japanese participants (N = 655: Age range 20-84 years). The results of item analysis, reliability analysis, and factor analysis, were consistent with the general ideas of the theory. In addition, there were significant relationships between certain thinking styles and 3 participant characteristics: age, gender, and working arrangement. Furthermore, some thinking styles were positively correlated with social skill. Implications of these results for the nature of Japanese thinking styles are discussed.

Classification systems for lower extremity amputation prediction in subjects with active diabetic foot ulcer: a systematic review and meta-analysis.

PubMed

Monteiro-Soares, M; Martins-Mendes, D; Vaz-Carneiro, A; Sampaio, S; Dinis-Ribeiro, M

2014-10-01

We systematically review the available systems used to classify diabetic foot ulcers in order to synthesize their methodological qualitative issues and accuracy to predict lower extremity amputation, as this may represent a critical point in these patients' care. Two investigators searched, in EBSCO, ISI, PubMed and SCOPUS databases, and independently selected studies published until May 2013 and reporting prognostic accuracy and/or reliability of specific systems for patients with diabetic foot ulcer in order to predict lower extremity amputation. We included 25 studies reporting a prevalence of lower extremity amputation between 6% and 78%. Eight different diabetic foot ulcer descriptions and seven prognostic stratification classification systems were addressed with a variable (1-9) number of factors included, specially peripheral arterial disease (n = 12) or infection at the ulcer site (n = 10) or ulcer depth (n = 10). The Meggitt-Wagner, S(AD)SAD and Texas University Classification systems were the most extensively validated, whereas ten classifications were derived or validated only once. Reliability was reported in a single study, and accuracy measures were reported in five studies with another eight allowing their calculation. Pooled accuracy ranged from 0.65 (for gangrene) to 0.74 (for infection). There are numerous classification systems for diabetic foot ulcer outcome prediction, but only few studies evaluated their reliability or external validity. Studies rarely validated several systems simultaneously and only a few reported accuracy measures. Further studies assessing reliability and accuracy of the available systems and their composing variables are needed. Copyright © 2014 John Wiley & Sons, Ltd.
QSAR Modeling of Rat Acute Toxicity by Oral Exposure

PubMed Central

Zhu, Hao; Martin, Todd M.; Ye, Lin; Sedykh, Alexander; Young, Douglas M.; Tropsha, Alexander

2009-01-01

Few Quantitative Structure-Activity Relationship (QSAR) studies have successfully modeled large, diverse rodent toxicity endpoints. In this study, a comprehensive dataset of 7,385 compounds with their most conservative lethal dose (LD50) values has been compiled. A combinatorial QSAR approach has been employed to develop robust and predictive models of acute toxicity in rats caused by oral exposure to chemicals. To enable fair comparison between the predictive power of models generated in this study versus a commercial toxicity predictor, TOPKAT (Toxicity Prediction by Komputer Assisted Technology), a modeling subset of the entire dataset was selected that included all 3,472 compounds used in the TOPKAT’s training set. The remaining 3,913 compounds, which were not present in the TOPKAT training set, were used as the external validation set. QSAR models of five different types were developed for the modeling set. The prediction accuracy for the external validation set was estimated by determination coefficient R2 of linear regression between actual and predicted LD50 values. The use of the applicability domain threshold implemented in most models generally improved the external prediction accuracy but expectedly led to the decrease in chemical space coverage; depending on the applicability domain threshold, R2 ranged from 0.24 to 0.70. Ultimately, several consensus models were developed by averaging the predicted LD50 for every compound using all 5 models. The consensus models afforded higher prediction accuracy for the external validation dataset with the higher coverage as compared to individual constituent models. The validated consensus LD50 models developed in this study can be used as reliable computational predictors of in vivo acute toxicity. PMID:19845371
Quantitative structure-activity relationship modeling of rat acute toxicity by oral exposure.

PubMed

Zhu, Hao; Martin, Todd M; Ye, Lin; Sedykh, Alexander; Young, Douglas M; Tropsha, Alexander

2009-12-01

Few quantitative structure-activity relationship (QSAR) studies have successfully modeled large, diverse rodent toxicity end points. In this study, a comprehensive data set of 7385 compounds with their most conservative lethal dose (LD(50)) values has been compiled. A combinatorial QSAR approach has been employed to develop robust and predictive models of acute toxicity in rats caused by oral exposure to chemicals. To enable fair comparison between the predictive power of models generated in this study versus a commercial toxicity predictor, TOPKAT (Toxicity Prediction by Komputer Assisted Technology), a modeling subset of the entire data set was selected that included all 3472 compounds used in TOPKAT's training set. The remaining 3913 compounds, which were not present in the TOPKAT training set, were used as the external validation set. QSAR models of five different types were developed for the modeling set. The prediction accuracy for the external validation set was estimated by determination coefficient R(2) of linear regression between actual and predicted LD(50) values. The use of the applicability domain threshold implemented in most models generally improved the external prediction accuracy but expectedly led to the decrease in chemical space coverage; depending on the applicability domain threshold, R(2) ranged from 0.24 to 0.70. Ultimately, several consensus models were developed by averaging the predicted LD(50) for every compound using all five models. The consensus models afforded higher prediction accuracy for the external validation data set with the higher coverage as compared to individual constituent models. The validated consensus LD(50) models developed in this study can be used as reliable computational predictors of in vivo acute toxicity.
Risk prediction models of breast cancer: a systematic review of model performances.

PubMed

Anothaisintawee, Thunyarat; Teerawattananon, Yot; Wiratkapun, Chollathip; Kasamesup, Vijj; Thakkinstian, Ammarin

2012-05-01

The number of risk prediction models has been increasingly developed, for estimating about breast cancer in individual women. However, those model performances are questionable. We therefore have conducted a study with the aim to systematically review previous risk prediction models. The results from this review help to identify the most reliable model and indicate the strengths and weaknesses of each model for guiding future model development. We searched MEDLINE (PubMed) from 1949 and EMBASE (Ovid) from 1974 until October 2010. Observational studies which constructed models using regression methods were selected. Information about model development and performance were extracted. Twenty-five out of 453 studies were eligible. Of these, 18 developed prediction models and 7 validated existing prediction models. Up to 13 variables were included in the models and sample sizes for each study ranged from 550 to 2,404,636. Internal validation was performed in four models, while five models had external validation. Gail and Rosner and Colditz models were the significant models which were subsequently modified by other scholars. Calibration performance of most models was fair to good (expected/observe ratio: 0.87-1.12), but discriminatory accuracy was poor to fair both in internal validation (concordance statistics: 0.53-0.66) and in external validation (concordance statistics: 0.56-0.63). Most models yielded relatively poor discrimination in both internal and external validation. This poor discriminatory accuracy of existing models might be because of a lack of knowledge about risk factors, heterogeneous subtypes of breast cancer, and different distributions of risk factors across populations. In addition the concordance statistic itself is insensitive to measure the improvement of discrimination. Therefore, the new method such as net reclassification index should be considered to evaluate the improvement of the performance of a new develop model.
Interrater Reliability in Large-Scale Assessments--Can Teachers Score National Tests Reliably without External Controls?

ERIC Educational Resources Information Center

Pantzare, Anna Lind

2015-01-01

In most large-scale assessment systems a set of rather expensive external quality controls are implemented in order to guarantee the quality of interrater reliability. This study empirically examines if teachers' ratings of national tests in mathematics can be reliable without using monitoring, training, or other methods of external quality…
Development and External Validation of a Prognostic Nomogram for Metastatic Uveal Melanoma

PubMed Central

Valpione, Sara; Moser, Justin C.; Parrozzani, Raffaele; Bazzi, Marco; Mansfield, Aaron S.; Mocellin, Simone; Pigozzo, Jacopo; Midena, Edoardo; Markovic, Svetomir N.; Aliberti, Camillo; Campana, Luca G.; Chiarion-Sileni, Vanna

2015-01-01

Background Approximately 50% of patients with uveal melanoma (UM) will develop metastatic disease, usually involving the liver. The outcome of metastatic UM (mUM) is generally poor and no standard therapy has been established. Additionally, clinicians lack a validated prognostic tool to evaluate these patients. The aim of this work was to develop a reliable prognostic nomogram for clinicians. Patients and Methods Two cohorts of mUM patients, from Veneto Oncology Institute (IOV) (N=152) and Mayo Clinic (MC) (N=102), were analyzed to develop and externally validate, a prognostic nomogram. Results The median survival of mUM was 17.2 months in the IOV cohort and 19.7 in the MC cohort. Percentage of liver involvement (HR 1.6), elevated levels of serum LDH (HR 1.6), and a WHO performance status=1 (HR 1.5) or 2–3 (HR 4.6) were associated with worse prognosis. Longer disease-free interval from diagnosis of UM to that of mUM conferred a survival advantage (HR 0.9). The nomogram had a concordance probability of 0.75 (SE .006) in the development dataset (IOV), and 0.80 (SE .009) in the external validation (MC). Nomogram predictions were well calibrated. Conclusions The nomogram, which includes percentage of liver involvement, LDH levels, WHO performance status and disease free-interval accurately predicts the prognosis of mUM and could be useful for decision-making and risk stratification for clinical trials. PMID:25780931
Design and validation of a self-administered test to assess bullying (bull-M) in high school Mexicans: a pilot study

PubMed Central

2013-01-01

Background Bullying (Bull) is a public health problem worldwide, and Mexico is not exempt. However, its epidemiology and early detection in our country is limited, in part, by the lack of validated tests to ensure the respondents’ anonymity. The aim of this study was to validate a self-administered test (Bull-M) for assessing Bull among high-school Mexicans. Methods Experts and school teachers from highly violent areas of Ciudad Juarez (Chihuahua, México), reported common Bull behaviors. Then, a 10-item test was developed based on twelve of these behaviors; the students’ and peers’ participation in Bull acts and in some somatic consequences in Bull victims with a 5-point Likert frequency scale. Validation criteria were: content (CV, judges); reliability [Cronbach’s alpha (CA), test-retest (spearman correlation, rs)]; construct [principal component (PCA), confirmatory factor (CFA), goodness-of-fit (GF) analysis]; and convergent (Bull-M vs. Bull-S test) validity. Results Bull-M showed good reliability (CA = 0.75, rs = 0.91; p < 0.001). Two factors were identified (PCA) and confirmed (CFA): “bullying me (victim)” and “bullying others (aggressor)”. GF indices were: Root mean square error of approximation (0.031), GF index (0.97), and normalized fit index (0.92). Bull-M was as good as Bull-S for measuring Bull prevalence. Conclusions Bull-M has a good reliability and convergent validity and a bi-modal factor structure for detecting Bull victims and aggressors; however, its external validity and sensitivity should be analyzed on a wider and different population. PMID:23577755
A New Measure to Assess Psychopathic Personality in Children: The Child Problematic Traits Inventory.

PubMed

Colins, Olivier F; Andershed, Henrik; Frogner, Louise; Lopez-Romero, Laura; Veen, Violaine; Andershed, Anna-Karin

2014-01-01

Understanding the development of psychopathic personality from childhood to adulthood is crucial for understanding the development and stability of severe and long-lasting conduct problems and criminal behavior. This paper describes the development of a new teacher rated instrument to assess psychopathic personality from age three to 12, the Child Problematic Traits Inventory (CPTI). The reliability and validity of the CPTI was tested in a Swedish general population sample of 2,056 3- to 5-year-olds (mean age = 3.86; SD = .86; 53 % boys). The CPTI items loaded distinctively on three theoretically proposed factors: a Grandiose-Deceitful Factor, a Callous-Unemotional factor, and an Impulsive-Need for Stimulation factor. The three CPTI factors showed reliability in internal consistency and external validity, in terms of expected correlations with theoretically relevant constructs (e.g., fearlessness). The interaction between the three CPTI factors was a stronger predictor of concurrent conduct problems than any of the three individual CPTI factors, showing that it is important to assess all three factors of the psychopathic personality construct in early childhood. In conclusion, the CPTI seems to reliably and validly assess a constellation of traits that is similar to psychopathic personality as manifested in adolescence and adulthood.
The comprehensive care project: measuring physician performance in ambulatory practice.

PubMed

Holmboe, Eric S; Weng, Weifeng; Arnold, Gerald K; Kaplan, Sherrie H; Normand, Sharon-Lise; Greenfield, Sheldon; Hood, Sarah; Lipner, Rebecca S

2010-12-01

To investigate the feasibility, reliability, and validity of comprehensively assessing physician-level performance in ambulatory practice. Ambulatory-based general internists in 13 states participated in the assessment. We assessed physician-level performance, adjusted for patient factors, on 46 individual measures, an overall composite measure, and composite measures for chronic, acute, and preventive care. Between- versus within-physician variation was quantified by intraclass correlation coefficients (ICC). External validity was assessed by correlating performance on a certification exam. Medical records for 236 physicians were audited for seven chronic and four acute care conditions, and six age- and gender-appropriate preventive services. Performance on the individual and composite measures varied substantially within (range 5-86 percent compliance on 46 measures) and between physicians (ICC range 0.12-0.88). Reliabilities for the composite measures were robust: 0.88 for chronic care and 0.87 for preventive services. Higher certification exam scores were associated with better performance on the overall (r = 0.19; p<.01), chronic care (r = 0.14, p = .04), and preventive services composites (r = 0.17, p = .01). Our results suggest that reliable and valid comprehensive assessment of the quality of chronic and preventive care can be achieved by creating composite measures and by sampling feasible numbers of patients for each condition. © Health Research and Educational Trust.
Validation of a French version of the pure procrastination scale (PPS).

PubMed

Rebetez, Marie My Lien; Rochat, Lucien; Gay, Philippe; Van der Linden, Martial

2014-08-01

Procrastination is a widespread phenomenon that affects everyone's day-to-day life and interferes with the clinical treatment of several psychopathological states. To assess this construct, Steel (2010) developed the Pure Procrastination Scale (PPS), a short scale intended to capture the general notion of dysfunctional delay. The aim of the current study was to present a French version of this questionnaire. To this end, the 12 items of the PPS were translated into French and data were collected from an online survey in a sample of 245 French-speaking individuals from the general population. The results revealed that one item had problematic face validity; it was therefore removed. Exploratory and confirmatory analyses performed on the resulting 11-item version of the French PPS indicated that the scale was composed of two factors ("voluntary delay" and "observed delay") depending on a common, higher-order construct ("general procrastination"). Good internal consistency and test-retest reliability were found. External validity was supported by specific relationships with measures of personality traits, impulsivity, and subjective well-being. The French PPS therefore presents satisfactory psychometric properties and may be considered a reliable and valid instrument for research, teaching and clinical practice. Copyright © 2014 Elsevier Inc. All rights reserved.
Development and validation of a primary sclerosing cholangitis-specific patient-reported outcomes instrument: The PSC PRO.

PubMed

Younossi, Zobair M; Afendy, Arian; Stepanova, Maria; Racila, Andrei; Nader, Fatema; Gomel, Rachel; Safer, Ricky; Lenderking, William R; Skalicky, Anne; Kleinman, Leah; Myers, Robert P; Subramanian, G Mani; McHutchison, John G; Levy, Cynthia; Bowlus, Christopher L; Kowdley, Kris; Muir, Andrew J

2017-11-20

Primary sclerosing cholangitis (PSC) is a chronic liver disease associated with inflammation and biliary fibrosis that leads to cholangitis, cirrhosis, and impaired quality of life. Our objective was to develop and validate a PSC-specific patient-reported outcome (PRO) instrument. We developed a 42-item PSC PRO instrument that contains two modules (Symptoms and Impact of Symptoms) and conducted an external validation. Reliability and validity were evaluated using clinical data and a battery of other validated instruments. Test-retest reliability was assessed in a subgroup of patients who repeated the PSC PRO after the first administration. One hundred two PSC subjects (44 ± 13 years; 32% male, 74% employed, 39% with cirrhosis, 14% with a history of decompensated cirrhosis, 38% history of depression, and 68% with inflammatory bowel disease [IBD]) completed PSC PRO and other PRO instruments (Short Form 36 V2 [SF-36], Chronic Liver Disease Questionnaire [CLDQ], Primary Biliary Cholangitis - 40 [PBC-40], and five dimensions [5-D Itch]). PSC PRO demonstrated excellent internal consistency (Cronbach alphas, 0.84-0.94) and discriminant validity (41 of 42 items had the highest correlations with their own domains). There were good correlations between PSC PRO domains and relevant domains of SF-36, CLDQ, and PBC-40 (R = 0.69-0.90; all P < 0.0001), but lower (R = 0.31-0.60; P < 0.001) with 5-D Itch. Construct validity showed that PSC PRO can differentiate patients according to the presence and severity of cirrhosis and history of depression (P < 0.05), but not by IBD (P > 0.05). Test-retest reliability was assessed in 53 subjects who repeated PSC PRO within a median (interquartile range) of 37 (27-47) days. There was excellent reliability for most domains with intraclass correlations (0.71-0.88; all P < 0.001). PSC PRO is a self-administered disease-specific instrument developed according to U.S. Food and Drug Administration guidelines. This preliminary validation study suggests good psychometric properties. Further validation of the instrument in a larger and more diverse sample of PSC patients is needed. (Hepatology 2017). © 2017 by the American Association for the Study of Liver Diseases.
VizieR Online Data Catalog: Planck Sunyaev-Zeldovich sources (PSZ2) (Planck+, 2016)

NASA Astrophysics Data System (ADS)

Planck Collaboration; Ade, P. A. R.; Aghanim, N.; Arnaud, M.; Ashdown, M.; Aumont, J.; Baccigalupi, C.; Banday, A. J.; Barreiro, R. B.; Barrena, R.; Bartlett, J. G.; Bartolo, N.; Battaner, E.; Battye, R.; Benabed, K.; Benoit, A.; Benoit-Levy, A.; Bernard, J.-P.; Bersanelli, M.; Bielewicz, P.; Bikmaev, I.; Bohringer, H.; Bonaldi, A.; Bonavera, L.; Bond, J. R.; Borrill, J.; Bouchet, F. R.; Bucher, M.; Burenin, R.; Burigana, C.; Butler, R. C.; Calabrese, E.; Cardoso, J.-F.; Carvalho, P.; Catalano, A.; Challinor, A.; Chamballu, A.; Chary, R.-R.; Chiang, H. C.; Chon, G.; Christensen, P. R.; Clements, D. L.; Colombi, S.; Colombo, L. P. L.; Combet, C.; Comis, B.; Couchot, F.; Coulais, A.; Crill, B. P.; Curto, A.; Cuttaia, F.; Dahle, H.; Danese, L.; Davies, R. D.; Davis, R. J.; de Bernardis, P.; De Rosa, A.; de Zotti, G.; Delabrouille, J.; Desert, F.-X.; Dickinson, C.; Diego, J. M.; Dolag, K.; Dole, H.; Donzelli, S.; Dore, O.; Douspis, M.; Ducout, A.; Dupac, X.; Efstathiou, G.; Eisenhardt, P. R. M.; Elsner, F.; Ensslin, T. A.; Eriksen, H. K.; Falgarone, E.; Fergusson, J.; Feroz, F.; Ferragamo, A.; Finelli, F.; Forni, O.; Frailis, M.; Fraisse, A. A.; Franceschi, E.; Frejsel, A.; Galeotta, S.; Galli, S.; Ganga, K.; Genova-Santos, R. T.; Giard, M.; Giraud-Heraud, Y.; Gjerlow, E.; Gonzalez-Nuevo, J.; Gorski, K. M.; Grainge, K. J. B.; Gratton, S.; Gregorio, A.; Gruppuso, A.; Gudmundsson, J. E.; Hansen, F. K.; Hanson, D.; Harrison, D. L.; Hempel, A.; Henrot-Versille, S.; Hernandez-Monteagudo, C.; Herranz, D.; Hildebrandt, S. R.; Hivon, E.; Hobson, M.; Holmes, W. A.; Hornstrup, A.; Hovest, W.; Huffenberger, K. M.; Hurier, G.; Jaffe, A. H.; Jaffe, T. R.; Jin, T.; Jones, W. C.; Juvela, M.; Keihanen, E.; Keskitalo, R.; Khamitov, I.; Kisner, T. S.; Kneissl, R.; Knoche, J.; Kunz, M.; Kurki-Suonio, H.; Lagache, G.; Lamarre, J.-M.; Lasenby, A.; Lattanzi, M.; Lawrence, C. R.; Leonardi, R.; Lesgourgues, J.; Levrier, F.; Liguori, M.; Lilje, P. B.; Linden-Vornle, M.; Lopez-Caniego, M.; Lubin, P. M.; Macias-Perez, J. F.; Maggio, G.; Maino, D.; Mak, D. S. Y.; Mandolesi, N.; Mangilli, A.; Martin, P. G.; Martinez-Gonzalez, E.; Masi, S.; Matarrese, S.; Mazzotta, P.; McGehee, P.; Mei, S.; Melchiorri, A.; Melin, J.-B.; Mendes, L.; Mennella, A.; Migliaccio, M.; Mitra, S.; Miville-Deschenes, M.-A.; Moneti, A.; Montier, L.; Morgante, G.; Mortlock, D.; Moss, A.; Munshi, D.; Murphy, J. A.; Naselsky, P.; Nastasi, A.; Nati, F.; Natoli, P.; Netterfield, C. B.; Norgaard-Nielsen, H. U.; Noviello, F.; Novikov, D.; Novikov, I.; Olamaie, M.; Oxborrow, C. A.; Paci, F.; Pagano, L.; Pajot, F.; Paoletti, D.; Pasian, F.; Patanchon, G.; Pearson, T. J.; Perdereau, O.; Perotto, L.; Perrott, Y. C.; Perrotta, F.; Pettorino, V.; Piacentini, F.; Piat, M.; Pierpaoli, E.; Pietrobon, D.; Plaszczynski, S.; Pointecouteau, E.; Polenta, G.; Pratt, G. W.; Prezeau, G.; Prunet, S.; Puget, J.-L.; Rachen, J. P.; Reach, W. T.; Rebolo, R.; Reinecke, M.; Remazeilles, M.; Renault, C.; Renzi, A.; Ristorcelli, I.; Rocha, G.; Rosset, C.; Rossetti, M.; Roudier, G.; Rozo, E.; Rubino-Martin, J. A.; Rumsey, C.; Rusholme, B.; Rykoff, E. S.; Sandri, M.; Santos, D.; Saunders, R. D. E.; Savelainen, M.; Savini, G.; Schammel, M. P.; Scott, D.; Seiffert, M. D.; Shellard, E. P. S.; Shimwell, T. W.; Spence, R. L. D.; Stanford, S. A.; Stern, D.; Stolyarov, V.; Stompor, R.; Streblyanska, A.; Sudiwala, R.; Sunyaev, R.; Sutton, D.; Suur-Uski, A.-S.; Sygnet, J.-F.; Tauber, J. A.; Terenzi, L.; Toffolatti, L.; Tomasi, M.; Tramonte, D.; Tristram, M.; Tucci, M.; Tuovinen, J.; Umana, G.; Valenziano, L.; Valiviita, J.; van Tent, B.; Vielva, P.; Villa, F.; Wade, L. A.; Wandelt, B. D.; Wehus, I. K.; White, S. D. M.; Wright, E. L.; Yvon, D.; Zacchei, A.; Zonca, A.

2017-01-01

Three pipelines are used to detect SZ clusters: two independent implementations of the Matched Multi-Filter (MMF1 and MMF3), and PowellSnakes (PwS). The main catalogue is constructed as the union of the catalogues from the three detection methods. The completeness and reliability of the catalogues have been assessed through internal and external validation as described in section 4 of the paper. (5 data files).
Variability in GCSE Controlled Assessments Subject to High Levels of Control: Ipsos MORI's "Evaluation of the Introduction of Controlled Assessment" (2011) and Its Implications for Controlled Assessments in English Literature

ERIC Educational Resources Information Center

Vowles, C. G.

2012-01-01

Controlled assessment (CA) was introduced as a valid and reliable replacement for coursework in GCSE English and English Literature assessments in 2009. I argue that CA lacks clear definition, typically mimics externally-assessed public examinations and, when interrogated through the Crooks eight-link chain model, is undermined by several threats…
Psychometric Properties of the Bermond-Vorst Alexithymia Questionnaire (BVAQ) in the General Population and a Clinical Population.

PubMed

de Vroege, Lars; Emons, Wilco H M; Sijtsma, Klaas; van der Feltz-Cornelis, Christina M

2018-01-01

The Bermond-Vorst Alexithymia Questionnaire (BVAQ) has been validated in student samples and small clinical samples, but not in the general population; thus, representative general-population norms are lacking. We examined the factor structure of the BVAQ in Longitudinal Internet Studies for the Social Sciences panel data from the Dutch general population ( N = 974). Factor analyses revealed a first-order five-factor model and a second-order two-factor model. However, in the second-order model, the factor interpreted as analyzing ability loaded on both the affective factor and the cognitive factor. Further analyses showed that the first-order test scores are more reliable than the second-order test scores. External and construct validity were addressed by comparing BVAQ scores with a clinical sample of patients suffering from somatic symptom and related disorder (SSRD) ( N = 235). BVAQ scores differed significantly between the general population and patients suffering from SSRD, suggesting acceptable construct validity. Age was positively associated with alexithymia. Males showed higher levels of alexithymia. The BVAQ is a reliable alternative measure for measuring alexithymia.
The German Version of the Dutch Eating Behavior Questionnaire: Psychometric Properties, Measurement Invariance, and Population-Based Norms

PubMed Central

Hilbert, Anja; de Zwaan, Martina; Braehler, Elmar; Kersting, Anette

2016-01-01

The Dutch Eating Behavior Questionnaire is an internationally widely used instrument assessing different eating styles that may contribute to weight gain and overweight: emotional eating, external eating, and restraint. This study aimed to evaluate the psychometric properties of the 30-item German version of the DEBQ including its measurement invariance across gender, age, and BMI-status in a representative German population sample. Furthermore, we examined the distribution of eating styles in the general population and provide population-based norms for DEBQ scales. A representative sample of the German general population (N = 2513, age ≥ 14 years) was assessed with the German version of the DEBQ along with information on sociodemographic characteristics and body weight and height. The German version of the DEQB demonstrates good item characteristics and reliability (restraint: α = .92, emotional eating: α = .94, external eating: α = .89). The 3-factor structure of the DEBQ could be replicated in exploratory and confirmatory factor analyses and results of multi-group confirmatory factor analyses supported its metric and scalar measurement invariance across gender, age, and BMI-status. External eating was the most prevalent eating style in the German general population. Women scored higher on emotional and restrained eating scales than men, and overweight individuals scored higher in all three eating styles compared to normal weight individuals. Small differences across age were found for external eating. Norms were provided according to gender, age, and BMI-status. Our findings suggest that the German version of the DEBQ has good reliability and construct validity, and is suitable to reliably measure eating styles across age, gender, and BMI-status. Furthermore, the results demonstrate a considerable variation of eating styles across gender and BMI-status. PMID:27656879
The German Version of the Dutch Eating Behavior Questionnaire: Psychometric Properties, Measurement Invariance, and Population-Based Norms.

PubMed

Nagl, Michaela; Hilbert, Anja; de Zwaan, Martina; Braehler, Elmar; Kersting, Anette

The Dutch Eating Behavior Questionnaire is an internationally widely used instrument assessing different eating styles that may contribute to weight gain and overweight: emotional eating, external eating, and restraint. This study aimed to evaluate the psychometric properties of the 30-item German version of the DEBQ including its measurement invariance across gender, age, and BMI-status in a representative German population sample. Furthermore, we examined the distribution of eating styles in the general population and provide population-based norms for DEBQ scales. A representative sample of the German general population (N = 2513, age ≥ 14 years) was assessed with the German version of the DEBQ along with information on sociodemographic characteristics and body weight and height. The German version of the DEQB demonstrates good item characteristics and reliability (restraint: α = .92, emotional eating: α = .94, external eating: α = .89). The 3-factor structure of the DEBQ could be replicated in exploratory and confirmatory factor analyses and results of multi-group confirmatory factor analyses supported its metric and scalar measurement invariance across gender, age, and BMI-status. External eating was the most prevalent eating style in the German general population. Women scored higher on emotional and restrained eating scales than men, and overweight individuals scored higher in all three eating styles compared to normal weight individuals. Small differences across age were found for external eating. Norms were provided according to gender, age, and BMI-status. Our findings suggest that the German version of the DEBQ has good reliability and construct validity, and is suitable to reliably measure eating styles across age, gender, and BMI-status. Furthermore, the results demonstrate a considerable variation of eating styles across gender and BMI-status.
Developing evaluation scales for horticultural therapy.

PubMed

Im, Eun-Ae; Park, Sin-Ae; Son, Ki-Cheol

2018-04-01

This study developed evaluation scales for measuring the effects of horticultural therapy in practical settings. Qualitative and quantitative research, including three preliminary studies and a main study, were conducted. In the first study, a total of 779 horticultural therapists answered an open-end questionnaire based on 58 items about elements of occupational therapy and seven factors about singularity of horticultural therapy. In the second study, 20 horticultural therapists participated in in-depth interviews. In the third study, a Delphi method was conducted with 24 horticultural therapists to build a model of assessment indexes and ensure the validity. In the final study, the reserve scales were tested by 121 horticultural therapists in their practical settings for 1045 clients, to verify their reliability and validity. Preliminary questions in the effects area of horticultural therapy were developed in the first study, and validity for the components in the second study. In the third study, an expert Delphi survey was conducted as part of content validity verification of the preliminary tool of horticultural therapy for physical, cognitive, psychological-emotional, and social areas. In the final study, the evaluation tool, which verified the construct, convergence, discriminant, and predictive validity and reliability test, was used to finalise the evaluation tool. The effects of horticultural therapy were classified as four different aspects, namely, physical, cognitive, psycho-emotional, and social, based on previous studies on the effects of horticultural therapy. 98 questions in the four aspects were selected as reserve scales. The reliability of each scale was calculated as 0.982 in physical, 0.980 in cognitive, 0.965 in psycho-emotional, and 0.972 in social aspects based on the Cronbach's test of intra-item internal consistency and half reliability of Spearman-Brown. This study was the first to demonstrate validity and reliability by simultaneously developing four measures of horticultural therapy effectiveness, namely, physical, cognitive, psychological-emotional, and social, both locally and externally. It is especially worthwhile in that it can be applied in common to people. Copyright © 2018 Elsevier Ltd. All rights reserved.
The 7 up 7 down inventory: a 14-item measure of manic and depressive tendencies carved from the General Behavior Inventory.

PubMed

Youngstrom, Eric A; Murray, Greg; Johnson, Sheri L; Findling, Robert L

2013-12-01

The aim of this study was to develop and validate manic and depressive scales carved from the full-length General Behavior Inventory (GBI). The brief version was designed to be applicable for youths and adults and to improve separation between mania and depression dimensions. Data came from 9 studies (2 youth clinical samples, aggregate N = 738, and 7 nonclinical adult samples, aggregate N = 1,756). Items with high factor loadings on the 2 extracted dimensions of mania and depression were identified from both data sets, and final item selection was based on internal reliability criteria. Confirmatory factor analyses described the 2-factor model's fit. Criterion validity was compared between mania and depression scales, and with the full-length GBI scales. For both mania and depression factors, 7 items produced a psychometrically adequate measure applicable across both aggregate samples. Internal reliability of the Mania scale was .81 (youth) and .83 (adult) and for Depression was .93 (youth) and .95 (adult). By design, the brief scales were less strongly correlated with each other than were the original GBI scales. Construct validity of the new instrument was supported in observed discriminant and convergent relationships with external correlates and discrimination of diagnostic groups. The new brief GBI, the 7 Up 7 Down Inventory, demonstrates sound psychometric properties across a wide age range, showing expected relationships with external correlates. The new instrument provides a clearer separation of manic and depressive tendencies than the original. (PsycINFO Database Record (c) 2013 APA, all rights reserved).
Reliability and Validity of Composite Scores from the NIH Toolbox Cognition Battery in Adults

PubMed Central

Heaton, Robert K.; Akshoomoff, Natacha; Tulsky, David; Mungas, Dan; Weintraub, Sandra; Dikmen, Sureyya; Beaumont, Jennifer; Casaletto, Kaitlin B.; Conway, Kevin; Slotkin, Jerry; Gershon, Richard

2014-01-01

This study describes psychometric properties of the NIH Toolbox Cognition Battery (NIHTB-CB) Composite Scores in an adult sample. The NIHTB-CB was designed for use in epidemiologic studies and clinical trials for ages 3 to 85. A total of 268 self-described healthy adults were recruited at four university-based sites, using stratified sampling guidelines to target demographic variability for age (20–85 years), gender, education, and ethnicity. The NIHTB-CB contains seven computer-based instruments assessing five cognitive sub-domains: Language, Executive Function, Episodic Memory, Processing Speed, and Working Memory. Participants completed the NIHTB-CB, corresponding gold standard validation measures selected to tap the same cognitive abilities, and sociodemographic questionnaires. Three Composite Scores were derived for both the NIHTB-CB and gold standard batteries: “Crystallized Cognition Composite,” “Fluid Cognition Composite,” and “Total Cognition Composite” scores. NIHTB Composite Scores showed acceptable internal consistency (Cronbach’s alphas = 0.84 Crystallized, 0.83 Fluid, 0.77 Total), excellent test–retest reliability (r: 0.86–0.92), strong convergent (r: 0.78–0.90) and discriminant (r: 0.19–0.39) validities versus gold standard composites, and expected age effects (r = 0.18 crystallized, r = − 0.68 fluid, r = − 0.26 total). Significant relationships with self-reported prior school difficulties and current health status, employment, and presence of a disability provided evidence of external validity. The NIH Toolbox Cognition Battery Composite Scores have excellent reliability and validity, suggesting they can be used effectively in epidemiologic and clinical studies. PMID:24960398
Validation of multisource electronic health record data: an application to blood transfusion data.

PubMed

Hoeven, Loan R van; Bruijne, Martine C de; Kemper, Peter F; Koopman, Maria M W; Rondeel, Jan M M; Leyte, Anja; Koffijberg, Hendrik; Janssen, Mart P; Roes, Kit C B

2017-07-14

Although data from electronic health records (EHR) are often used for research purposes, systematic validation of these data prior to their use is not standard practice. Existing validation frameworks discuss validity concepts without translating these into practical implementation steps or addressing the potential influence of linking multiple sources. Therefore we developed a practical approach for validating routinely collected data from multiple sources and to apply it to a blood transfusion data warehouse to evaluate the usability in practice. The approach consists of identifying existing validation frameworks for EHR data or linked data, selecting validity concepts from these frameworks and establishing quantifiable validity outcomes for each concept. The approach distinguishes external validation concepts (e.g. concordance with external reports, previous literature and expert feedback) and internal consistency concepts which use expected associations within the dataset itself (e.g. completeness, uniformity and plausibility). In an example case, the selected concepts were applied to a transfusion dataset and specified in more detail. Application of the approach to a transfusion dataset resulted in a structured overview of data validity aspects. This allowed improvement of these aspects through further processing of the data and in some cases adjustment of the data extraction. For example, the proportion of transfused products that could not be linked to the corresponding issued products initially was 2.2% but could be improved by adjusting data extraction criteria to 0.17%. This stepwise approach for validating linked multisource data provides a basis for evaluating data quality and enhancing interpretation. When the process of data validation is adopted more broadly, this contributes to increased transparency and greater reliability of research based on routinely collected electronic health records.

Translation and validation of the vertigo symptom scale into German: A cultural adaption to a wider German-speaking population

PubMed Central

2012-01-01

Background Dizziness and comorbid anxiety may cause severe disability of patients with vestibulopathy, but can be addressed effectively with rehabilitation. For an individually adapted treatment, a structured assessment is needed. The Vertigo Symptom Scale (VSS) with two subscales assessing vertigo symptoms (VSS-VER) and associated symptoms (VSS-AA) might be used for this purpose. As there was no validated VSS available in German, the aim of the study was the translation and cross-cultural adaptation in German (VSS-G) and the investigation of its reliability, internal and external validity. Methods The VSS was translated into German according to recognized guidelines. Psychometric properties were tested on 52 healthy controls and 202 participants with vestibulopathy. Internal validity and reliability were investigated with factor analysis, Cronbach’s α and ICC estimations. Discriminant validity was analysed with the Mann–Whitney-U-Test between patients and controls and the ROC-Curve. Convergent validity was estimated with the correlation with the Hospital Anxiety Subscale (HADS-A), Dizziness Handicap Inventory (DHI) and frequency of dizziness. Results Internal validity: factor analysis confirmed the structure of two subscales. Reliability: VSS-G: α = 0.904 and ICC (CI) =0.926 (0.826, 0.965). Discriminant validity: VSS-VER differentiate patients and controls ROC (CI) =0.99 (0.98, 1.00). Convergent validity: VSS-G correlates with DHI (r = 0.554) and frequency (T = 0.317). HADS-A correlates with VSS-AA (r = 0.452) but not with VSS-VER (r = 0.186). Conclusions The VSS-G showed satisfactory psychometric properties to assess the severity of vertigo or vertigo-related symptoms. The VSS-VER can differentiate between healthy subjects and patients with vestibular disorders. The VSS-AA showed some screening properties with high sensitivity for patients with abnormal anxiety. PMID:22747644
Validity and reliability of the EQ-5D self-report questionnaire in Chinese-speaking patients with rheumatic diseases in Singapore.

PubMed

Luo, N; Chew, L H; Fong, K Y; Koh, D R; Ng, S C; Yoon, K H; Vasoo, S; Li, S C; Thumboo, J

2003-09-01

We assessed the psychometric properties of a Singaporean Chinese version of the EQ-5D, a health-related quality of life (HRQoL) instrument. Consecutive outpatients with rheumatic diseases seen for routine follow-up consultations at the National University Hospital, Singapore were interviewed twice within 2 weeks using a standardised questionnaire containing the EQ-5D, the Short-Form 36 Health Survey (SF-36), the Learned Helplessness Subscale, a pain Visual Analogue Scale (VAS) and assessing demographic and psychosocial characteristics. To assess the validity of the EQ-5D, 13 hypotheses relating the EQ-5D self-classifier (5 dimensions) or visual analogue scale (EQ-VAS) to SF-36 scores or other variables were examined using the Mann-Whitney U test, Kruskal-Wallis or Spearman's correlation coefficient. Test-retest reliability was assessed using Cohen's kappa. Forty-eight subjects were studied (osteoarthritis: 16; rheumatoid arthritis: 22; systemic lupus erythematosus: 8; spondyloarthropathy: 2; female: 93.8%; mean age: 56.4 years). Seven of 13 a-priori hypotheses relating EQ-5D to external variables were fulfilled, supporting the validity of the EQ-5D. For example, subjects reporting moderate or extreme problems for EQ-5D dimensions generally had lower median SF-36 scores than those without such problems. Cohen's kappa for test-retest reliability of the self-classifier ranged from 0.41 to 1.00 (n = 42; median interval: 7 days, interquartile range: 7 to 11 days). The Singaporean Chinese EQ-5D self-classifier appears to be a valid measure of HRQoL in Singaporeans with rheumatic diseases; however, the reliability of the EQ-VAS requires further investigation. These data provide a basis for further studies of the Singaporean Chinese EQ-5D.
Translation and validation of the Spanish version of the Health of the Nation Outcome Scales for People with Learning Disabilities (HoNOS-LD).

PubMed

Esteba-Castillo, Susanna; Torrents-Rodas, David; García-Alba, Javier; Ribas-Vidal, Núria; Novell-Alsina, Ramon

2016-12-21

The Health of the Nation Outcome Scales for People with Learning Disabilities (HoNOS-LD) is a brief instrument that assesses functioning in people with intellectual development disorder and mental health problems/behaviour disorders. The aim of the present study was to examine the evidence on the validity of the scores based on the Spanish version of the HoNOS-LD. The study included 111 participants that were assessed by the Spanish version of the HoNOS-LD and other questionnaires that measured different variables related to the scale. Thirty-three participants were assessed by 2 examiners, and retested 7 days later, in order to study inter-examiner reliability and test-retest reliabilities. Based on clinical and conceptual criteria, and on the results of the parallel analysis, a factorial solution with one factor was selected. Internal consistency was good (Omega coefficient of 0.87). Inter-examiner and test-retest reliabilities were excellent (intraclass correlation coefficients of 0.95 and 0.98, respectively). Correlations between sections of the HoNOS-LD and the related instruments showed the expected direction, and were highly significant (P<.001), and the HoNOS-LD score increased with the intensity of the support required by the participants. These results showed evidence of the validity of association with other external variables. The Spanish version of the HoNOS-LD is a brief, valid and reliable instrument, which will enable a routine assessment of functioning for different uses, including diagnosis and intervention. Copyright © 2016 SEP y SEPB. Publicado por Elsevier España, S.L.U. All rights reserved.
Evaluative measurement properties of the patient-specific functional scale for primary shoulder complaints in physical therapy practice.

PubMed

Koehorst, Marije L S; van Trijffel, Emiel; Lindeboom, Robert

2014-08-01

Clinical measurement, longitudinal. To assess the test-retest reliability, construct validity, and responsiveness of the Patient-Specific Functional Scale (PSFS) in patients with a primary shoulder complaint. Health measurement outcomes have become increasingly important for evaluating treatment. Patient-specific questionnaires are useful tools for determining treatment goals and evaluating treatment in individual patients. These questionnaires have not yet been validated in patients with nonspecific shoulder pain. Patients completed the PSFS, the numeric pain rating scale, and the Shoulder Pain and Disability Index at baseline, and after 1 week and 4 to 6 weeks. Test-retest reliability was determined using intraclass correlation coefficients. To assess convergent validity, change scores of the PSFS were correlated with the numeric pain rating scale and Shoulder Pain and Disability Index change scores. Responsiveness was assessed by calculating the area under the curve, the minimal clinically important change, and minimal detectable change, using the global rating of change as an external criterion. Fifty patients (37 men; mean age, 47.7 years) participated in the study. Reliability was high (intraclass correlation coefficient = 0.87; 95% confidence interval [CI]: 0.72, 0.94). The correlations between the change scores of the PSFS and those of the Shoulder Pain and Disability Index and numeric pain rating scale were 0.45 (95% CI: 0.17, 0.80) and 0.55 (95% CI: 0.29, 0.73), respectively. The area under the curve for the PSFS was 0.67 (95% CI: 0.51, 0.83). The minimal detectable change and minimal clinically important change were 0.97 and 1.29 points, respectively. These results suggest that the PSFS is a reliable, valid, and responsive instrument that can be used as an evaluative instrument in patients with a primary shoulder complaint.
Cultural adaptation and validation of a German version of the Arthritis Impact Measurement Scales (AIMS2).

PubMed

Rosemann, T; Szecsenyi, J

2007-10-01

To validate a translated and culturally adapted version of the Arthritis Impact Measurement Scale (AIMS) 2 in primary care patients with osteoarthritis (OA) of the hip and knee. The AIMS2 was translated into German and culturally adapted. The questionnaire then was administered to 220 primary care patients with OA of the knee or hip. Two hundred and nine questionnaires were returned and analysed. Test-retest reliability was tested in 50 randomly selected patients, of those 42 completed the questionnaire after 2 weeks for a second time. Item-scale correlations were reasonably good as well as the discriminative power of separate scales. The assessment of internal consistency reliability also revealed satisfactory values; Cronbach's alpha was 0.77 or higher for all scales. The test-retest reliability, estimated in an intraclass correlation coefficient (ICC), exceeded 0.90, except the "social activities" scale (0.87). Since only patients with OA of the lower limb were enrolled, substantial floor effects occurred in the "arm function" (28.2%) and the "hand and finger function" scale (29.2%). The principal factor analysis confirmed the postulated three-factor structure with a physical, physiological and social dimension, explaining 48.5%, 13.9% and 6.8% of the variation, respectively. External validity was assessed by calculating correlations to the Western Ontario and MacMaster (WOMAC) osteoarthritis questionnaire a pain visual analogue scale (VAS) and the Kellgren score as well as to disease duration. Spearman's "R" achieved satisfactory values for the corresponding WOMAC scales and the pain-VAS. Correlations with disease duration as well as with the radiological grading were low. The GERMAN-AIMS2 is a reliable and valid instrument to assess the quality of life (QoL) in primary care patients suffering from OA.
Inter-agency communication and operations capabilities during a hospital functional exercise: reliability and validity of a measurement tool.

PubMed

Savoia, Elena; Biddinger, Paul D; Burstein, Jon; Stoto, Michael A

2010-01-01

As proxies for actual emergencies, drills and exercises can raise awareness, stimulate improvements in planning and training, and provide an opportunity to examine how different components of the public health system would combine to respond to a challenge. Despite these benefits, there remains a substantial need for widely accepted and prospectively validated tools to evaluate agencies' and hospitals' performance during such events. Unfortunately, to date, few studies have focused on addressing this need. The purpose of this study was to assess the validity and reliability of a qualitative performance assessment tool designed to measure hospitals' communication and operational capabilities during a functional exercise. The study population included 154 hospital personnel representing nine hospitals that participated in a functional exercise in Massachusetts in June 2008. A 25-item questionnaire was developed to assess the following three hospital functional capabilities: (1) inter-agency communication; (2) communication with the public; and (3) disaster operations. Analyses were conducted to examine internal consistency, associations among scales, the empirical structure of the items, and inter-rater agreement. Twenty-two questions were retained in the final instrument, which demonstrated reliability with alpha coefficients of 0.83 or higher for all scales. A three-factor solution from the principal components analysis accounted for 57% of the total variance, and the factor structure was consistent with the original hypothesized domains. Inter-rater agreement between participants' self reported scores and external evaluators' scores ranged from moderate to good. The resulting 22-item performance measurement tool reliably measured hospital capabilities in a functional exercise setting, with preliminary evidence of concurrent and criterion-related validity.
Verification of the reliability and validity of a Japanese version of the Quality of Life in Childhood Epilepsy Questionnaire (QOLCE-J).

PubMed

Moriguchi, Eri; Ito, Mikiko; Nagai, Toshisaburo

2015-11-01

A Japanese version of the Quality of Life in Childhood Epilepsy Questionnaire (QOLCE-J) was developed using international guidelines as a QOL scale for childhood epilepsy; its reliability and validity were examined, focusing on Japanese pediatric epilepsy patients applicability. A pilot test questionnaire survey was conducted; involving parents of pediatric epilepsy patients aged 4-15 undergoing outpatient treatment. 278 responses were obtained and analyzed. Internal consistency for the 16 QOLCE-J subscales, except for , was sufficient, and a high overall coefficient α was obtained. The intraclass correlation coefficient was also high, supporting the test-retest reliability of this version. Associations among the subscales, high correlations of r>0.7 were observed among , , and , representing cognitive and behavioral aspects, and among these and . In contrast, correlations among others were moderate or weaker. Furthermore, correlations of r>0.35 were observed among the subscales of the SDQ (Strength and Difficulties Questionnaire) used as an external criterion and the QOLCE-J, confirming the criterion validity of the study version. Analysis of associations between the total QOLCE-J score and pathology of epilepsy, found significant correlation with age of onset and frequency of seizures, ADL, and antiepileptics side effects' symptoms. QOLCE has mostly been used in treatment resistant pediatric patients, the influence of interictal period presently observed, like antiepileptic side effects' symptoms; suggest usefulness for pediatric patients with seizures under control. The QOLCE-J with sufficient reliability and validity may be applicable as a QOL scale for Japanese children with epilepsy. Copyright © 2015 The Japanese Society of Child Neurology. Published by Elsevier B.V. All rights reserved.
Clinical audit project in undergraduate medical education curriculum: an assessment validation study

PubMed Central

Steketee, Carole; Mak, Donna

2016-01-01

Objectives To evaluate the merit of the Clinical Audit Project (CAP) in an assessment program for undergraduate medical education using a systematic assessment validation framework. Methods A cross-sectional assessment validation study at one medical school in Western Australia, with retrospective qualitative analysis of the design, development, implementation and outcomes of the CAP, and quantitative analysis of assessment data from four cohorts of medical students (2011- 2014). Results The CAP is fit for purpose with clear external and internal alignment to expected medical graduate outcomes. Substantive validity in students’ and examiners’ response processes is ensured through relevant methodological and cognitive processes. Multiple validity features are built-in to the design, planning and implementation process of the CAP. There is evidence of high internal consistency reliability of CAP scores (Cronbach’s alpha > 0.8) and inter-examiner consistency reliability (intra-class correlation>0.7). Aggregation of CAP scores is psychometrically sound, with high internal consistency indicating one common underlying construct. Significant but moderate correlations between CAP scores and scores from other assessment modalities indicate validity of extrapolation and alignment between the CAP and the overall target outcomes of medical graduates. Standard setting, score equating and fair decision rules justify consequential validity of CAP scores interpretation and use. Conclusions This study provides evidence demonstrating that the CAP is a meaningful and valid component in the assessment program. This systematic framework of validation can be adopted for all levels of assessment in medical education, from individual assessment modality, to the validation of an assessment program as a whole. PMID:27716612
Clinical audit project in undergraduate medical education curriculum: an assessment validation study.

PubMed

Tor, Elina; Steketee, Carole; Mak, Donna

2016-09-24

To evaluate the merit of the Clinical Audit Project (CAP) in an assessment program for undergraduate medical education using a systematic assessment validation framework. A cross-sectional assessment validation study at one medical school in Western Australia, with retrospective qualitative analysis of the design, development, implementation and outcomes of the CAP, and quantitative analysis of assessment data from four cohorts of medical students (2011- 2014). The CAP is fit for purpose with clear external and internal alignment to expected medical graduate outcomes. Substantive validity in students' and examiners' response processes is ensured through relevant methodological and cognitive processes. Multiple validity features are built-in to the design, planning and implementation process of the CAP. There is evidence of high internal consistency reliability of CAP scores (Cronbach's alpha > 0.8) and inter-examiner consistency reliability (intra-class correlation>0.7). Aggregation of CAP scores is psychometrically sound, with high internal consistency indicating one common underlying construct. Significant but moderate correlations between CAP scores and scores from other assessment modalities indicate validity of extrapolation and alignment between the CAP and the overall target outcomes of medical graduates. Standard setting, score equating and fair decision rules justify consequential validity of CAP scores interpretation and use. This study provides evidence demonstrating that the CAP is a meaningful and valid component in the assessment program. This systematic framework of validation can be adopted for all levels of assessment in medical education, from individual assessment modality, to the validation of an assessment program as a whole.
The Social Attribution Task - Multiple Choice (SAT-MC): Psychometric comparison with social cognitive measures for schizophrenia research.

PubMed

Johannesen, Jason K; Fiszdon, Joanna M; Weinstein, Andrea; Ciosek, David; Bell, Morris D

2018-04-01

The Social Attribution Task-Multiple Choice (SAT-MC) tests the ability to extract social themes from viewed object motion. This form of animacy perception is thought to aid the development of social inference, but appears impaired in schizophrenia. The current study was undertaken to examine psychometric equivalence of two forms of the SAT-MC and to compare their performance against social cognitive tests recommended for schizophrenia research. Thirty-two schizophrenia (SZ) and 30 substance use disorder (SUD) participants completed both SAT-MC forms, the Bell-Lysaker Emotion Recognition Task (BLERT), Hinting Task, The Awareness of Social Inference Test (TASIT), Ambiguous Intentions and Hostility Questionnaire (AIHQ) and questionnaire measures of interpersonal function. Test sensitivity, construct and external validity, test-retest reliability, and internal consistency were evaluated. SZ scored significantly lower than SUD on both SAT-MC forms, each classifying ~60% of SZ as impaired, compared with ~30% of SUD. SAT-MC forms demonstrated good test-retest and parallel form reliability, minimal practice effect, high internal consistency, and similar patterns of correlation with social cognitive and external validity measures. The SAT-MC compared favorably to recommended social cognitive tests across psychometric features and, with exception of TASIT, was most sensitive to impairment in schizophrenia when compared to a chronic substance use sample. Published by Elsevier B.V.
Reliability assessment of fiber optic communication lines depending on external factors and diagnostic errors

NASA Astrophysics Data System (ADS)

Bogachkov, I. V.; Lutchenko, S. S.

2018-05-01

The article deals with the method for the assessment of the fiber optic communication lines (FOCL) reliability taking into account the effect of the optical fiber tension, the temperature influence and the built-in diagnostic equipment errors of the first kind. The reliability is assessed in terms of the availability factor using the theory of Markov chains and probabilistic mathematical modeling. To obtain a mathematical model, the following steps are performed: the FOCL state is defined and validated; the state graph and system transitions are described; the system transition of states that occur at a certain point is specified; the real and the observed time of system presence in the considered states are identified. According to the permissible value of the availability factor, it is possible to determine the limiting frequency of FOCL maintenance.
Reliability and Validity the Brief Problem Monitor, an Abbreviated Form of the Child Behavior Checklist

PubMed Central

Piper, Brian J.; Gray, Hilary M.; Raber, Jacob; Birkett, Melissa A.

2014-01-01

Aim The parent form of the 113 item Child Behavior Checklist (CBCL) is widely utilized by child psychiatrists and psychologists. This report examines the reliability and validity of a recently developed abbreviated version of the CBCL, the Brief Problem Monitor (BPM). Methods Caregivers (N=567) completed the CBCL online and the 19 BPM items were examined separately. Results Internal consistency of the BPM was high (Cronbach’s alpha=0.91) and satisfactory for the Internalizing (0.78), Externalizing (0.86), and Attention (0.87) scales. High correlations between the CBCL and BPM were identified for the total score (r=0.95) as well as the Internalizing (0.86), Externalizing (0.93), and Attention (0.97) scales. The BPM and scales were sensitive and identified significantly higher behavioral and emotional problems among children whose caregiver reported a psychiatric diagnosis of Attention Deficit Hyperactivity Disorder, bipolar, depression, anxiety, developmental disabilities, or Autism Spectrum Disorders relative to a comparison group that had not been diagnosed with these disorders. BPM ratings also differed by the socioeconomic status and education of the caregiver. Mothers with higher annual incomes rated their children as having 38.8% fewer total problems (Cohen’s d=0.62) as well as 42.8% lower Internalizing (d=0.53), 44.1% less Externalizing (d=0.62), and 30.9% decreased Attention (d=0.39). A similar pattern was evident for maternal education (d=0.30 to 0.65). Conclusion Overall, these findings provide strong psychometric support for the BPM although the differences based on the characteristics of the parent indicates that additional information from other sources (e.g., teachers) should be obtained to complement parental reports. PMID:24735087
Reliability, validity and feasibility of nail ultrasonography in psoriatic arthritis.

PubMed

Arbault, Anaïs; Devilliers, Hervé; Laroche, Davy; Cayot, Audrey; Vabres, Pierre; Maillefert, Jean-Francis; Ornetti, Paul

2016-10-01

To determine the feasibility, reliability and validity of nails ultrasonography in psoriatic arthritis as an outcome measure. Pilot prospective single-centre study of eight ultrasonography parameters in B mode and power Doppler concerning the distal interphalangeal (DIP) joint, the matrix, the bed and nail plate. Intra-observer and inter-observer reliability was evaluated for the seven quantitative parameters (ICC and kappa). Correlations between ultrasonography and clinical variables were searched to assess external validity. Feasibility was assessed by the time to carry out the examination and the percentage of missing data. Twenty-seven patients with psoriatic arthritis (age 55.0±16.2 years, disease duration 13.4±9.4 years) were included. Of these, 67% presented nail involvement on ultrasonography vs 37% on physical examination (P<0.05). Reliability was good (ICC and weighted kappa>0.75) for the seven quantitative parameters, except for synovitis of the DIP joint in B mode. The synovitis of the DIP joint revealed by ultrasonography correlated with the total number of clinical synovitis and Doppler US of the nail (matrix and bed). Doppler US of the matrix correlated with VAS pain but not with the ASDAS-CRP or with clinical enthesitis. No significant correlation was found with US nail thickness. The feasibility and reliability of ultrasonography of the nail in psoriatic arthritis appear to be satisfactory. Among the eight parameters evaluated, power Doppler of the matrix which correlated with local inflammation (DIP joint and bed) and with VAS pain could become an interesting outcome measure, provided that it is also sensitive to change. Copyright © 2015 Société française de rhumatologie. Published by Elsevier SAS. All rights reserved.
Using Item Response Theory to Develop a 60-Item Representation of the NEO PI-R Using the International Personality Item Pool: Development of the IPIP-NEO-60.

PubMed

Maples-Keller, Jessica L; Williamson, Rachel L; Sleep, Chelsea E; Carter, Nathan T; Campbell, W Keith; Miller, Joshua D

2017-10-31

Given advantages of freely available and modifiable measures, an increase in the use of measures developed from the International Personality Item Pool (IPIP), including the 300-item representation of the Revised NEO Personality Inventory (NEO PI-R; Costa & McCrae, 1992a ) has occurred. The focus of this study was to use item response theory to develop a 60-item, IPIP-based measure of the Five-Factor Model (FFM) that provides equal representation of the FFM facets and to test the reliability and convergent and criterion validity of this measure compared to the NEO Five Factor Inventory (NEO-FFI). In an undergraduate sample (n = 359), scores from the NEO-FFI and IPIP-NEO-60 demonstrated good reliability and convergent validity with the NEO PI-R and IPIP-NEO-300. Additionally, across criterion variables in the undergraduate sample as well as a community-based sample (n = 757), the NEO-FFI and IPIP-NEO-60 demonstrated similar nomological networks across a wide range of external variables (r ICC = .96). Finally, as expected, in an MTurk sample the IPIP-NEO-60 demonstrated advantages over the Big Five Inventory-2 (Soto & John, 2017 ; n = 342) with regard to the Agreeableness domain content. The results suggest strong reliability and validity of the IPIP-NEO-60 scores.
Development and Validation of a Self-reported Questionnaire for Measuring Internet Search Dependence

PubMed Central

Wang, Yifan; Wu, Lingdan; Zhou, Hongli; Xu, Jiaojing; Dong, Guangheng

2016-01-01

Internet search has become the most common way that people deal with issues and problems in everyday life. The wide use of Internet search has largely changed the way people search for and store information. There is a growing interest in the impact of Internet search on users’ affect, cognition, and behavior. Thus, it is essential to develop a tool to measure the changes in psychological characteristics as a result of long-term use of Internet search. The aim of this study is to develop a Questionnaire on Internet Search Dependence (QISD) and test its reliability and validity. We first proposed a preliminary structure and items of the QISD based on literature review, supplemental investigations, and interviews. And then, we assessed the psychometric properties and explored the factor structure of the initial version via exploratory factor analysis (EFA). The EFA results indicated that four dimensions of the QISD were very reliable, i.e., habitual use of Internet search, withdrawal reaction, Internet search trust, and external storage under Internet search. Finally, we tested the factor solution obtained from EFA through confirmatory factor analysis (CFA). The results of CFA confirmed that the four dimensions model fits the data well. In all, this study suggests that the 12-item QISD is of high reliability and validity and can serve as a preliminary tool to measure the features of Internet search dependence. PMID:28066753
Development and Validation of a Self-reported Questionnaire for Measuring Internet Search Dependence.

PubMed

Wang, Yifan; Wu, Lingdan; Zhou, Hongli; Xu, Jiaojing; Dong, Guangheng

2016-01-01

Internet search has become the most common way that people deal with issues and problems in everyday life. The wide use of Internet search has largely changed the way people search for and store information. There is a growing interest in the impact of Internet search on users' affect, cognition, and behavior. Thus, it is essential to develop a tool to measure the changes in psychological characteristics as a result of long-term use of Internet search. The aim of this study is to develop a Questionnaire on Internet Search Dependence (QISD) and test its reliability and validity. We first proposed a preliminary structure and items of the QISD based on literature review, supplemental investigations, and interviews. And then, we assessed the psychometric properties and explored the factor structure of the initial version via exploratory factor analysis (EFA). The EFA results indicated that four dimensions of the QISD were very reliable, i.e., habitual use of Internet search, withdrawal reaction, Internet search trust, and external storage under Internet search. Finally, we tested the factor solution obtained from EFA through confirmatory factor analysis (CFA). The results of CFA confirmed that the four dimensions model fits the data well. In all, this study suggests that the 12-item QISD is of high reliability and validity and can serve as a preliminary tool to measure the features of Internet search dependence.
Evaluation of validity and reliability of a methodology for measuring human postural attitude and its relation to temporomandibular joint disorders

PubMed Central

Fernández, Ramón Fuentes; Carter, Pablo; Muñoz, Sergio; Silva, Héctor; Venegas, Gonzalo Hernán Oporto; Cantin, Mario; Ottone, Nicolás Ernesto

2016-01-01

INTRODUCTION Temporomandibular joint disorders (TMJDs) are caused by several factors such as anatomical, neuromuscular and psychological alterations. A relationship has been established between TMJDs and postural alterations, a type of anatomical alteration. An anterior position of the head requires hyperactivity of the posterior neck region and shoulder muscles to prevent the head from falling forward. This compensatory muscular function may cause fatigue, discomfort and trigger point activation. To our knowledge, a method for assessing human postural attitude in more than one plane has not been reported. Thus, the aim of this study was to design a methodology to measure the external human postural attitude in frontal and sagittal planes, with proper validity and reliability analyses. METHODS The variable postures of 78 subjects (36 men, 42 women; age 18–24 years) were evaluated. The postural attitudes of the subjects were measured in the frontal and sagittal planes, using an acromiopelvimeter, grid panel and Fox plane. RESULTS The method we designed for measuring postural attitudes had adequate reliability and validity, both qualitatively and quantitatively, based on Cohen’s Kappa coefficient (> 0.87) and Pearson’s correlation coefficient (r = 0.824, > 80%). CONCLUSION This method exhibits adequate metrical properties and can therefore be used in further research on the association of human body posture with skeletal types and TMJDs. PMID:26768173
Evaluation of validity and reliability of a methodology for measuring human postural attitude and its relation to temporomandibular joint disorders.

PubMed

Fuentes Fernández, Ramón; Carter, Pablo; Muñoz, Sergio; Silva, Héctor; Oporto Venegas, Gonzalo Hernán; Cantin, Mario; Ottone, Nicolás Ernesto

2016-04-01

Temporomandibular joint disorders (TMJDs) are caused by several factors such as anatomical, neuromuscular and psychological alterations. A relationship has been established between TMJDs and postural alterations, a type of anatomical alteration. An anterior position of the head requires hyperactivity of the posterior neck region and shoulder muscles to prevent the head from falling forward. This compensatory muscular function may cause fatigue, discomfort and trigger point activation. To our knowledge, a method for assessing human postural attitude in more than one plane has not been reported. Thus, the aim of this study was to design a methodology to measure the external human postural attitude in frontal and sagittal planes, with proper validity and reliability analyses. The variable postures of 78 subjects (36 men, 42 women; age 18-24 years) were evaluated. The postural attitudes of the subjects were measured in the frontal and sagittal planes, using an acromiopelvimeter, grid panel and Fox plane. The method we designed for measuring postural attitudes had adequate reliability and validity, both qualitatively and quantitatively, based on Cohen's Kappa coefficient (> 0.87) and Pearson's correlation coefficient (r = 0.824, > 80%). This method exhibits adequate metrical properties and can therefore be used in further research on the association of human body posture with skeletal types and TMJDs. Copyright © Singapore Medical Association.
The Comprehensive Care Project: Measuring Physician Performance in Ambulatory Practice

PubMed Central

Holmboe, Eric S; Weng, Weifeng; Arnold, Gerald K; Kaplan, Sherrie H; Normand, Sharon-Lise; Greenfield, Sheldon; Hood, Sarah; Lipner, Rebecca S

2010-01-01

Objective To investigate the feasibility, reliability, and validity of comprehensively assessing physician-level performance in ambulatory practice. Data Sources/Study Setting Ambulatory-based general internists in 13 states participated in the assessment. Study Design We assessed physician-level performance, adjusted for patient factors, on 46 individual measures, an overall composite measure, and composite measures for chronic, acute, and preventive care. Between- versus within-physician variation was quantified by intraclass correlation coefficients (ICC). External validity was assessed by correlating performance on a certification exam. Data Collection/Extraction Methods Medical records for 236 physicians were audited for seven chronic and four acute care conditions, and six age- and gender-appropriate preventive services. Principal Findings Performance on the individual and composite measures varied substantially within (range 5–86 percent compliance on 46 measures) and between physicians (ICC range 0.12–0.88). Reliabilities for the composite measures were robust: 0.88 for chronic care and 0.87 for preventive services. Higher certification exam scores were associated with better performance on the overall (r = 0.19; p <.01), chronic care (r = 0.14, p = .04), and preventive services composites (r = 0.17, p = .01). Conclusions Our results suggest that reliable and valid comprehensive assessment of the quality of chronic and preventive care can be achieved by creating composite measures and by sampling feasible numbers of patients for each condition. PMID:20819110
A cluster analytic study of the Wechsler Intelligence Test for Children-IV in children referred for psychoeducational assessment due to persistent academic difficulties.

PubMed

Hale, Corinne R; Casey, Joseph E; Ricciardi, Philip W R

2014-02-01

Wechsler Intelligence Test for Children-IV core subtest scores of 472 children were cluster analyzed to determine if reliable and valid subgroups would emerge. Three subgroups were identified. Clusters were reliable across different stages of the analysis as well as across algorithms and samples. With respect to external validity, the Globally Low cluster differed from the other two clusters on Wechsler Individual Achievement Test-II Word Reading, Numerical Operations, and Spelling subtests, whereas the latter two clusters did not differ from one another. The clusters derived have been identified in studies using previous WISC editions. Clusters characterized by poor performance on subtests historically associated with the VIQ (i.e., VCI + WMI) and PIQ (i.e., POI + PSI) did not emerge, nor did a cluster characterized by low scores on PRI subtests. Picture Concepts represented the highest subtest score in every cluster, failing to vary in a predictable manner with the other PRI subtests.

The German version of the Posttraumatic Stress Disorder Checklist for DSM-5 (PCL-5): psychometric properties and diagnostic utility.

PubMed

Krüger-Gottschalk, Antje; Knaevelsrud, Christine; Rau, Heinrich; Dyer, Anne; Schäfer, Ingo; Schellong, Julia; Ehring, Thomas

2017-11-28

The Posttraumatic Stress Disorder (PTSD) Checklist (PCL, now PCL-5) has recently been revised to reflect the new diagnostic criteria of the disorder. A clinical sample of trauma-exposed individuals (N = 352) was assessed with the Clinician Administered PTSD Scale for DSM-5 (CAPS-5) and the PCL-5. Internal consistencies and test-retest reliability were computed. To investigate diagnostic accuracy, we calculated receiver operating curves. Confirmatory factor analyses (CFA) were performed to analyze the structural validity. Results showed high internal consistency (α = .95), high test-retest reliability (r = .91) and a high correlation with the total severity score of the CAPS-5, r = .77. In addition, the recommended cutoff of 33 on the PCL-5 showed high diagnostic accuracy when compared to the diagnosis established by the CAPS-5. CFAs comparing the DSM-5 model with alternative models (the three-factor solution, the dysphoria, anhedonia, externalizing behavior and hybrid model) to account for the structural validity of the PCL-5 remained inconclusive. Overall, the findings show that the German PCL-5 is a reliable instrument with good diagnostic accuracy. However, more research evaluating the underlying factor structure is needed.
Critical Analysis of Strategies for Determining Rigor in Qualitative Inquiry.

PubMed

Morse, Janice M

2015-09-01

Criteria for determining the trustworthiness of qualitative research were introduced by Guba and Lincoln in the 1980s when they replaced terminology for achieving rigor, reliability, validity, and generalizability with dependability, credibility, and transferability. Strategies for achieving trustworthiness were also introduced. This landmark contribution to qualitative research remains in use today, with only minor modifications in format. Despite the significance of this contribution over the past four decades, the strategies recommended to achieve trustworthiness have not been critically examined. Recommendations for where, why, and how to use these strategies have not been developed, and how well they achieve their intended goal has not been examined. We do not know, for example, what impact these strategies have on the completed research. In this article, I critique these strategies. I recommend that qualitative researchers return to the terminology of social sciences, using rigor, reliability, validity, and generalizability. I then make recommendations for the appropriate use of the strategies recommended to achieve rigor: prolonged engagement, persistent observation, and thick, rich description; inter-rater reliability, negative case analysis; peer review or debriefing; clarifying researcher bias; member checking; external audits; and triangulation. © The Author(s) 2015.
[Design and validation of a brief questionnaire to assess young´s sexual knowledge].

PubMed

Leon-Larios, Fátima; Gómez-Baya, Diego

2018-06-01

Only very few instruments have been developed to assess sexual knowledge and practices. Most of the research to date has been carried out with adolescent samples, but not with university students, who are also at a particularly risky stage. The aim of this study was to design and validate a brief questionnaire to assess young´s sexual knowledge, practices and behaviors to design health education programs in the university context. We created a specific questionnaire about sexual pattern in university adolescents and a brief questionnaire consisted of 9 items (true/false) about contraception, sexuality and sexual transmission diseases. We carried out a pilot study, reliability (KR-20) and validity analyses using factorial analysis and examining the association with other variables. 566 students from University of Seville participated during 2015/16. One item was eliminated because of comprehension (only 13.9% of correct answers) and weak or non significant associations (p more than 0.05). Finally, the scale was formed by 8 items and had good internal consistency reliability (KR-20 = 0.57), and both factorial and external validity reliability. A three-factor model showed good data fit, χ2 (14, N=566)=17.48, p= 0.232, Comparative Fit Index CFI = 0.97, root mean squared error of prediction RMSEA = 0.02. Participants with less knowledge about sexuality were whose did not receive any information (M=6.82, SD=1.41), without partner (M=6.87, SD=1.35), had an abortion (M=6.43, SD=1.95) and did not use any contraceptive method (M=6.66, SD=0.58) or coitus interruptus (M=6.55, SD=1.39), and had less sexual relationships, e.g., once or twice a year (M=6.49, SD=1.70). This questionnaire is a short instrument to assess students´ practices and knowledge about sexuality and contraception. The analyses of reliability and validity have shown the good psychometric properties of this instrument.
The importance of molecular structures, endpoints' values, and predictivity parameters in QSAR research: QSAR analysis of a series of estrogen receptor binders.

PubMed

Li, Jiazhong; Gramatica, Paola

2010-11-01

Quantitative structure-activity relationship (QSAR) methodology aims to explore the relationship between molecular structures and experimental endpoints, producing a model for the prediction of new data; the predictive performance of the model must be checked by external validation. Clearly, the qualities of chemical structure information and experimental endpoints, as well as the statistical parameters used to verify the external predictivity have a strong influence on QSAR model reliability. Here, we emphasize the importance of these three aspects by analyzing our models on estrogen receptor binders (Endocrine disruptor knowledge base (EDKB) database). Endocrine disrupting chemicals, which mimic or antagonize the endogenous hormones such as estrogens, are a hot topic in environmental and toxicological sciences. QSAR shows great values in predicting the estrogenic activity and exploring the interactions between the estrogen receptor and ligands. We have verified our previously published model for additional external validation on new EDKB chemicals. Having found some errors in the used 3D molecular conformations, we redevelop a new model using the same data set with corrected structures, the same method (ordinary least-square regression, OLS) and DRAGON descriptors. The new model, based on some different descriptors, is more predictive on external prediction sets. Three different formulas to calculate correlation coefficient for the external prediction set (Q2 EXT) were compared, and the results indicated that the new proposal of Consonni et al. had more reasonable results, consistent with the conclusions from regression line, Williams plot and root mean square error (RMSE) values. Finally, the importance of reliable endpoints values has been highlighted by comparing the classification assignments of EDKB with those of another estrogen receptor binders database (METI): we found that 16.1% assignments of the common compounds were opposite (20 among 124 common compounds). In order to verify the real assignments for these inconsistent compounds, we predicted these samples, as a blind external set, by our regression models and compared the results with the two databases. The results indicated that most of the predictions were consistent with METI. Furthermore, we built a kNN classification model using the 104 consistent compounds to predict those inconsistent ones, and most of the predictions were also in agreement with METI database.
Current and historical individual data about exposure of workers in the rayon industry to carbon disulfide and their validity in calculating the cumulative dose.

PubMed

Göen, Thomas; Schramm, Axel; Baumeister, Thomas; Uter, Wolfgang; Drexler, Hans

2014-08-01

The objective of the study was to investigate how exposure to carbon disulfide (CS2) in a rayon-manufacturing plant has changed within two decades and whether it is possible to calculate valid data for the individual cumulative exposure. The data for CS2 concentration in air and biological exposure monitoring (2-thio-1,3-thiaxolidine-4-carboxylic acid (TTCA) in urine) from two cross-sectional studies, performed in 1992 (n = 362) and 2009 (n = 212) in a German rayon-manufacturing plant, were compared to data obtained from company-internal measurements between the studies. Using the data from the cross-sectional studies and company-internal data, cumulative external exposure and the cumulative internal exposure were calculated for each worker. External and internal CS2 exposure of the employees decreased from 1992 (medians 4.0 ppm and 1.63 mgTTCA/g creatinine) to 2009 (medians 2.5 ppm and 0.86 mg/g). However, company-internal CS2 data do not show a straight trend for this period. The annual medians of the company-internal measurement of external exposure to CS2 have varied between 2.7 and 8.4 ppm, in which median values exceeded 5 ppm generally since 2000. The annual medians for the company-internal biomonitoring assessment ranged between 1.2 and 2.8 mg/g creatinine. The cumulative CS2 exposure ranged from 8.5 to 869.5 ppm years for external exposure and between 1.30 and 176.2 mg/g creatinine years for the internal exposure. Significant correlations were found between the current air pollution and the internal exposure in 2009 but also between the cumulative external and internal CS2 exposure. Current exposure data, usually collected in cross-sectional studies, rarely allow a reliable statement on the cumulative dose, because of higher exposure in the past and of fluctuating courses of exposure. On the other hand, company-internal exposure data may be affected by non-representative measurement strategies. Some verification of the reliability of cumulative exposure data may be possible by testing the correlation between cumulative exposure data of external assessment and biological monitoring.
Measuring treatment satisfaction in MS: Is the Treatment Satisfaction Questionnaire for Medication fit for purpose?

PubMed

Vermersch, Patrick; Hobart, Jeremy; Dive-Pouletty, Catherine; Bozzi, Sylvie; Hass, Steven; Coyle, Patricia K

2017-04-01

The Treatment Satisfaction Questionnaire for Medication (TSQM) was designed to assess patient treatment satisfaction in chronic diseases. Its performance has not been examined in multiple sclerosis (MS). The 14 items of the TSQM cover four domains: Effectiveness, Side Effects, Convenience, and Global Satisfaction. To evaluate performance of the TSQM in patients with relapsing MS, using data collected from the TENERE study (NCT00883337), in which 324 patients received oral teriflunomide or subcutaneous interferon beta-1a for ⩾48 weeks. Five measurement properties were examined using traditional psychometric methods: data completeness, scale-to-sample targeting, scaling assumptions, reliability (including test-retest), and construct validity (internal: item-level scaling success, confirmatory factor analysis, and exploratory factor analysis; external: convergence, discrimination, and group differences). There were few (<2%) missing item data; domain scores could be computed for all patients. Score distributions were skewed toward higher satisfaction; two domains had marked ceiling effects. Scaling assumptions were supported. Internal consistency reliability was high (Cronbach's α > 0.90). Internal validity tests supported item groupings. Correlations supported convergent and discriminant construct validity; hypothesis testing supported group differences validity. This investigation found the TSQM to be a useful tool, exhibiting good psychometric measurement properties in patients with relapsing MS in the TENERE study.
Assessing physiotherapists' communication skills for promoting patient autonomy for self-management: reliability and validity of the communication evaluation in rehabilitation tool.

PubMed

Murray, Aileen; Hall, Amanda; Williams, Geoffrey C; McDonough, Suzanne M; Ntoumanis, Nikos; Taylor, Ian; Jackson, Ben; Copsey, Bethan; Hurley, Deirdre A; Matthews, James

2018-02-27

To assess the inter-rater reliability and concurrent validity of the Communication Evaluation in Rehabilitation Tool, which aims to externally assess physiotherapists competency in using Self-Determination Theory-based communication strategies in practice. Audio recordings of initial consultations between 24 physiotherapists and 24 patients with chronic low back pain in four hospitals in Ireland were obtained as part of a larger randomised controlled trial. Three raters, all of whom had Ph.Ds in psychology and expertise in motivation and physical activity, independently listened to the 24 audio recordings and completed the 18-item Communication Evaluation in Rehabilitation Tool. Inter-rater reliability between all three raters was assessed using intraclass correlation coefficients. Concurrent validity was assessed using Pearson's r correlations with a reference standard, the Health Care Climate Questionnaire. The total score for the Communication Evaluation in Rehabilitation Tool is an average of all 18 items. Total scores demonstrated good inter-rater reliability (Intraclass Correlation Coefficient (ICC) = 0.8) and concurrent validity with the Health Care Climate Questionnaire total score (range: r = 0.7-0.88). Item-level scores of the Communication Evaluation in Rehabilitation Tool identified five items that need improvement. Results provide preliminary evidence to support future use and testing of the Communication Evaluation in Rehabilitation Tool. Implications for Rehabilitation Promoting patient autonomy is a learned skill and while interventions exist to train clinicians in these skills there are no tools to assess how well clinicians use these skills when interacting with a patient. The lack of robust assessment has severe implications regarding both the fidelity of clinician training packages and resulting outcomes for promoting patient autonomy. This study has developed a novel measurement tool Communication Evaluation in Rehabilitation Tool and a comprehensive user manual to assess how well health care providers use autonomy-supportive communication strategies in real world-clinical settings. This tool has demonstrated good inter-rater reliability and concurrent validity in its initial testing phase. The Communication Evaluation in Rehabilitation Tool can be used in future studies to assess autonomy-supportive communication and undergo further measurement property testing as per our recommendations.
Validation of the American version of the CareGiver Oncology Quality of Life (CarGOQoL) questionnaire.

PubMed

Kaveney, Sarah C; Baumstarck, Karine; Minaya-Flores, Patricia; Shannon, Tarrah; Symes, Philip; Loundou, Anderson; Auquier, Pascal

2016-05-28

The CareGiver Oncology Quality of Life (CarGOQoL) questionnaire, a 29-item, multidimensional, self-administered questionnaire, was validated using a large French sample. We reported the linguistic validation process and the metric validity of the English version of CarGOQoL in the United- States. The translation process consisted of 3 consecutive steps: forward-backward translation, acceptability testing, and cognitive interviews. The psychometric testing was applied to caregivers of consecutive patients with representative cancers who were recruited from the Regional Cancer Center in northwestern Pennsylvania. All individuals completed the CarGOQoL at baseline, day- 30, and day- 90. Internal consistency, reliability, external validity, reproducibility, and sensitivity to change were tested. The translated version was validated on a total of 87 American cancer caregivers. The dimensions of the CarGOQoL generally demonstrated a high internal consistency (Cronbach's alpha > 0.70 for all but four domain scores). External validity testing revealed that the CarGOQoL index score correlated significantly with all SF-36 dimension scores except the physical composite score (Pearson's correlation: 0.28-0.70). Reproducibility was satisfactory at day- 30 (intraclass correlation coefficient: 0.46-0.94) and day- 90 (0.43-0.92). Four specific dimensions of CarGOQoL showed responsiveness: the Psychological well-being, the Relationships with health care system, the Social support and the Finances. The American version of the CarGOQoL constitutes a useful instrument to measure QoL in caregivers of cancer patients in the United- States.
Reconceptualising the external validity of discrete choice experiments.

PubMed

Lancsar, Emily; Swait, Joffre

2014-10-01

External validity is a crucial but under-researched topic when considering using discrete choice experiment (DCE) results to inform decision making in clinical, commercial or policy contexts. We present the theory and tests traditionally used to explore external validity that focus on a comparison of final outcomes and review how this traditional definition has been empirically tested in health economics and other sectors (such as transport, environment and marketing) in which DCE methods are applied. While an important component, we argue that the investigation of external validity should be much broader than a comparison of final outcomes. In doing so, we introduce a new and more comprehensive conceptualisation of external validity, closely linked to process validity, that moves us from the simple characterisation of a model as being or not being externally valid on the basis of predictive performance, to the concept that external validity should be an objective pursued from the initial conceptualisation and design of any DCE. We discuss how such a broader definition of external validity can be fruitfully used and suggest innovative ways in which it can be explored in practice.
Turbomachine Sealing and Secondary Flows - Part 3. Part 3; Review of Power-Stream Support, Unsteady Flow Systems, Seal and Disk Cavity Flows, Engine Externals, and Life and Reliability Issues

NASA Technical Reports Server (NTRS)

Hendricks, R. C.; Steinetz, B. M.; Zaretsky, E. V.; Athavale, M. M.; Przekwas, A. J.

2004-01-01

The issues and components supporting the engine power stream are reviewed. It is essential that companies pay close attention to engine sealing issues, particularly on the high-pressure spool or high-pressure pumps. Small changes in these systems are reflected throughout the entire engine. Although cavity, platform, and tip sealing are complex and have a significant effect on component and engine performance, computational tools (e.g., NASA-developed INDSEAL, SCISEAL, and ADPAC) are available to help guide the designer and the experimenter. Gas turbine engine and rocket engine externals must all function efficiently with a high degree of reliability in order for the engine to run but often receive little attention until they malfunction. Within the open literature statistically significant data for critical engine components are virtually nonexistent; the classic approach is deterministic. Studies show that variations with loading can have a significant effect on component performance and life. Without validation data they are just studies. These variations and deficits in statistical databases require immediate attention.
Assessing child and adolescent pragmatic language competencies: toward evidence-based assessments.

PubMed

Russell, Robert L; Grizzle, Kenneth L

2008-06-01

Using language appropriately and effectively in social contexts requires pragmatic language competencies (PLCs). Increasingly, deficits in PLCs are linked to child and adolescent disorders, including autism spectrum, externalizing, and internalizing disorders. As the role of PLCs expands in diagnosis and treatment of developmental psychopathology, psychologists and educators will need to appraise and select clinical and research PLC instruments for use in assessments and/or studies. To assist in this appraisal, 24 PLC instruments, containing 1,082 items, are assessed by addressing four questions: (1) Can PLC domains targeted by assessment items be reliably identified?, (2) What are the core PLC domains that emerge across the 24 instruments?, (3) Do PLC questionnaires and tests assess similar PLC domains?, and (4) Do the instruments achieve content, structural, diagnostic, and ecological validity? Results indicate that test and questionnaire items can be reliably categorized into PLC domains, that PLC domains featured in questionnaires and tests significantly differ, and that PLC instruments need empirical confirmation of their dimensional structure, content validity across all developmental age bands, and ecological validity. Progress in building a better evidence base for PLC assessments should be a priority in future research.
A psychometric evaluation of the Temperament in Middle Childhood Questionnaire (TMCQ) in a Swedish sample.

PubMed

Nystrom, Beatrice; Bengtsson, Hans

2017-12-01

Personality is generally considered to be biologically founded in temperament, and temperamental qualities have proven to be relatively stable across childhood and into adulthood (Caspi, Roberts & Shiner, ). Temperament predicts important developmental outcomes such as academic performance (Muris, ), and social functioning (Eisenberg, Fabes, Guthrie & Reiser, ), and it has also been found to be strongly related to the etiology and maintenance of internalizing and externalizing psychopathology in children (Muris, Meesters & Blijlevens, ; Nigg, ). To allow for the possibility of making early interventions, identification of potential risk factors (such as temperamental dispositions) is of great importance (Rettew & McKee, ). As temperament is multidimensional and has many different manifestations, parents and teachers are valuable sources in providing information about children's temperament (Rothbart & Bates, ; Tackett, Slobodskaya, Mar et al., ), and caregiver questionnaires are frequently used in child personality research. However, such questionnaires are only useful if their reliability and validity have been established. The aim of the present study was to examine the psychometric properties of the Temperament in Middle Childhood Questionnaire (TMCQ; Simonds, Kieras, Rueda & Rothbart, ), which focuses specifically on the ages between 7 and 11 years. The TMCQ is the least validated of the Rothbart measures, and although reliability data have been presented, together with some validity data, for a computerized self-report version of the questionnaire (Simonds & Rothbart, ), information about the reliability and validity for the caregiver version is scant. In the present paper, we report such data for a Swedish sample. © 2017 Scandinavian Psychological Associations and John Wiley & Sons Ltd.
National audit of continence care: laying the foundation.

PubMed

Mian, Sarah; Wagg, Adrian; Irwin, Penny; Lowe, Derek; Potter, Jonathan; Pearson, Michael

2005-12-01

National audit provides a basis for establishing performance against national standards, benchmarking against other service providers and improving standards of care. For effective audit, clinical indicators are required that are valid, feasible to apply and reliable. This study describes the methods used to develop clinical indicators of continence care in preparation for a national audit. To describe the methods used to develop and test clinical indicators of continence care with regard to validity, feasibility and reliability. A multidisciplinary working group developed clinical indicators that measured the structure, process and outcome of care as well as case-mix variables. Literature searching, consensus workshops and a Delphi process were used to develop the indicators. The indicators were tested in 15 secondary care sites, 15 primary care sites and 15 long-term care settings. The process of development produced indicators that received a high degree of consensus within the Delphi process. Testing of the indicators demonstrated an internal reliability of 0.7 and an external reliability of 0.6. Data collection required significant investment in terms of staff time and training. The method used produced indicators that achieved a high degree of acceptance from health care professionals. The reliability of data collection was high for this audit and was similar to the level seen in other successful national audits. Data collection for the indicators was feasible to collect, however, issues of time and staffing were identified as limitations to such data collection. The study has described a systematic method for developing clinical indicators for national audit. The indicators proved robust and reliable in primary and secondary care as well as long-term care settings.
Target analyte quantification by isotope dilution LC-MS/MS directly referring to internal standard concentrations--validation for serum cortisol measurement.

PubMed

Maier, Barbara; Vogeser, Michael

2013-04-01

Isotope dilution LC-MS/MS methods used in the clinical laboratory typically involve multi-point external calibration in each analytical series. Our aim was to test the hypothesis that determination of target analyte concentrations directly derived from the relation of the target analyte peak area to the peak area of a corresponding stable isotope labelled internal standard compound [direct isotope dilution analysis (DIDA)] may be not inferior to conventional external calibration with respect to accuracy and reproducibility. Quality control samples and human serum pools were analysed in a comparative validation protocol for cortisol as an exemplary analyte by LC-MS/MS. Accuracy and reproducibility were compared between quantification either involving a six-point external calibration function, or a result calculation merely based on peak area ratios of unlabelled and labelled analyte. Both quantification approaches resulted in similar accuracy and reproducibility. For specified analytes, reliable analyte quantification directly derived from the ratio of peak areas of labelled and unlabelled analyte without the need for a time consuming multi-point calibration series is possible. This DIDA approach is of considerable practical importance for the application of LC-MS/MS in the clinical laboratory where short turnaround times often have high priority.
Measuring emotions during epistemic activities: the Epistemically-Related Emotion Scales.

PubMed

Pekrun, Reinhard; Vogl, Elisabeth; Muis, Krista R; Sinatra, Gale M

2017-09-01

Measurement instruments assessing multiple emotions during epistemic activities are largely lacking. We describe the construction and validation of the Epistemically-Related Emotion Scales, which measure surprise, curiosity, enjoyment, confusion, anxiety, frustration, and boredom occurring during epistemic cognitive activities. The instrument was tested in a multinational study of emotions during learning from conflicting texts (N = 438 university students from the United States, Canada, and Germany). The findings document the reliability, internal validity, and external validity of the instrument. A seven-factor model best fit the data, suggesting that epistemically-related emotions should be conceptualised in terms of discrete emotion categories, and the scales showed metric invariance across the North American and German samples. Furthermore, emotion scores changed over time as a function of conflicting task information and related significantly to perceived task value and use of cognitive and metacognitive learning strategies.
Examining the Relations Among the DSM-5 Alternative Model of Personality, the Five-Factor Model, and Externalizing and Internalizing Behavior.

PubMed

Sleep, Chelsea E; Hyatt, Courtland S; Lamkin, Joanna; Maples-Keller, Jessica L; Miller, Joshua D

2017-01-26

Given long-standing criticisms of the DSM's reliance on categorical models of psychopathology, including the poor reliability and validity of personality-disorder diagnoses, the American Psychiatric Association (APA) published an alternative model (AM) of personality disorders in Section III of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5; APA, 2013), which, in part, comprises 5 pathological trait domains based on the 5-factor model (FFM). However, the empirical profiles and discriminant validity of the AM traits remain in question. We recruited a sample of undergraduates (N = 340) for the current study to compare the relations found between a measure of the DSM-5 AM traits (i.e., the Personality Inventory for DSM-5; PID-5; Krueger, Derringer, Markon, Watson, & Skodol, 2012) and a measure of the FFM (i.e., the International Personality Item Pool; IPIP; Goldberg, 1999) in relation to externalizing and internalizing symptoms. In general, the domains from the 2 measures were significantly related and demonstrated similar patterns of relations with these criteria, such that Antagonism/low Agreeableness and Disinhibition/low Conscientiousness were related to externalizing behaviors, whereas Negative Affectivity/Neuroticism was most significantly related to internalizing symptoms. However, the PID-5 demonstrated large interrelations among its domains and poorer discriminant validity than the IPIP. These results provide additional support that the conception of the trait model included in the DSM-5 AM is an extension of the FFM, but highlight some of the issues that arise due to the PID-5's more limited discriminant validity. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
The Psychometric Properties of the Center for Epidemiologic Studies Depression Scale in Chinese Primary Care Patients: Factor Structure, Construct Validity, Reliability, Sensitivity and Responsiveness.

PubMed

Chin, Weng Yee; Choi, Edmond P H; Chan, Kit T Y; Wong, Carlos K H

2015-01-01

The Center for Epidemiologic Studies Depression Scale (CES-D) is a commonly used instrument to measure depressive symptomatology. Despite this, the evidence for its psychometric properties remains poorly established in Chinese populations. The aim of this study was to validate the use of the CES-D in Chinese primary care patients by examining factor structure, construct validity, reliability, sensitivity and responsiveness. The psychometric properties were assessed amongst a sample of 3686 Chinese adult primary care patients in Hong Kong. Three competing factor structure models were examined using confirmatory factor analysis. The original CES-D four-structure model had adequate fit, however the data was better fit into a bi-factor model. For the internal construct validity, corrected item-total correlations were 0.4 for most items. The convergent validity was assessed by examining the correlations between the CES-D, the Patient Health Questionnaire 9 (PHQ-9) and the Short Form-12 Health Survey (version 2) Mental Component Summary (SF-12 v2 MCS). The CES-D had a strong correlation with the PHQ-9 (coefficient: 0.78) and SF-12 v2 MCS (coefficient: -0.75). Internal consistency was assessed by McDonald's omega hierarchical (ωH). The ωH value for the general depression factor was 0.855. The ωH values for "somatic", "depressed affect", "positive affect" and "interpersonal problems" were 0.434, 0.038, 0.738 and 0.730, respectively. For the two-week test-retest reliability, the intraclass correlation coefficient was 0.91. The CES-D was sensitive in detecting differences between known groups, with the AUC >0.7. Internal responsiveness of the CES-D to detect positive and negative changes was satisfactory (with p value <0.01 and all effect size statistics >0.2). The CES-D was externally responsive, with the AUC>0.7. The CES-D appears to be a valid, reliable, sensitive and responsive instrument for screening and monitoring depressive symptoms in adult Chinese primary care patients. In its original four-factor and bi-factor structure, the CES-D is supported for cross-cultural comparisons of depression in multi-center studies.
Recommendations for the definition of clinical responder in insulin preservation studies.

PubMed

Beam, Craig A; Gitelman, Stephen E; Palmer, Jerry P

2014-09-01

Clinical responder studies should contribute to the translation of effective treatments and interventions to the clinic. Since ultimately this translation will involve regulatory approval, we recommend that clinical trials prespecify a responder definition that can be assessed against the requirements and suggestions of regulatory agencies. In this article, we propose a clinical responder definition to specifically assist researchers and regulatory agencies in interpreting the clinical importance of statistically significant findings for studies of interventions intended to preserve β-cell function in newly diagnosed type 1 diabetes. We focus on studies of 6-month β-cell preservation in type 1 diabetes as measured by 2-h-stimulated C-peptide. We introduce criteria (bias, reliability, and external validity) for the assessment of responder definitions to ensure they meet U.S. Food and Drug Administration and European Medicines Agency guidelines. Using data from several published TrialNet studies, we evaluate our definition (no decrease in C-peptide) against published alternatives and determine that our definition has minimum bias with external validity. We observe that reliability could be improved by using changes in C-peptide later than 6 months beyond baseline. In sum, to support efficacy claims of β-cell preservation therapies in type 1 diabetes submitted to U.S. and European regulatory agencies, we recommend use of our definition. © 2014 by the American Diabetes Association. Readers may use this article as long as the work is properly cited, the use is educational and not for profit, and the work is not altered.
Validation of a new international quality-of-life instrument specific to cosmetics and physical appearance: BeautyQoL questionnaire.

PubMed

Beresniak, Ariel; de Linares, Yolaine; Krueger, Gerald G; Talarico, Sergio; Tsutani, Kiichiro; Duru, Gérard; Berger, Geneviève

2012-11-01

To develop a new quality-of-life (QoL) instrument with international validity that specifically assesses cosmetic products and physical appearance. In the first phase, semidirected interviews involved 309 subjects. In the second stage, an acceptability study was performed on 874 subjects. Thereafter, we recruited a total of 3231 subjects, each of whom completed the BeautyQoL questionnaire, a clinical checklist for the skin, the generic QoL 36-Item Short Form Health Survey, and a sociodemographic questionnaire. A retest was performed 8 days later on a subgroup of 652 subjects. Populations in France, the United Kingdom, Germany, Spain, Sweden, Italy, Russia, the United States, Brazil, Japan, India, China, and South Africa, representing 16 languages. The general adult healthy population, including women and men. Psychometric properties, construct validity, reproducibility, and internal and external consistency. General acceptability was very good in the 16 languages, with a very low rate of no answers. The validation phase reduced the questionnaire to 42 questions structured in the following 5 dimensions that explained 76.7% of the total variance: social life, self-confidence, mood, energy, and attractiveness. Internal consistency was high (Cronbach α coefficients, 0.93-0.98). Reproducibility at 8 days was satisfactory in all dimensions. Results of external validity testing revealed that BeautyQoL scores correlated significantly with all 36-Item Short Form Health Survey scores except for physical function. These results demonstrate the validity and reliability of the BeautyQoL questionnaire as the very first international instrument specific to cosmetic products and physical appearance.
[Evaluation of a two-dimensional scale for the assessment of fear avoidance beliefs in elderly chronic low back pain patients].

PubMed

Quint, S; Raich, M; Luckmann, J

2011-06-01

There is evidence on the importance of fear avoidance beliefs (FAB) as prognostic risk factors in elderly patients suffering from chronic low back pain (CLBP). However, so far there is no validated German instrument for measuring FAB in elderly CLBP patients. The aim of the study presented was to evaluate the psychometric properties of the Catastrophizing Avoidance Scale D-65+ (CAS-D-65+) within a population of elderly patients with CLBP. A cross-sectional study was conducted with measurement repeated after 4 weeks in 68 CLBP patients aged 64 years and older. The CAS-D-65+ was analyzed performing an item analysis and retest reliability. For validation standardized assessment methods (Tampa Scale of Kinesiophobia [TSK], Photography of Daily Activity - Short electronic Version [Phoda-SeV], 5-Item-FAB, pain, disability, well-being and strain) were used. Internal consistency (Cronbach's α) ranged from 0.87 to 0.92 for total scale and from 0.71 to 0.89 for the sub-scales "catastrophizing" and "avoidance", retest reliability (r(tt)) ranged from 0.67 for the sub-scale "catastrophizing" to 0.70 for total scale and sub-scale "avoidance". The CAS-D-65+ showed moderate and strong effect sizes (Cohen's d) with other related FAB scales and external criteria. As shown in this study the CAS-D-65+ is a reliable and a valid instrument for the assessment of FAB in older patients with CLBP.

Development of a multidimensional labour satisfaction questionnaire: dimensions, validity, and internal reliability

PubMed Central

Smith, L

2001-01-01

Background—No published quantitative instrument exists to measure maternal satisfaction with the quality of different models of labour care in the UK. Methods—A quantitative psychometric multidimensional maternal satisfaction questionnaire, the Women's Views of Birth Labour Satisfaction Questionnaire (WOMBLSQ), was developed using principal components analysis with varimax rotation of successive versions. Internal reliability and content and construct validity were assessed. Results—Of 300 women sent the first version (WOMBLSQ1), 120 (40%) replied; of 300 sent WOMBLSQ2, 188 (62.7%) replied; of 500 women sent WOMBLSQ3, 319 (63.8%) replied; and of 2400 women sent WOMBLSQ4, 1683 (70.1%) replied. The latter two versions consisted of 10 dimensions in addition to general satisfaction. These were (Cronbach's alpha): professional support in labour (0.91), expectations of labour (0.90), home assessment in early labour (0.90), holding the baby (0.87), support from husband/partner (0.83), pain relief in labour (0.83), pain relief immediately after labour (0.65), knowing labour carers (0.82), labour environment (0.80), and control in labour (0.62). There were moderate correlations (range 0.16–0.73) between individual dimensions and the general satisfaction scale (0.75). Scores on individual dimensions were significantly related to a range of clinical and demographic variables. Conclusion—This multidimensional labour satisfaction instrument has good validity and internal reliability. It could be used to assess care in labour across different models of maternity care, or as a prelude to in depth exploration of specific areas of concern. Its external reliability and transferability to care outside the South West region needs further evaluation, particularly in terms of ethnicity and social class. Key Words: Women's Views of Birth Labour Satisfaction Questionnaire (WOMBLSQ); labour; questionnaire PMID:11239139
Development of a new assessment tool for cervical myelopathy using hand-tracking sensor: Part 1: validity and reliability.

PubMed

Alagha, M Abdulhadi; Alagha, Mahmoud A; Dunstan, Eleanor; Sperwer, Olaf; Timmins, Kate A; Boszczyk, Bronek M

2017-04-01

To assess the reliability and validity of a hand motion sensor, Leap Motion Controller (LMC), in the 15-s hand grip-and-release test, as compared against human inspection of an external digital camera recording. Fifty healthy participants were asked to fully grip-and-release their dominant hand as rapidly as possible for two trials with a 10-min rest in-between, while wearing a non-metal wrist splint. Each test lasted for 15 s, and a digital camera was used to film the anterolateral side of the hand on the first test. Three assessors counted the frequency of grip-and-release (G-R) cycles independently and in a blinded fashion. The average mean of the three was compared with that measured by LMC using the Bland-Altman method. Test-retest reliability was examined by comparing the two 15-s tests. The mean number of G-R cycles recorded was: 47.8 ± 6.4 (test 1, video observer); 47.7 ± 6.5 (test 1, LMC); and 50.2 ± 6.5 (test 2, LMC). Bland-Altman indicated good agreement, with a low bias (0.15 cycles) and narrow limits of agreement. The ICC showed high inter-rater agreement and the coefficient of repeatability for the number of cycles was ±5.393, with a mean bias of 3.63. LMC appears to be valid and reliable in the 15-s grip-and-release test. This serves as a first step towards the development of an objective myelopathy assessment device and platform for the assessment of neuromotor hand function in general. Further assessment in a clinical setting and to gauge healthy benchmark values is warranted.
Quality of life after multiple trauma: validation and population norm of the Polytrauma Outcome (POLO) chart.

PubMed

Lefering, R; Tecic, T; Schmidt, Y; Pirente, N; Bouillon, B; Neugebauer, E

2012-08-01

Due to an increasing number of survivors after multiple injuries in Western countries, the health-related quality of life (QoL) is considered to be an important outcome parameter. Up to now, measuring instruments used in this field lacked validity and comparability. Within 6 years, our working group developed a new modular instrument, called the Polytrauma Outcome (POLO) chart. This study documents the validation of the trauma-specific module specifically designed for trauma patients, the Trauma Outcome Profile (TOP). A total of 172 multiply injured patients (mean Injury Severity Score [ISS] 26.7) recruited from eight trauma centres participating in the German Trauma Registry were compared with 166 marginally injured patients (mean ISS 3.9). The mean follow-up was 24.2 and 26.4 months, respectively. The validation questionnaires used were the Beck Depression Inventory (BDI), the State-Trait Anxiety Inventory (STAI), Impact of Event Scale-Revised (IES-R), Social Support Questionnaire (F-SOZU-K-22), Barthel Index of Activities of Daily Living (ADL) and the Short Form Health Survey (SF-36). The internal consistency of the different dimensions of QoL assessed with the TOP was good. Factor analysis provides evidence of the construct validity of the questionnaire. Correlation with external measures gives evidence of criterion validity for the various dimensions of QoL and similar exceedance of proposed cut-off points within TOP and external measures is verified. The TOP module is a reliable and valid instrument to assess health-related QoL in patients with multiple injuries. It can be used stand-alone or as part of the POLO chart together with the Glasgow Outcome Scale (GOS), the EuroQoL and the SF-36 as a regular systematic follow-up instrument.
Math Anxiety Assessment with the Abbreviated Math Anxiety Scale: Applicability and Usefulness: Insights from the Polish Adaptation

PubMed Central

Cipora, Krzysztof; Szczygieł, Monika; Willmes, Klaus; Nuerk, Hans-Christoph

2015-01-01

Math anxiety has an important impact on mathematical development and performance. However, although math anxiety is supposed to be a transcultural trait, assessment instruments are scarce and are validated mainly for Western cultures so far. Therefore, we aimed at examining the transcultural generality of math anxiety by a thorough investigation of the validity of math anxiety assessment in Eastern Europe. We investigated the validity and reliability of a Polish adaptation of the Abbreviated Math Anxiety Scale (AMAS), known to have very good psychometric characteristics in its original, American-English version as well as in its Italian and Iranian adaptations. We also observed high reliability, both for internal consistency and test-retest stability of the AMAS in the Polish sample. The results also show very good construct, convergent and discriminant validity: The factorial structure in Polish adult participants (n = 857) was very similar to the one previously found in other samples; AMAS scores correlated moderately in expected directions with state and trait anxiety, self-assessed math achievement and skill as well temperamental traits of emotional reactivity, briskness, endurance, and perseverance. Average scores obtained by participants as well as gender differences and correlations with external measures were also similar across cultures. Beyond the cultural comparison, we used path model analyses to show that math anxiety relates to math grades and self-competence when controlling for trait anxiety. The current study shows transcultural validity of math anxiety assessment with the AMAS. PMID:26648893
Math Anxiety Assessment with the Abbreviated Math Anxiety Scale: Applicability and Usefulness: Insights from the Polish Adaptation.

PubMed

Cipora, Krzysztof; Szczygieł, Monika; Willmes, Klaus; Nuerk, Hans-Christoph

2015-01-01

Math anxiety has an important impact on mathematical development and performance. However, although math anxiety is supposed to be a transcultural trait, assessment instruments are scarce and are validated mainly for Western cultures so far. Therefore, we aimed at examining the transcultural generality of math anxiety by a thorough investigation of the validity of math anxiety assessment in Eastern Europe. We investigated the validity and reliability of a Polish adaptation of the Abbreviated Math Anxiety Scale (AMAS), known to have very good psychometric characteristics in its original, American-English version as well as in its Italian and Iranian adaptations. We also observed high reliability, both for internal consistency and test-retest stability of the AMAS in the Polish sample. The results also show very good construct, convergent and discriminant validity: The factorial structure in Polish adult participants (n = 857) was very similar to the one previously found in other samples; AMAS scores correlated moderately in expected directions with state and trait anxiety, self-assessed math achievement and skill as well temperamental traits of emotional reactivity, briskness, endurance, and perseverance. Average scores obtained by participants as well as gender differences and correlations with external measures were also similar across cultures. Beyond the cultural comparison, we used path model analyses to show that math anxiety relates to math grades and self-competence when controlling for trait anxiety. The current study shows transcultural validity of math anxiety assessment with the AMAS.
Reliability and validity of the Spanish version of the Minnesota Multiphasic Personality Inventory-Adolescent (MMPI-A).

PubMed

Zubeidat, Ihab; Sierra, Juan Carlos; Salinas, José María; Rojas-García, Antonio

2011-01-01

The aim of this study was to determine the test-retest reliability and internal consistency of the scales of the Spanish version of the Minnesota Multiphasic Personality Inventory-Adolescent (MMPI-A; Butcher et al., 1992). Two samples of 939 and 109 Spanish adolescents ages 14 to 18 years were assessed with the MMPI-A in their school environment. The first sample responded to the inventory once, whereas the second sample responded to it on 2 occasions with a 2-week interval between sessions. Results showed no significant differences in means or variances between the first and the second test administration for most MMPI-A scales. Test-retest reliability ranged between .62 (Amorality, Ma(1)) and .92 (Immaturity, IMM); most correlations exceeded .70. Internal consistency values for the MMPI-A scales in the pretest and posttest were very similar overall. External validity of the MMPI-A was demonstrated through several significant correlations between its scales and YSR/11-18 syndromes and social interaction measures. The highest correlations were established between the Anxious/Depressed YSR/11-18 scale and other MMPI-A scales such as Schizophrenia (Sc), Welsh's Anxiety (A), Adolescent-Anxiety (A-anx) and Adolescent-Alienation (A-aln), and between the Social Avoidance and Distress Scale and the MMPI-A Adolescent-Social Discomfort (A-sod) scale.
Child Attitude Toward Illness Scale (CATIS): A systematic review of the literature.

PubMed

Ramsey, Rachelle R; Ryan, Jamie L; Fedele, David A; Mullins, Larry L; Chaney, John M; Wagner, Janelle L

2016-06-01

The objective of this study was to systematically review the literature utilizing the Child Attitude Toward Illness Scale (CATIS) as a measure of illness attitudes within pediatric chronic illness, including epilepsy, and provide recommendations for its use. This review includes an examination of the psychometric properties of the CATIS and the relationship between the CATIS and psychological, academic, behavioral, and illness variables. Electronic searches were conducted using Medline and PsychINFO to identify twenty-two relevant publications. The CATIS was identified as a reliable and valid self-report assessment tool across chronic illnesses, including pediatric epilepsy. Although originally developed for children ages 8-12, the CATIS has demonstrated reliability and validity in youth ages 8-22. The CATIS scores were reliably associated with cognitive appraisal variables and internalizing symptoms. Initial support exists for the relation between illness attitudes and externalizing behavior, academic functioning, and psychosocial care needs. Mixed findings were reported with regard to the relation between illness attitudes and demographic and disease variables, as well as both social and family functioning. The CATIS is a psychometrically sound self-report instrument for measuring illness attitudes and demonstrates clinical utility for examining adjustment outcomes across chronic illnesses, particularly pediatric epilepsy. Copyright © 2016 Elsevier Inc. All rights reserved.
Development of Depression Profile: a new psychometric instrument to selectively evaluate depressive symptoms based on the neurocircuitry theory.

PubMed

Faludi, Gábor; Gonda, Xenia; Kliment, Edit; Bekes, Vera; Mészáros, Veronika; Oláh, Attila

2010-06-01

Although we have several self-report instruments available to assess depression, they yield a composite score and thus do not allow for the differential examination of major symptom clusters associated with depression. However, such an instrument would be a useful tool in subtyping depression and selecting the most appropriate pharmacotherapy for each patient. The neurocircuitry theory describes the biochemical and neuroanatomic background associated with the major symptoms of depression. Based on the neurocircuitry theory, our team has developed a new instrument, the Depression Profile, to selectively assess depressive symptom clusters associated with different neurotransmitter systems and neuroanatomic structures. The aim of our study was to investigate the psychometric characteristics of Depression Profile. 339 patients consecutively admitted with DSM-IV major depression in our hospital completed the Depression Profile in the first two weeks of their hospitalisation. 81 patients in an adult outpatient unit also completed the Zung Self-rating Depression Scale. Internal consistency of Depression Profile was tested with item analysis. The external validity of Depression Profile against the Zung Self-rating Depression Scale was tested using Pearson correlations. The internal consistency of Depression Profile proved to be excellent. The Cronbach alpha values of the scales met the expectable minimum level derived from the number of items in the scales. In testing for convergent validity, all Pearson correlation coefficients between Depression profile subscales and the Zung Self-rating Depression Scale were significant and moderate to high which indicates the good external validity of our instrument. The initial psychometric evaluation of Depression Profile indicates that our instrument has good reliability and internal and external validity. The instrument also proved to be useful in clinical work to aid the choice of medications and determine the subtype of depressive episodes. Further studies, possibly with biochemical and neuroimaging methodology are needed to validate the 9 main symptom clusters of the Depression Profile subscales with respect to their neuroanatomical and neurochemical bases.
Measurement of supernatural belief: sex differences and locus of control.

PubMed

Randall, T M; Desrosiers, M

1980-10-01

Although we live in an age dominated by science and technology, there exists an increasingly popular anti-science sentiment. This study describes the development of a scale to assess the degree of personal acceptance of supernatural causality versus acceptance of scientific explanation. In addition to the psychometric data concerning validity and reliability of the scale, data are presented which showed the personality factor of supernaturalism to be independent of orthodox religious attitudes. Results indicated a significantly greater supernatural acceptance for women, and a positive relation of supernaturalism with external locus of control.
Sources of self-efficacy belief: development and validation of two scales.

PubMed

Liu, Ou Lydia; Wilson, Mark

2010-01-01

Self-efficacy belief has been an instrumental affective factor in predicting student behavior and achievement in academic settings. Although there is abundant literature on efficacy belief per se, the sources of efficacy belief have not been fully researched. Very few instruments exist to quantify the sources of efficacy-beliefs. To fill this void, we developed two scales for the two main sources of self-efficacy belief: past performance and social persuasion. Pilot test data were collected from 255 middle school students. A self-efficacy measure was also administered to the students as a criterion measure. The Rasch rating scale model was used to analyze the data. Information on item fit, item design, content validity, external validity, internal consistency, and person separation reliability was examined. The two scales displayed satisfactory psychometric properties. Applications and limitations of these two scales are also discussed.
Validity of the Modified Child Psychopathy Scale for Juvenile Justice Center Residents.

PubMed

Verschuere, Bruno; Candel, Ingrid; Van Reenen, Lique; Korebrits, Andries

2012-06-01

Adult psychopathy has proven to be an important clinical and forensic construct, but much less is known about juvenile psychopathy. In the present study, we examined the construct validity of the self report modified Child Psychopathy Scale mCPS; Lynam (Psychological Bulletin 120:(2), 209-234, 1997) in a sample of 57 adolescents residing in a Dutch juvenile justice center, aged between 13 and 22 years. The mCPS total score was reliably related to high externalizing problems, low empathy, high anger and aggression, high impulsivity, high (violent) delinquency, and high alcohol/drug use. Unique relations were found for the antisocial-impulsive (mCPS Factor 2), but not the callous-unemotional facet of psychopathy (mCPS Factor 1). Our findings support the validity of the mCPS in that it encompasses the antisocial-impulsive facet of psychopathy, but it is less clear whether the mCPS sufficiently captures the affective-interpersonal facet of psychopathy.
QSAR modeling: where have you been? Where are you going to?

PubMed

Cherkasov, Artem; Muratov, Eugene N; Fourches, Denis; Varnek, Alexandre; Baskin, Igor I; Cronin, Mark; Dearden, John; Gramatica, Paola; Martin, Yvonne C; Todeschini, Roberto; Consonni, Viviana; Kuz'min, Victor E; Cramer, Richard; Benigni, Romualdo; Yang, Chihae; Rathman, James; Terfloth, Lothar; Gasteiger, Johann; Richard, Ann; Tropsha, Alexander

2014-06-26

Quantitative structure-activity relationship modeling is one of the major computational tools employed in medicinal chemistry. However, throughout its entire history it has drawn both praise and criticism concerning its reliability, limitations, successes, and failures. In this paper, we discuss (i) the development and evolution of QSAR; (ii) the current trends, unsolved problems, and pressing challenges; and (iii) several novel and emerging applications of QSAR modeling. Throughout this discussion, we provide guidelines for QSAR development, validation, and application, which are summarized in best practices for building rigorously validated and externally predictive QSAR models. We hope that this Perspective will help communications between computational and experimental chemists toward collaborative development and use of QSAR models. We also believe that the guidelines presented here will help journal editors and reviewers apply more stringent scientific standards to manuscripts reporting new QSAR studies, as well as encourage the use of high quality, validated QSARs for regulatory decision making.
QSAR Modeling: Where have you been? Where are you going to?

PubMed Central

Cherkasov, Artem; Muratov, Eugene N.; Fourches, Denis; Varnek, Alexandre; Baskin, Igor I.; Cronin, Mark; Dearden, John; Gramatica, Paola; Martin, Yvonne C.; Todeschini, Roberto; Consonni, Viviana; Kuz'min, Victor E.; Cramer, Richard; Benigni, Romualdo; Yang, Chihae; Rathman, James; Terfloth, Lothar; Gasteiger, Johann; Richard, Ann; Tropsha, Alexander

2014-01-01

Quantitative Structure-Activity Relationship modeling is one of the major computational tools employed in medicinal chemistry. However, throughout its entire history it has drawn both praise and criticism concerning its reliability, limitations, successes, and failures. In this paper, we discuss: (i) the development and evolution of QSAR; (ii) the current trends, unsolved problems, and pressing challenges; and (iii) several novel and emerging applications of QSAR modeling. Throughout this discussion, we provide guidelines for QSAR development, validation, and application, which are summarized in best practices for building rigorously validated and externally predictive QSAR models. We hope that this Perspective will help communications between computational and experimental chemists towards collaborative development and use of QSAR models. We also believe that the guidelines presented here will help journal editors and reviewers apply more stringent scientific standards to manuscripts reporting new QSAR studies, as well as encourage the use of high quality, validated QSARs for regulatory decision making. PMID:24351051
Assessment of Fearless Dominance and Impulsive Antisociality via Normal Personality Measures: Convergent Validity, Criterion Validity, and Developmental Change

PubMed Central

Witt, Edward A.; Donnellan, M. Brent; Blonigen, Daniel M.; Krueger, Robert F.; Conger, Rand D.

2009-01-01

This report provides evidence for the reliability, validity, and developmental course of the psychopathic personality traits of Fearless Dominance (FD) and Impulsive Antisociality (IA) as assessed by items from Multidimensional Personality Questionnaire (MPQ; Patrick, Curtin, & Tellegen, 2002). In Study 1, MPQ-based measures of FD and IA were strongly correlated with their corresponding composite scores from the Psychopathic Personality Inventory-Revised (Lilienfeld & Widows, 2005). In Study 2, FD and IA had relatively distinct associations with measures of normal and maladaptive personality traits. In Study 3, FD and IA had substantial retest coefficients during the transition to adulthood and both traits showed average declines with an especially substantial drop in IA. In Study 4, FD and IA were correlated with measures of internalizing and externalizing problems in ways consistent with previous research and theory. Collectively, these results provide important information about the assessment of FD and IA. PMID:19365767
Development and validation of the Family Motivational Climate Questionnaire (FMC-Q).

PubMed

Alonso Tapia, Jesús; Simón Rueda, Cecilia; Asensio Fuentes, César

2013-01-01

The goal of this study was to develop and validate the Family Motivational Climate Questionnaire (FMCQ). Parental involvement (PI) affects children's academic orientations. However, PI questionnaires had not considered parenting behaviours from the perspective of motivational theories. It was therefore decided to develop the FMCQ. 570 Secondary-School students formed the sample. To validate the FMCQ, confirmatory factor analyses, reliability analysis and correlation and regression analyses were conducted. Children's attribution to parents of perceived change in motivational variables affecting achievement, were used as external criteria. Results support most of the hypotheses either related to the FMCQ structure or to its moderating role as predictor of school achievement and of attribution to parents of changes in different motivational variables --interest, effort, perceived ability, success expectancies, resilience, and satisfaction. The results underline the importance of acting on FMC-components in order to improve Children's motivation and achievement.
[Validation of the Scale of Hope in Terminal Illness for relatives brief version (SHTI-b). Validity and reliability analysis.

PubMed

Villacieros, M; Bermejo, J C; Hassoun, H

2017-12-29

Bermejo and Villacieros' Scale of Hope in Terminal Disease (SHTD) specifically collects meanings of hope facing terminal disease, including considerations relating to psycho-emotional support and that have a transcendental sense. The objective of this paper is to validate the SHTD abbreviated and rephrased to adapt all the items to a single domain. Starting from the published SHTD, an exploratory factor analysis (EFA) was carried out with a sample of 177 valid questionnaires. In a second study, with another sample of 180 valid questionnaires, a confirmatory factor analysis (CFA) and a correlation analysis with other measurements of spiritual wellbeing (Functional Assessment of Chronic Illness Therapy-Sp) and hope (Herth Hope Index) were done. A bidimensional model with satisfactory goodness of fit index values was obtained (GFI = 0.991; CFI = 0.984; SRMR = 0.08; RMSEA = 0.057); the Relations of Transcendence factor obtained a Cronbach's alpha of 0.872 and Personal Relations an alpha of 0.762. The correlations of the SHTI-rb with external measures were: r = 0.527with FACIT; r = 0.266 with HHI; r = 0.667 with the Spirituality subscale of FACIT; and r = 0.348 with the Interrelation factor of HHI. The Relations of Transcendence subscale correlated with both Layout and Expectation and Interrelation of HHI (r = 0.162 and r = 0.329 respectively), while the scale of Personal Relations only correlated with Interrelation of HHI (r = 0.244). The Scale of Hope in Terminal Illness for relatives (brief version) is a valid and reliable specific instrument for terminal patients.
Development of the Return-to-Work Obstacles and Self-Efficacy Scale (ROSES) and Validation with Workers Suffering from a Common Mental Disorder or Musculoskeletal Disorder.

PubMed

Corbière, Marc; Negrini, Alessia; Durand, Marie-José; St-Arnaud, Louise; Briand, Catherine; Fassier, Jean-Baptiste; Loisel, Patrick; Lachance, Jean-Philippe

2017-09-01

Introduction Common mental disorders (CMDs) and musculoskeletal disorders (MSDs) lead the list of causes for work absence in several countries. Current research is starting to look at workers on sick leave as a single population, regardless of the nature of the disease or accident. The purpose of this study is to report the validation of the Return to Work Obstacles and Self-Efficacy Scale (ROSES) for people with MSDs and CMDs, based on the disability paradigm. Methods From a prospective design, the ROSES' reliability and validity were investigated in a Canadian sample of workers on sick leave due to MSDs (n = 206) and CMDs (n = 157). Results Exploratory and confirmatory factor analyses revealed that 46 items spread out on 10 conceptual dimensions (e.g., Fears of a relapse, Job demands, Difficult relation with the immediate supervisor), with satisfactory alpha coefficients and test-retest reliability for all subscales. Finally, several dimensions of ROSES also predict the participant's RTW within 6 months for MSDs (e.g., job demands), and CMDs (e.g., difficult relation with the immediate supervisor), even when adjusted by several variables (e.g., age, severity of symptoms). Apart from the job demands dimension, when the ROSES dimension is more external to the individual, only the perception of obstacles remains significant to predict RTW whereas it is the opposite result when the dimension is more internal (e.g., fears of a relapse). Conclusion The ROSES demonstrated satisfactory results regarding its validity and reliability with people having MSDs or CMDs, at the time of the return-to-work process.
Rasch-built Overall Disability Scale for patients with chemotherapy-induced peripheral neuropathy (CIPN-R-ODS).

PubMed

Binda, D; Vanhoutte, E K; Cavaletti, G; Cornblath, D R; Postma, T J; Frigeni, B; Alberti, P; Bruna, J; Velasco, R; Argyriou, A A; Kalofonos, H P; Psimaras, D; Ricard, D; Pace, A; Galiè, E; Briani, C; Dalla Torre, C; Lalisang, R I; Boogerd, W; Brandsma, D; Koeppen, S; Hense, J; Storey, D; Kerrigan, S; Schenone, A; Fabbri, S; Rossi, E; Valsecchi, M G; Faber, C G; Merkies, I S J; Galimberti, S; Lanzani, F; Mattavelli, L; Piatti, M L; Bidoli, P; Cazzaniga, M; Cortinovis, D; Lucchetta, M; Campagnolo, M; Bakkers, M; Brouwer, B; Boogerd, W; Grant, R; Reni, L; Piras, B; Pessino, A; Padua, L; Granata, G; Leandri, M; Ghignotti, I; Plasmati, R; Pastorelli, F; Heimans, J J; Eurelings, M; Meijer, R J; Grisold, W; Lindeck Pozza, E; Mazzeo, A; Toscano, A; Russo, M; Tomasello, C; Altavilla, G; Penas Prado, M; Dominguez Gonzalez, C; Dorsey, S G

2013-09-01

Chemotherapy-induced peripheral neuropathy (CIPN) is a common neurological side-effect of cancer treatment and may lead to declines in patients' daily functioning and quality of life. To date, there are no modern clinimetrically well-evaluated outcome measures available to assess disability in CIPN patients. The objective of the study was to develop an interval-weighted scale to capture activity limitations and participation restrictions in CIPN patients using the Rasch methodology and to determine its validity and reliability properties. A preliminary Rasch-built Overall Disability Scale (pre-R-ODS) comprising 146 items was assessed twice (interval: 2-3 weeks; test-retest reliability) in 281 CIPN patients with a stable clinical condition. The obtained data were subjected to Rasch analyses to determine whether model expectations would be met, and if necessarily, adaptations were made to obtain proper model fit (internal validity). External validity was obtained by correlating the CIPN-R-ODS with the National Cancer Institute-Common Toxicity Criteria (NCI-CTC) neuropathy scales and the Pain-Intensity Numeric-Rating-Scale (PI-NRS). The preliminary R-ODS did not meet Rasch model's expectations. Items displaying misfit statistics, disordered thresholds, item bias or local dependency were systematically removed. The final CIPN-R-ODS consisting of 28 items fulfilled all the model's expectations with proper validity and reliability, and was unidimensional. The final CIPN-R-ODS is a Rasch-built disease-specific, interval measure suitable to detect disability in CIPN patients and bypasses the shortcomings of classical test theory ordinal-based measures. Its use is recommended in future clinical trials in CIPN. Copyright © 2013 Elsevier Ltd. All rights reserved.
Psychometric properties of the abbreviated version of the Scale to Assess Unawareness in Mental Disorder in schizophrenia

PubMed Central

2013-01-01

Background The Scale to Assess Unawareness in Mental Disorder (SUMD) is widely used in clinical trials and epidemiological studies but more rarely in clinical practice because of its length (74 items). In clinical practice, it is necessary to provide shorter instruments. The aim of this study was to investigate the validity and reliability of the abbreviated version of the SUMD. Methods Design: We used data from four cross-sectional studies conducted in several psychiatric hospitals in France. Inclusion criteria: a diagnosis of schizophrenia based on DSM-IV criteria. Data collection: socio-demographic and clinical data (including duration of illness, Positive and Negative Syndrome Scale, and the Calgary Depression Scale); quality of life; SUMD. Statistical analysis: confirmatory factor analyses, item-dimension correlations, Cronbach’s alpha coefficients, Rasch statistics, relationships between the SUMD and other parameters. We tested two different scoring models and considered the response ‘not applicable’ as ‘0’ or as missing data. Results Five hundred and thirty-one patients participated in this study. The 3-factor structure of the SUMD (awareness of the disease, consequences and need for treatment; awareness of positive symptoms; and awareness of negative symptoms) was confirmed using LISREL confirmatory factor analysis for the two models. Internal item consistency and reliability were satisfactory for all dimensions. External validity testing revealed that dimension scores correlated significantly with all PANSS scores, especially with the G12 item (lack of judgement and awareness). Significant associations with age, disease duration, education level, and living arrangements showed good discriminant validity. Conclusion The abbreviated version of the SUMD appears to be a valid and reliable instrument for measuring insight in patients with schizophrenia and may be used by clinicians to accurately assess insight in clinical settings. PMID:24053640
Adaptation and Validation of the Psychological Need Thwarting Scale in Spanish Physical Education Teachers.

PubMed

Cuevas, Ricardo; Sánchez-Oliva, David; Bartholomew, Kimberley J; Ntoumanis, Nikos; García-Calvo, Tomás

2015-07-20

Drawing from self-determination theory (SDT; Deci & Ryan, 1985; Ryan & Deci, 2002), the aim of the study was to adapt and validate a Spanish version of the Psychological Need Thwarting Scale (PNTS; Bartholomew, Ntoumanis, Ryan, & Thørgersen-Ntoumani, 2011) in the educational domain. Psychological need thwarting and burnout were assessed in 619 physical education teachers from several high schools in Spain. Overall, the adapted measure demonstrated good content, factorial (χ2/gl = 4.87, p < .01, CFI = .95, IFI = .96, TLI = .94, RMSEA = .08, SRMR = .05), and external validity, as well as internal consistency (α ≥ .81) and invariance across gender. Moreover, burnout was strongly predicted by teachers' perceptions of competence (β = .53, p ≤ .01), autonomy (β = .34, p ≤ .01), and relatedness (β = .31, p ≤ .01) need thwarting. In conclusion, these results support the Spanish version of the PNTS as a valid and reliable instrument for assessing the understudied concept of psychological need thwarting in teachers.

Hybrid time-variant reliability estimation for active control structures under aleatory and epistemic uncertainties

NASA Astrophysics Data System (ADS)

Wang, Lei; Xiong, Chuang; Wang, Xiaojun; Li, Yunlong; Xu, Menghui

2018-04-01

Considering that multi-source uncertainties from inherent nature as well as the external environment are unavoidable and severely affect the controller performance, the dynamic safety assessment with high confidence is of great significance for scientists and engineers. In view of this, the uncertainty quantification analysis and time-variant reliability estimation corresponding to the closed-loop control problems are conducted in this study under a mixture of random, interval, and convex uncertainties. By combining the state-space transformation and the natural set expansion, the boundary laws of controlled response histories are first confirmed with specific implementation of random items. For nonlinear cases, the collocation set methodology and fourth Rounge-Kutta algorithm are introduced as well. Enlightened by the first-passage model in random process theory as well as by the static probabilistic reliability ideas, a new definition of the hybrid time-variant reliability measurement is provided for the vibration control systems and the related solution details are further expounded. Two engineering examples are eventually presented to demonstrate the validity and applicability of the methodology developed.
Psychometric Properties and Factor Structure of the German Version of the Clinician-Administered PTSD Scale for DSM-5.

PubMed

Müller-Engelmann, Meike; Schnyder, Ulrich; Dittmann, Clara; Priebe, Kathlen; Bohus, Martin; Thome, Janine; Fydrich, Thomas; Pfaltz, Monique C; Steil, Regina

2018-05-01

The Clinician-Administered PTSD Scale (CAPS) is a widely used diagnostic interview for posttraumatic stress disorder (PTSD). Following fundamental modifications in the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition ( DSM-5), the CAPS had to be revised. This study examined the psychometric properties (internal consistency, interrater reliability, convergent and discriminant validity, and structural validity) of the German version of the CAPS-5 in a trauma-exposed sample ( n = 223 with PTSD; n =51 without PTSD). The results demonstrated high internal consistency (αs = .65-.93) and high interrater reliability (ICCs = .81-.89). With regard to convergent and discriminant validity, we found high correlations between the CAPS severity score and both the Posttraumatic Diagnostic Scale sum score ( r = .87) and the Beck Depression Inventory total score ( r = .72). Regarding the underlying factor structure, the hybrid model demonstrated the best fit, followed by the anhedonia model. However, we encountered some nonpositive estimates for the correlations of the latent variables (factors) for both models. The model with the best fit without methodological problems was the externalizing behaviors model, but the results also supported the DSM-5 model. Overall, the results demonstrate that the German version of the CAPS-5 is a psychometrically sound measure.
Development and Validation of the Escala de Actitudes Emprendedoras para Estudiantes (EAEE).

PubMed

Oliver, Amparo; Galiana, Laura

2015-03-17

During the last few years, entrepreneurship has gained an important role in many economic and social policies, with the consequent growth of entrepreneurial research in many social areas. However, in the Spanish psychometric context, there is not an updated scale including recent contributions to entrepreneurship attitudes literature. The aim of this study is to present and validate a new scale named Escala de Actitudes Emprendedoras para Estudiantes-EAEE, (Entrepreneurial Attitudes Scale for Students, EASS), in two samples of high school and university Spanish students. Data comes from a cross-sectional survey of 524 high school and undergraduate students, from Valencia (Spain). Two confirmatory factor analyses (CFAs) were estimated, together with reliability and validity evidence of the scale. Results offered evidence of the adequate psychometric properties of the EASS. The CFAs showed overall and analytical adequate fit indexes (χ 2 (120) = 163.19 (p < .01), GFI = .906, CFI = .959, SRMR = .044, RMSEA = .040 [CI .022-.054]); reliability indices of the entrepreneurial attitudes were appropriate for most of the entrepreneurial attitudes (α were between .63 and .87 for the different dimensions); and external evidence relating entrepreneurial dimensions to personality traits was similar to in previous studies. The scale could be a useful instrument both for previous diagnosis and effectiveness assessment of programs on entrepreneurship promotion.
Identifying developmental coordination disorder: MOQ-T validity as a fast screening instrument based on teachers' ratings and its relationship with praxic and visuospatial working memory deficits.

PubMed

Giofrè, David; Cornoldi, Cesare; Schoemaker, Marina M

2014-12-01

The present study was devoted to test the validity of the Italian adaptation of the Motor Observation Questionnaire for Teachers (MOQ-T, Schoemaker, Flapper, Reinders-Messelink, & De Kloet, 2008) as a fast screening instrument, based on teachers' ratings, for detecting developmental coordination disorders symptoms and to study its relationship with praxic and visuospatial working memory deficits. In a first study on a large sample of children, we assessed the reliability and structure of the Italian adaptation of the MOQ-T. Results showed a good reliability of the questionnaire and a hierarchical structure with two first-order factors (reflecting motor and handwriting skills), which are influenced by a second-order factor (general motor function) at the top. In a second study, we looked at the external validity of the MOQ-T and found that children with symptoms of Developmental Coordination Disorder (children with high scores on the MOQ-T) also had difficulty reproducing gestures, either imitating others or in response to verbal prompts. Our results also showed that children with high MOQ-T scores had visuospatial WM impairments. The theoretical and clinical implications of these findings are discussed. Copyright © 2014 Elsevier Ltd. All rights reserved.
Persian version of the Moorong Self-Efficacy Scale: psychometric study among subjects with physical disability.

PubMed

Rajati, Fatemeh; Ghanbari, Masoud; Hasandokht, Tolou; Hosseini, Seyed Younes; Akbarzadeh, Rasool; Ashtarian, Hossein

2017-11-01

Self-efficacy plays a key role in varying areas of human conditions which can be measured by different scales. The present study was aimed to evaluate the psychometric properties of Moorong Self-Efficacy Scale (MSES) in Iranian Subjects with Physical Disability (SWPD). Data were collected by face-to-face interviews and self-report surveys from 214 subjects. The face and content validity, and reliability were evaluated. Discriminates were evaluated between the sub-groups of disability levels, physical activity, and health condition levels. The concurrent, convergent, divergent, and construct validity were assessed by short form health survey scale (SF-36), general self-efficacy scale (GSES), hospital anxiety and depression scale (HADS), respectively. Replaceable exploratory factor analysis was evaluated. SPSS software was used for statistical analysis. There were acceptable face and content validity, and reliability. Furthermore, significant correlation was found between PSES and SF-36 (p < 0.001). Self-efficacy was statistically different among the disability levels (p = 0.02), physical activity levels (p < 0.001), and health status (p = 0.001). The correlation of Persian Self-Efficacy Scale (PSES) scores with GSES (r = 0.61, p < 0.001), and HADS (R = -0.53, p < 0.001) was significant. This scale yielded a two-dimensional structure, with a good internal replicability. The external replicability was satisfactory when we compared factor loadings with the original study. The PSES is a valid, reliable and sensitive tool to measure the self-efficacy among SWPD for planning and managing of disability problems. Implications for rehabilitation Psychometric properties of the Persian version of self-Efficacy scale (PSES) appear to be similar to original, English version. The PSES has been shown to have validity and reliability in Persian physical disables and can be used for patients with more different types of physical disability than individuals suffering from only Spinal Cord Injury (SCI). The PSES can be used in clinical practice and research work to evaluate the patients' confidence in performing daily activities.
Development and validation of a new population-based simulation model of osteoarthritis in New Zealand.

PubMed

Wilson, R; Abbott, J H

2018-04-01

To describe the construction and preliminary validation of a new population-based microsimulation model developed to analyse the health and economic burden and cost-effectiveness of treatments for knee osteoarthritis (OA) in New Zealand (NZ). We developed the New Zealand Management of Osteoarthritis (NZ-MOA) model, a discrete-time state-transition microsimulation model of the natural history of radiographic knee OA. In this article, we report on the model structure, derivation of input data, validation of baseline model parameters against external data sources, and validation of model outputs by comparison of the predicted population health loss with previous estimates. The NZ-MOA model simulates both the structural progression of radiographic knee OA and the stochastic development of multiple disease symptoms. Input parameters were sourced from NZ population-based data where possible, and from international sources where NZ-specific data were not available. The predicted distributions of structural OA severity and health utility detriments associated with OA were externally validated against other sources of evidence, and uncertainty resulting from key input parameters was quantified. The resulting lifetime and current population health-loss burden was consistent with estimates of previous studies. The new NZ-MOA model provides reliable estimates of the health loss associated with knee OA in the NZ population. The model structure is suitable for analysis of the effects of a range of potential treatments, and will be used in future work to evaluate the cost-effectiveness of recommended interventions within the NZ healthcare system. Copyright © 2018 Osteoarthritis Research Society International. Published by Elsevier Ltd. All rights reserved.
The Nature of Science Instrument-Elementary (NOSI-E): the end of the road?

PubMed

Peoples, Shelagh M; O'Dwyer, Laura M

2014-01-01

This research continues prior work published in this journal (Peoples, O'Dwyer, Shields and Wang, 2013). The first paper described the scale development, psychometric analyses and part-validation of a theoretically-grounded Rasch-based instrument, the Nature of Science Instrument-Elementary (NOSI-E). The NOSI-E was designed to measure elementary students' understanding of the Nature of Science (NOS). In the first paper, evidence was provided for three of the six validity aspects (content, substantive and generalizability) needed to support the construct validity of the NOSI-E. The research described in this paper examines two additional validity aspects (structural and external). The purpose of this study was to determine which of three competing internal models provides reliable, interpretable, and responsive measures of students' understanding of NOS. One postulate is that the NOS construct is unidimensional;. alternatively, the NOS construct is composed of five independent unidimensional constructs (the consecutive approach). Lastly, the NOS construct is multidimensional and composed of five inter-related but separate dimensions. The vast body of evidence supported the claim that the NOS construct is multidimensional. Measures from the multidimensional model were positively related to student science achievement and students' perceptions of their classroom environment; this provided supporting evidence for the external validity aspect of the NOS construct. As US science education moves toward students learning science through engaging in authentic scientific practices and building learning progressions (NRC, 2012), it will be important to assess whether this new approach to teaching science is effective, and the NOSI-E may be used as a measure of the impact of this reform.
A metabolic fingerprinting approach based on selected ion flow tube mass spectrometry (SIFT-MS) and chemometrics: A reliable tool for Mediterranean origin-labeled olive oils authentication.

PubMed

Bajoub, Aadil; Medina-Rodríguez, Santiago; Ajal, El Amine; Cuadros-Rodríguez, Luis; Monasterio, Romina Paula; Vercammen, Joeri; Fernández-Gutiérrez, Alberto; Carrasco-Pancorbo, Alegría

2018-04-01

Selected Ion flow tube mass spectrometry (SIFT-MS) in combination with chemometrics was used to authenticate the geographical origin of Mediterranean virgin olive oils (VOOs) produced under geographical origin labels. In particular, 130 oil samples from six different Mediterranean regions (Kalamata (Greece); Toscana (Italy); Meknès and Tyout (Morocco); and Priego de Córdoba and Baena (Spain)) were considered. The headspace volatile fingerprints were measured by SIFT-MS in full scan with H 3 O + , NO + and O 2 + as precursor ions and the results were subjected to chemometric treatments. Principal Component Analysis (PCA) was used for preliminary multivariate data analysis and Partial Least Squares-Discriminant Analysis (PLS-DA) was applied to build different models (considering the three reagent ions) to classify samples according to the country of origin and regions (within the same country). The multi-class PLS-DA models showed very good performance in terms of fitting accuracy (98.90-100%) and prediction accuracy (96.70-100% accuracy for cross validation and 97.30-100% accuracy for external validation (test set)). Considering the two-class PLS-DA models, the one for the Spanish samples showed 100% sensitivity, specificity and accuracy in calibration, cross validation and external validation; the model for Moroccan oils also showed very satisfactory results (with perfect scores for almost every parameter in all the cases). Copyright © 2017 Elsevier Ltd. All rights reserved.
Development and validation of the Bullying and Cyberbullying Scale for Adolescents: A multi-dimensional measurement model.

PubMed

Thomas, Hannah J; Scott, James G; Coates, Jason M; Connor, Jason P

2018-05-03

Intervention on adolescent bullying is reliant on valid and reliable measurement of victimization and perpetration experiences across different behavioural expressions. This study developed and validated a survey tool that integrates measurement of both traditional and cyber bullying to test a theoretically driven multi-dimensional model. Adolescents from 10 mainstream secondary schools completed a baseline and follow-up survey (N = 1,217; M age = 14 years; 66.2% male). The Bullying and cyberbullying Scale for Adolescents (BCS-A) developed for this study comprised parallel victimization and perpetration subscales, each with 20 items. Additional measures of bullying (Olweus Global Bullying and the Forms of Bullying Scale [FBS]), as well as measures of internalizing and externalizing problems, school connectedness, social support, and personality, were used to further assess validity. Factor structure was determined, and then, the suitability of items was assessed according to the following criteria: (1) factor interpretability, (2) item correlations, (3) model parsimony, and (4) measurement equivalence across victimization and perpetration experiences. The final models comprised four factors: physical, verbal, relational, and cyber. The final scale was revised to two 13-item subscales. The BCS-A demonstrated acceptable concurrent and convergent validity (internalizing and externalizing problems, school connectedness, social support, and personality), as well as predictive validity over 6 months. The BCS-A has sound psychometric properties. This tool establishes measurement equivalence across types of involvement and behavioural forms common among adolescents. An improved measurement method could add greater rigour to the evaluation of intervention programmes and also enable interventions to be tailored to subscale profiles. © 2018 The British Psychological Society.
Development and validation of a measure of display rule knowledge: the display rule assessment inventory.

PubMed

Matsumoto, David; Yoo, Seung Hee; Hirayama, Satoko; Petrova, Galina

2005-03-01

As one component of emotion regulation, display rules, which reflect the regulation of expressive behavior, have been the topic of many studies. Despite their theoretical and empirical importance, however, to date there is no measure of display rules that assesses a full range of behavioral responses that are theoretically possible when emotion is elicited. This article reports the development of a new measure of display rules that surveys 5 expressive modes: expression, deamplification, amplification, qualification, and masking. Two studies provide evidence for its internal and temporal reliability and for its content, convergent, discriminant, external, and concurrent predictive validity. Additionally, Study 1, involving American, Russian, and Japanese participants, demonstrated predictable cultural differences on each of the expressive modes. Copyright 2005 APA, all rights reserved.
Development and validation of a scale to measure perceived control of internal states.

PubMed

Pallant, J F

2000-10-01

One of the key developments in the psychological literature on control has been the growing recognition of the multidimensional nature of the control construct. Recent research suggests that perceived control of internal states may be just as important as perceived control of external events. The Perceived Control of Internal States Scale was developed to provide a measure of the degree to which people feel they have control of their internal states (emotions, thoughts, physical reactions). I report the results of 2 studies (N= 689), supporting the reliability, construct, and incremental validity of the scale. The buffering effects of perceived control for people facing major life events was also explored, with higher levels of perceived control being associated with less physical and psychological symptoms of strain.
Sexual compulsivity scale: adaptation and validation in the spanish population.

PubMed

Ballester-Arnal, Rafael; Gómez-Martínez, Sandra; Llario, M Dolores-Gil; Salmerón-Sánchez, Pedro

2013-01-01

Sexual compulsivity has been studied in relation to high-risk behavior for sexually transmitted infections. The aim of this study was the adaptation and validation of the Sexual Compulsivity Scale to a sample of Spanish young people. This scale was applied to 1,196 (891 female, 305 male) Spanish college students. The results of principal components factor analysis using a varimax rotation indicated a two-factor solution. The reliability of the Sexual Compulsivity Scale was found to be high. Moreover, the scale showed good temporal stability. External correlates were examined through Pearson correlations between the Sexual Compulsivity Scale and other constructs related with HIV prevention. The authors' results suggest that the Sexual Compulsivity Scale is an appropriate measure for assessing sexual compulsivity, showing adequate psychometric properties in the Spanish population.
Reliability of Measurement of Glenohumeral Internal Rotation, External Rotation, and Total Arc of Motion in 3 Test Positions

PubMed Central

Kevern, Mark A.; Beecher, Michael; Rao, Smita

2014-01-01

Context: Athletes who participate in throwing and racket sports consistently demonstrate adaptive changes in glenohumeral-joint internal and external rotation in the dominant arm. Measurements of these motions have demonstrated excellent intrarater and poor interrater reliability. Objective: To determine intrarater reliability, interrater reliability, and standard error of measurement for shoulder internal rotation, external rotation, and total arc of motion using an inclinometer in 3 testing procedures in National Collegiate Athletic Association Division I baseball and softball athletes. Design: Cross-sectional study. Setting: Athletic department. Patients or Other Participants Thirty-eight players participated in the study. Shoulder internal rotation, external rotation, and total arc of motion were measured by 2 investigators in 3 test positions. The standard supine position was compared with a side-lying test position, as well as a supine test position without examiner overpressure. Results: Excellent intrarater reliability was noted for all 3 test positions and ranges of motion, with intraclass correlation coefficient values ranging from 0.93 to 0.99. Results for interrater reliability were less favorable. Reliability for internal rotation was highest in the side-lying position (0.68) and reliability for external rotation and total arc was highest in the supine-without-overpressure position (0.774 and 0.713, respectively). The supine-with-overpressure position yielded the lowest interrater reliability results in all positions. The side-lying position had the most consistent results, with very little variation among intraclass correlation coefficient values for the various test positions. Conclusions: The results of our study clearly indicate that the side-lying test procedure is of equal or greater value than the traditional supine-with-overpressure method. PMID:25188316
External validation of Vascular Study Group of New England risk predictive model of mortality after elective abdominal aorta aneurysm repair in the Vascular Quality Initiative and comparison against established models.

PubMed

Eslami, Mohammad H; Rybin, Denis V; Doros, Gheorghe; Siracuse, Jeffrey J; Farber, Alik

2018-01-01

The purpose of this study is to externally validate a recently reported Vascular Study Group of New England (VSGNE) risk predictive model of postoperative mortality after elective abdominal aortic aneurysm (AAA) repair and to compare its predictive ability across different patients' risk categories and against the established risk predictive models using the Vascular Quality Initiative (VQI) AAA sample. The VQI AAA database (2010-2015) was queried for patients who underwent elective AAA repair. The VSGNE cases were excluded from the VQI sample. The external validation of a recently published VSGNE AAA risk predictive model, which includes only preoperative variables (age, gender, history of coronary artery disease, chronic obstructive pulmonary disease, cerebrovascular disease, creatinine levels, and aneurysm size) and planned type of repair, was performed using the VQI elective AAA repair sample. The predictive value of the model was assessed via the C-statistic. Hosmer-Lemeshow method was used to assess calibration and goodness of fit. This model was then compared with the Medicare, Vascular Governance Northwest model, and Glasgow Aneurysm Score for predicting mortality in VQI sample. The Vuong test was performed to compare the model fit between the models. Model discrimination was assessed in different risk group VQI quintiles. Data from 4431 cases from the VSGNE sample with the overall mortality rate of 1.4% was used to develop the model. The internally validated VSGNE model showed a very high discriminating ability in predicting mortality (C = 0.822) and good model fit (Hosmer-Lemeshow P = .309) among the VSGNE elective AAA repair sample. External validation on 16,989 VQI cases with an overall 0.9% mortality rate showed very robust predictive ability of mortality (C = 0.802). Vuong tests yielded a significant fit difference favoring the VSGNE over then Medicare model (C = 0.780), Vascular Governance Northwest (0.774), and Glasgow Aneurysm Score (0.639). Across the 5 risk quintiles, the VSGNE model predicted observed mortality significantly with great accuracy. This simple VSGNE AAA risk predictive model showed very high discriminative ability in predicting mortality after elective AAA repair among a large external independent sample of AAA cases performed by a diverse array of physicians nationwide. The risk score based on this simple VSGNE model can reliably stratify patients according to their risk of mortality after elective AAA repair better than other established models. Copyright © 2017 Society for Vascular Surgery. Published by Elsevier Inc. All rights reserved.
Absorption in Sport: A Cross-Validation Study

PubMed Central

Koehn, Stefan; Stavrou, Nektarios A. M.; Cogley, Jeremy; Morris, Tony; Mosek, Erez; Watt, Anthony P.

2017-01-01

Absorption has been identified as readiness for experiences of deep involvement in the task. Conceptually, absorption is a key psychological construct, incorporating experiential, cognitive, and motivational components. Although, no operationalization of the construct has been provided to facilitate research in this area, the purpose of this research was the development and examination of the psychometric properties of a sport-specific measure of absorption that evolved from the use of the modified Tellegen Absorption Scale (MODTAS; Jamieson, 2005) in mainstream psychology. The study aimed to provide evidence of the psychometric properties, reliability, and validity of the Measure of Absorption in Sport Contexts (MASCs). The psychometric examination included a calibration sample from Scotland and a cross-validation sample from Australia using a cross-sectional design. The item pool was developed based on existing items from the modified Tellegen Absorption Scale (Jamieson, 2005). The MODTAS items were reworded and translated into a sport context. The Scottish sample consisted of 292 participants and the Australian sample of 314 participants. Congeneric model testing and confirmatory factor analysis for both samples and multi-group invariance testing across samples was used. In the cross-validation sample the MASC subscales showed acceptable internal consistency and construct reliability (≥0.70). Excellent fit indices were found for the final 18-item, six-factor measure in the cross-validation sample, χ(120)2 = 197.486, p < 0.001; CFI = 0.957; TLI = 0.945; RMSEA = 0.045; SRMR = 0.044. Multi-group invariance testing revealed no differences in item meaning, except for two items. The MASC and the Dispositional Flow Scale-2 showed moderate-to-strong positive correlations in both samples, r = 0.38, p < 0.001 and r = 0.42, p < 0.001, supporting the external validity of the MASC. This article provides initial evidence in support of the psychometric properties, reliability, and validity of the sport-specific measure of absorption. The MASC provides rich research opportunities in sport psychology that can enhance the theoretical understanding between absorption and related constructs and facilitate future intervention studies. PMID:28883802
Validity of the Malaise Inventory in general population samples.

PubMed

Rodgers, B; Pickles, A; Power, C; Collishaw, S; Maughan, B

1999-06-01

The Malaise Inventory is a commonly used self-completion scale for assessing psychiatric morbidity. There is some evidence that it may represent two separate psychological and somatic subscales rather than a single underlying factor of distress. This paper provides further information on the factor structure of the Inventory and on the reliability and validity of the total scale and two sub-scales. Two general population samples completed the full Inventory: over 11,000 subjects from the National Child Development Study at ages 23 and 33, and 544 mothers of adolescents included in the Isle of Wight epidemiological surveys. The internal consistency of the full 24-item scale and the 15-item psychological subscale were found to be acceptable, but the eight-item somatic sub-scale was less reliable. Factor analysis of all 24 items identified a first main general factor and a second more purely psychological factor. Receiver operating characteristic (ROC) analysis indicated that the validity of the scale held for men and women separately and for different socio-economic groups, by reference to external criteria covering current or recent psychiatric morbidity and service use, and that the psychological sub-scale had no greater validity than the full scale. This study did not support the separate scoring of a somatic sub-scale of the Malaise Inventory. Use of the 15-item psychological sub-scale can be justified on the grounds of reduced time and cost for completion, with little loss of reliability or validity, but this approach would not significantly enhance the properties of the Inventory by comparison with the full 24-item scale. Inclusion of somatic items may be more problematic when the full scale is used to compare particular sub-populations with different propensities for physical morbidity, such as different age groups, and in these circumstances it would be a sensible precaution to utilise the 15-item psychological sub-scale.
Can Findings from Randomized Controlled Trials of Social Skills Training in Autism Spectrum Disorder Be Generalized? The Neglected Dimension of External Validity

ERIC Educational Resources Information Center

Jonsson, Ulf; Olsson, Nora Choque; Bölte, Sven

2016-01-01

Systematic reviews have traditionally focused on internal validity, while external validity often has been overlooked. In this study, we systematically reviewed determinants of external validity in the accumulated randomized controlled trials of social skills group interventions for children and adolescents with autism spectrum disorder. We…
External Validity, Internal Validity, and Organizational Reality: A Response to Robert L. Cardy (Commentary).

ERIC Educational Resources Information Center

Steinfatt, Thomas M.

1991-01-01

Responds to an article in the same issue of this journal which defends the applied value of laboratory studies to managers. Agrees that external validity is often irrelevant, and maintains that the problem of making inferences from any subject sample in management communication is one that demands internal, not external, validity. (SR)
Quality assessment of gasoline using comprehensive two-dimensional gas chromatography combined with unfolded partial least squares: A reliable approach for the detection of gasoline adulteration.

PubMed

Parastar, Hadi; Mostafapour, Sara; Azimi, Gholamhasan

2016-01-01

Comprehensive two-dimensional gas chromatography and flame ionization detection combined with unfolded-partial least squares is proposed as a simple, fast and reliable method to assess the quality of gasoline and to detect its potential adulterants. The data for the calibration set are first baseline corrected using a two-dimensional asymmetric least squares algorithm. The number of significant partial least squares components to build the model is determined using the minimum value of root-mean square error of leave-one out cross validation, which was 4. In this regard, blends of gasoline with kerosene, white spirit and paint thinner as frequently used adulterants are used to make calibration samples. Appropriate statistical parameters of regression coefficient of 0.996-0.998, root-mean square error of prediction of 0.005-0.010 and relative error of prediction of 1.54-3.82% for the calibration set show the reliability of the developed method. In addition, the developed method is externally validated with three samples in validation set (with a relative error of prediction below 10.0%). Finally, to test the applicability of the proposed strategy for the analysis of real samples, five real gasoline samples collected from gas stations are used for this purpose and the gasoline proportions were in range of 70-85%. Also, the relative standard deviations were below 8.5% for different samples in the prediction set. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
The Psychometric Properties of the Center for Epidemiologic Studies Depression Scale in Chinese Primary Care Patients: Factor Structure, Construct Validity, Reliability, Sensitivity and Responsiveness

PubMed Central

2015-01-01

Background The Center for Epidemiologic Studies Depression Scale (CES-D) is a commonly used instrument to measure depressive symptomatology. Despite this, the evidence for its psychometric properties remains poorly established in Chinese populations. The aim of this study was to validate the use of the CES-D in Chinese primary care patients by examining factor structure, construct validity, reliability, sensitivity and responsiveness. Methods and Results The psychometric properties were assessed amongst a sample of 3686 Chinese adult primary care patients in Hong Kong. Three competing factor structure models were examined using confirmatory factor analysis. The original CES-D four-structure model had adequate fit, however the data was better fit into a bi-factor model. For the internal construct validity, corrected item-total correlations were 0.4 for most items. The convergent validity was assessed by examining the correlations between the CES-D, the Patient Health Questionnaire 9 (PHQ-9) and the Short Form-12 Health Survey (version 2) Mental Component Summary (SF-12 v2 MCS). The CES-D had a strong correlation with the PHQ-9 (coefficient: 0.78) and SF-12 v2 MCS (coefficient: -0.75). Internal consistency was assessed by McDonald’s omega hierarchical (ωH). The ωH value for the general depression factor was 0.855. The ωH values for “somatic”, “depressed affect”, “positive affect” and “interpersonal problems” were 0.434, 0.038, 0.738 and 0.730, respectively. For the two-week test-retest reliability, the intraclass correlation coefficient was 0.91. The CES-D was sensitive in detecting differences between known groups, with the AUC >0.7. Internal responsiveness of the CES-D to detect positive and negative changes was satisfactory (with p value <0.01 and all effect size statistics >0.2). The CES-D was externally responsive, with the AUC>0.7. Conclusions The CES-D appears to be a valid, reliable, sensitive and responsive instrument for screening and monitoring depressive symptoms in adult Chinese primary care patients. In its original four-factor and bi-factor structure, the CES-D is supported for cross-cultural comparisons of depression in multi-center studies. PMID:26252739

External model validation of binary clinical risk prediction models in cardiovascular and thoracic surgery.

PubMed

Hickey, Graeme L; Blackstone, Eugene H

2016-08-01

Clinical risk-prediction models serve an important role in healthcare. They are used for clinical decision-making and measuring the performance of healthcare providers. To establish confidence in a model, external model validation is imperative. When designing such an external model validation study, thought must be given to patient selection, risk factor and outcome definitions, missing data, and the transparent reporting of the analysis. In addition, there are a number of statistical methods available for external model validation. Execution of a rigorous external validation study rests in proper study design, application of suitable statistical methods, and transparent reporting. Copyright © 2016 The American Association for Thoracic Surgery. Published by Elsevier Inc. All rights reserved.
Validation of an innovative instrument of Positive Oral Health and Well-Being (POHW).

PubMed

Zini, Avraham; Büssing, Arndt; Chay, Cindy; Badner, Victor; Weinstock-Levin, Tamar; Sgan-Cohen, Harold D; Cochardt, Philip; Friedmann, Anton; Ziskind, Karin; Vered, Yuval

2016-04-01

Most existing measures of oral health focus solely on negative oral health, illness, and deficiencies and ignore positive oral health. In an attempt to commence exploration of this challenging field, an innovative instrument was developed, the "Positive Oral Health and Well-Being" (POHW) index. This study aimed to validate this instrument and to explore an initial model of the pathway between oral health attributes and positive oral health. A cross-sectional, multicenter study (Israel, USA, and Germany), was conducted. Our conceptual model suggests that positive oral health attributes, which integrate with positive unawareness or positive awareness on the one hand and with positive perception on the other hand, may result via appropriate oral health behavior on positive oral health. The 17-item self-administered index was built on a theoretical concept by four experts from Israel and Germany. Reliability, factor, and correlation analyses were performed. For external correlations and to measure construct validity of the instrument, we utilized the oral health impact profile-14, self-perceived oral impairment, life satisfaction, self-perceived well-being, sociodemographic and behavioral data, and oral health status indices. Four hundred and seventy participants took part in our three-center study. The combined data set reliability analyses detected two items which were not contributing to the index reliability. Thus, we tested a 15-item construct, and a Cronbach's α value of 0.933 was revealed. Primary factor analysis of the whole sample indicated three subconstructs which could explain 60 % of variance. Correlation analyses demonstrated that the POHW and OHIP-14 were strongly and negatively associated. The POHW correlated strongly and positively with general well-being, moderately with life satisfaction, and weakly with the perceived importance of regular dental checkups. It correlated moderately and negatively with perceived oral impairment, and marginally and negatively with dental caries experience (DMFT) and periodontal health status (CPI) scores. When DMFT and CPI clinical measurements were categorized, a higher score of POHW was revealed for better oral health. Our study introduced a new instrument with good reliability and sound correlations with external measures. This instrument is the first to allow measurability of positive instead of impaired oral health. We utilized subjective-psychological and functional-social measures. The current results indicate that by further exploring our conceptual model, POHW may be of importance for identifying patients with good and poor oral health, and building an effective and inexpensive strategy for prevention, by being able to evaluate the effect of interventions in a standardized way.
[The Arnett Inventory of Sensation Seeking (AISS): a French-speaking validation and psychometric examination in young students].

PubMed

Cazenave, N; Paquette, L

2010-10-01

In French-speaking countries, the concept of sensation seeking has been most widely assessed using the Zuckerman Sensation Seeking Scale form V (SSS), since this instrument was validated (in French) more than 15 years ago. This instrument has received several criticisms which limit the internal and external consistencies. Indeed, five limitations of conception and form could reduce the fact that many researchers have found the SSSV to be valid and useful and, more importantly, the conclusions that can be drawn from studies in which it has been used (e.g; tautological relationships, a forced-choice format, language of some items is out-of-date). Arnett thus developed a new measurement (Arnett Inventory of Sensation Seeking, AISS) based on a new conceptualization of sensation seeking, which is characterized by the need for novelty and intensity of stimulation, whereas sensation seeking, as developed by Zuckerman, is marked by a need for novelty and complexity of stimulation. The AISS has been translated and validated in Spanish and in German. Both studies found support for the bi-dimensional structure of the instrument. Currently, there is no French-speaking version of the AISS, and because of the cultural differences between English- and French-speaking populations, we cannot simply translate the instrument without examining the reliability and the factorial validity. Hence, we followed the seven steps of the cross-cultural validation methodology for psychological questionnaires presented by Vallerand. Questionnaires were distributed to 782 young adults. Out of these questionnaires, 737 (94%) were returned. One hundred and sixteen questionnaires were removed because of missing data. Thus, a total of 621 young adults were included in the study. They were aged from 18 to 28 years (M=23.32, SD=2.79). They completed the SSS and the AISS. We conducted a confirmatory factor analysis (CFA) on the data set, using Amos 6.0, to assess the validity of the bi-dimensional structure; we also examined the internal consistencies, and tested the potential gender differences. The analyses show that the fit indices, associated with the model with 20 items proposed by Arnett, were poor. We therefore had to modify it and delete some items in order to provide a more satisfactory account of the data. The fit indices from the confirmatory factor analysis were adequate for a two-factor structure with six items on each subscale. Pearson's correlation coefficients supported convergent validity of the questionnaire. Internal consistency reliabilities Cronbach's α were calculated for each of the factors and for the total scale. The reliability coefficients for the Intensity and Novelty subscales were 0.621 and 0.567, respectively, whereas the reliability of the overall scale was 0.646. In order to assess the differences between both sexes, we carried out a multivariate analysis of variance with gender as independent variables, and intensity, novelty and the total score of the revised AISS as dependent variables. Men scored higher than women on the Total Scale and on the Intensity subscale, but no gender relationship was found on Novelty subscale. These findings replicated research supporting the construct validity and reliability of the AISS in previous psychometric examinations. The results of this preliminary study yielded sufficient support for the validity of the French translation of the AISS, but further analyses, such as test-retest reliability and discriminant validity should be conducted. Copyright © 2010 L'Encéphale, Paris. Published by Elsevier Masson SAS. All rights reserved.
Psychometric evaluation of the Mental Health Continuum-Short Form (MHC-SF) in Chinese adolescents - a methodological study.

PubMed

Guo, Cheng; Tomson, Göran; Guo, Jizhi; Li, Xiangyun; Keller, Christina; Söderqvist, Fredrik

2015-12-10

In epidemiological surveillance of mental health there is good reason to also include scales that measure the presence of well-being rather than merely symptoms of ill health. The Mental Health Continuum-Short Form (MHC-SF) is a self-reported scale to measure emotional, psychological and social well-being and conduct categorical diagnosis of positive mental health. This particular instrument includes the three core components of the World Health Organization's definition of mental health and had previously not been psychometrically evaluated on adolescents in China. In total 5,399 students (51.1% female) from schools in the urban areas of Weifang in China were included in the study (mean age = 15.13, SD = 1.56). Participants completed a comprehensive questionnaire with several scales, among them the MHC-SF. Statistical analyses to evaluate reliability, structural validity, measurement invariance, presence of floor and ceiling effects and to some extent external validity of the MHC-SF were carried out. The Cronbach's α coefficients for sub-scales as well as the total scale were all above 0.80 indicating good reliability. Confirmative factor analysis confirmed the three-dimensional structure of the Chinese version of MHC-SF and supported the configural and metric invariance across gender and age. Noteworthy ceiling effects were observed for single items and sub-scales although not for the total scale. More importantly, observed floor effects were negligible. The stronger correlation found between MHC-SF and Minneapolis-Manchester Quality of Life Instrument (as measure of positive mental health) than between MHC-SF and Hospital Anxiety Depression Scale (as measure of mental illness and distress) yielded support for external validity. In conclusion, the main findings of this study are in line with studies from other countries that evaluated the psychometric properties of the MHC-SF and show that this instrument, that includes the three core components of the WHO definition of mental health, is useful in assessing positive adolescent mental health also in China.
Validation of the German Version of the Social Functioning Scale (SFS) for schizophrenia.

PubMed

Iffland, Jona R; Lockhofen, Denise; Gruppe, Harald; Gallhofer, Bernd; Sammer, Gebhard; Hanewald, Bernd

2015-01-01

Deficits in social functioning are a core symptom of schizophrenia and an important criterion for evaluating the success of treatment. However, there is little agreement regarding its measurement. A common, often cited instrument for assessing self-reported social functioning is the Social Functioning Scale (SFS). The study aimed to investigate the reliability and validity of the German translation. 101 patients suffering from schizophrenia (SZ) and 101 matched controls (C) (60 male / 41 female, 35.8 years in both groups) completed the German version. In addition, demographic, clinical, and functional data were collected. Internal consistency was investigated calculating Cronbach's alpha for SFS full scale (α: .81) and all subscales (α: .59-.88). Significant bivariate correlation coefficients were found between all subscales as well as between all subscales and full scale (p <.01). For the total sample, principal component analysis gave evidence to prefer a single-factor solution (eigenvalue ≥ 1) accounting for 48.5 % of the variance. For the subsamples, a two-component solution (SZ; 57.0 %) and a three-component solution (C; 65.6 %) fitted best, respectively. For SZ and C, significant associations were found between SFS and external criteria. The main factor "group" emerged as being significant. C showed higher values on both subscales and full scale. The sensitivity of the SFS was examined using discriminant analysis. 86.5% of the participants could be categorized correctly to their actual group. The German translation of the SFS turned out to be a reliable and valid questionnaire comparable to the original English version. This is in line with Spanish and Norwegian translations of the SFS. Concluding, the German version of the SFS is well suited to become a useful and practicable instrument for the assessment of social functioning in both clinical practice and research. It accomplishes commonly used external assessment scales.
The Teamwork Assessment Scale: A Novel Instrument to Assess Quality of Undergraduate Medical Students' Teamwork Using the Example of Simulation-based Ward-Rounds.

PubMed

Kiesewetter, Jan; Fischer, Martin R

2015-01-01

Simulation-based teamwork trainings are considered a powerful training method to advance teamwork, which becomes more relevant in medical education. The measurement of teamwork is of high importance and several instruments have been developed for various medical domains to meet this need. To our knowledge, no theoretically-based and easy-to-use measurement instrument has been published nor developed specifically for simulation-based teamwork trainings of medical students. Internist ward-rounds function as an important example of teamwork in medicine. The purpose of this study was to provide a validated, theoretically-based instrument that is easy-to-use. Furthermore, this study aimed to identify if and when rater scores relate to performance. Based on a theoretical framework for teamwork behaviour, items regarding four teamwork components (Team Coordination, Team Cooperation, Information Exchange, Team Adjustment Behaviours) were developed. In study one, three ward-round scenarios, simulated by 69 students, were videotaped and rated independently by four trained raters. The instrument was tested for the embedded psychometric properties and factorial structure. In study two, the instrument was tested for construct validity with an external criterion with a second set of 100 students and four raters. In study one, the factorial structure matched the theoretical components but was unable to separate Information Exchange and Team Cooperation. The preliminary version showed adequate psychometric properties (Cronbach's α=.75). In study two, the instrument showed physician rater scores were more reliable in measurement than those of student raters. Furthermore, a close correlation between the scale and clinical performance as an external criteria was shown (r=.64) and the sufficient psychometric properties were replicated (Cronbach's α=.78). The validation allows for use of the simulated teamwork assessment scale in undergraduate medical ward-round trainings to reliably measure teamwork by physicians. Further studies are needed to verify the applicability of the instrument.
The Teamwork Assessment Scale: A Novel Instrument to Assess Quality of Undergraduate Medical Students' Teamwork Using the Example of Simulation-based Ward-Rounds

PubMed Central

Kiesewetter, Jan; Fischer, Martin R.

2015-01-01

Background: Simulation-based teamwork trainings are considered a powerful training method to advance teamwork, which becomes more relevant in medical education. The measurement of teamwork is of high importance and several instruments have been developed for various medical domains to meet this need. To our knowledge, no theoretically-based and easy-to-use measurement instrument has been published nor developed specifically for simulation-based teamwork trainings of medical students. Internist ward-rounds function as an important example of teamwork in medicine. Purposes: The purpose of this study was to provide a validated, theoretically-based instrument that is easy-to-use. Furthermore, this study aimed to identify if and when rater scores relate to performance. Methods: Based on a theoretical framework for teamwork behaviour, items regarding four teamwork components (Team Coordination, Team Cooperation, Information Exchange, Team Adjustment Behaviours) were developed. In study one, three ward-round scenarios, simulated by 69 students, were videotaped and rated independently by four trained raters. The instrument was tested for the embedded psychometric properties and factorial structure. In study two, the instrument was tested for construct validity with an external criterion with a second set of 100 students and four raters. Results: In study one, the factorial structure matched the theoretical components but was unable to separate Information Exchange and Team Cooperation. The preliminary version showed adequate psychometric properties (Cronbach’s α=.75). In study two, the instrument showed physician rater scores were more reliable in measurement than those of student raters. Furthermore, a close correlation between the scale and clinical performance as an external criteria was shown (r=.64) and the sufficient psychometric properties were replicated (Cronbach’s α=.78). Conclusions: The validation allows for use of the simulated teamwork assessment scale in undergraduate medical ward-round trainings to reliably measure teamwork by physicians. Further studies are needed to verify the applicability of the instrument. PMID:26038684
Preliminary validation of the Perceived Locus of Causality scale for academic motivation in the context of university studies (PLOC-U).

PubMed

Sánchez de Miguel, Manuel; Lizaso, Izarne; Hermosilla, Daniel; Alcover, Carlos-Maria; Goudas, Marios; Arranz-Freijó, Enrique

2017-12-01

Research has shown that self-determination theory can be useful in the study of motivation in sport and other forms of physical activity. The Perceived Locus of Causality (PLOC) scale was originally designed to study both. The current research presents and validates the new PLOC-U scale to measure academic motivation in the university context. We tested levels of self-determination before and after academic examinations. Also, we analysed degree of internalization of extrinsic motivation in students' practical activities. Two hundred and eighty-seven Spanish university students participated in the study. Data were collected at two time points to check the reliability and stability of PLOC-U by a test-retest procedure. Confirmatory factor analysis was performed on the PLOC-U. Also convergent validity was tested against the Academic Motivation Scale (EME-E). Confirmatory factor analysis showed optimum fit and good reliability of PLOC-U. It also presented excellent convergent validity with the EME-E and good stability over time. Our findings did not show any significant correlation between self-determination and expected results before academic examinations, but it did so afterwards, revealing greater regulation by and integration of extrinsic motivation. The high score obtained for extrinsic motivation points to a greater regulation associated with an external contingency (rewards in the practical coursework). PLOC-U is a good instrument for the measurement of academic motivation and provides a new tool to analyse self-determination among university students. © 2017 The British Psychological Society.
Use of ATR-FTIR spectroscopy coupled with chemometrics for the authentication of avocado oil in ternary mixtures with sunflower and soybean oils.

PubMed

Jiménez-Sotelo, Paola; Hernández-Martínez, Maylet; Osorio-Revilla, Guillermo; Meza-Márquez, Ofelia Gabriela; García-Ochoa, Felipe; Gallardo-Velázquez, Tzayhrí

2016-07-01

Avocado oil is a high-value and nutraceutical oil whose authentication is very important since the addition of low-cost oils could lower its beneficial properties. Mid-FTIR spectroscopy combined with chemometrics was used to detect and quantify adulteration of avocado oil with sunflower and soybean oils in a ternary mixture. Thirty-seven laboratory-prepared adulterated samples and 20 pure avocado oil samples were evaluated. The adulterated oil amount ranged from 2% to 50% (w/w) in avocado oil. A soft independent modelling class analogy (SIMCA) model was developed to discriminate between pure and adulterated samples. The model showed recognition and rejection rate of 100% and proper classification in external validation. A partial least square (PLS) algorithm was used to estimate the percentage of adulteration. The PLS model showed values of R(2) > 0.9961, standard errors of calibration (SEC) in the range of 0.3963-0.7881, standard errors of prediction (SEP estimated) between 0.6483 and 0.9707, and good prediction performances in external validation. The results showed that mid-FTIR spectroscopy could be an accurate and reliable technique for qualitative and quantitative analysis of avocado oil in ternary mixtures.
Citrate Content of Bone as a Measure of Postmortem Interval: An External Validation Study.

PubMed

Brown, Michael A; Bunch, Ann W; Froome, Charles; Gerling, Rebecca; Hennessy, Shawn; Ellison, Jeffrey

2017-12-26

The postmortem interval (PMI) of skeletal remains is a crucial piece of information that can help establish the time dimension in criminal cases. Unfortunately, the accurate and reliable determination of PMI from bone continues to evade forensic investigators despite concerted efforts over the past decades to develop suitable qualitative and quantitative methods. A relatively new PMI method based on the analysis of citrate content of bone was developed by Schwarcz et al. The main objective of our research was to determine whether this work could be externally validated. Thirty-one bone samples were obtained from the Forensic Anthropology Center, University of Tennessee, Knoxville, and the Onondaga County Medical Examiner's Office. Results from analyzing samples with PMI greater than 2 years suggest that the hypothetical relationship between the citrate content of bone and PMI is much weaker than reported. It was also observed that the average absolute error between the PMI value estimated using the equation proposed by Schwarcz et al. and the actual ("true") PMI of the sample was negative indicating an underestimation in PMI. These findings are identical to those reported by Kanz et al. Despite these results this method may still serve as a technique to sort ancient from more recent skeletal cases, after further, similar validation studies have been conducted. © 2017 American Academy of Forensic Sciences.
Automatic personality assessment through social media language.

PubMed

Park, Gregory; Schwartz, H Andrew; Eichstaedt, Johannes C; Kern, Margaret L; Kosinski, Michal; Stillwell, David J; Ungar, Lyle H; Seligman, Martin E P

2015-06-01

Language use is a psychologically rich, stable individual difference with well-established correlations to personality. We describe a method for assessing personality using an open-vocabulary analysis of language from social media. We compiled the written language from 66,732 Facebook users and their questionnaire-based self-reported Big Five personality traits, and then we built a predictive model of personality based on their language. We used this model to predict the 5 personality factors in a separate sample of 4,824 Facebook users, examining (a) convergence with self-reports of personality at the domain- and facet-level; (b) discriminant validity between predictions of distinct traits; (c) agreement with informant reports of personality; (d) patterns of correlations with external criteria (e.g., number of friends, political attitudes, impulsiveness); and (e) test-retest reliability over 6-month intervals. Results indicated that language-based assessments can constitute valid personality measures: they agreed with self-reports and informant reports of personality, added incremental validity over informant reports, adequately discriminated between traits, exhibited patterns of correlations with external criteria similar to those found with self-reported personality, and were stable over 6-month intervals. Analysis of predictive language can provide rich portraits of the mental life associated with traits. This approach can complement and extend traditional methods, providing researchers with an additional measure that can quickly and cheaply assess large groups of participants with minimal burden. (c) 2015 APA, all rights reserved).
Development and psychometric properties of a Calcium Intake Questionnaire based on the social cognitive theory (CIQ-SCT) for Iranian women.

PubMed

Nematollahi, Mahin; Eslami, Ahmad Ali

2018-01-01

Background: Osteoporosis is common among women which may be mostly due to the low intake of calcium. This article reports the development, cultural adaptation and psychometric properties of a Calcium Intake Questionnaire based on the social cognitive theory (CIQ-SCT)among Iranian women. Methods: In 2016, this cross-sectional study was carried out among 400 younger than 50 years old women in Isfahan, Iran. After literature review, a preliminary 35-item questionnaire was developed. Then, forward-backward translation and cultural adaptation of the tool was conducted. Content Validity Index confirmed by an expert panel and Face Validity was evaluated in a pilot study. Exploratory and confirmatory factor analyses (EFA &CFA) were conducted on the calibration and validation sample, respectively. Reliability was also assessed using internal consistency test. Results: After determining content and face validity, 20 items with 5 factors (self-efficacy,outcome expectations, social support and self-regulation) were obtained. Cronbach alpha for the instrument was found to be 0.901. In EFA, we identified a 4-factor model with a total variance of 72.3%. The results related to CFA (CMIN/DF=1.850, CFI =0.946, TLI=0.938, RMSEA=0.069[90% CI: 0.057-0.081]) indicated that the model was fit to the social cognitive theory. Self regulation was detected as the best predictor for calcium intake. Conclusion: The CIQ-SCT showed acceptable levels of reliability and validity in explaining the calcium intake based on the constructs of social cognitive theory. Further psychometric testing is recommended in different population to approve the external validity of the instrument.
A comparison of the psychometric properties of the psychopathic personality inventory full-length and short-form versions.

PubMed

Kastner, Rebecca M; Sellbom, Martin; Lilienfeld, Scott O

2012-03-01

The Psychopathic Personality Inventory (PPI) has shown promising construct validity as a measure of psychopathy. Because of its relative efficiency, a short-form version of the PPI (PPI-SF) was developed and has proven useful in many psychopathy studies. The validity of the PPI-SF, however, has not been thoroughly examined, and no studies have directly compared the validity of the short form with that of the full-length version. The current study was designed to compare the psychometric properties of both PPI versions, with an emphasis on convergent and discriminant validity in predicting external criteria conceptually relevant to psychopathy. We used both prison (n = 558) and college samples (n = 322) for this investigation. PPI scale scores were more reliable and more strongly correlated with the conceptually relevant criterion measures compared with the PPI-SF, particularly in the prison sample. There were no differences in relative discriminant validity. Thus, overall, the PPI full-length version showed more evidence of construct validity than did the short form, and the consequences of this psychometric difference should be considered when evaluating the clinical utility of each measure.
Identification student’s misconception of heat and temperature using three-tier diagnostic test

NASA Astrophysics Data System (ADS)

Suliyanah; Putri, H. N. P. A.; Rohmawati, L.

2018-03-01

The objective of this research is to develop a Three-Tier Diagnostic Test (TTDT) to identify the student's misconception of heat and temperature. Stages of development include: analysis, planning, design, development, evaluation and revise. The results of this study show that (1) the quality of the three-tier type diagnostic test instrument developed has been expressed well with the following details: (a) Internal validity of 88.19% belonging to the valid category. (b) External validity of empirical construct validity test using Pearson Product Moment obtained 0.43 is classified and result of empirical construct validity test obtained false positives 6.1% and false negatives 5.9% then the instrument was valid. (c) Test reliability by using Cronbach’s Alpha of 0.98 which means acceptable. (d) The 80% difficulty level test is quite difficult. (2) Student misconceptions on the temperature of heat and displacement materials based on the II test the highest (84%), the lowest (21%), and the non-misconceptions (7%). (3) The highest cause of misconception among students is associative thinking (22%) and the lowest is caused by incomplete or incomplete reasoning (11%). Three-Tier Diagnostic Test (TTDT) could identify the student's misconception of heat and temperature.
Test-retest reliability of a balance testing protocol with external perturbations in young healthy adults.

PubMed

Robbins, Shawn M; Caplan, Ryan M; Aponte, Daniel I; St-Onge, Nancy

2017-10-01

External perturbations are utilized to challenge balance and mimic realistic balance threats in patient populations. The reliability of such protocols has not been established. The purpose was to examine test-retest reliability of balance testing with external perturbations. Healthy adults (n=34; mean age 23 years) underwent balance testing over two visits. Participants completed ten balance conditions in which the following parameters were combined: perturbation or non-perturbation, single or double leg, and eyes open or closed. Three trials were collected for each condition. Data were collected on a force plate and external perturbations were applied by translating the plate. Force plate center of pressure (CoP) data were summarized using 13 different CoP measures. Test-retest reliability was examined using intraclass correlation coefficients (ICC) and Bland-Altman plots. CoP measures of total speed and excursion in both anterior-posterior and medial-lateral directions generally had acceptable ICC values for perturbation conditions (ICC=0.46 to 0.87); however, many other CoP measures (e.g. range, area of ellipse) had unacceptable test-retest reliability (ICC<0.70). Improved CoP measures were present on the second visit indicating a potential learning effect. Non-perturbation conditions generally produced more reliable CoP measures than perturbation conditions during double leg standing, but not single leg standing. Therefore, changes to balance testing protocols that include external perturbations should be made to improve test-retest reliability and diminish learning including more extensive participant training and increasing the number of trials. CoP measures that consider all data points (e.g. total speed) are more reliable than those that only consider a few data points. Copyright © 2017 Elsevier B.V. All rights reserved.
Prediction models for intracranial hemorrhage or major bleeding in patients on antiplatelet therapy: a systematic review and external validation study.

PubMed

Hilkens, N A; Algra, A; Greving, J P

2016-01-01

ESSENTIALS: Prediction models may help to identify patients at high risk of bleeding on antiplatelet therapy. We identified existing prediction models for bleeding and validated them in patients with cerebral ischemia. Five prediction models were identified, all of which had some methodological shortcomings. Performance in patients with cerebral ischemia was poor. Background Antiplatelet therapy is widely used in secondary prevention after a transient ischemic attack (TIA) or ischemic stroke. Bleeding is the main adverse effect of antiplatelet therapy and is potentially life threatening. Identification of patients at increased risk of bleeding may help target antiplatelet therapy. This study sought to identify existing prediction models for intracranial hemorrhage or major bleeding in patients on antiplatelet therapy and evaluate their performance in patients with cerebral ischemia. We systematically searched PubMed and Embase for existing prediction models up to December 2014. The methodological quality of the included studies was assessed with the CHARMS checklist. Prediction models were externally validated in the European Stroke Prevention Study 2, comprising 6602 patients with a TIA or ischemic stroke. We assessed discrimination and calibration of included prediction models. Five prediction models were identified, of which two were developed in patients with previous cerebral ischemia. Three studies assessed major bleeding, one studied intracerebral hemorrhage and one gastrointestinal bleeding. None of the studies met all criteria of good quality. External validation showed poor discriminative performance, with c-statistics ranging from 0.53 to 0.64 and poor calibration. A limited number of prediction models is available that predict intracranial hemorrhage or major bleeding in patients on antiplatelet therapy. The methodological quality of the models varied, but was generally low. Predictive performance in patients with cerebral ischemia was poor. In order to reliably predict the risk of bleeding in patients with cerebral ischemia, development of a prediction model according to current methodological standards is needed. © 2015 International Society on Thrombosis and Haemostasis.
Qualitative and Semiquantitative Assessment of Exposure to Engineered Nanomaterials within the French EpiNano Program: Inter- and Intramethod Reliability Study.

PubMed

Guseva Canu, Irina; Jezewski-Serra, Delphine; Delabre, Laurène; Ducamp, Stéphane; Iwatsubo, Yuriko; Audignon-Durand, Sabine; Ducros, Cécile; Radauceanu, Anca; Durand, Catherine; Witschger, Olivier; Flahaut, Emmanuel

2017-01-01

The relatively recent development of industries working with nanomaterials has created challenges for exposure assessment. In this article, we propose a relatively simple approach to assessing nanomaterial exposures for the purposes of epidemiological studies of workers in these industries. This method consists of an onsite industrial hygiene visit of facilities carried out individually and a description of workstations where nano-objects and their agglomerates and aggregates (NOAA) are present using a standardized tool, the Onsite technical logbook. To assess its reliability, we implemented this approach for assessing exposure to NOAA in workplaces at seven workstations which synthesize and functionalize carbon nanotubes. The prediction of exposure to NOAA using this method exhibited substantial agreement with that of the reference method, the latter being based on an onsite group visit, an expert's report and exposure measurements (Cohen kappa = 0.70, sensitivity = 0.88, specificity = 0.92). Intramethod comparison of results for exposure prediction showed moderate agreement between the three evaluators (two program team evaluators and one external evaluator) (weighted Fleiss kappa = 0.60, P = 0.003). Interevaluator reliability of the semiquantitative exposure characterization results was excellent between the two evaluators from the program team (Spearman rho = 0.93, P = 0.03) and fair when these two evaluators' results were compared with the external evaluator's results. The project was undertaken within the framework of the French epidemiological surveillance program EpiNano. This study allowed a first reliability assessment of the EpiNano method. However, to further validate this method a comparison with robust quantitative exposure measurement data is necessary. © The Author 2017. Published by Oxford University Press on behalf of the British Occupational Hygiene Society.
Are Validity and Reliability "Relevant" in Qualitative Evaluation Research?

ERIC Educational Resources Information Center

Goodwin, Laura D.; Goodwin, William L.

1984-01-01

The views of prominant qualitative methodologists on the appropriateness of validity and reliability estimation for the measurement strategies employed in qualitative evaluations are summarized. A case is made for the relevance of validity and reliability estimation. Definitions of validity and reliability for qualitative measurement are presented…
Temporal and external validation of a prediction model for adverse outcomes among inpatients with diabetes.

PubMed

Adderley, N J; Mallett, S; Marshall, T; Ghosh, S; Rayman, G; Bellary, S; Coleman, J; Akiboye, F; Toulis, K A; Nirantharakumar, K

2018-06-01

To temporally and externally validate our previously developed prediction model, which used data from University Hospitals Birmingham to identify inpatients with diabetes at high risk of adverse outcome (mortality or excessive length of stay), in order to demonstrate its applicability to other hospital populations within the UK. Temporal validation was performed using data from University Hospitals Birmingham and external validation was performed using data from both the Heart of England NHS Foundation Trust and Ipswich Hospital. All adult inpatients with diabetes were included. Variables included in the model were age, gender, ethnicity, admission type, intensive therapy unit admission, insulin therapy, albumin, sodium, potassium, haemoglobin, C-reactive protein, estimated GFR and neutrophil count. Adverse outcome was defined as excessive length of stay or death. Model discrimination in the temporal and external validation datasets was good. In temporal validation using data from University Hospitals Birmingham, the area under the curve was 0.797 (95% CI 0.785-0.810), sensitivity was 70% (95% CI 67-72) and specificity was 75% (95% CI 74-76). In external validation using data from Heart of England NHS Foundation Trust, the area under the curve was 0.758 (95% CI 0.747-0.768), sensitivity was 73% (95% CI 71-74) and specificity was 66% (95% CI 65-67). In external validation using data from Ipswich, the area under the curve was 0.736 (95% CI 0.711-0.761), sensitivity was 63% (95% CI 59-68) and specificity was 69% (95% CI 67-72). These results were similar to those for the internally validated model derived from University Hospitals Birmingham. The prediction model to identify patients with diabetes at high risk of developing an adverse event while in hospital performed well in temporal and external validation. The externally validated prediction model is a novel tool that can be used to improve care pathways for inpatients with diabetes. Further research to assess clinical utility is needed. © 2018 Diabetes UK.
Microstructural Modeling of Brittle Materials for Enhanced Performance and Reliability.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Teague, Melissa Christine; Teague, Melissa Christine; Rodgers, Theron

Brittle failure is often influenced by difficult to measure and variable microstructure-scale stresses. Recent advances in photoluminescence spectroscopy (PLS), including improved confocal laser measurement and rapid spectroscopic data collection have established the potential to map stresses with microscale spatial resolution (%3C2 microns). Advanced PLS was successfully used to investigate both residual and externally applied stresses in polycrystalline alumina at the microstructure scale. The measured average stresses matched those estimated from beam theory to within one standard deviation, validating the technique. Modeling the residual stresses within the microstructure produced general agreement in comparison with the experimentally measured results. Microstructure scale modelingmore » is primed to take advantage of advanced PLS to enable its refinement and validation, eventually enabling microstructure modeling to become a predictive tool for brittle materials.« less

A Rapid Assessment Tool for affirming good practice in midwifery education programming.

PubMed

Fullerton, Judith T; Johnson, Peter; Lobe, Erika; Myint, Khine Haymar; Aung, Nan Nan; Moe, Thida; Linn, Nay Aung

2016-03-01

to design a criterion-referenced assessment tool that could be used globally in a rapid assessment of good practices and bottlenecks in midwifery education programs. a standard tool development process was followed, to generate standards and reference criteria; followed by external review and field testing to document psychometric properties. review of standards and scoring criteria were conducted by stakeholders around the globe. Field testing of the tool was conducted in Myanmar. eleven of Myanmar׳s 22 midwifery education programs participated in the assessment. the clinimetric tool was demonstrated to have content validity and high inter-rater reliability in use. a globally validated tool, and accompanying user guide and handbook are now available for conducting rapid assessments of compliance with good practice criteria in midwifery education programming. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
Reliability and validity in a nutshell.

PubMed

Bannigan, Katrina; Watson, Roger

2009-12-01

To explore and explain the different concepts of reliability and validity as they are related to measurement instruments in social science and health care. There are different concepts contained in the terms reliability and validity and these are often explained poorly and there is often confusion between them. To develop some clarity about reliability and validity a conceptual framework was built based on the existing literature. The concepts of reliability, validity and utility are explored and explained. Reliability contains the concepts of internal consistency and stability and equivalence. Validity contains the concepts of content, face, criterion, concurrent, predictive, construct, convergent (and divergent), factorial and discriminant. In addition, for clinical practice and research, it is essential to establish the utility of a measurement instrument. To use measurement instruments appropriately in clinical practice, the extent to which they are reliable, valid and usable must be established.
If It Doesn't Work, Why Do We Still Do It? The Continuing Use of Subtalar Joint Neutral Theory in the Face of Overpowering Critical Research.

PubMed

Harradine, Paul; Gates, Lucy; Bowen, Catherine

2018-03-01

The use of subtalar joint neutral (STJN) in the assessment and treatment of foot-related musculoskeletal symptomology is common in daily practice and still widely taught. The main pioneer of this theory was Dr Merton L. Root, and it has been labeled with a variety of names: "the foot morphology theory," "the subtalar joint neutral theory," or simply "Rootian theory" or "Root model." The theory's core concepts still underpin a common approach to musculoskeletal assessment of the foot, as well as the consequent design of foot orthoses. The available literature continues to point to Dr Root's theory as the most prevalently utilized. Concurrently, the worth of this theory has been challenged due to its poor reliability and limited external validity. This Viewpoint reviews the main clinical areas of the STJN theory, and concludes with a possible explanation and concerns for its ongoing use. To support our view, we will discuss (1) historical inaccuracies, (2) challenges with reliability, and (3) concerns with validity. J Orthop Sports Phys Ther 2018;48(3):130-132. doi:10.2519/jospt.2018.0604.
Assessing individual differences in proneness to shame and guilt: development of the Self-Conscious Affect and Attribution Inventory.

PubMed

Tangney, J P

1990-07-01

Individual differences in proneness to shame and proneness to guilt are thought to play an important role in the development of both adaptive and maladaptive interpersonal and intrapersonal processes. But little empirical research has addressed these issues, largely because no reliable, valid measure has been available to researchers interested in differentiating proneness to shame from proneness to guilt. The Self-Conscious Affect and Attribution Inventory (SCAAI) was developed to assess characteristic affective, cognitive, and behavioral responses associated with shame and guilt among a young adult population. The SCAAI also includes indices of externalization of cause or blame, detachment/unconcern, pride in self, and pride in behavior. Data from 3 independent studies of college students and 1 study of noncollege adults provide support for the reliability of the main SCAAI subscales. Moreover, the pattern of relations among the SCAAI subscales and the relation of SCAAI subscales to 2 extant measures of shame and guilt support the validity of this new measure. The SCAAI appears to provide related but functionally distinct indices of proneness to shame and guilt in a way that these previous measures have not.
Benchmarking Treatment Response in Tourette's Disorder: A Psychometric Evaluation and Signal Detection Analysis of the Parent Tic Questionnaire.

PubMed

Ricketts, Emily J; McGuire, Joseph F; Chang, Susanna; Bose, Deepika; Rasch, Madeline M; Woods, Douglas W; Specht, Matthew W; Walkup, John T; Scahill, Lawrence; Wilhelm, Sabine; Peterson, Alan L; Piacentini, John

2018-01-01

This study assessed the psychometric properties of a parent-reported tic severity measure, the Parent Tic Questionnaire (PTQ), and used the scale to establish guidelines for delineating clinically significant tic treatment response. Participants were 126 children ages 9 to 17 who participated in a randomized controlled trial of Comprehensive Behavioral Intervention for Tics (CBIT). Tic severity was assessed using the Yale Global Tic Severity Scale (YGTSS), Hopkins Motor/Vocal Tic Scale (HMVTS) and PTQ; positive treatment response was defined by a score of 1 (very much improved) or 2 (much improved) on the Clinical Global Impressions - Improvement (CGI-I) scale. Cronbach's alpha and intraclass correlations (ICC) assessed internal consistency and test-retest reliability, with correlations evaluating validity. Receiver- and Quality-Receiver Operating Characteristic analyses assessed the efficiency of percent and raw-reduction cutoffs associated with positive treatment response. The PTQ demonstrated good internal consistency (α = 0.80 to 0.86), excellent test-retest reliability (ICC = .84 to .89), good convergent validity with the YGTSS and HM/VTS, and good discriminant validity from hyperactive, obsessive-compulsive, and externalizing (i.e., aggression and rule-breaking) symptoms. A 55% reduction and 10-point decrease in PTQ Total score were optimal for defining positive treatment response. Findings help standardize tic assessment and provide clinicians with greater clarity in determining clinically meaningful tic symptom change during treatment. Copyright © 2017. Published by Elsevier Ltd.
A reliable DNA barcode reference library for the identification of the North European shelf fish fauna.

PubMed

Knebelsberger, Thomas; Landi, Monica; Neumann, Hermann; Kloppmann, Matthias; Sell, Anne F; Campbell, Patrick D; Laakmann, Silke; Raupach, Michael J; Carvalho, Gary R; Costa, Filipe O

2014-09-01

Valid fish species identification is an essential step both for fundamental science and fisheries management. The traditional identification is mainly based on external morphological diagnostic characters, leading to inconsistent results in many cases. Here, we provide a sequence reference library based on mitochondrial cytochrome c oxidase subunit I (COI) for a valid identification of 93 North Atlantic fish species originating from the North Sea and adjacent waters, including many commercially exploited species. Neighbour-joining analysis based on K2P genetic distances formed nonoverlapping clusters for all species with a ≥99% bootstrap support each. Identification was successful for 100% of the species as the minimum genetic distance to the nearest neighbour always exceeded the maximum intraspecific distance. A barcoding gap was apparent for the whole data set. Within-species distances ranged from 0 to 2.35%, while interspecific distances varied between 3.15 and 28.09%. Distances between congeners were on average 51-fold higher than those within species. The validation of the sequence library by applying BOLDs barcode index number (BIN) analysis tool and a ranking system demonstrated high taxonomic reliability of the DNA barcodes for 85% of the investigated fish species. Thus, the sequence library presented here can be confidently used as a benchmark for identification of at least two-thirds of the typical fish species recorded for the North Sea. © 2014 John Wiley & Sons Ltd.
A new framework to enhance the interpretation of external validation studies of clinical prediction models.

PubMed

Debray, Thomas P A; Vergouwe, Yvonne; Koffijberg, Hendrik; Nieboer, Daan; Steyerberg, Ewout W; Moons, Karel G M

2015-03-01

It is widely acknowledged that the performance of diagnostic and prognostic prediction models should be assessed in external validation studies with independent data from "different but related" samples as compared with that of the development sample. We developed a framework of methodological steps and statistical methods for analyzing and enhancing the interpretation of results from external validation studies of prediction models. We propose to quantify the degree of relatedness between development and validation samples on a scale ranging from reproducibility to transportability by evaluating their corresponding case-mix differences. We subsequently assess the models' performance in the validation sample and interpret the performance in view of the case-mix differences. Finally, we may adjust the model to the validation setting. We illustrate this three-step framework with a prediction model for diagnosing deep venous thrombosis using three validation samples with varying case mix. While one external validation sample merely assessed the model's reproducibility, two other samples rather assessed model transportability. The performance in all validation samples was adequate, and the model did not require extensive updating to correct for miscalibration or poor fit to the validation settings. The proposed framework enhances the interpretation of findings at external validation of prediction models. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Development and psychometric validation of the Nausea/Vomiting Symptom Assessment patient-reported outcome (PRO) instrument for adults with secondary hyperparathyroidism.

PubMed

McHorney, Colleen A; Bensink, Mark E; Burke, Laurie B; Belozeroff, Vasily; Gwaltney, Chad

2017-01-01

We developed the Nausea/Vomiting Symptom Assessment (NVSA © ) patient-reported outcome (PRO) instrument to capture patients' experience with nausea and vomiting while on calcimimetic therapy to treat secondary hyperparathyroidism (SHPT) related to end-stage kidney disease. This report summarizes the content validity and psychometric validation of the NVSA © . The two NVSA © items were drafted by two health outcomes researchers, one medical development lead, and one regulatory lead: it yields three scores: the number of days of vomiting or nausea per week, the number of vomiting episodes per week, and the mean severity of nausea. An eight-week prospective observational study was conducted at ten dialysis centers in the U.S. with 91 subjects. Criterion measures included in the study were the Functional Living Index-Emesis, Kidney Disease Quality of Life Instrument, EQ-5D-5 L, Static Patient Global Assessment, and Patient Global Rating of Change. Analyses included assessment of score distributions, convergent and known-groups validity, test-retest reliability, ability to detect change, and thresholds for meaningful change. Qualitative interviews verified that the NVSA © captures relevant aspects of nausea and vomiting. Patients understood the NVSA © instructions, items, and response scales. Correlations between the NVSA © and related and unrelated measures indicated strong convergent and discriminant validity, respectively. Mean differences between externally-defined vomiting/nausea groups supported known-groups validity. The scores were stable in subjects who reported no change on the Patient Global Rating of Change indicating sufficient test-retest reliability. The no-change group had mean differences and effect sizes close to zero; mean differences were mostly positive for a worsening group and mostly negative for the improvement group with predominantly medium or large effect sizes. Preliminary thresholds for meaningful worsening were 0.90 days for number of days of vomiting or nausea per week, 1.20 for number of episodes of vomiting per week, and 0.40 for mean severity of nausea. The NVSA © instrument demonstrated content validity, convergent and known-groups validity, test-retest reliability, and the ability to detect change. Preliminary thresholds for minimally important change should be further refined with additional interventional research. The NVSA © may be used to support study endpoints in clinical trials comparing the nausea/vomiting profile of novel SHPT therapies.
Beware of external validation! - A Comparative Study of Several Validation Techniques used in QSAR Modelling.

PubMed

Majumdar, Subhabrata; Basak, Subhash C

2018-04-26

Proper validation is an important aspect of QSAR modelling. External validation is one of the widely used validation methods in QSAR where the model is built on a subset of the data and validated on the rest of the samples. However, its effectiveness for datasets with a small number of samples but large number of predictors remains suspect. Calculating hundreds or thousands of molecular descriptors using currently available software has become the norm in QSAR research, owing to computational advances in the past few decades. Thus, for n chemical compounds and p descriptors calculated for each molecule, the typical chemometric dataset today has high value of p but small n (i.e. n < p). Motivated by the evidence of inadequacies of external validation in estimating the true predictive capability of a statistical model in recent literature, this paper performs an extensive and comparative study of this method with several other validation techniques. We compared four validation methods: leave-one-out, K-fold, external and multi-split validation, using statistical models built using the LASSO regression, which simultaneously performs variable selection and modelling. We used 300 simulated datasets and one real dataset of 95 congeneric amine mutagens for this evaluation. External validation metrics have high variation among different random splits of the data, hence are not recommended for predictive QSAR models. LOO has the overall best performance among all validation methods applied in our scenario. Results from external validation are too unstable for the datasets we analyzed. Based on our findings, we recommend using the LOO procedure for validating QSAR predictive models built on high-dimensional small-sample data. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Brief report: the Utrecht-Management of Identity Commitments Scale (U-MICS): gender and age measurement invariance and convergent validity of the Turkish version.

PubMed

Morsunbul, Umit; Crocetti, Elisabetta; Cok, Figen; Meeus, Wim

2014-08-01

The purpose of this study was to evaluate the factor structure and convergent validity of the Turkish version of the Utrecht-Management of Identity Commitments Scale (U-MICS). Participants were 1201 (59.6% females) youth aged between 12 and 24 years (M(age) = 17.53 years, SD(age) = 3.25). Results indicated that the three-factor model consisting of commitment, in-depth exploration, and reconsideration of commitment provided a very good fit to the data and applied equally well to boys and girls as well as to three age groups (early adolescents, middle adolescents, and emerging adults). Significant relations between identity processes and self-concept clarity, personality, internalizing and externalizing problem behaviors, and parental relationships supported convergent validity. Thus, the Turkish version of U-MICS is a reliable tool for assessing identity in Turkish-speaking respondents. Copyright © 2014 The Foundation for Professionals in Services for Adolescents. Published by Elsevier Ltd. All rights reserved.
Validation of a short Korean version of the UPPS-P Impulsive Behavior Scale.

PubMed

Lim, Sun Young; Kim, Seog Ju

2018-04-23

The purpose of the present study was to validate a Korean version of the short UPPS-P Impulsive Behavior Scale (UPPS-P). This study included 724 undergraduate students who completed the following questionnaires: the Korean UPPS-P, Beck Depression Inventory, State-Trait Anxiety Inventory, Eating Disorder Inventory-2, Alcohol Use Disorder Identification Test, and Canadian Problem Gambling Index. A confirmatory factor analysis supported a 5-factor interrelated model. The internal consistency coefficients for the 5 factors of the short Korean UPPS-P were acceptable (.65-.78 across the subscales), and the subscales of the short Korean UPPS-P were strongly correlated with the long UPPS-P subscales. External validity was demonstrated by associations between the subfactors of impulsivity and various psychopathologies, including depression, anxiety, binge eating, alcohol abuse, and gambling. The present results indicate that the short Korean version of the UPPS-P may be a useful and reliable alternative to the original long-form UPPS-P. © 2018 John Wiley & Sons Australia, Ltd.
Assessing Jail Inmates’ Proneness to Shame and Guilt: Feeling Bad About the Behavior or the Self?

PubMed Central

Tangney, June P.; Stuewig, Jeffrey; Mashek, Debra; Hastings, Mark

2011-01-01

This study of 550 jail inmates (379 male and 171 female) held on felony charges examines the reliability and validity of the Test of Self Conscious Affect –Socially Deviant Version (TOSCA-SD; Hanson & Tangney, 1996) as a measure of offenders’ proneness to shame and proneness to guilt. Discriminant validity (e.g., vis-à-vis self-esteem, negative affect, social desirability/impression management) and convergent validity (e.g., vis-à-vis correlations with empathy, externalization of blame, anger, psychological symptoms, and substance use problems) was supported, paralleling results from community samples. Further, proneness to shame and guilt were differentially related to widely used risk measures from the field of criminal justice (e.g., criminal history, psychopathy, violence risk, antisocial personality). Guilt-proneness appears to be a protective factor, whereas there was no evidence that shame-proneness serves an inhibitory function. Subsequent analyses indicate these findings generalize quite well across gender and race. Implications for intervention and sentencing practices are discussed. PMID:21743757
External validation of Global Evaluative Assessment of Robotic Skills (GEARS).

PubMed

Aghazadeh, Monty A; Jayaratna, Isuru S; Hung, Andrew J; Pan, Michael M; Desai, Mihir M; Gill, Inderbir S; Goh, Alvin C

2015-11-01

We demonstrate the construct validity, reliability, and utility of Global Evaluative Assessment of Robotic Skills (GEARS), a clinical assessment tool designed to measure robotic technical skills, in an independent cohort using an in vivo animal training model. Using a cross-sectional observational study design, 47 voluntary participants were categorized as experts (>30 robotic cases completed as primary surgeon) or trainees. The trainee group was further divided into intermediates (≥5 but ≤30 cases) or novices (<5 cases). All participants completed a standardized in vivo robotic task in a porcine model. Task performance was evaluated by two expert robotic surgeons and self-assessed by the participants using the GEARS assessment tool. Kruskal-Wallis test was used to compare the GEARS performance scores to determine construct validity; Spearman's rank correlation measured interobserver reliability; and Cronbach's alpha was used to assess internal consistency. Performance evaluations were completed on nine experts and 38 trainees (14 intermediate, 24 novice). Experts demonstrated superior performance compared to intermediates and novices overall and in all individual domains (p < 0.0001). In comparing intermediates and novices, the overall performance difference trended toward significance (p = 0.0505), while the individual domains of efficiency and autonomy were significantly different between groups (p = 0.0280 and 0.0425, respectively). Interobserver reliability between expert ratings was confirmed with a strong correlation observed (r = 0.857, 95 % CI [0.691, 0.941]). Experts and participant scoring showed less agreement (r = 0.435, 95 % CI [0.121, 0.689] and r = 0.422, 95 % CI [0.081, 0.0672]). Internal consistency was excellent for experts and participants (α = 0.96, 0.98, 0.93). In an independent cohort, GEARS was able to differentiate between different robotic skill levels, demonstrating excellent construct validity. As a standardized assessment tool, GEARS maintained consistency and reliability for an in vivo robotic surgical task and may be applied for skills evaluation in a broad range of robotic procedures.
Bumper and grille airbags concept for enhanced vehicle compatibility in side impact: phase II.

PubMed

Barbat, Saeed; Li, Xiaowei; Prasad, Priya

2013-01-01

Fundamental physics and numerous field studies have shown a higher injury and fatality risk for occupants in smaller and lighter vehicles when struck by heavier, taller and higher vehicles. The consensus is that the significant parameters influencing compatibility in front-to-side crashes are geometric interaction, vehicle stiffness, and vehicle mass. The objective of this research is to develop a concept of deployable bumper and grille airbags for improved vehicle compatibility in side impact. The external airbags, deployed upon signals from sensors, may help mitigate the effect of weight, geometry and stiffness differences and reduce side intrusions. However, a highly reliable pre-crash sensing system is required to enable the reliable deployment, which is currently not technologically feasible. Analytical and numerical methods and hardware testing were used to help develop the deployable external airbags concept. Various Finite Element (FE) models at different stages were developed and an extensive number of iterations were conducted to help optimize airbag and inflator parameters to achieve desired targets. The concept development was executed and validated in two phases. This paper covers Phase II ONLY, which includes: (1) Re-design of the airbag geometry, pressure, and deployment strategies; (2) Further validation using a Via sled test of a 48 kph perpendicular side impact of an SUV-type impactor against a stationary car with US-SID-H3 crash dummy in the struck side; (3) Design of the reaction surface necessary for the bumper airbag functionality. The concept was demonstrated through live deployment of external airbags with a reaction surface in a full-scale perpendicular side impact of an SUV against a stationary passenger car at 48 kph. This research investigated only the concept of the inflatable devices since pre-crash sensing development was beyond the scope of this research. The concept design parameters of the bumper and grille airbags are presented in this paper. Full vehicle-to-vehicle crash test results, Via sled test, and simulation results are also presented. Head peak acceleration, Head Injury Criteria (HIC), Thoracic Trauma Index (TTI), and Pelvic acceleration for the SID-H3 dummy and structural intrusion profiles were used as performance metrics for the bumper and grille airbags. Results obtained from the Via sled tests and the full vehicle-to-vehicle tests with bumper and grille airbags were compared to those of baseline test results with no external airbags.
Early detection of lung cancer recurrence after stereotactic ablative radiation therapy: radiomics system design

NASA Astrophysics Data System (ADS)

Dammak, Salma; Palma, David; Mattonen, Sarah; Senan, Suresh; Ward, Aaron D.

2018-02-01

Stereotactic ablative radiotherapy (SABR) is the standard treatment recommendation for Stage I non-small cell lung cancer (NSCLC) patients who are inoperable or who refuse surgery. This option is well tolerated by even unfit patients and has a low recurrence risk post-treatment. However, SABR induces changes in the lung parenchyma that can appear similar to those of recurrence, and the difference between the two at an early follow-up time point is not easily distinguishable for an expert physician. We hypothesized that a radiomics signature derived from standard-of-care computed tomography (CT) imaging can detect cancer recurrence within six months of SABR treatment. This study reports on the design phase of our work, with external validation planned in future work. In this study, we performed cross-validation experiments with four feature selection approaches and seven classifiers on an 81-patient data set. We extracted 104 radiomics features from the consolidative and the peri-consolidative regions on the follow-up CT scans. The best results were achieved using the sum of estimated Mahalanobis distances (Maha) for supervised forward feature selection and a trainable automatic radial basis support vector classifier (RBSVC). This system produced an area under the receiver operating characteristic curve (AUC) of 0.84, an error rate of 16.4%, a false negative rate of 12.7%, and a false positive rate of 20.0% for leaveone patient out cross-validation. This suggests that once validated on an external data set, radiomics could reliably detect post-SABR recurrence and form the basis of a tool assisting physicians in making salvage treatment decisions.
Just add water: Accuracy of analysis of diluted human milk samples using mid-infrared spectroscopy.

PubMed

Smith, R W; Adamkin, D H; Farris, A; Radmacher, P G

2017-01-01

To determine the maximum dilution of human milk (HM) that yields reliable results for protein, fat and lactose when analyzed by mid-infrared spectroscopy. De-identified samples of frozen HM were obtained. Milk was thawed and warmed (40°C) prior to analysis. Undiluted (native) HM was analyzed by mid-infrared spectroscopy for macronutrient composition: total protein (P), fat (F), carbohydrate (C); Energy (E) was calculated from the macronutrient results. Subsequent analyses were done with 1 : 2, 1 : 3, 1 : 5 and 1 : 10 dilutions of each sample with distilled water. Additional samples were sent to a certified lab for external validation. Quantitatively, F and P showed statistically significant but clinically non-critical differences in 1 : 2 and 1 : 3 dilutions. Differences at higher dilutions were statistically significant and deviated from native values enough to render those dilutions unreliable. External validation studies also showed statistically significant but clinically unimportant differences at 1 : 2 and 1 : 3 dilutions. The Calais Human Milk Analyzer can be used with HM samples diluted 1 : 2 and 1 : 3 and return results within 5% of values from undiluted HM. At a 1 : 5 or 1 : 10 dilution, however, results vary as much as 10%, especially with P and F. At the 1 : 2 and 1 : 3 dilutions these differences appear to be insignificant in the context of nutritional management. However, the accuracy and reliability of the 1 : 5 and 1 : 10 dilutions are questionable.
The WOMB (Women's views of birth) antenatal satisfaction questionnaire: development, dimensions, internal reliability, and validity.

PubMed Central

Smith, L F

1999-01-01

BACKGROUND: Antenatal services continue to change, stimulated by the Changing Childbirth report. Women's views should be an important component of assessing the quality of such services. To date, no published quantitative multidimensional assessment instrument has been available to measure their satisfaction with care. AIM: To develop a valid, reliable, multidimensional questionnaire to assess quality of antenatal care. METHOD: A multidimensional satisfaction questionnaire was developed using psychometric methods. Following fieldwork to pilot a questionnaire, three successive versions of it were given by midwives to pregnant women in their final trimester in nine trusts in the old South Western region of England. Their replies were analysed by principal components analysis (PCA) with varimax rotation; internal reliability was assessed by Cronbach's alpha. Face, content, and construct validity were all assessed during development. RESULTS: Out of 196 women, 134 (68.4%) returned the pilot questionnaires. One hundred and seventy-two (57.3%) out of 300 women returned version 1 of the WOMB (WOMen's views of Birth) antenatal satisfaction questionnaire proper, 283 (56.6%) out of 500 returned version 2, and 328 (65.6%) out of 500 returned the final development version. This final version consisted of 11 dimensions in addition to a general satisfaction one. These were [Cronbach's alpha]: five related to antenatal clinic characteristics (travelling to clinic [0.75], waiting at clinic [0.90], clinic environment [0.69], timing of appointment [0.78], car parking [0.85]), three 'professional' characteristics (professional competence [0.80], knowing carers [0.79], information provided [0.81]), antenatal classes [0.76], social support from other pregnant women [0.83], checking for the baby's heart beat [0.63]. There were significant moderate correlations (range = 0.24 to 0.77) between individual dimensions and the general satisfaction dimension. Women's dimension scores were significantly related to age, parity, social class, and best educational achievement. CONCLUSION: This multidimensional satisfaction instrument has good face, content, and construct validity, and excellent internal reliability. It could be used to generally assess antenatal services or to screen them to detect areas where further in-depth qualitative enquiry is merited. Its sensitivity to change over time, external reliability, and transferability to non-Caucasian groups needs to be assessed. PMID:10824341
The Internet for neurosurgeons: current resources and future challenges.

PubMed

Hughes, Mark A; Brennan, Paul M

2011-06-01

Our professional and personal lives depend increasingly on access to information via the Internet. As an open access resource, the Internet is on the whole unbridled by censorship and can facilitate the rapid propagation of ideas and discoveries. At the same time, this liberty in sharing information, being unregulated and often free from external validation, can be oppressive; overloading the user and hindering effective decision-making. It can be difficult, if not impossible, to reliably ascertain the provenance of data and opinion. We must, therefore, discern what is useful, relevant, and above all reliable if we are to harness the Internet's potential to improve training, delivery of care, research, and provision of patient information. This article profiles the resources currently available to neurosurgeons, asks how we can sort the informational wheat from the chaff, and explores where future developments might further influence neurosurgical practice.
Structural exploration for the refinement of anticancer matrix metalloproteinase-2 inhibitor designing approaches through robust validated multi-QSARs

NASA Astrophysics Data System (ADS)

Adhikari, Nilanjan; Amin, Sk. Abdul; Saha, Achintya; Jha, Tarun

2018-03-01

Matrix metalloproteinase-2 (MMP-2) is a promising pharmacological target for designing potential anticancer drugs. MMP-2 plays critical functions in apoptosis by cleaving the DNA repair enzyme namely poly (ADP-ribose) polymerase (PARP). Moreover, MMP-2 expression triggers the vascular endothelial growth factor (VEGF) having a positive influence on tumor size, invasion, and angiogenesis. Therefore, it is an urgent need to develop potential MMP-2 inhibitors without any toxicity but better pharmacokinetic property. In this article, robust validated multi-quantitative structure-activity relationship (QSAR) modeling approaches were attempted on a dataset of 222 MMP-2 inhibitors to explore the important structural and pharmacophoric requirements for higher MMP-2 inhibition. Different validated regression and classification-based QSARs, pharmacophore mapping and 3D-QSAR techniques were performed. These results were challenged and subjected to further validation to explain 24 in house MMP-2 inhibitors to judge the reliability of these models further. All these models were individually validated internally as well as externally and were supported and validated by each other. These results were further justified by molecular docking analysis. Modeling techniques adopted here not only helps to explore the necessary structural and pharmacophoric requirements but also for the overall validation and refinement techniques for designing potential MMP-2 inhibitors.
Diagnosing soft tissue rheumatic disorders of the upper limb in epidemiological studies of vibration-exposed populations

PubMed Central

Palmer, Keith T

2013-01-01

Objectives To investigate approaches adopted to diagnose soft tissue rheumatic disorders of the upper limb (ULDs) in vibration-exposed populations and in other settings, and to compare their methodological qualities. Methods Systematic searches were made of the Medline, Embase, and CINAHL electronic bibliographic databases, and of various supplementary sources (textbooks, reviews, conference and workshop proceedings, personal files). For vibration-exposed populations, qualifying papers were scored in terms of the provenance of their measuring instruments (adequacy of documentation, standardisation, reliability, criterion-related and content validity). Similar criteria were applied to general proposals for whole diagnostic schemes, and evidence was collated on the test-retest reliability of symptom histories and clinical signs. Results In total, 23 relevant reports were identified concerning vibration-exposed populations - 21 involving symptoms and 9 involving examination/diagnosis. Most of the instruments employed scored poorly in terms of methodological quality. The search also identified, from the wider literature, more than a dozen schemes directed at classifying ULDs, and 18 studies of test-retest reliability of symptoms and physical signs in the upper limb. Findings support the use of the standardised Nordic questionnaire for symptom inquiry and suggest that a range of physical signs can be elicited with reasonable between-observer agreement. Four classification schemes rated well in terms of content validity. One of these had excellent documentation, and one had been tested for repeatability, agreement with an external reference standard, and utility in distinguishing groups that differed in disability, prognosis and associated risk factors. Conclusions Hitherto, most studies of ULDs in vibration-exposed populations have used custom-specified diagnostic methods, poorly documented, and non-stringent in terms of standardisation and supporting evidence of reliability and/or validity. The broader literature contains several question sets and procedures that improve upon this, and offer scope in vibration-exposed populations to diagnose ULDs more systematically. PMID:17909839

Ethical Implications of Validity-vs.-Reliability Trade-Offs in Educational Research

ERIC Educational Resources Information Center

Fendler, Lynn

2016-01-01

In educational research that calls itself empirical, the relationship between validity and reliability is that of trade-off: the stronger the bases for validity, the weaker the bases for reliability (and vice versa). Validity and reliability are widely regarded as basic criteria for evaluating research; however, there are ethical implications of…
What to Do With "Moderate" Reliability and Validity Coefficients?

PubMed

Post, Marcel W

2016-07-01

Clinimetric studies may use criteria for test-retest reliability and convergent validity such that correlation coefficients as low as .40 are supportive of reliability and validity. It can be argued that moderate (.40-.60) correlations should not be interpreted in this way and that reliability coefficients <.70 should be considered as indicative of unreliability. Convergent validity coefficients in the .40 to .60 or .40 to .70 range should be considered as indications of validity problems, or as inconclusive at best. Studies on reliability and convergent should be designed in such a way that it is realistic to expect high reliability and validity coefficients. Multitrait multimethod approaches are preferred to study construct (convergent-divergent) validity. Copyright © 2016 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Reliability, validity, and clinical use of the Dominic Interactive: a DSM-based, self-report screen for school-aged children.

PubMed

Bergeron, Lise; Berthiaume, Claude; St-Georges, Marie; Piché, Geneviève; Smolla, Nicole

2013-08-01

As no single informant can be considered the gold standard of child psychopathology, interviewing of children regarding their own symptoms is necessary. Our study focused on the reliability, validity, and clinical use of the Dominic Interactive (DI), a multimedia self-report screen to assess symptoms for the most frequent Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition, Text Revision, mental disorders in school-aged children. A sample of 585 children aged 6 to 11 years from the community and psychiatric clinics was used to analyze the internal consistency, the test-retest estimate of reliability, and the criterion-related validity of the DI against the referral status. In addition, cross-informant correlation coefficients between this instrument (child report) and the Child Symptom Inventory (parent report) were explored in a subsample of 292 participants. For the total sample, Cronbach alpha coefficients ranged from 0.63 to 0.91. Test-retest kappas varied from 0.42 to 0.62 for categories based on cut-off points, except for specific phobias. Intraclass correlation coefficients ranged from 0.70 to 0.81 for symptom scales. The DI discriminated between referred and non-referred children in psychiatric clinics for all symptom scales. Significant cross-informant correlation coefficients were higher for the externalizing symptoms (0.35 to 0.48) than the internalizing symptoms (0.14 to 0.27). Findings of our study reasonably support adequate psychometric properties of the DI. This instrument offers a developmentally sensitive screening method to obtain unique information from young children about their mental health problems in front-line services, psychiatric clinics, and research settings.
Validation of psychoanalytic theories: towards a conceptualization of references.

PubMed

Zachrisson, Anders; Zachrisson, Henrik Daae

2005-10-01

The authors discuss criteria for the validation of psychoanalytic theories and develop a heuristic and normative model of the references needed for this. Their core question in this paper is: can psychoanalytic theories be validated exclusively from within psychoanalytic theory (internal validation), or are references to sources of knowledge other than psychoanalysis also necessary (external validation)? They discuss aspects of the classic truth criteria correspondence and coherence, both from the point of view of contemporary psychoanalysis and of contemporary philosophy of science. The authors present arguments for both external and internal validation. Internal validation has to deal with the problems of subjectivity of observations and circularity of reasoning, external validation with the problem of relevance. They recommend a critical attitude towards psychoanalytic theories, which, by carefully scrutinizing weak points and invalidating observations in the theories, reduces the risk of wishful thinking. The authors conclude by sketching a heuristic model of validation. This model combines correspondence and coherence with internal and external validation into a four-leaf model for references for the process of validating psychoanalytic theories.
Focusing on the adult attachment style in schizophrenia in community mental health centres: validation of the Psychosis Attachment Measure (PAM) in a German-speaking sample.

PubMed

Kvrgic, Sara; Beck, Eva-Marina; Cavelti, Marialuisa; Kossowsky, Joe; Stieglitz, Rolf-Dieter; Vauth, Roland

2012-07-01

Assessing attachment style in people with schizophrenia may be important to identify a risk factor in building a strong therapeutic relationship and so indirectly to understand the development of mal-compliance as one of the major obstacles in the treatment of schizophrenia. The present study analysed the psychometric properties of the German version of the Psychosis Attachment Measure (PAM), which assesses avoidant and anxious attachment style. A sample of 127 patients suffering from chronic schizophrenia or schizoaffective disorder participated in this study. In testing discriminant validity, we assessed psychopathology, depression, therapeutic relationship and service engagement. Internal consistency, test-retest reliability and factor structure were analysed. The German version of PAM exhibited acceptable to good internal and test-retest reliabilities and the two-factor structure of the English version could be replicated. Avoidant attachment style was related to higher levels of positive symptoms and to a poorer therapeutic relationship. In the context of external validation, a regression analysis revealed that a poor therapeutic relationship correlated with avoidant attachment style, independent of anxious attachment style and depressive symptoms. Anxious attachment was associated with higher treatment adherence. Both insecure attachment styles (avoidant and anxious) were found to be correlated with higher levels of depression, but only attachment anxiety had an independent predictive value for self-reported depression in regression analysis. The German version of PAM displayed satisfactory psychometric properties and seems to be a reliable measure for assessing attachment style in individuals with schizophrenia. Validation of PAM led to the finding that only the avoidant attachment style might be a risk factor when building a strong therapeutic relationship in schizophrenia. In future studies, other factors influencing therapeutic relationship should be taken into account. Anxious attachment style may be a risk factor for depression, but it also has an enhancing effect on treatment adherence.
Development and validation of the Learning Disabilities Needs Assessment Tool (LDNAT), a HoNOS-based needs assessment tool for use with people with intellectual disability.

PubMed

Painter, J; Trevithick, L; Hastings, R P; Ingham, B; Roy, A

2016-12-01

In meeting the needs of individuals with intellectual disabilities (ID) who access health services, a brief, holistic assessment of need is useful. This study outlines the development and testing of the Learning Disabilities Needs Assessment Tool (LDNAT), a tool intended for this purpose. An existing mental health (MH) tool was extended by a multidisciplinary group of ID practitioners. Additional scales were drafted to capture needs across six ID treatment domains that the group identified. LDNAT ratings were analysed for the following: item redundancy, relevance, construct validity and internal consistency (n = 1692); test-retest reliability (n = 27); and concurrent validity (n = 160). All LDNAT scales were deemed clinically relevant with little redundancy apparent. Principal component analysis indicated three components (developmental needs, challenging behaviour, MH and well-being). Internal consistency was good (Cronbach alpha 0.80). Individual item test-retest reliability was substantial-near perfect for 20 scales and slight-fair for three scales. Overall reliability was near perfect (intra-class correlation = 0.91). There were significant associations with five of six condition-specific measures, i.e. the Waisman Activities of Daily Living Scale (general ability/disability), Threshold Assessment Grid (risk), Behaviour Problems Inventory for Individuals with Intellectual Disabilities-Short Form (challenging behaviour) Social Communication Questionnaire (autism) and a bespoke physical health questionnaire. Additionally, the statistically significant correlations between these tools and the LDNAT components made sense clinically. There were no statistically significant correlations with the Psychiatric Assessment Schedules for Adults with Developmental Disabilities (a measure of MH symptoms in people with ID). The LDNAT had clinically utility when rating the needs of people with ID prior to condition-specific assessment(s). Analyses of internal and external validity were promising. Further evaluation of its sensitivity to changes in needs is now required. © 2016 MENCAP and International Association of the Scientific Study of Intellectual and Developmental Disabilities and John Wiley & Sons Ltd.
Developing and testing an instrument to measure the presence of conditions for successful implementation of quality improvement collaboratives.

PubMed

Dückers, Michel L A; Wagner, Cordula; Groenewegen, Peter P

2008-08-11

In quality improvement collaboratives (QICs) teams of practitioners from different health care organizations are brought together to systematically improve an aspect of patient care. Teams take part in a series of meetings to learn about relevant best practices, quality methods and change ideas, and share experiences in making changes in their own local setting. The purpose of this study was to develop an instrument for measuring team organization, external change agent support and support from the team's home institution in a Dutch national improvement and dissemination programme for hospitals based on several QICs. The exploratory methodological design included two phases: a) content development and assessment, resulting in an instrument with 15 items, and b) field testing (N = 165). Internal consistency reliability was tested via Cronbach's alpha coefficient. Principal component analyses were used to identify underlying constructs. Tests of scaling assumptions according to the multi trait/multi-item matrix, were used to confirm the component structure. Three components were revealed, explaining 65% of the variability. The components were labelled 'organizational support', 'team organization' and 'external change agent support'. One item not meeting item-scale criteria was removed. This resulted in a 14 item instrument. Scale reliability ranged from 0.77 to 0.91. Internal item consistency and divergent validity were satisfactory. On the whole, the instrument appears to be a promising tool for assessing team organization and internal and external support during QIC implementation. The psychometric properties were good and warrant application of the instrument for the evaluation of the national programme and similar improvement programmes.
Reliability of externally fixed dynamometry hamstring strength testing in elite youth football players.

PubMed

Wollin, Martin; Purdam, Craig; Drew, Michael K

2016-01-01

To investigate inter and intra-tester reliability of an externally fixed dynamometry unilateral hamstring strength test, in the elite sports setting. Reliability study. Sixteen, injury-free, elite male youth football players (age=16.81±0.54 years, height=180.22±5.29cm, weight 73.88±6.54kg, BMI=22.57±1.42) gave written informed consent. Unilateral maximum isometric peak hamstring force was evaluated by externally fixed dynamometry for inter-tester, intra-day and intra-tester, inter-week reliability. The test position was standardised to correlate with the terminal swing phase of the gait running cycle. Inter and intra-tester values demonstrated good to high levels of reliability. The intra-class coefficient (ICC) for inter-tester, intra-day reliability was 0.87 (95% CI=0.75-0.93) with standard error of measure percentage (SEM%) 4.7 and minimal detectable change percentage (MDC%) 12.9. Intra-tester, inter-week reliability results were ICC 0.86 (95% CI, 0.74-0.93), SEM% 5.0 and MDC% 14.0. This study demonstrates good to high inter and intra-tester reliability of isometric externally fixed dynamometry unilateral hamstring strength testing in the regular elite sport setting involving elite male youth football players. The intra-class coefficient in association with the low standard error of measure and minimal detectable change percentages suggest that this procedure is appropriate for clinical and academic use as well as monitoring hamstring strength in the elite sport setting. Crown Copyright © 2015. Published by Elsevier Ltd. All rights reserved.
External validity of a hierarchical dimensional model of child and adolescent psychopathology: Tests using confirmatory factor analyses and multivariate behavior genetic analyses.

PubMed

Waldman, Irwin D; Poore, Holly E; van Hulle, Carol; Rathouz, Paul J; Lahey, Benjamin B

2016-11-01

Several recent studies of the hierarchical phenotypic structure of psychopathology have identified a General psychopathology factor in addition to the more expected specific Externalizing and Internalizing dimensions in both youth and adult samples and some have found relevant unique external correlates of this General factor. We used data from 1,568 twin pairs (599 MZ & 969 DZ) age 9 to 17 to test hypotheses for the underlying structure of youth psychopathology and the external validity of the higher-order factors. Psychopathology symptoms were assessed via structured interviews of caretakers and youth. We conducted phenotypic analyses of competing structural models using Confirmatory Factor Analysis and used Structural Equation Modeling and multivariate behavior genetic analyses to understand the etiology of the higher-order factors and their external validity. We found that both a General factor and specific Externalizing and Internalizing dimensions are necessary for characterizing youth psychopathology at both the phenotypic and etiologic levels, and that the 3 higher-order factors differed substantially in the magnitudes of their underlying genetic and environmental influences. Phenotypically, the specific Externalizing and Internalizing dimensions were slightly negatively correlated when a General factor was included, which reflected a significant inverse correlation between the nonshared environmental (but not genetic) influences on Internalizing and Externalizing. We estimated heritability of the general factor of psychopathology for the first time. Its moderate heritability suggests that it is not merely an artifact of measurement error but a valid construct. The General, Externalizing, and Internalizing factors differed in their relations with 3 external validity criteria: mother's smoking during pregnancy, parent's harsh discipline, and the youth's association with delinquent peers. Multivariate behavior genetic analyses supported the external validity of the 3 higher-order factors by suggesting that the General, Externalizing, and Internalizing factors were correlated with peer delinquency and parent's harsh discipline for different etiologic reasons. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
The NINDS-Canadian stroke network vascular cognitive impairment neuropsychology protocols in Chinese.

PubMed

Wong, Adrian; Xiong, Yun-yun; Wang, Defeng; Lin, Shi; Chu, Winnie W C; Kwan, Pauline W K; Nyenhuis, David; Black, Sandra E; Wong, Ka Sing Lawrence; Mok, Vincent

2013-05-01

Vascular cognitive impairment (VCI) affects up to half of stroke survivors and predicts poor outcomes. Valid and reliable assessement for VCI is lacking, especially for the Chinese population. In 2005, the National Institute of Neurological Disorders and Stroke and Canadian Stroke Network (NINDS-CSN) Harmonisation workshop proposed a set of three neuropsychology protocols for VCI evaluation. This paper is to introduce the protocol design and to report the psychometric properties of the Chinese NINDS-CSN VCI protocols. Fifty patients with mild stroke (mean National Institute of Health Stroke Scale 2.2 (SD=3.2)) and 50 controls were recruited. The NINDS-CSN VCI protocols were adapted into Chinese. We assessed protocols' (1) external validity, defined by how well the protocol summary scores differentiated patients from controls using receiver operating characteristics (ROC) curve analysis; (2) concurrent validity, by correlations with functional measures including Stroke Impact Scale memory score and Chinese Disability Assessment for Dementia; (3) internal consistency; and (4) ease of administration. All three protocols differentiated patients from controls (area under ROC for the three protocols between 0.77 to 0.79, p<0.001), and significantly correlated with the functional measures (Pearson r ranged from 0.37 to 0.51). A cut-off of 19/20 on MMSE identified only one-tenth of patients classified as impaired on the 5-min protocol. Cronbach's α across the four cognitive domains of the 60-min protocol was 0.78 for all subjects and 0.76 for stroke patients. The Chinese NINDS-CSN VCI protocols are valid and reliable for cognitive assessment in Chinese patients with mild stroke.
Investigation of the optimum location of external markers for patient setup accuracy enhancement at external beam radiotherapy

PubMed Central

Torshabi, Ahmad Esmaili; Nankali, Saber

2016-01-01

In external beam radiotherapy, one of the most common and reliable methods for patient geometrical setup and/or predicting the tumor location is use of external markers. In this study, the main challenging issue is increasing the accuracy of patient setup by investigating external markers location. Since the location of each external marker may yield different patient setup accuracy, it is important to assess different locations of external markers using appropriate selective algorithms. To do this, two commercially available algorithms entitled a) canonical correlation analysis (CCA) and b) principal component analysis (PCA) were proposed as input selection algorithms. They work on the basis of maximum correlation coefficient and minimum variance between given datasets. The proposed input selection algorithms work in combination with an adaptive neuro‐fuzzy inference system (ANFIS) as a correlation model to give patient positioning information as output. Our proposed algorithms provide input file of ANFIS correlation model accurately. The required dataset for this study was prepared by means of a NURBS‐based 4D XCAT anthropomorphic phantom that can model the shape and structure of complex organs in human body along with motion information of dynamic organs. Moreover, a database of four real patients undergoing radiation therapy for lung cancers was utilized in this study for validation of proposed strategy. Final analyzed results demonstrate that input selection algorithms can reasonably select specific external markers from those areas of the thorax region where root mean square error (RMSE) of ANFIS model has minimum values at that given area. It is also found that the selected marker locations lie closely in those areas where surface point motion has a large amplitude and a high correlation. PACS number(s): 87.55.km, 87.55.N PMID:27929479
Neck motion kinematics: an inter-tester reliability study using an interactive neck VR assessment in asymptomatic individuals.

PubMed

Sarig Bahat, Hilla; Sprecher, Elliot; Sela, Itamar; Treleaven, Julia

2016-07-01

The use of virtual reality (VR) for assessment and intervention of neck pain has previously been used and shown reliable for cervical range of motion measures. Neck VR enables analysis of task-oriented neck movement by stimulating responsive movements to external stimuli. Therefore, the purpose of this study was to establish inter-tester reliability of neck kinematic measures so that it can be used as a reliable assessment and treatment tool between clinicians. This reliability study included 46 asymptomatic participants, who were assessed using the neck VR system which displayed an interactive VR scenario via a head-mounted device, controlled by neck movements. The objective of the interactive assessment was to hit 16 targets, randomly appearing in four directions, as fast as possible. Each participant was tested twice by two different testers. Good reliability was found of neck motion kinematic measures in flexion, extension, and rotation (0.64-0.93 inter-class correlation). High reliability was shown for peak velocity globally (0.93), in left rotation (0.9), right rotation and extension (0.88), and flexion (0.86). Mean velocity had a good global reliability (0.84), except for left rotation directed movement with moderate reliability (0.68). Minimal detectable change for peak velocity ranged from 41 to 53 °/s, while mean velocity ranged from 20 to 25 °/s. The results suggest high reliability for peak and mean velocity as measured by the interactive Neck VR assessment of neck motion kinematics. VR appears to provide a reliable and more ecologically valid method of cervical motion evaluation than previous conventional methodologies.
Assessing dependency using self-report and indirect measures: examining the significance of discrepancies.

PubMed

Cogswell, Alex; Alloy, Lauren B; Karpinski, Andrew; Grant, David A

2010-07-01

The present study addressed convergence between self-report and indirect approaches to assessing dependency. We were moderately successful in validating an implicit measure, which was found to be reliable, orthogonal to 2 self-report instruments, and predictive of external criteria. This study also examined discrepancies between scores on self-report and implicit measures, and has implications for their significance. The possibility that discrepancies themselves are pathological was not supported, although discrepancies were associated with particular personality profiles. Finally, this study offered additional evidence for the relation between dependency and depressive symptomatology and identified implicit dependency as contributing unique variance in predicting past major depression.
[Assessment of stress in childhood: Children's Daily Stress Inventory (Inventario Infantil de Estresores Cotidiano, IIEC)].

PubMed

Trianes Torres, María Victoria; Blanca Mena, María José; Fernández Baena, Francisco J; Escobar Espejo, Milagros; Maldonado Montero, Enrique F; Muñoz Sánchez, Angela María

2009-11-01

The present study introduces the Children's Daily Stress Inventory (Inventario Infantil de Estresores Cotidianos, IIEC) as a measure that assesses daily stress in primary school children. The inventory was applied to a sample of 1094 primary school students. The final version includes 25 dichotomic items covering the areas of health, school/peers, and family. The score is obtained by adding the total of positive answers. Analyses of items, reliability and several external pieces of evidence of validity based on relations with other variables are presented. The results show adequate psychometric properties for the assessment of daily stress in children.
Training and Maintaining System-Wide Reliability in Outcome Management.

PubMed

Barwick, Melanie A; Urajnik, Diana J; Moore, Julia E

2014-01-01

The Child and Adolescent Functional Assessment Scale (CAFAS) is widely used for outcome management, for providing real time client and program level data, and the monitoring of evidence-based practices. Methods of reliability training and the assessment of rater drift are critical for service decision-making within organizations and systems of care. We assessed two approaches for CAFAS training: external technical assistance and internal technical assistance. To this end, we sampled 315 practitioners trained by external technical assistance approach from 2,344 Ontario practitioners who had achieved reliability on the CAFAS. To assess the internal technical assistance approach as a reliable alternative training method, 140 practitioners trained internally were selected from the same pool of certified raters. Reliabilities were high for both practitioners trained by external technical assistance and internal technical assistance approaches (.909-.995, .915-.997, respectively). 1 and 3-year estimates showed some drift on several scales. High and consistent reliabilities over time and training method has implications for CAFAS training of behavioral health care practitioners, and the maintenance of CAFAS as a global outcome management tool in systems of care.
Reliability and validity of the revised Gibson Test of Cognitive Skills, a computer-based test battery for assessing cognition across the lifespan.

PubMed

Moore, Amy Lawson; Miller, Terissa M

2018-01-01

The purpose of the current study is to evaluate the validity and reliability of the revised Gibson Test of Cognitive Skills, a computer-based battery of tests measuring short-term memory, long-term memory, processing speed, logic and reasoning, visual processing, as well as auditory processing and word attack skills. This study included 2,737 participants aged 5-85 years. A series of studies was conducted to examine the validity and reliability using the test performance of the entire norming group and several subgroups. The evaluation of the technical properties of the test battery included content validation by subject matter experts, item analysis and coefficient alpha, test-retest reliability, split-half reliability, and analysis of concurrent validity with the Woodcock Johnson III Tests of Cognitive Abilities and Tests of Achievement. Results indicated strong sources of evidence of validity and reliability for the test, including internal consistency reliability coefficients ranging from 0.87 to 0.98, test-retest reliability coefficients ranging from 0.69 to 0.91, split-half reliability coefficients ranging from 0.87 to 0.91, and concurrent validity coefficients ranging from 0.53 to 0.93. The Gibson Test of Cognitive Skills-2 is a reliable and valid tool for assessing cognition in the general population across the lifespan.
Evaluation of tools used to measure calcium and/or dairy consumption in adults.

PubMed

Magarey, Anthea; Baulderstone, Lauren; Yaxley, Alison; Markow, Kylie; Miller, Michelle

2015-05-01

To identify and critique tools for the assessment of Ca and/or dairy intake in adults, in order to ascertain the most accurate and reliable tools available. A systematic review of the literature was conducted using defined inclusion and exclusion criteria. Articles reporting on originally developed tools or testing the reliability or validity of existing tools that measure Ca and/or dairy intake in adults were included. Author-defined criteria for reporting reliability and validity properties were applied. Studies conducted in Western countries. Adults. Thirty papers, utilising thirty-six tools assessing intake of dairy, Ca or both, were identified. Reliability testing was conducted on only two dairy and five Ca tools, with results indicating that only one dairy and two Ca tools were reliable. Validity testing was conducted for all but four Ca-only tools. There was high reliance in validity testing on lower-order tests such as correlation and failure to differentiate between statistical and clinically meaningful differences. Results of the validity testing suggest one dairy and five Ca tools are valid. Thus one tool was considered both reliable and valid for the assessment of dairy intake and only two tools proved reliable and valid for the assessment of Ca intake. While several tools are reliable and valid, their application across adult populations is limited by the populations in which they were tested. These results indicate a need for tools that assess Ca and/or dairy intake in adults to be rigorously tested for reliability and validity.
The ACTA PORT-score for predicting perioperative risk of blood transfusion for adult cardiac surgery.

PubMed

Klein, A A; Collier, T; Yeates, J; Miles, L F; Fletcher, S N; Evans, C; Richards, T

2017-09-01

A simple and accurate scoring system to predict risk of transfusion for patients undergoing cardiac surgery is lacking. We identified independent risk factors associated with transfusion by performing univariate analysis, followed by logistic regression. We then simplified the score to an integer-based system and tested it using the area under the receiver operator characteristic (AUC) statistic with a Hosmer-Lemeshow goodness-of-fit test. Finally, the scoring system was applied to the external validation dataset and the same statistical methods applied to test the accuracy of the ACTA-PORT score. Several factors were independently associated with risk of transfusion, including age, sex, body surface area, logistic EuroSCORE, preoperative haemoglobin and creatinine, and type of surgery. In our primary dataset, the score accurately predicted risk of perioperative transfusion in cardiac surgery patients with an AUC of 0.76. The external validation confirmed accuracy of the scoring method with an AUC of 0.84 and good agreement across all scores, with a minor tendency to under-estimate transfusion risk in very high-risk patients. The ACTA-PORT score is a reliable, validated tool for predicting risk of transfusion for patients undergoing cardiac surgery. This and other scores can be used in research studies for risk adjustment when assessing outcomes, and might also be incorporated into a Patient Blood Management programme. © The Author 2017. Published by Oxford University Press on behalf of the British Journal of Anaesthesia. All rights reserved. For Permissions, please email: journals.permissions@oup.com
External Validation of Bifactor Model of ADHD: Explaining Heterogeneity in Psychiatric Comorbidity, Cognitive Control, and Personality Trait Profiles within DSM-IV ADHD

ERIC Educational Resources Information Center

Martel, Michelle M.; Roberts, Bethan; Gremillion, Monica; von Eye, Alexander; Nigg, Joel T.

2011-01-01

The current paper provides external validation of the bifactor model of ADHD by examining associations between ADHD latent factor/profile scores and external validation indices. 548 children (321 boys; 302 with ADHD), 6 to 18 years old, recruited from the community participated in a comprehensive diagnostic procedure. Mothers completed the Child…
Validation of the self regulation questionnaire as a measure of health in quality of life research

PubMed Central

2009-01-01

Objectives Several epidemiological studies address psychosomatic 'self regulation' as a measure of quality of life aspects. However, although widely used in studies with a focus on complementary cancer treatment, and recognized to be associated with better survival of cancer patients, it is unclear what the 'self regulation' questionnaire exactly measures. Design and setting In a sample of 444 individuals (27% healthy, 33% cancer, 40% other internal diseases), we performed reliability and exploratory factor analyses, and correlated the 16-item instrument with external measures such as the Hospital Anxiety and Depression Scale, the Herdecke Quality of Life questionnaire, and autonomic regulation questionnaire. Results The 16-item pool had a very good internal consistency (Cronbach's alpha = 0.948) and satisfying/good (rrt = 0.796) test-retest reliability after 3 months. Exploratory factor analysis indicated 2 sub-constructs: (1) Ability to change behaviour in order to reach goals, and (2) Achieve satisfaction and well-being. Both sub-scales correlated well with quality of life aspects, particularly with Initiative Power/Interest, Social Interactions, Mental Balance, and negatively with anxiety and depression. Conclusions The Self Regulation Questionnaire (SRQ) was found to be a valid and reliable tool which measures unique psychosomatic abilities. Self regulation deals with competence and autonomy and can be regarded as a problem solving capacity in terms of an active adaptation to stressful situations to restore wellbeing. The tool is an interesting option to be used particularly in complementary medicine research with a focus on behavioural modification. PMID:19541580

Selecting and Improving Quasi-Experimental Designs in Effectiveness and Implementation Research.

PubMed

Handley, Margaret A; Lyles, Courtney R; McCulloch, Charles; Cattamanchi, Adithya

2018-04-01

Interventional researchers face many design challenges when assessing intervention implementation in real-world settings. Intervention implementation requires holding fast on internal validity needs while incorporating external validity considerations (such as uptake by diverse subpopulations, acceptability, cost, and sustainability). Quasi-experimental designs (QEDs) are increasingly employed to achieve a balance between internal and external validity. Although these designs are often referred to and summarized in terms of logistical benefits, there is still uncertainty about (a) selecting from among various QEDs and (b) developing strategies to strengthen the internal and external validity of QEDs. We focus here on commonly used QEDs (prepost designs with nonequivalent control groups, interrupted time series, and stepped-wedge designs) and discuss several variants that maximize internal and external validity at the design, execution and implementation, and analysis stages.
[Evaluation of Suicide Risk Levels in Hospitals: Validity and Reliability Tests].

PubMed

Macagnino, Sandro; Steinert, Tilman; Uhlmann, Carmen

2018-05-01

Examination of in-hospital suicide risk levels concerning their validity and their reliability. The internal suicide risk levels were evaluated in a cross sectional study of in 163 inpatients. A reliability check was performed via determining interrater-reliability of senior physician, therapist and the responsible nurse. Within the scope of the validity check, we conducted analyses of criterion validity and construct validity. For the total sample an "acceptable" to "good" interrater-reliability (Kendalls W = .77) of suicide risk levels were obtained. Schizophrenic disorders showed the lowest values, for personality disorders we found the highest level of interrater-reliability. When examining the criterion validity, Item-9 of the BDI-II is substantial correlated to our suicide risk levels (ρ m = .54, p < .01). Within the scope of construct validity check, affective disorders showed the highest correlation (ρ = .77), compatible also with "convergent validity". They differed with schizophrenic disorders which showed the least concordance (ρ = .43). In-hospital suicide risk levels may represent an important contribution to the assessment of suicidal behavior of inpatients experiencing psychiatric treatment due to their overall good validity and reliability. © Georg Thieme Verlag KG Stuttgart · New York.
Adapting Social Neuroscience Measures for Schizophrenia Clinical Trials, Part 1: Ferrying Paradigms Across Perilous Waters

PubMed Central

Green, Michael F.

2013-01-01

Social cognitive impairment is prominent in schizophrenia, and it is closely related to functional outcome. Partly for these reasons, it has rapidly become a target for both training and psychopharmacological interventions. However, there is a paucity of reliable and valid social cognitive endpoints that can be used to evaluate treatment response in clinical trials. Also, clinical studies in schizophrenia have benefited rather little from the surge of activity and knowledge in nonclinical social neuroscience. The National Institute of Mental Health-sponsored study, “Social Cognition and Functioning in Schizophrenia” (SCAF), attempted to address this translational challenge by selecting paradigms from social neuroscience that could be adapted for use in schizophrenia. The project also evaluated the psychometric properties and external validity of the tasks to determine their suitability for multisite clinical trials. This first article in the theme section presents the goals, conceptual background, and rationale for the SCAF project. PMID:24072811
Observation of early childhood physical aggression: a psychometric study of the system for coding early physical aggression.

PubMed

Mesman, Judi; Alink, Lenneke R A; van Zeijl, Jantien; Stolk, Mirjam N; Bakermans-Kranenburg, Marian J; van Ijzendoorn, Marinus H; Juffer, Femmie; Koot, Hans M

2008-01-01

We investigated the reliability and (convergent and discriminant) validity of an observational measure of physical aggression in toddlers and preschoolers, originally developed by Keenan and Shaw [1994]. The observation instrument is based on a developmental definition of aggression. Physical aggression was observed twice in a laboratory setting, the first time when children were 1-3 years old, and again 1 year later. Observed physical aggression was significantly related to concurrent mother-rated physical aggression for 2- to 4-year-olds, but not to maternal ratings of nonaggressive externalizing problems, indicating the measure's discriminant validity. However, we did not find significant 1-year stability of observed physical aggression in any of the age groups, whereas mother-rated physical aggression was significantly stable for all ages. The observational measure shows promise, but may have assessed state rather than trait aggression in our study. Copyright 2008 Wiley-Liss, Inc.
An empirical assessment of validation practices for molecular classifiers

PubMed Central

Castaldi, Peter J.; Dahabreh, Issa J.

2011-01-01

Proposed molecular classifiers may be overfit to idiosyncrasies of noisy genomic and proteomic data. Cross-validation methods are often used to obtain estimates of classification accuracy, but both simulations and case studies suggest that, when inappropriate methods are used, bias may ensue. Bias can be bypassed and generalizability can be tested by external (independent) validation. We evaluated 35 studies that have reported on external validation of a molecular classifier. We extracted information on study design and methodological features, and compared the performance of molecular classifiers in internal cross-validation versus external validation for 28 studies where both had been performed. We demonstrate that the majority of studies pursued cross-validation practices that are likely to overestimate classifier performance. Most studies were markedly underpowered to detect a 20% decrease in sensitivity or specificity between internal cross-validation and external validation [median power was 36% (IQR, 21–61%) and 29% (IQR, 15–65%), respectively]. The median reported classification performance for sensitivity and specificity was 94% and 98%, respectively, in cross-validation and 88% and 81% for independent validation. The relative diagnostic odds ratio was 3.26 (95% CI 2.04–5.21) for cross-validation versus independent validation. Finally, we reviewed all studies (n = 758) which cited those in our study sample, and identified only one instance of additional subsequent independent validation of these classifiers. In conclusion, these results document that many cross-validation practices employed in the literature are potentially biased and genuine progress in this field will require adoption of routine external validation of molecular classifiers, preferably in much larger studies than in current practice. PMID:21300697
Modeling and Prediction of Solvent Effect on Human Skin Permeability using Support Vector Regression and Random Forest.

PubMed

Baba, Hiromi; Takahara, Jun-ichi; Yamashita, Fumiyoshi; Hashida, Mitsuru

2015-11-01

The solvent effect on skin permeability is important for assessing the effectiveness and toxicological risk of new dermatological formulations in pharmaceuticals and cosmetics development. The solvent effect occurs by diverse mechanisms, which could be elucidated by efficient and reliable prediction models. However, such prediction models have been hampered by the small variety of permeants and mixture components archived in databases and by low predictive performance. Here, we propose a solution to both problems. We first compiled a novel large database of 412 samples from 261 structurally diverse permeants and 31 solvents reported in the literature. The data were carefully screened to ensure their collection under consistent experimental conditions. To construct a high-performance predictive model, we then applied support vector regression (SVR) and random forest (RF) with greedy stepwise descriptor selection to our database. The models were internally and externally validated. The SVR achieved higher performance statistics than RF. The (externally validated) determination coefficient, root mean square error, and mean absolute error of SVR were 0.899, 0.351, and 0.268, respectively. Moreover, because all descriptors are fully computational, our method can predict as-yet unsynthesized compounds. Our high-performance prediction model offers an attractive alternative to permeability experiments for pharmaceutical and cosmetic candidate screening and optimizing skin-permeable topical formulations.
Educational testing validity and reliability in pharmacy and medical education literature.

PubMed

Hoover, Matthew J; Jung, Rose; Jacobs, David M; Peeters, Michael J

2013-12-16

To evaluate and compare the reliability and validity of educational testing reported in pharmacy education journals to medical education literature. Descriptions of validity evidence sources (content, construct, criterion, and reliability) were extracted from articles that reported educational testing of learners' knowledge, skills, and/or abilities. Using educational testing, the findings of 108 pharmacy education articles were compared to the findings of 198 medical education articles. For pharmacy educational testing, 14 articles (13%) reported more than 1 validity evidence source while 83 articles (77%) reported 1 validity evidence source and 11 articles (10%) did not have evidence. Among validity evidence sources, content validity was reported most frequently. Compared with pharmacy education literature, more medical education articles reported both validity and reliability (59%; p<0.001). While there were more scholarship of teaching and learning (SoTL) articles in pharmacy education compared to medical education, validity, and reliability reporting were limited in the pharmacy education literature.
Lower urinary tract symptoms that predict microscopic pyuria.

PubMed

Khasriya, Rajvinder; Barcella, William; De Iorio, Maria; Swamy, Sheela; Gill, Kiren; Kupelian, Anthony; Malone-Lee, James

2017-10-02

Urinary dipsticks and culture analyses of a mid-stream urine specimen (MSU) at 10 5 cfu ml -1 of a known urinary pathogen are considered the gold standard investigations for diagnosing urinary tract infection (UTI). However, the reliability of these tests has been much criticised and they may mislead. It is now widely accepted that pyuria (≥1 WBC μl -1 ) detected by microscopy of a fresh unspun, unstained specimen of urine is the best biological indicator of UTI available. We aimed to scrutinise the greater potential of symptoms analysis in detecting pyuria and UTI. Lower urinary tract symptom (LUTS) descriptions were collected from patients with chronic lower urinary tract symptoms referred to a tertiary referral unit. The symptoms informed a 39-question inventory, grouped into storage, voiding, stress incontinence and pain symptoms. All questions sought a binary yes or no response. A bespoke software package was developed to collect the data. The study was powered to a sample of at least 1,990 patients, with sufficient power to analyse 39 symptoms in a linear model with an effect size of Cohen's f 2 = 0.02, type 1 error probability = 0.05; and power (1-β); 95% where β is the probability of type 2 error). The inventory was administered to 2,050 female patients between August 2004 and November 2011. The data were collated and the following properties assessed: internal consistency, test-retest reliability, inter-observer reliability, internal responsiveness, external responsiveness, construct validity analysis and a comparison with the International Consultation on Incontinence Modular Questionnaire for female lower urinary tract symptoms (ICIQ-FLUTS). The dependent variable used as a surrogate marker of UTI was microscopic pyuria. An MSU sample was sent for routine culture. The symptoms proved reliable predictors of microscopic pyuria. In particular, voiding symptoms correlated well with microscopic pyuria (χ 2 = 88, df = 1, p < 0.001). The symptom inventory has significant psychometric characteristics as below: test-retest reliability: Cronbach's alpha was 0.981; inter-observer reliability, Cronbach's alpha was 0.995, internal responsiveness F = 221, p < 0.001, external responsiveness F = 359, df = 5, p < 0.001. The correlation coefficients for the domains of the ICIQ-FLUTS were around R = 0.5, p < 0.001. This symptoms score performed well on the standard, psychometric validation. The score changed in response to treatment and in a direction appropriate to the changes in microscopic pyuria. It correlated with measures of quality of life. It would seem to make a good candidate for monitoring treatment progress in ordinary clinical practice.
Health status in patients with coexistent COPD and heart failure: a validation and comparison between the Clinical COPD Questionnaire and the Minnesota Living with Heart Failure Questionnaire

PubMed Central

Berkhof, Farida F; Metzemaekers, Leola; Uil, Steven M; Kerstjens, Huib AM; van den Berg, Jan WK

2014-01-01

Background Chronic obstructive pulmonary disease (COPD) and heart failure (HF) are both common diseases that coexist frequently. Patients with both diseases have worse stable state health status when compared with patients with one of these diseases. In many outpatient clinics, health status is monitored routinely in COPD patients using the Clinical COPD Questionnaire (CCQ) and in HF patients with the Minnesota Living with Heart Failure Questionnaire (MLHF-Q). This study validated and compared which questionnaire, ie, the CCQ or the MLHF-Q, is suited best for patients with coexistent COPD and HF. Methods Patients with both COPD and HF and aged ≥40 years were included. Construct validity, internal consistency, test–retest reliability, and agreement were determined. The Short-Form 36 was used as the external criterion. All questionnaires were completed at baseline. The CCQ and MLHF-Q were repeated after 2 weeks, together with a global rating of change. Results Fifty-eight patients were included, of whom 50 completed the study. Construct validity was acceptable. Internal consistency was adequate for CCQ and MLHF-Q total and domain scores, with a Cronbach’s alpha ≥0.70. Reliability was adequate for MLHF-Q and CCQ total and domain scores, and intraclass correlation coefficients were 0.70–0.90, except for the CCQ symptom score (intraclass correlation coefficient 0.42). The standard error of measurement on the group level was smaller than the minimal clinical important difference for both questionnaires. However, the standard error of measurement on the individual level was larger than the minimal clinical important difference. Agreement was acceptable on the group level and limited on the individual level. Conclusion CCQ and MLHF-Q were both valid and reliable questionnaires for assessment of health status in patients with coexistent COPD and HF on the group level, and hence for research. However, in clinical practice, on the individual level, the characteristics of both questionnaires were not as good. There is room for a questionnaire with good evaluative properties on the individual level, preferably tested in a setting of patients with COPD or HF, or both. PMID:25285000
Psychometric properties of the Norwegian version of the Safety Attitudes Questionnaire (SAQ), Generic version (Short Form 2006).

PubMed

Deilkås, Ellen T; Hofoss, Dag

2008-09-22

How to protect patients from harm is a question of universal interest. Measuring and improving safety culture in care giving units is an important strategy for promoting a safe environment for patients. The Safety Attitudes Questionnaire (SAQ) is the only instrument that measures safety culture in a way which correlates with patient outcome. We have translated the SAQ to Norwegian and validated the translated version. The psychometric properties of the translated questionnaire are presented in this article. The questionnaire was translated with the back translation technique and tested in 47 clinical units in a Norwegian university hospital. SAQ's (the Generic version (Short Form 2006) the version with the two sets of questions on perceptions of management: on unit management and on hospital management) were distributed to 1911 frontline staff. 762 were distributed during unit meetings and 1149 through the postal system. Cronbach alphas, item-to-own correlations, and test-retest correlations were calculated, and response distribution analysis and confirmatory factor analysis were performed, as well as early validity tests. 1306 staff members completed and returned the questionnaire: a response rate of 68%. Questionnaire acceptability was good. The reliability measures were acceptable. The factor structure of the responses was tested by confirmatory factor analysis. 36 items were ascribed to seven underlying factors: Teamwork Climate, Safety Climate, Stress Recognition, Perceptions of Hospital Management, Perceptions of Unit Management, Working conditions, and Job satisfaction. Goodness-of-Fit Indices showed reasonable, but not indisputable, model fit. External validity indicators - recognizability of results, correlations with "trigger tool"-identified adverse events, with patient satisfaction with hospitalization, patient reports of possible maltreatment, and patient evaluation of organization of hospital work - provided preliminary validation. Based on the data from Akershus University Hospital, we conclude that the Norwegian translation of the SAQ showed satisfactory internal psychometric properties. With data from one hospital only, we cannot draw strong conclusions on its external validity. Further validation studies linking the SAQ-scores to patient outcome data should be performed.
Using beta binomials to estimate classification uncertainty for ensemble models.

PubMed

Clark, Robert D; Liang, Wenkel; Lee, Adam C; Lawless, Michael S; Fraczkiewicz, Robert; Waldman, Marvin

2014-01-01

Quantitative structure-activity (QSAR) models have enormous potential for reducing drug discovery and development costs as well as the need for animal testing. Great strides have been made in estimating their overall reliability, but to fully realize that potential, researchers and regulators need to know how confident they can be in individual predictions. Submodels in an ensemble model which have been trained on different subsets of a shared training pool represent multiple samples of the model space, and the degree of agreement among them contains information on the reliability of ensemble predictions. For artificial neural network ensembles (ANNEs) using two different methods for determining ensemble classification - one using vote tallies and the other averaging individual network outputs - we have found that the distribution of predictions across positive vote tallies can be reasonably well-modeled as a beta binomial distribution, as can the distribution of errors. Together, these two distributions can be used to estimate the probability that a given predictive classification will be in error. Large data sets comprised of logP, Ames mutagenicity, and CYP2D6 inhibition data are used to illustrate and validate the method. The distributions of predictions and errors for the training pool accurately predicted the distribution of predictions and errors for large external validation sets, even when the number of positive and negative examples in the training pool were not balanced. Moreover, the likelihood of a given compound being prospectively misclassified as a function of the degree of consensus between networks in the ensemble could in most cases be estimated accurately from the fitted beta binomial distributions for the training pool. Confidence in an individual predictive classification by an ensemble model can be accurately assessed by examining the distributions of predictions and errors as a function of the degree of agreement among the constituent submodels. Further, ensemble uncertainty estimation can often be improved by adjusting the voting or classification threshold based on the parameters of the error distribution. Finally, the profiles for models whose predictive uncertainty estimates are not reliable provide clues to that effect without the need for comparison to an external test set.
Validation of a scenario-based assessment of critical thinking using an externally validated tool.

PubMed

Buur, Jennifer L; Schmidt, Peggy; Smylie, Dean; Irizarry, Kris; Crocker, Carlos; Tyler, John; Barr, Margaret

2012-01-01

With medical education transitioning from knowledge-based curricula to competency-based curricula, critical thinking skills have emerged as a major competency. While there are validated external instruments for assessing critical thinking, many educators have created their own custom assessments of critical thinking. However, the face validity of these assessments has not been challenged. The purpose of this study was to compare results from a custom assessment of critical thinking with the results from a validated external instrument of critical thinking. Students from the College of Veterinary Medicine at Western University of Health Sciences were administered a custom assessment of critical thinking (ACT) examination and the externally validated instrument, California Critical Thinking Skills Test (CCTST), in the spring of 2011. Total scores and sub-scores from each exam were analyzed for significant correlations using Pearson correlation coefficients. Significant correlations between ACT Blooms 2 and deductive reasoning and total ACT score and deductive reasoning were demonstrated with correlation coefficients of 0.24 and 0.22, respectively. No other statistically significant correlations were found. The lack of significant correlation between the two examinations illustrates the need in medical education to externally validate internal custom assessments. Ultimately, the development and validation of custom assessments of non-knowledge-based competencies will produce higher quality medical professionals.
Development of Decision Support Formulas for the Prediction of Bladder Outlet Obstruction and Prostatic Surgery in Patients With Lower Urinary Tract Symptom/Benign Prostatic Hyperplasia: Part II, External Validation and Usability Testing of a Smartphone App.

PubMed

Choo, Min Soo; Jeong, Seong Jin; Cho, Sung Yong; Yoo, Changwon; Jeong, Chang Wook; Ku, Ja Hyeon; Oh, Seung-June

2017-04-01

We aimed to externally validate the prediction model we developed for having bladder outlet obstruction (BOO) and requiring prostatic surgery using 2 independent data sets from tertiary referral centers, and also aimed to validate a mobile app for using this model through usability testing. Formulas and nomograms predicting whether a subject has BOO and needs prostatic surgery were validated with an external validation cohort from Seoul National University Bundang Hospital and Seoul Metropolitan Government-Seoul National University Boramae Medical Center between January 2004 and April 2015. A smartphone-based app was developed, and 8 young urologists were enrolled for usability testing to identify any human factor issues of the app. A total of 642 patients were included in the external validation cohort. No significant differences were found in the baseline characteristics of major parameters between the original (n=1,179) and the external validation cohort, except for the maximal flow rate. Predictions of requiring prostatic surgery in the validation cohort showed a sensitivity of 80.6%, a specificity of 73.2%, a positive predictive value of 49.7%, and a negative predictive value of 92.0%, and area under receiver operating curve of 0.84. The calibration plot indicated that the predictions have good correspondence. The decision curve showed also a high net benefit. Similar evaluation results using the external validation cohort were seen in the predictions of having BOO. Overall results of the usability test demonstrated that the app was user-friendly with no major human factor issues. External validation of these newly developed a prediction model demonstrated a moderate level of discrimination, adequate calibration, and high net benefit gains for predicting both having BOO and requiring prostatic surgery. Also a smartphone app implementing the prediction model was user-friendly with no major human factor issue.
Reliability and validity of non-radiographic methods of thoracic kyphosis measurement: a systematic review.

PubMed

Barrett, Eva; McCreesh, Karen; Lewis, Jeremy

2014-02-01

A wide array of instruments are available for non-invasive thoracic kyphosis measurement. Guidelines for selecting outcome measures for use in clinical and research practice recommend that properties such as validity and reliability are considered. This systematic review reports on the reliability and validity of non-invasive methods for measuring thoracic kyphosis. A systematic search of 11 electronic databases located studies assessing reliability and/or validity of non-invasive thoracic kyphosis measurement techniques. Two independent reviewers used a critical appraisal tool to assess the quality of retrieved studies. Data was extracted by the primary reviewer. The results were synthesized qualitatively using a level of evidence approach. 27 studies satisfied the eligibility criteria and were included in the review. The reliability, validity and both reliability and validity were investigated by sixteen, two and nine studies respectively. 17/27 studies were deemed to be of high quality. In total, 15 methods of thoracic kyphosis were evaluated in retrieved studies. All investigated methods showed high (ICC ≥ .7) to very high (ICC ≥ .9) levels of reliability. The validity of the methods ranged from low to very high. The strongest levels of evidence for reliability exists in support of the Debrunner kyphometer, Spinal Mouse and Flexicurve index, and for validity supports the arcometer and Flexicurve index. Further reliability and validity studies are required to strengthen the level of evidence for the remaining methods of measurement. This should be addressed by future research. Copyright © 2013 Elsevier Ltd. All rights reserved.
Improving the quality of discrete-choice experiments in health: how can we assess validity and reliability?

PubMed

Janssen, Ellen M; Marshall, Deborah A; Hauber, A Brett; Bridges, John F P

2017-12-01

The recent endorsement of discrete-choice experiments (DCEs) and other stated-preference methods by regulatory and health technology assessment (HTA) agencies has placed a greater focus on demonstrating the validity and reliability of preference results. Areas covered: We present a practical overview of tests of validity and reliability that have been applied in the health DCE literature and explore other study qualities of DCEs. From the published literature, we identify a variety of methods to assess the validity and reliability of DCEs. We conceptualize these methods to create a conceptual model with four domains: measurement validity, measurement reliability, choice validity, and choice reliability. Each domain consists of three categories that can be assessed using one to four procedures (for a total of 24 tests). We present how these tests have been applied in the literature and direct readers to applications of these tests in the health DCE literature. Based on a stakeholder engagement exercise, we consider the importance of study characteristics beyond traditional concepts of validity and reliability. Expert commentary: We discuss study design considerations to assess the validity and reliability of a DCE, consider limitations to the current application of tests, and discuss future work to consider the quality of DCEs in healthcare.
77 FR 56650 - Food and Drug Administration/American Glaucoma Society Workshop on the Validity, Reliability, and...

Federal Register 2010, 2011, 2012, 2013, 2014

2012-09-13

...] Food and Drug Administration/American Glaucoma Society Workshop on the Validity, Reliability, and... entitled ``FDA/American Glaucoma Society (AGS) Workshop on the Validity, Reliability, and Usability of... research. The purpose of this public workshop is to provide a forum for discussing the validity...
A Severe Sepsis Mortality Prediction Model and Score for Use with Administrative Data

PubMed Central

Ford, Dee W.; Goodwin, Andrew J.; Simpson, Annie N.; Johnson, Emily; Nadig, Nandita; Simpson, Kit N.

2016-01-01

Objective Administrative data is used for research, quality improvement, and health policy in severe sepsis. However, there is not a sepsis-specific tool applicable to administrative data with which to adjust for illness severity. Our objective was to develop, internally validate, and externally validate a severe sepsis mortality prediction model and associated mortality prediction score. Design Retrospective cohort study using 2012 administrative data from five US states. Three cohorts of patients with severe sepsis were created: 1) ICD-9-CM codes for severe sepsis/septic shock, 2) ‘Martin’ approach, and 3) ‘Angus’ approach. The model was developed and internally validated in ICD-9-CM cohort and externally validated in other cohorts. Integer point values for each predictor variable were generated to create a sepsis severity score. Setting Acute care, non-federal hospitals in NY, MD, FL, MI, and WA Subjects Patients in one of three severe sepsis cohorts: 1) explicitly coded (n=108,448), 2) Martin cohort (n=139,094), and 3) Angus cohort (n=523,637) Interventions None Measurements and Main Results Maximum likelihood estimation logistic regression to develop a predictive model for in-hospital mortality. Model calibration and discrimination assessed via Hosmer-Lemeshow goodness-of-fit (GOF) and C-statistics respectively. Primary cohort subset into risk deciles and observed versus predicted mortality plotted. GOF demonstrated p>0.05 for each cohort demonstrating sound calibration. C-statistic ranged from low of 0.709 (sepsis severity score) to high of 0.838 (Angus cohort) suggesting good to excellent model discrimination. Comparison of observed versus expected mortality was robust although accuracy decreased in highest risk decile. Conclusions Our sepsis severity model and score is a tool that provides reliable risk adjustment for administrative data. PMID:26496452
Spanish cross-cultural adaptation and psychometric properties of the Schizophrenia Quality of Life short-version questionnaire (SQoL18) in 3 middle-income countries: Bolivia, Chile and Peru.

PubMed

Caqueo-Urízar, Alejandra; Boyer, Laurent; Boucekine, Mohamed; Auquier, Pascal

2014-10-01

The aim of this study was to adapt the Schizophrenia - Quality of Life short-version questionnaire (SQoL18) for use in three middle-income countries in Latin America and to evaluate the factor structure, reliability, and external validity of this questionnaire. The SQoL18 was translated into Spanish using a well-validated forward-backward process. We evaluated the psychometric properties of the SQoL18 in a sample of 253 patients with schizophrenia attending outpatient mental health services in three Latin American countries. For participants in each country (Bolivia, N=83; Chile, N=85; Peru, N=85), psychometric properties were compared to those reported from the reference population (507 patients with schizophrenia) assessed in the validation study. In addition, differential item functioning (DIF) analyses were performed to see whether all items behave in the same way in each country. Factor analysis performed in the 3 countries showed that the questionnaire's structure adequately matched the initial structure of the SQoL18. The unidimensionality of the dimensions was preserved, and the internal/external validity indices were close to those of the reference population. However, one dimension of the SQoL18 (resilience) presented some unsatisfactory properties including low Cronbach's alpha coefficients, one INFIT value higher than 1.2, and one item showing DIF between the 3 countries. These results demonstrate the satisfactory acceptability and psychometric properties of the SQoL18, suggesting the relevance of this questionnaire among patients with schizophrenia in these 3 Latin American countries. Copyright © 2014 Elsevier B.V. All rights reserved.
Different top-down approaches to estimate measurement uncertainty of whole blood tacrolimus mass concentration values.

PubMed

Rigo-Bonnin, Raül; Blanco-Font, Aurora; Canalias, Francesca

2018-05-08

Values of mass concentration of tacrolimus in whole blood are commonly used by the clinicians for monitoring the status of a transplant patient and for checking whether the administered dose of tacrolimus is effective. So, clinical laboratories must provide results as accurately as possible. Measurement uncertainty can allow ensuring reliability of these results. The aim of this study was to estimate measurement uncertainty of whole blood mass concentration tacrolimus values obtained by UHPLC-MS/MS using two top-down approaches: the single laboratory validation approach and the proficiency testing approach. For the single laboratory validation approach, we estimated the uncertainties associated to the intermediate imprecision (using long-term internal quality control data) and the bias (utilizing a certified reference material). Next, we combined them together with the uncertainties related to the calibrators-assigned values to obtain a combined uncertainty for, finally, to calculate the expanded uncertainty. For the proficiency testing approach, the uncertainty was estimated in a similar way that the single laboratory validation approach but considering data from internal and external quality control schemes to estimate the uncertainty related to the bias. The estimated expanded uncertainty for single laboratory validation, proficiency testing using internal and external quality control schemes were 11.8%, 13.2%, and 13.0%, respectively. After performing the two top-down approaches, we observed that their uncertainty results were quite similar. This fact would confirm that either two approaches could be used to estimate the measurement uncertainty of whole blood mass concentration tacrolimus values in clinical laboratories. Copyright © 2018 The Canadian Society of Clinical Chemists. Published by Elsevier Inc. All rights reserved.
Predicting the Individual Risk of Acute Severe Colitis at Diagnosis

PubMed Central

Cesarini, Monica; Collins, Gary S.; Rönnblom, Anders; Santos, Antonieta; Wang, Lai Mun; Sjöberg, Daniel; Parkes, Miles; Keshav, Satish

2017-01-01

Abstract Background and Aims: Acute severe colitis [ASC] is associated with major morbidity. We aimed to develop and externally validate an index that predicted ASC within 3 years of diagnosis. Methods: The development cohort included patients aged 16–89 years, diagnosed with ulcerative colitis [UC] in Oxford and followed for 3 years. Primary outcome was hospitalization for ASC, excluding patients admitted within 1 month of diagnosis. Multivariable logistic regression examined the adjusted association of seven risk factors with ASC. Backwards elimination produced a parsimonious model that was simplified to create an easy-to-use index. External validation occurred in separate cohorts from Cambridge, UK, and Uppsala, Sweden. Results: The development cohort [Oxford] included 34/111 patients who developed ASC within a median 14 months [range 1–29]. The final model applied the sum of 1 point each for extensive disease, C-reactive protein [CRP] > 10mg/l, or haemoglobin < 12g/dl F or < 14g/dl M at diagnosis, to give a score from 0/3 to 3/3. This predicted a 70% risk of developing ASC within 3 years [score 3/3]. Validation cohorts included different proportions with ASC [Cambridge = 25/96; Uppsala = 18/298]. Of those scoring 3/3 at diagnosis, 18/18 [Cambridge] and 12/13 [Uppsala] subsequently developed ASC. Discriminant ability [c-index, where 1.0 = perfect discrimination] was 0.81 [Oxford], 0.95 [Cambridge], 0.97 [Uppsala]. Internal validation using bootstrapping showed good calibration, with similar predicted risk across all cohorts. A nomogram predicted individual risk. Conclusions: An index applied at diagnosis reliably predicts the risk of ASC within 3 years in different populations. Patients with a score 3/3 at diagnosis may merit early immunomodulator therapy. PMID:27647858

Validity and Reliability of Field-Based Measures for Assessing Movement Skill Competency in Lifelong Physical Activities: A Systematic Review.

PubMed

Hulteen, Ryan M; Lander, Natalie J; Morgan, Philip J; Barnett, Lisa M; Robertson, Samuel J; Lubans, David R

2015-10-01

It has been suggested that young people should develop competence in a variety of 'lifelong physical activities' to ensure that they can be active across the lifespan. The primary aim of this systematic review is to report the methodological properties, validity, reliability, and test duration of field-based measures that assess movement skill competency in lifelong physical activities. A secondary aim was to clearly define those characteristics unique to lifelong physical activities. A search of four electronic databases (Scopus, SPORTDiscus, ProQuest, and PubMed) was conducted between June 2014 and April 2015 with no date restrictions. Studies addressing the validity and/or reliability of lifelong physical activity tests were reviewed. Included articles were required to assess lifelong physical activities using process-oriented measures, as well as report either one type of validity or reliability. Assessment criteria for methodological quality were adapted from a checklist used in a previous review of sport skill outcome assessments. Movement skill assessments for eight different lifelong physical activities (badminton, cycling, dance, golf, racquetball, resistance training, swimming, and tennis) in 17 studies were identified for inclusion. Methodological quality, validity, reliability, and test duration (time to assess a single participant), for each article were assessed. Moderate to excellent reliability results were found in 16 of 17 studies, with 71% reporting inter-rater reliability and 41% reporting intra-rater reliability. Only four studies in this review reported test-retest reliability. Ten studies reported validity results; content validity was cited in 41% of these studies. Construct validity was reported in 24% of studies, while criterion validity was only reported in 12% of studies. Numerous assessments for lifelong physical activities may exist, yet only assessments for eight lifelong physical activities were included in this review. Generalizability of results may be more applicable if more heterogeneous samples are used in future research. Moderate to excellent levels of inter- and intra-rater reliability were reported in the majority of studies. However, future work should look to establish test-retest reliability. Validity was less commonly reported than reliability, and further types of validity other than content validity need to be established in future research. Specifically, predictive validity of 'lifelong physical activity' movement skill competency is needed to support the assertion that such activities provide the foundation for a lifetime of activity.
Validity and Reliability of Turkish Male Breast Self-Examination Instrument.

PubMed

Erkin, Özüm; Göl, İlknur

2018-04-01

This study aims to measure the validity and reliability of Turkish male breast self-examination (MBSE) instrument. The methodological study was performed in 2016 at Ege University, Faculty of Nursing, İzmir, Turkey. The MBSE includes ten steps. For validity studies, face validity, content validity, and construct validity (exploratory factor analysis) were done. For reliability study, Kuder Richardson was calculated. The content validity index was found to be 0.94. Kendall W coefficient was 0.80 (p=0.551). The total variance explained by the two factors was found to be 63.24%. Kuder Richardson 21 was done for reliability study and found to be 0.97 for the instrument. The final instrument included 10 steps and two stages. The Turkish version of MBSE is a valid and reliable instrument for early diagnose. The MBSE can be used in Turkish speaking countries and cultures with two stages and 10 steps.
Multidimensional Fatigue Inventory: Spanish adaptation and psychometric properties for fibromyalgia patients. The Al-Andalus study.

PubMed

Munguía-Izquierdo, Diego; Segura-Jiménez, Victor; Camiletti-Moirón, Daniel; Pulido-Martos, Manuel; Alvarez-Gallardo, Inmaculada C; Romero, Alejandro; Aparicio, Virginia A; Carbonell-Baeza, Ana; Delgado-Fernández, Manuel

2012-01-01

The aim of this study was to assess the psychometric properties and transcultural adaptation into Spanish of the Multidimensional Fatigue Inventory in fibromyalgia patients. The Spanish version of the Multidimensional Fatigue Inventory (MFI-S) was translated and cognitively pretested following cross-cultural adaptation guidelines. Test-retest reliability, convergent validity, and operational qualities were evaluated in a total of 116 fibromyalgia patients. Convergent validity was assessed comparing MFI-S with a visual analogue scale for global fatigue. The intra-class correlation coefficients varied from moderate to excellent (from 0.64 to 0.91) and the standard errors of the mean ranged from 0.5 to 1.1 points for the five MFI-S domains. The coefficient of repeatability was less than 2 standard deviations and the limits of agreement ranged from 2 to 4 points for the MFI-S domains. A weak to fair significant relationship was found between each MFI-S domain and the visual analogue scale (from 0.21 to 0.32). The mean time required to complete the MFI-S was 3.2±2.0 minutes. None of the patients needed external help to complete the MFI-S, and there were very few missing values. The MFI-S developed in this study presents a good reliability and reasonable construct validity for Spanish fibromyalgia patients unaffected by cognitive dysfunction and severe depression. This questionnaire is quick, easy to administer and interpret.
A medical school's organizational readiness for curriculum change (MORC): development and validation of a questionnaire.

PubMed

Jippes, Mariëlle; Driessen, Erik W; Broers, Nick J; Majoor, Gerard D; Gijselaers, Wim H; van der Vleuten, Cees P M

2013-09-01

Because successful change implementation depends on organizational readiness for change, the authors developed and assessed the validity of a questionnaire, based on a theoretical model of organizational readiness for change, designed to measure, specifically, a medical school's organizational readiness for curriculum change (MORC). In 2012, a panel of medical education experts judged and adapted a preliminary MORC questionnaire through a modified Delphi procedure. The authors administered the resulting questionnaire to medical school faculty involved in curriculum change and tested the psychometric properties using exploratory and confirmatory factor analysis, and generalizability analysis. The mean relevance score of the Delphi panel (n = 19) reached 4.2 on a five-point Likert-type scale (1 = not relevant and 5 = highly relevant) in the second round, meeting predefined criteria for completing the Delphi procedure. Faculty (n = 991) from 131 medical schools in 56 countries completed MORC. Exploratory factor analysis yielded three underlying factors-motivation, capability, and external pressure-in 12 subscales with 53 items. The scale structure suggested by exploratory factor analysis was confirmed by confirmatory factor analysis. Cronbach alpha ranged from 0.67 to 0.92 for the subscales. Generalizability analysis showed that the MORC results of 5 to 16 faculty members can reliably evaluate a school's organizational readiness for change. MORC is a valid, reliable questionnaire for measuring organizational readiness for curriculum change in medical schools. It can identify which elements in a change process require special attention so as to increase the chance of successful implementation.
Using Student Video Cases to Assess Pre-service Elementary Teachers' Engineering Teaching Responsiveness

NASA Astrophysics Data System (ADS)

Dalvi, Tejaswini; Wendell, Kristen

2017-10-01

Our study addresses the need for new approaches to prepare novice elementary teachers to teach both science and engineering, and for new tools to measure how well those approaches are working. This in particular would inform the teacher educators of the extent to which novice teachers are developing expertise in facilitating their students' engineering design work. One important dimension to measure is novice teachers' abilities to notice the substance of student thinking and to respond in productive ways. This teacher noticing is particularly important in science and engineering education, where students' initial, idiosyncratic ideas and practices influence the likelihood that particular instructional strategies will help them learn. This paper describes evidence of validity and reliability for the Video Case Diagnosis (VCD) task, a new instrument for measuring pre-service elementary teachers' engineering teaching responsiveness. To complete the VCD, participants view a 6-min video episode of children solving an engineering design problem, describe in writing what they notice about the students' science ideas and engineering practices, and propose how a teacher could productively respond to the students. The rubric for scoring VCD responses allowed two independent scorers to achieve inter-rater reliability. Content analysis of the video episode, systematic review of literature on science and engineering practices, and solicitation of external expert educator responses establish content validity for VCD. Field test results with three different participant groups who have different levels of engineering education experience offer evidence of construct validity.
Psychometric instrumentation: reliability and validity of instruments used for clinical practice, evidence-based practice projects and research studies.

PubMed

Mayo, Ann M

2015-01-01

It is important for CNSs and other APNs to consider the reliability and validity of instruments chosen for clinical practice, evidence-based practice projects, or research studies. Psychometric testing uses specific research methods to evaluate the amount of error associated with any particular instrument. Reliability estimates explain more about how well the instrument is designed, whereas validity estimates explain more about scores that are produced by the instrument. An instrument may be architecturally sound overall (reliable), but the same instrument may not be valid. For example, if a specific group does not understand certain well-constructed items, then the instrument does not produce valid scores when used with that group. Many instrument developers may conduct reliability testing only once, yet continue validity testing in different populations over many years. All CNSs should be advocating for the use of reliable instruments that produce valid results. Clinical nurse specialists may find themselves in situations where reliability and validity estimates for some instruments that are being utilized are unknown. In such cases, CNSs should engage key stakeholders to sponsor nursing researchers to pursue this most important work.
Internal Motion Estimation by Internal-external Motion Modeling for Lung Cancer Radiotherapy.

PubMed

Chen, Haibin; Zhong, Zichun; Yang, Yiwei; Chen, Jiawei; Zhou, Linghong; Zhen, Xin; Gu, Xuejun

2018-02-27

The aim of this study is to develop an internal-external correlation model for internal motion estimation for lung cancer radiotherapy. Deformation vector fields that characterize the internal-external motion are obtained by respectively registering the internal organ meshes and external surface meshes from the 4DCT images via a recently developed local topology preserved non-rigid point matching algorithm. A composite matrix is constructed by combing the estimated internal phasic DVFs with external phasic and directional DVFs. Principle component analysis is then applied to the composite matrix to extract principal motion characteristics, and generate model parameters to correlate the internal-external motion. The proposed model is evaluated on a 4D NURBS-based cardiac-torso (NCAT) synthetic phantom and 4DCT images from five lung cancer patients. For tumor tracking, the center of mass errors of the tracked tumor are 0.8(±0.5)mm/0.8(±0.4)mm for synthetic data, and 1.3(±1.0)mm/1.2(±1.2)mm for patient data in the intra-fraction/inter-fraction tracking, respectively. For lung tracking, the percent errors of the tracked contours are 0.06(±0.02)/0.07(±0.03) for synthetic data, and 0.06(±0.02)/0.06(±0.02) for patient data in the intra-fraction/inter-fraction tracking, respectively. The extensive validations have demonstrated the effectiveness and reliability of the proposed model in motion tracking for both the tumor and the lung in lung cancer radiotherapy.
External validation of a Cox prognostic model: principles and methods

PubMed Central

2013-01-01

Background A prognostic model should not enter clinical practice unless it has been demonstrated that it performs a useful role. External validation denotes evaluation of model performance in a sample independent of that used to develop the model. Unlike for logistic regression models, external validation of Cox models is sparsely treated in the literature. Successful validation of a model means achieving satisfactory discrimination and calibration (prediction accuracy) in the validation sample. Validating Cox models is not straightforward because event probabilities are estimated relative to an unspecified baseline function. Methods We describe statistical approaches to external validation of a published Cox model according to the level of published information, specifically (1) the prognostic index only, (2) the prognostic index together with Kaplan-Meier curves for risk groups, and (3) the first two plus the baseline survival curve (the estimated survival function at the mean prognostic index across the sample). The most challenging task, requiring level 3 information, is assessing calibration, for which we suggest a method of approximating the baseline survival function. Results We apply the methods to two comparable datasets in primary breast cancer, treating one as derivation and the other as validation sample. Results are presented for discrimination and calibration. We demonstrate plots of survival probabilities that can assist model evaluation. Conclusions Our validation methods are applicable to a wide range of prognostic studies and provide researchers with a toolkit for external validation of a published Cox model. PMID:23496923
Concordance and Reliability of Photogrammetric Protocols for Measuring the Cervical Lordosis Angle: A Systematic Review of the Literature.

PubMed

de Albuquerque, Priscila Maria Nascimento Martins; de Alencar, Geisa Guimarães; de Oliveira, Daniela Araújo; de Siqueira, Gisela Rocha

2018-01-01

The aim of this study was to examine and interpret the concordance, accuracy, and reliability of photogrammetric protocols available in the literature for evaluating cervical lordosis in an adult population aged 18 to 59 years. A systematic search of 6 electronic databases (MEDLINE via PubMed, LILACS, CINAHL, Scopus, ScienceDirect, and Web of Science) located studies that assessed the reliability and/or concordance and/or accuracy of photogrammetric protocols for evaluating cervical lordosis, compared with radiography. Articles published through April 2016 were selected. Two independent reviewers used a critical appraisal tool (QUADAS and QAREL) to assess the quality of the selected studies. Two studies were included in the review and had high levels of reliability (intraclass correlation coefficient: 0.974-0.98). Only 1 study assessed the concordance between the methods, which was calculated using Pearson's correlation coefficient. To date, the accuracy of photogrammetry has not been investigated thoroughly. We encountered no study in the literature that investigated the accuracy of photogrammetry in diagnosing hyperlordosis of cervical spine. However, both current studies report high levels of intra- and interrater reliability. To increase the level of evidence of photogrammetry in the evaluation of cervical lordosis, it is necessary to conduct further studies using a larger sample to increase the external validity of the findings. Copyright © 2018. Published by Elsevier Inc.
Real-time sonoelastography using an external reference material: test-retest reliability of healthy Achilles tendons.

PubMed

Schneebeli, Alessandro; Del Grande, Filippo; Vincenzo, Gabriele; Cescon, Corrado; Clijsen, Ron; Biordi, Fulvio; Barbero, Marco

2016-08-01

To establish the test-retest reliability of sonoelastography (SE) on healthy Achilles tendons in contracted and relaxed states using an external reference system. Forty-eight Achilles tendons from 24 healthy volunteers were assessed using ultrasound and real-time SE with an external reference material. Tendons were analyzed under relaxed and contracted conditions. Strain ratios between the tendons and the reference material were calculated. The intraclass correlation coefficient (ICC2.k) and Bland-Altman plot were used to assess test-retest reliability. The reliability of SE measurements under relaxed conditions ranged from high to very high, with an ICC2.k of 0.84 (95 % CI: 0.64-0.92) for reference material, 0.91 (95 % CI: 0.83-0.95) for Achilles tendons and 0.95 (95 % CI: 0.91-0.97) for Kager fat pads (KFP). The ICC2.k value for skin was 0.30 (95 % CI: -0.26 to 0.61). Reliability for measurements in the contracted state ranged from high to very high, with an ICC2.k of 0.93 (95 % CI: 0.87-0.96) for reference material, 0.72 (95 % CI: 0.50-0.84) for skin, 0.93 (95 % CI: 0.87-0.96) for Achilles tendons, and 0.81 (95 % CI: 0.66-0.89) for KFP. Reliability of the strain ratio (tendon/reference) under relaxed conditions was high with an ICC2.k of 0.87 (95 % CI: 0.75-0.93), and in the contracted state, it was very high with an ICC2.k of 0.94 (95 % CI: 0.90-0.97). Sonoelastography using an external reference material is a reliable and simple technique for the assessment of the elasticity of healthy Achilles tendons. The use of an external material as a reference, along with strain ratios, could provide a quantitative measure of elasticity.
Reliability and validity of generalizable skills instruments for students who are deaf, blind, or visually impaired.

PubMed

Loeding, B L; Greenan, J P

1998-12-01

The study examined the validity and reliability of four assessments, with three instruments per domain. Domains included generalizable mathematics, communication, interpersonal relations, and reasoning skills. Participants were deaf, legally blind, or visually impaired students enrolled in vocational classes at residential secondary schools. The researchers estimated the internal consistency reliability, test-retest reliability, and construct validity correlations of three subinstruments: student self-ratings, teacher ratings, and performance assessments. The data suggest that these instruments are highly internally consistent measures of generalizable vocational skills. Four performance assessments have high-to-moderate test-retest reliability estimates, and were generally considered to possess acceptable validity and reliability.
Internal Consistency, Retest Reliability, and their Implications For Personality Scale Validity

PubMed Central

McCrae, Robert R.; Kurtz, John E.; Yamagata, Shinji; Terracciano, Antonio

2010-01-01

We examined data (N = 34,108) on the differential reliability and validity of facet scales from the NEO Inventories. We evaluated the extent to which (a) psychometric properties of facet scales are generalizable across ages, cultures, and methods of measurement; and (b) validity criteria are associated with different forms of reliability. Composite estimates of facet scale stability, heritability, and cross-observer validity were broadly generalizable. Two estimates of retest reliability were independent predictors of the three validity criteria; none of three estimates of internal consistency was. Available evidence suggests the same pattern of results for other personality inventories. Internal consistency of scales can be useful as a check on data quality, but appears to be of limited utility for evaluating the potential validity of developed scales, and it should not be used as a substitute for retest reliability. Further research on the nature and determinants of retest reliability is needed. PMID:20435807
Meal size of high-fat food is reliably greater than high-carbohydrate food across externally-evoked single-meal tests and long-term spontaneous feeding in rat.

PubMed

Synowski, Stephen J; Smart, Andrew B; Warwick, Zoe S

2005-10-01

A series of studies in rat using isoenergetic (kcal/ml) liquid diets differing in fat content has previously found dietary fat to dose-dependently increase daily caloric intake. In single-meal tests in which meal initiation was externally evoked in feeding-associated environments, the behavioral expression of this overeating was found to be larger meal intake. The present studies confirmed the ecological validity of this larger meal size of high-fat diet (HF) relative to high-carbohydrate diet (HC): meal size of HF>HC in home-cage testing (Experiment 1), and during undisturbed, spontaneous feeding in which ingestive behavior was continuously monitored (Experiments 2 and 3). These findings demonstrate that single-meal paradigms yield results consistent with spontaneous feeding of high-fat and high-carbohydrate liquid diets, thus supporting the use of single-meal studies to better understand the physiological bases of elevated caloric intake associated with chronic consumption of a high-fat diet.
Validation of the Eating Pattern Inventory for Children in a General Population Sample of 11- to 12-Year-Old Children.

PubMed

Munkholm, Anja; Bjorner, Jakob B; Petersen, Janne; Micali, Nadia; Olsen, Else Marie; Skovgaard, Anne Mette

2017-09-01

Previous research suggests that the Eating Pattern Inventory for Children (EPI-C) is best conceptualized as comprising four factors: dietary restraint, emotional, external eating and parental pressure to eat. This study aims to examine the psychometric properties of the EPI-C and to test gender and weight group differences. The population-based study sample comprised 1,939 children aged 11 to 12 years from the Copenhagen Child Cohort (CCC2000). Psychometric properties were evaluated using multigroup categorical data in confirmatory factor analysis (CFA) and differential item functioning (DIF) tests. CFA supported the four-factor solution for the EPI-C. Reliability estimates were satisfactory for three of the four scales. DIF with regard to weight was found for an item on weight loss intention. Girls reported higher restrained and emotional eating; overweight children reported higher restrained, emotional and external eating, while underweight children reported higher parental pressure to eat. The results support the use of EPI-C for measuring eating behaviors in preadolescence.
Quantitative Determination of Fusarium proliferatum Concentration in Intact Garlic Cloves Using Near-Infrared Spectroscopy.

PubMed

Tamburini, Elena; Mamolini, Elisabetta; De Bastiani, Morena; Marchetti, Maria Gabriella

2016-07-15

Fusarium proliferatum is considered to be a pathogen of many economically important plants, including garlic. The objective of this research was to apply near-infrared spectroscopy (NIRS) to rapidly determine fungal concentration in intact garlic cloves, avoiding the laborious and time-consuming procedures of traditional assays. Preventive detection of infection before seeding is of great interest for farmers, because it could avoid serious losses of yield during harvesting and storage. Spectra were collected on 95 garlic cloves, divided in five classes of infection (from 1-healthy to 5-very highly infected) in the range of fungal concentration 0.34-7231.15 ppb. Calibration and cross validation models were developed with partial least squares regression (PLSR) on pretreated spectra (standard normal variate, SNV, and derivatives), providing good accuracy in prediction, with a coefficient of determination (R²) of 0.829 and 0.774, respectively, a standard error of calibration (SEC) of 615.17 ppb, and a standard error of cross validation (SECV) of 717.41 ppb. The calibration model was then used to predict fungal concentration in unknown samples, peeled and unpeeled. The results showed that NIRS could be used as a reliable tool to directly detect and quantify F. proliferatum infection in peeled intact garlic cloves, but the presence of the external peel strongly affected the prediction reliability.
An integrated analysis for determining the geographical origin of medicinal herbs using ICP-AES/ICP-MS and (1)H NMR analysis.

PubMed

Kwon, Yong-Kook; Bong, Yeon-Sik; Lee, Kwang-Sik; Hwang, Geum-Sook

2014-10-15

ICP-MS and (1)H NMR are commonly used to determine the geographical origin of food and crops. In this study, data from multielemental analysis performed by ICP-AES/ICP-MS and metabolomic data obtained from (1)H NMR were integrated to improve the reliability of determining the geographical origin of medicinal herbs. Astragalus membranaceus and Paeonia albiflora with different origins in Korea and China were analysed by (1)H NMR and ICP-AES/ICP-MS, and an integrated multivariate analysis was performed to characterise the differences between their origins. Four classification methods were applied: linear discriminant analysis (LDA), k-nearest neighbour classification (KNN), support vector machines (SVM), and partial least squares-discriminant analysis (PLS-DA). Results were compared using leave-one-out cross-validation and external validation. The integration of multielemental and metabolomic data was more suitable for determining geographical origin than the use of each individual data set alone. The integration of the two analytical techniques allowed diverse environmental factors such as climate and geology, to be considered. Our study suggests that an appropriate integration of different types of analytical data is useful for determining the geographical origin of food and crops with a high degree of reliability. Copyright © 2014 Elsevier Ltd. All rights reserved.
Derivation and external validation of a case mix model for the standardized reporting of 30-day stroke mortality rates.

PubMed

Bray, Benjamin D; Campbell, James; Cloud, Geoffrey C; Hoffman, Alex; James, Martin; Tyrrell, Pippa J; Wolfe, Charles D A; Rudd, Anthony G

2014-11-01

Case mix adjustment is required to allow valid comparison of outcomes across care providers. However, there is a lack of externally validated models suitable for use in unselected stroke admissions. We therefore aimed to develop and externally validate prediction models to enable comparison of 30-day post-stroke mortality outcomes using routine clinical data. Models were derived (n=9000 patients) and internally validated (n=18 169 patients) using data from the Sentinel Stroke National Audit Program, the national register of acute stroke in England and Wales. External validation (n=1470 patients) was performed in the South London Stroke Register, a population-based longitudinal study. Models were fitted using general estimating equations. Discrimination and calibration were assessed using receiver operating characteristic curve analysis and correlation plots. Two final models were derived. Model A included age (<60, 60-69, 70-79, 80-89, and ≥90 years), National Institutes of Health Stroke Severity Score (NIHSS) on admission, presence of atrial fibrillation on admission, and stroke type (ischemic versus primary intracerebral hemorrhage). Model B was similar but included only the consciousness component of the NIHSS in place of the full NIHSS. Both models showed excellent discrimination and calibration in internal and external validation. The c-statistics in external validation were 0.87 (95% confidence interval, 0.84-0.89) and 0.86 (95% confidence interval, 0.83-0.89) for models A and B, respectively. We have derived and externally validated 2 models to predict mortality in unselected patients with acute stroke using commonly collected clinical variables. In settings where the ability to record the full NIHSS on admission is limited, the level of consciousness component of the NIHSS provides a good approximation of the full NIHSS for mortality prediction. © 2014 American Heart Association, Inc.
Study on the Validity and Reliability of Melbourne Decision Making Scale in Turkey

ERIC Educational Resources Information Center

Çolakkadioglu, Oguzhan; Deniz, M. Engin

2015-01-01

This study is to analyze the validity and reliability of Melbourne Decision Making Questionnaire (MDMQ). The sample consisted of 650 university students. The structural validity of the MDMQ, as well as correlations among its sub-scales, measure-bound validity, internal consistency, item total correlations and test-retest reliability coefficients…
A Model for Estimating the Reliability and Validity of Criterion-Referenced Measures.

ERIC Educational Resources Information Center

Edmonston, Leon P.; Randall, Robert S.

A decision model designed to determine the reliability and validity of criterion referenced measures (CRMs) is presented. General procedures which pertain to the model are discussed as to: Measures of relationship, Reliability, Validity (content, criterion-oriented, and construct validation), and Item Analysis. The decision model is presented in…
Spanish validation of the Domain-Specific Risk-Taking (DOSPERT-30) Scale.

PubMed

Lozano, Luis M; Megías, Alberto; Catena, Andrés; Perales, José C; Baltruschat, Sabina; Cándido, Antonio

2017-02-01

The aim of the present study was to develop and validate a Spanish version of the short Domain-Specific Risk-Taking (DOSPERT-30) scale, measuring risk-taking behavior, risk perception, and expected beneficial consequences (from taking risks) in five life domains: ethics, finance, health/security, recreational, and social decisions. The scale was back-translated, and administered online to 826 participants. Validity evidence was tested using correlations with construct-related instruments (UPPS-P and SSS-V), as well as using factor analysis. Internal consistency reliability was calculated with the ordinal Alpha coefficient, and gender differences were considered. Internal consistency was good, and factor analysis confirmed the five factors proposed by the authors. With respect to the external validity, high correlations with the positive urgency and the sensation seeking subscales of the UPPS-P, as well as with the thrill and adventure seeking and disinhibition subscales of the SSS-V were found. Finally, gender differences were found in all subscales and domains, with men tending to take more risks, perceive less risk and expect more beneficial consequences, except for the social domain where an inverse pattern was found. As these findings are in line with the original version, they indicate the scale was successfully adapted.

Development and validation of the Spanish-English Language Proficiency Scale (SELPS).

PubMed

Smyk, Ekaterina; Restrepo, M Adelaida; Gorin, Joanna S; Gray, Shelley

2013-07-01

This study examined the development and validation of a criterion-referenced Spanish-English Language Proficiency Scale (SELPS) that was designed to assess the oral language skills of sequential bilingual children ages 4-8. This article reports results for the English proficiency portion of the scale. The SELPS assesses syntactic complexity, grammatical accuracy, verbal fluency, and lexical diversity based on 2 story retell tasks. In Study 1, 40 children were given 2 story retell tasks to evaluate the reliability of parallel forms. In Study 2, 76 children participated in the validation of the scale against language sample measures and teacher ratings of language proficiency. Study 1 indicated no significant differences between the SELPS scores on the 2 stories. Study 2 indicated that the SELPS scores correlated significantly with their counterpart language sample measures. Correlations between the SELPS and teacher ratings were moderate. The 2 story retells elicited comparable SELPS scores, providing a valuable tool for test-retest conditions in the assessment of language proficiency. Correlations between the SELPS scores and external variables indicated that these measures assessed the same language skills. Results provided empirical evidence regarding the validity of inferences about language proficiency based on the SELPS score.
Adaptation and Validation of the Basque Version of the Emotional Creativity Inventory in Higher Education.

PubMed

Soroa, Goretti; Aritzeta, Aitor; Balluerka, Nekane; Gorostiaga, Arantxa

2016-06-03

Emotional creativity is defined as the ability to feel and express emotions in a new, effective and authentic way. There are currently no Basque-language self-report instruments to provide valid and reliable measures of this construct. Thus, this paper describes the process of adapting and validating the Emotional Creativity Inventory (ECI) for the Basque-speaking population. The sample was comprised of 594 higher education students (388 women and 206 men) aged between 18 and 32 years old (Mage = 20.47; SD = 2.48). The Basque version of the ECI was administered along with the TMMS-23, NEO PI-R, and PANAS. The results of exploratory and confirmatory factor analyses on the Basque ECI corroborated the original scale's three-factor structure (preparedness, novelty, and effectiveness/authenticity). Those dimensions showed acceptable indexes of internal consistency (α = .80, .83, and .83) and temporal stability (r = .70, .69, and .74). The study also provided some evidence of external validity (p < .05) based on the relationships found between emotional creativity and emotional intelligence, personality, affect, and sex. The Basque ECI can be regarded as a useful tool to evaluate perceived emotional creativity during the preparation and verification phases of the creative process.
Analysis of model development strategies: predicting ventral hernia recurrence.

PubMed

Holihan, Julie L; Li, Linda T; Askenasy, Erik P; Greenberg, Jacob A; Keith, Jerrod N; Martindale, Robert G; Roth, J Scott; Liang, Mike K

2016-11-01

There have been many attempts to identify variables associated with ventral hernia recurrence; however, it is unclear which statistical modeling approach results in models with greatest internal and external validity. We aim to assess the predictive accuracy of models developed using five common variable selection strategies to determine variables associated with hernia recurrence. Two multicenter ventral hernia databases were used. Database 1 was randomly split into "development" and "internal validation" cohorts. Database 2 was designated "external validation". The dependent variable for model development was hernia recurrence. Five variable selection strategies were used: (1) "clinical"-variables considered clinically relevant, (2) "selective stepwise"-all variables with a P value <0.20 were assessed in a step-backward model, (3) "liberal stepwise"-all variables were included and step-backward regression was performed, (4) "restrictive internal resampling," and (5) "liberal internal resampling." Variables were included with P < 0.05 for the Restrictive model and P < 0.10 for the Liberal model. A time-to-event analysis using Cox regression was performed using these strategies. The predictive accuracy of the developed models was tested on the internal and external validation cohorts using Harrell's C-statistic where C > 0.70 was considered "reasonable". The recurrence rate was 32.9% (n = 173/526; median/range follow-up, 20/1-58 mo) for the development cohort, 36.0% (n = 95/264, median/range follow-up 20/1-61 mo) for the internal validation cohort, and 12.7% (n = 155/1224, median/range follow-up 9/1-50 mo) for the external validation cohort. Internal validation demonstrated reasonable predictive accuracy (C-statistics = 0.772, 0.760, 0.767, 0.757, 0.763), while on external validation, predictive accuracy dipped precipitously (C-statistic = 0.561, 0.557, 0.562, 0.553, 0.560). Predictive accuracy was equally adequate on internal validation among models; however, on external validation, all five models failed to demonstrate utility. Future studies should report multiple variable selection techniques and demonstrate predictive accuracy on external data sets for model validation. Copyright © 2016 Elsevier Inc. All rights reserved.
Unavailability of thymidine kinase does not preclude the use of German comprehensive prognostic index: results of an external validation analysis in early chronic lymphocytic leukemia and comparison with MD Anderson Cancer Center model.

PubMed

Molica, Stefano; Giannarelli, Diana; Mirabelli, Rosanna; Levato, Luciano; Russo, Antonio; Linardi, Maria; Gentile, Massimo; Morabito, Fortunato

2016-01-01

A comprehensive prognostic index that includes clinical (i.e., age, sex, ECOG performance status), serum (i.e., ß2-microglobulin, thymidine kinase [TK]), and molecular (i.e., IGVH mutational status, del 17p, del 11q) markers developed by the German CLL Study Group (GCLLSG) was externally validated in a prospective, community-based cohort consisting of 338 patients with early chronic lymphocytic leukemia (CLL) using as endpoint the time to first treatment (TTFT). Because serum TK was not available, a slightly modified version of the model based on seven instead of eight prognostic variables was used. By German index, 62.9% of patients were scored as having low-risk CLL (score 0-2), whereas 37.1% had intermediate-risk CLL (score 3-5). This stratification translated into a significant difference in the TTFT [HR = 4.21; 95% C.I. (2.71-6.53); P < 0.0001]. Also the 2007 MD Anderson Cancer Center (MDACC) score, barely based on traditional clinical parameters, showed comparable reliability [HR = 2.73; 95% C.I. (1.79-4.17); P < 0.0001]. A comparative performance assessment between the two models revealed that prediction of the TTFT was more accurate with German score. The c-statistic of the MDACC model was 0.65 (range, 0.53-0.78) a level below that of the German index [0.71 (range, 0.60-0.82)] and below the accepted 0.7 threshold necessary to have value at the individual patient level. Results of this external comparative validation analysis strongly support the German score as the benchmark for comparison of any novel prognostic scheme aimed at evaluating the TTFT in patients with early CLL even when a modified version which does not include TK is utilized. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Failure of Colorectal Surgical Site Infection Predictive Models Applied to an Independent Dataset: Do They Add Value or Just Confusion?

PubMed

Bergquist, John R; Thiels, Cornelius A; Etzioni, David A; Habermann, Elizabeth B; Cima, Robert R

2016-04-01

Colorectal surgical site infections (C-SSIs) are a major source of postoperative morbidity. Institutional C-SSI rates are modeled and scrutinized, and there is increasing movement in the direction of public reporting. External validation of C-SSI risk prediction models is lacking. Factors governing C-SSI occurrence are complicated and multifactorial. We hypothesized that existing C-SSI prediction models have limited ability to accurately predict C-SSI in independent data. Colorectal resections identified from our institutional ACS-NSQIP dataset (2006 to 2014) were reviewed. The primary outcome was any C-SSI according to the ACS-NSQIP definition. Emergency cases were excluded. Published C-SSI risk scores: the National Nosocomial Infection Surveillance (NNIS), Contamination, Obesity, Laparotomy, and American Society of Anesthesiologists (ASA) class (COLA), Preventie Ziekenhuisinfecties door Surveillance (PREZIES), and NSQIP-based models were compared with receiver operating characteristic (ROC) analysis to evaluate discriminatory quality. There were 2,376 cases included, with an overall C-SSI rate of 9% (213 cases). None of the models produced reliable and high quality C-SSI predictions. For any C-SSI, the NNIS c-index was 0.57 vs 0.61 for COLA, 0.58 for PREZIES, and 0.62 for NSQIP: all well below the minimum "reasonably" predictive c-index of 0.7. Predictions for superficial, deep, and organ space SSI were similarly poor. Published C-SSI risk prediction models do not accurately predict C-SSI in our independent institutional dataset. Application of externally developed prediction models to any individual practice must be validated or modified to account for institution and case-mix specific factors. This questions the validity of using externally or nationally developed models for "expected" outcomes and interhospital comparisons. Copyright © 2016 American College of Surgeons. Published by Elsevier Inc. All rights reserved.
External validation of the Society of Thoracic Surgeons General Thoracic Surgery Database.

PubMed

Magee, Mitchell J; Wright, Cameron D; McDonald, Donna; Fernandez, Felix G; Kozower, Benjamin D

2013-11-01

The Society of Thoracic Surgeons (STS) General Thoracic Surgery Database (GTSD) reports outstanding results for lung and esophageal cancer resection. However, a major weakness of the GTSD has been the lack of validation of this voluntary registry. The purpose of this study was to perform an external, independent audit to assess the accuracy of the data collection process and the quality of the database. An independent firm was contracted to audit 5% of sites randomly selected from the GTDB in 2011. Audits were performed remotely to maximize the number of audits performed and reduce cost. Auditors compared lobectomy cases submitted to the GTSD with the hospital operative logs to evaluate completeness of the data. In addition, 20 lobectomy records from each site were audited in detail. Agreement rates were calculated for 32 individual data elements, 7 data categories pertaining to patient status or care delivery, and an overall agreement rate for each site. Six process variables were also evaluated to assess best practice for data collection and submission. Ten sites were audited from the 222 participants. Comparison of the 559 submitted lobectomy cases with operative logs from each site identified 28 omissions, a 94.6% agreement rate (discrepancies/site range, 2 to 27). Importantly, cases not submitted had no mortality or major morbidity, indicating a lack of purposeful omission. The aggregate agreement rates for all categories were greater than 90%. The overall data accuracy was 94.9%. External audits of the GTSD validate the accuracy and completeness of the data. Careful examination of unreported cases demonstrated no purposeful omission or gaming. Although these preliminary results are quite good, it is imperative that the audit process is refined and continues to expand along with the GTSD to insure reliability of the database. The audit results are currently being incorporated into educational and quality improvement processes to add further value. Copyright © 2013 The Society of Thoracic Surgeons. Published by Elsevier Inc. All rights reserved.
Patterns and reliability of EEG during error monitoring for internal versus external feedback in schizophrenia.

PubMed

Llerena, Katiah; Wynn, Jonathan K; Hajcak, Greg; Green, Michael F; Horan, William P

2016-07-01

Accurately monitoring one's performance on daily life tasks, and integrating internal and external performance feedback are necessary for guiding productive behavior. Although internal feedback processing, as indexed by the error-related negativity (ERN), is consistently impaired in schizophrenia, initial findings suggest that external performance feedback processing, as indexed by the feedback negativity (FN), may actually be intact. The current study evaluated internal and external feedback processing task performance and test-retest reliability in schizophrenia. 92 schizophrenia outpatients and 63 healthy controls completed a flanker task (ERN) and a time estimation task (FN). Analyses examined the ΔERN and ΔFN defined as difference waves between correct/positive versus error/negative feedback conditions. A temporal principal component analysis was conducted to distinguish the ΔERN and ΔFN from overlapping neural responses. We also assessed test-retest reliability of ΔERN and ΔFN in patients over a 4-week interval. Patients showed reduced ΔERN accompanied by intact ΔFN. In patients, test-retest reliability for both ΔERN and ΔFN over a four-week period was fair to good. Individuals with schizophrenia show a pattern of impaired internal, but intact external, feedback processing. This pattern has implications for understanding the nature and neural correlates of impaired feedback processing in schizophrenia. Published by Elsevier B.V.
Clinical effectiveness of a cognitive behavioral group treatment program for anxiety disorders: a benchmarking study.

PubMed

Oei, Tian P S; Boschen, Mark J

2009-10-01

Previous research has established efficacy of cognitive behavioral therapy (CBT) for anxiety disorders, yet it has not been widely assessed in routine community clinic practices. Efficacy research sacrifices external validity to achieve maximum internal validity. Recently, effectiveness research has been advocated as more ecologically valid for assessing routine clinical work in community clinics. Furthermore, there is a lack of effectiveness research in group CBT. This study aims to extend existing research on the effectiveness of CBT from individual therapy into group therapy delivery. It aimed also to examine outcome using not only symptom measures, but also measures of related symptoms, cognitions, and life quality and satisfaction. Results from a cohort of patients with various anxiety disorders demonstrated that treatment was effective in reducing anxiety symptoms to an extent comparable with other effectiveness studies. Despite this, only 43% of individuals showed reliable change, and 17% were 'recovered' from their anxiety symptoms, and the post-treatment measures were still significantly different from the level of anxiety symptoms observed in the general population.
Validation of the revised Mystical Experience Questionnaire in experimental sessions with psilocybin.

PubMed

Barrett, Frederick S; Johnson, Matthew W; Griffiths, Roland R

2015-11-01

The 30-item revised Mystical Experience Questionnaire (MEQ30) was previously developed within an online survey of mystical-type experiences occasioned by psilocybin-containing mushrooms. The rated experiences occurred on average eight years before completion of the questionnaire. The current paper validates the MEQ30 using data from experimental studies with controlled doses of psilocybin. Data were pooled and analyzed from five laboratory experiments in which participants (n=184) received a moderate to high oral dose of psilocybin (at least 20 mg/70 kg). Results of confirmatory factor analysis demonstrate the reliability and internal validity of the MEQ30. Structural equation models demonstrate the external and convergent validity of the MEQ30 by showing that latent variable scores on the MEQ30 positively predict persisting change in attitudes, behavior, and well-being attributed to experiences with psilocybin while controlling for the contribution of the participant-rated intensity of drug effects. These findings support the use of the MEQ30 as an efficient measure of individual mystical experiences. A method to score a "complete mystical experience" that was used in previous versions of the mystical experience questionnaire is validated in the MEQ30, and a stand-alone version of the MEQ30 is provided for use in future research. © The Author(s) 2015.
Validation of the revised Mystical Experience Questionnaire in experimental sessions with psilocybin

PubMed Central

Barrett, Frederick S; Johnson, Matthew W; Griffiths, Roland R

2016-01-01

The 30-item revised Mystical Experience Questionnaire (MEQ30) was previously developed within an online survey of mystical-type experiences occasioned by psilocybin-containing mushrooms. The rated experiences occurred on average eight years before completion of the questionnaire. The current paper validates the MEQ30 using data from experimental studies with controlled doses of psilocybin. Data were pooled and analyzed from five laboratory experiments in which participants (n=184) received a moderate to high oral dose of psilocybin (at least 20 mg/70 kg). Results of confirmatory factor analysis demonstrate the reliability and internal validity of the MEQ30. Structural equation models demonstrate the external and convergent validity of the MEQ30 by showing that latent variable scores on the MEQ30 positively predict persisting change in attitudes, behavior, and well-being attributed to experiences with psilocybin while controlling for the contribution of the participant-rated intensity of drug effects. These findings support the use of the MEQ30 as an efficient measure of individual mystical experiences. A method to score a “complete mystical experience” that was used in previous versions of the mystical experience questionnaire is validated in the MEQ30, and a stand-alone version of the MEQ30 is provided for use in future research. PMID:26442957
Development and Validation of Participation and Positive Psychologic Function Measures for Stroke Survivors

PubMed Central

Bode, Rita K.; Heinemann, Allen W.; Butt, Zeeshan; Stallings, Jena; Taylor, Caitlin; Rowe, Morgan; Roth, Elliot J.

2013-01-01

Bode RK, Heinemann AW, Butt Z, Stallings J, Taylor C, Rowe M, Roth EJ. Development and validation of participation and positive psychologic function measures for stroke survivors. Objective To evaluate the reliability and validity of Neurologic Quality of Life (NeuroQOL) item banks that assess quality-of-life (QOL) domains not typically included in poststroke measures. Design Secondary analysis of item responses to selected NeuroQOL domains. Setting Community. Participants Community-dwelling stroke survivors (n=111) who were at least 12 months poststroke. Interventions Not applicable. Main Outcome Measures Five measures developed for 3 NeuroQoL domains: ability to participate in social activities, satisfaction with participation in social activities, and positive psychologic function. Results A single bank was developed for the positive psychologic function domain, but 2 banks each were developed for the ability-to-participate and satisfaction-with-participation domains. The resulting item banks showed good psychometric properties and external construct validity with correlations with the legacy instruments, ranging from .53 to .71. Using these measures, stroke survivors in this sample reported an overall high level of QOL. Conclusions The NeuroQoL-derived measures are promising and valid methods for assessing aspects of QOL not typically measured in this population. PMID:20801251
[Validation, adaptation and translation of the MacCAT-T into Spanish: a tool to assess the ability to make health decisions].

PubMed

Hernando Robles, P; Lechuga Pérez, X; Solé Llop, P; Diestre, G; Mariné Torrent, A; Rodríguez Jornet, A; Marquina Parra, D; Colomer Mirabell, O

2012-01-01

Capacity assessment is an essential element of the informed consent process and is the duty of the physician. The MacCAT-T instrument explores four skills needed to consent a treatment. There is no Spanish version, and the main objective of this work is to validate, adapt and translate the MacCAT-T into Spanish. The MacCAT-T was translated into Spanish and then back-translated into English. It was validated as regards its appearance and content (by 15 experts), construct (inter-rater reliability and internal consistency) and criteria (the validity of an instrument by comparing it to some external criterion, in this case the Mini Examen Cognoscitivo de Lobo). Ninety medical and surgical outpatients over 18 years were included with no deficits of expression and/or severe disorders of consciousness that did not allow them to be interviewed. They have been optimal considering different types of validity. The average application time was between 9 and 13minutes. Data are consistent with those obtained in other applications of MacCAT-T in the English language and facilitate the provision of a Spanish tool for assessing capacity. Copyright © 2011 SECA. Published by Elsevier Espana. All rights reserved.
Development, reliability, and validity of the My Child's Play (MCP) questionnaire.

PubMed

Schneider, Eleanor; Rosenblum, Sara

2014-01-01

This article describes the development, reliability, and validity of My Child's Play (MCP), a parent questionnaire designed to evaluate the play of children ages 3-9 yr. The first phase of the study determined the questionnaire's content and face validity. Subsequently, the internal reliability consistency and construct and concurrent validity were demonstrated using 334 completed questionnaires. The MCP showed good internal consistency (α = .86). The factor analysis revealed four distinct factors with acceptable levels of internal reliability (Cronbach's αs = .63-.81) and gender- and age-related differences in play characteristics; both findings attest to the tool's construct validity. Significant correlations (r = .33, p < .0001) with the Parent as a Teacher Inventory demonstrate the MCP's concurrent validity. The MCP demonstrated acceptable reliability and validity. It appears to be a promising standardized assessment tool for use in research and practice to promote understanding of a child's play. Copyright © 2014 by the American Occupational Therapy Association, Inc.
[The external evaluation of study quality: the role in maintaining the reliability of laboratory information].

PubMed

Men'shikov, V V

2013-08-01

The external evaluation of quality of clinical laboratory examinations was gradually introduced in USSR medical laboratories since 1970s. In Russia, in the middle of 1990 a unified all-national system of external evaluation quality was organized known as the Federal center of external evaluation of quality at the basis of laboratory of the state research center of preventive medicine. The main positions of policy in this area were neatly formulated in the guidance documents of ministry of Health. Nowadays, the center of external evaluation of quality proposes 100 and more types of control studies and permanently extends their specter starting from interests of different disciplines of clinical medicine. The consistent participation of laboratories in the cycles of external evaluation of quality intrinsically promotes improvement of indicators of properness and precision of analysis results and increases reliability of laboratory information. However, a significant percentage of laboratories does not participate at all in external evaluation of quality or takes part in control process irregularly and in limited number of tests. The managers of a number of medical organizations disregard the application of the proposed possibilities to increase reliability of laboratory information and limit financing of studies in the field of quality control. The article proposes to adopt the national standard on the basis of ISO 17043 "Evaluation of compliance. The common requirements of professional competence testing".
Soldier Dimensions in Combat Models

DTIC Science & Technology

1990-05-07

and performance. Questionnaires, SQTs, and ARTEPs were often used. Many scales had estimates of reliability but few had validity data. Most studies...pending its validation . Research plans were provided for applications in simulated combat and with simulation devices, for data previously gathered...regarding reliability and validity . Lack of information following an instrument indicates neither reliability nor validity information was provided by the
Validation of the Temps-A in university student population in Serbia.

PubMed

Hinić, Darko; Akiskal, S Hagop; Akiskal, K Kareen; Jović, Jelena; Ignjatović Ristić, Dragana

2013-07-01

The TEMPS-A scale is a self-evaluation measure which assesses five affective temperaments. This study is a comparative analysis of affective temperament types in different educational fields, and the first validation of the Serbian version of the TEMPS-A. The TEMPS-A questionnaire has been adapted following the translation-back translation methodology from English to Serbian. It was then administered to 770 undergraduate students from eight different faculties. Five factors were extracted through Principal Component Analysis (Varimax rotation), each including ten items with loadings above 0.40. The internal consistency of this abbreviated 50-item scale was α=0.77 and the average test-retest coefficient (rho=0.82) indicates a stable reliability. The correlations among the temperaments ranged from weak to moderate, with the highest positive correlations obtained between the depressive and cyclothymic, and, depressive and anxious scales. The highest score was detected among the hyperthymic (0.64) and lowest among the depressive temperament (0.15). The male participants attained significantly higher scores for the hyperthymic temperament, while female scored significantly higher on the depressive and anxious temperaments. The students of physical education showed significantly lower results on the depressive and anxious subscales and higher on the hyperthymic, in comparison to other educational fields. The student sample is not representative of the general population, therefore further investigation in older population would be necessary for the evaluation of norms in additional age categories. The external validation with other personality scales has not been the subject of this research, but will be a part of some future studies. The Serbian 50-item version of the TEMPS-A showed good overall internal consistency and reliability, and the results generally cohere with those from previously validated versions in other languages. Copyright © 2013 Elsevier B.V. All rights reserved.
Methodology Series Module 9: Designing Questionnaires and Clinical Record Forms - Part II.

PubMed

Setia, Maninder Singh

2017-01-01

This article is a continuation of the previous module on designing questionnaires and clinical record form in which we have discussed some basic points about designing the questionnaire and clinical record forms. In this section, we will discuss the reliability and validity of questionnaires. The different types of validity are face validity, content validity, criterion validity, and construct validity. The different types of reliability are test-retest reliability, inter-rater reliability, and intra-rater reliability. Some of these parameters are assessed by subject area experts. However, statistical tests should be used for evaluation of other parameters. Once the questionnaire has been designed, the researcher should pilot test the questionnaire. The items in the questionnaire should be changed based on the feedback from the pilot study participants and the researcher's experience. After the basic structure of the questionnaire has been finalized, the researcher should assess the validity and reliability of the questionnaire or the scale. If an existing standard questionnaire is translated in the local language, the researcher should assess the reliability and validity of the translated questionnaire, and these values should be presented in the manuscript. The decision to use a self- or interviewer-administered, paper- or computer-based questionnaire depends on the nature of the questions, literacy levels of the target population, and resources.
Methodology Series Module 9: Designing Questionnaires and Clinical Record Forms – Part II

PubMed Central

Setia, Maninder Singh

2017-01-01

This article is a continuation of the previous module on designing questionnaires and clinical record form in which we have discussed some basic points about designing the questionnaire and clinical record forms. In this section, we will discuss the reliability and validity of questionnaires. The different types of validity are face validity, content validity, criterion validity, and construct validity. The different types of reliability are test-retest reliability, inter-rater reliability, and intra-rater reliability. Some of these parameters are assessed by subject area experts. However, statistical tests should be used for evaluation of other parameters. Once the questionnaire has been designed, the researcher should pilot test the questionnaire. The items in the questionnaire should be changed based on the feedback from the pilot study participants and the researcher's experience. After the basic structure of the questionnaire has been finalized, the researcher should assess the validity and reliability of the questionnaire or the scale. If an existing standard questionnaire is translated in the local language, the researcher should assess the reliability and validity of the translated questionnaire, and these values should be presented in the manuscript. The decision to use a self- or interviewer-administered, paper- or computer-based questionnaire depends on the nature of the questions, literacy levels of the target population, and resources. PMID:28584367
Reliability and Validity of the Chinese Version of FACIT-AI, a New Tool for Assessing Quality of Life in Patients with Malignant Ascites.

PubMed

Lou, Yanni; Lu, Linghui; Li, Yuan; Liu, Meng; Bredle, Jason M; Jia, Liqun

2015-10-01

The study objective was to determine the reliability and validity of the Chinese version of the Functional Assessment of Chronic Illness Therapy - Ascites Index (FACIT-AI). A forward-backward translation procedure was adopted to develop the Chinese version of the FACIT-AI, which was tested in 69 patients with malignant ascites. Cronbach's α, split-half reliability, and test-retest reliability were used to assess the reliability of the scale. The content validity index was used to assess the content validity, while factor analysis was used for construct validity and correlation analysis was used for criterion validity. The Cronbach's α was 0.772 for the total scale, and the split-half reliability was 0.693. The test-retest correlation was 0.972. The content validity index for the scale was 0.8-1.0. Four factors were extracted by factor analysis, and these contributed 63.51% of the total variance. Item-total correlations ranged from 0.591 to 0.897, and these were correlated with visual analog scale scores (correlation coefficient, 0.889; P<0.01). The Chinese version of the FACIT-AI has good reliability and validity and can be used as a tool to measure quality of life in Chinese patients with malignant ascites.
Validity and reliability of Internet-based physiotherapy assessment for musculoskeletal disorders: a systematic review.

PubMed

Mani, Suresh; Sharma, Shobha; Omar, Baharudin; Paungmali, Aatit; Joseph, Leonard

2017-04-01

Purpose The purpose of this review is to systematically explore and summarise the validity and reliability of telerehabilitation (TR)-based physiotherapy assessment for musculoskeletal disorders. Method A comprehensive systematic literature review was conducted using a number of electronic databases: PubMed, EMBASE, PsycINFO, Cochrane Library and CINAHL, published between January 2000 and May 2015. The studies examined the validity, inter- and intra-rater reliabilities of TR-based physiotherapy assessment for musculoskeletal conditions were included. Two independent reviewers used the Quality Appraisal Tool for studies of diagnostic Reliability (QAREL) and the Quality Assessment of Diagnostic Accuracy Studies (QUADAS) tool to assess the methodological quality of reliability and validity studies respectively. Results A total of 898 hits were achieved, of which 11 articles based on inclusion criteria were reviewed. Nine studies explored the concurrent validity, inter- and intra-rater reliabilities, while two studies examined only the concurrent validity. Reviewed studies were moderate to good in methodological quality. The physiotherapy assessments such as pain, swelling, range of motion, muscle strength, balance, gait and functional assessment demonstrated good concurrent validity. However, the reported concurrent validity of lumbar spine posture, special orthopaedic tests, neurodynamic tests and scar assessments ranged from low to moderate. Conclusion TR-based physiotherapy assessment was technically feasible with overall good concurrent validity and excellent reliability, except for lumbar spine posture, orthopaedic special tests, neurodynamic testa and scar assessment.

Assessing Dependency using Self-report and Indirect Measures: Examining the Significance of Discrepancies

PubMed Central

Cogswell, Alex; Alloy, Lauren B.; Karpinski, Andrew; Grant, David

2011-01-01

The present study addressed convergence between self-report and indirect approaches to assessing dependency. The study was moderately successful in validating an implicit measure, which was found to be reliable, orthogonal to two self-report instruments, and predictive of external criteria. This study also examined discrepancies between scores on self-report and implicit measures, and has implications for their significance. The possibility that discrepancies themselves are pathological was not supported, although discrepancies were associated with particular personality profiles. Finally, this study offered additional evidence for the relation between dependency and depressive symptomatology, and identified implicit dependency as contributing unique variance in predicting past major depression. PMID:20552505
The Reliability and Validity of the Power-Load-Margin Inventory: A Rasch Analysis.

PubMed

Hardigan, Patrick C; Cohen, Stanley R; Hagen, Kathleen P

2015-01-01

Margin is a function of the relationship of stress to strength. The greater the margin, the more likely students are able to successfully navigate academic structures. This study examined the psychometric properties of a newly created instrument designed to measure margin - the Power-Load-Margin Inventory (PLMI). The PLMI was created using eight domains: (A) Student's aptitude and ability, (B) Course structure, (C) External motivation, (D) Student health, (E) Instructor style, (F) Internal motivation, (G) Life opportunities, and (H) University support structure. A three-point response scale was used to measure the domains: (1) stress, (2) neither stress nor strength, and (3) strength. The PLMI was administered to 586 medical, dental, and pharmacy students. A Rasch rating scale model was used to examine the psychometric properties of the PLMI. The PLMI demonstrated acceptable psychometric properties for use with pharmacy, dental, and medical students. The PLMI's primary weakness was with the subscales' reliability. We attribute this to the small number of items per subscale.
Confirmatory Factor Analysis of the Combined Social Phobia Scale and Social Interaction Anxiety Scale: Support for a Bifactor Model.

PubMed

Gomez, Rapson; Watson, Shaun D

2017-01-01

For the Social Phobia Scale (SPS) and the Social Interaction Anxiety Scale (SIAS) together, this study examined support for a bifactor model, and also the internal consistency reliability and external validity of the factors in this model. Participants ( N = 526) were adults from the general community who completed the SPS and SIAS. Confirmatory factor analysis (CFA) of their ratings indicated good support for the bifactor model. For this model, the loadings for all but six items were higher on the general factor than the specific factors. The three positively worded items had negligible loadings on the general factor. The general factor explained most of the common variance in the SPS and SIAS, and demonstrated good model-based internal consistency reliability (omega hierarchical) and a strong association with fear of negative evaluation and extraversion. The practical implications of the findings for the utilization of the SPS and SIAS, and the theoretical and clinical implications for social anxiety are discussed.
Confirmatory Factor Analysis of the Combined Social Phobia Scale and Social Interaction Anxiety Scale: Support for a Bifactor Model

PubMed Central

Gomez, Rapson; Watson, Shaun D.

2017-01-01

For the Social Phobia Scale (SPS) and the Social Interaction Anxiety Scale (SIAS) together, this study examined support for a bifactor model, and also the internal consistency reliability and external validity of the factors in this model. Participants (N = 526) were adults from the general community who completed the SPS and SIAS. Confirmatory factor analysis (CFA) of their ratings indicated good support for the bifactor model. For this model, the loadings for all but six items were higher on the general factor than the specific factors. The three positively worded items had negligible loadings on the general factor. The general factor explained most of the common variance in the SPS and SIAS, and demonstrated good model-based internal consistency reliability (omega hierarchical) and a strong association with fear of negative evaluation and extraversion. The practical implications of the findings for the utilization of the SPS and SIAS, and the theoretical and clinical implications for social anxiety are discussed. PMID:28210232
A history of health-related quality of life outcomes in psychiatry.

PubMed

Revicki, Dennis A; Kleinman, Leah; Cella, David

2014-06-01

Health-related quality of life (HRQoL) is a multidimensional concept that includes subjective reports of symptoms, side effects, functioning in multiple life domains, and general perceptions of life satisfaction and quality. Rather than estimating it from external observations, interview, or clinical assessment, it is best measured by direct query. Due to a perception that respondents may not be reliable or credible, there has been some reluctance to use self-report outcomes in psychiatry. More recently, and increasingly, HRQoL assessment through direct patient query has become common when evaluating a range of psychiatric, psychological, and social therapies. With few exceptions, psychiatric patients are credible and reliable reporters of this information. This article summarizes studies that highlight the development, validation, and application of HRQoL measures in psychiatry. Thoughtful application of these tools in psychiatric research can provide a much-needed patient perspective in the future of comparative effectiveness research, patient-centered outcomes research, and clinical care.
A Monte Carlo analysis of breast screening randomized trials.

PubMed

Zamora, Luis I; Forastero, Cristina; Guirado, Damián; Lallena, Antonio M

2016-12-01

To analyze breast screening randomized trials with a Monte Carlo simulation tool. A simulation tool previously developed to simulate breast screening programmes was adapted for that purpose. The history of women participating in the trials was simulated, including a model for survival after local treatment of invasive cancers. Distributions of time gained due to screening detection against symptomatic detection and the overall screening sensitivity were used as inputs. Several randomized controlled trials were simulated. Except for the age range of women involved, all simulations used the same population characteristics and this permitted to analyze their external validity. The relative risks obtained were compared to those quoted for the trials, whose internal validity was addressed by further investigating the reasons of the disagreements observed. The Monte Carlo simulations produce results that are in good agreement with most of the randomized trials analyzed, thus indicating their methodological quality and external validity. A reduction of the breast cancer mortality around 20% appears to be a reasonable value according to the results of the trials that are methodologically correct. Discrepancies observed with Canada I and II trials may be attributed to a low mammography quality and some methodological problems. Kopparberg trial appears to show a low methodological quality. Monte Carlo simulations are a powerful tool to investigate breast screening controlled randomized trials, helping to establish those whose results are reliable enough to be extrapolated to other populations and to design the trial strategies and, eventually, adapting them during their development. Copyright © 2016 Associazione Italiana di Fisica Medica. Published by Elsevier Ltd. All rights reserved.
Development and validation of a Markov microsimulation model for the economic evaluation of treatments in osteoporosis.

PubMed

Hiligsmann, Mickaël; Ethgen, Olivier; Bruyère, Olivier; Richy, Florent; Gathon, Henry-Jean; Reginster, Jean-Yves

2009-01-01

Markov models are increasingly used in economic evaluations of treatments for osteoporosis. Most of the existing evaluations are cohort-based Markov models missing comprehensive memory management and versatility. In this article, we describe and validate an original Markov microsimulation model to accurately assess the cost-effectiveness of prevention and treatment of osteoporosis. We developed a Markov microsimulation model with a lifetime horizon and a direct health-care cost perspective. The patient history was recorded and was used in calculations of transition probabilities, utilities, and costs. To test the internal consistency of the model, we carried out an example calculation for alendronate therapy. Then, external consistency was investigated by comparing absolute lifetime risk of fracture estimates with epidemiologic data. For women at age 70 years, with a twofold increase in the fracture risk of the average population, the costs per quality-adjusted life-year gained for alendronate therapy versus no treatment were estimated at €9105 and €15,325, respectively, under full and realistic adherence assumptions. All the sensitivity analyses in terms of model parameters and modeling assumptions were coherent with expected conclusions and absolute lifetime risk of fracture estimates were within the range of previous estimates, which confirmed both internal and external consistency of the model. Microsimulation models present some major advantages over cohort-based models, increasing the reliability of the results and being largely compatible with the existing state of the art, evidence-based literature. The developed model appears to be a valid model for use in economic evaluations in osteoporosis.
Development and validation of a multifactor mindfulness scale in youth: The Comprehensive Inventory of Mindfulness Experiences-Adolescents (CHIME-A).

PubMed

Johnson, Catherine; Burke, Christine; Brinkman, Sally; Wade, Tracey

2017-03-01

Mindfulness-based interventions show consistent benefits in adults for a range of pathologies, but exploration of these approaches in youth is an emergent field, with limited measures of mindfulness for this population. This study aimed to investigate whether multifactor scales of mindfulness can be used in adolescents. A series of studies are presented assessing the performance of a recently developed adult measure, the Comprehensive Inventory of Mindfulness Experiences (CHIME) in 4 early adolescent samples. Study 1 was an investigation of how well the full adult measure (37 items) was understood by youth (N = 292). Study 2 piloted a revision of items in child friendly language with a small group (N = 48). The refined questionnaire for adolescents (CHIME-A) was then tested in Study 3 in a larger sample (N = 461) and subjected to exploratory factor analysis and a range of external validity measures. Study 4 was a confirmatory factor analysis in a new sample (N = 498) with additional external validity measures. Study 5 tested temporal stability (N = 120). Results supported an 8-factor 25-item measure of mindfulness in adolescents, with excellent model fit indices and sound internal consistency for the 8 subscales. Although the CFA supported an overarching factor, internal reliability of a combined total score was poor. The development of a multifactor measure represents a first step toward testing developmental models of mindfulness in young people. This in turn will aid construction of evidence based interventions that are not simply downward derivations of adult mindfulness programs. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Risk assessment tools to identify women with increased risk of osteoporotic fracture: complexity or simplicity? A systematic review.

PubMed

Rubin, Katrine Hass; Friis-Holmberg, Teresa; Hermann, Anne Pernille; Abrahamsen, Bo; Brixen, Kim

2013-08-01

A huge number of risk assessment tools have been developed. Far from all have been validated in external studies, more of them have absence of methodological and transparent evidence, and few are integrated in national guidelines. Therefore, we performed a systematic review to provide an overview of existing valid and reliable risk assessment tools for prediction of osteoporotic fractures. Additionally, we aimed to determine if the performance of each tool was sufficient for practical use, and last, to examine whether the complexity of the tools influenced their discriminative power. We searched PubMed, Embase, and Cochrane databases for papers and evaluated these with respect to methodological quality using the Quality Assessment Tool for Diagnostic Accuracy Studies (QUADAS) checklist. A total of 48 tools were identified; 20 had been externally validated, however, only six tools had been tested more than once in a population-based setting with acceptable methodological quality. None of the tools performed consistently better than the others and simple tools (i.e., the Osteoporosis Self-assessment Tool [OST], Osteoporosis Risk Assessment Instrument [ORAI], and Garvan Fracture Risk Calculator [Garvan]) often did as well or better than more complex tools (i.e., Simple Calculated Risk Estimation Score [SCORE], WHO Fracture Risk Assessment Tool [FRAX], and Qfracture). No studies determined the effectiveness of tools in selecting patients for therapy and thus improving fracture outcomes. High-quality studies in randomized design with population-based cohorts with different case mixes are needed. Copyright © 2013 American Society for Bone and Mineral Research.
Design of Novel Chemotherapeutic Agents Targeting Checkpoint Kinase 1 Using 3D-QSAR Modeling and Molecular Docking Methods.

PubMed

Balupuri, Anand; Balasubramanian, Pavithra K; Cho, Seung J

2016-01-01

Checkpoint kinase 1 (Chk1) has emerged as a potential therapeutic target for design and development of novel anticancer drugs. Herein, we have performed three-dimensional quantitative structure-activity relationship (3D-QSAR) and molecular docking analyses on a series of diazacarbazoles to design potent Chk1 inhibitors. 3D-QSAR models were developed using comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) techniques. Docking studies were performed using AutoDock. The best CoMFA and CoMSIA models exhibited cross-validated correlation coefficient (q2) values of 0.631 and 0.585, and non-cross-validated correlation coefficient (r2) values of 0.933 and 0.900, respectively. CoMFA and CoMSIA models showed reasonable external predictabilities (r2 pred) of 0.672 and 0.513, respectively. A satisfactory performance in the various internal and external validation techniques indicated the reliability and robustness of the best model. Docking studies were performed to explore the binding mode of inhibitors inside the active site of Chk1. Molecular docking revealed that hydrogen bond interactions with Lys38, Glu85 and Cys87 are essential for Chk1 inhibitory activity. The binding interaction patterns observed during docking studies were complementary to 3D-QSAR results. Information obtained from the contour map analysis was utilized to design novel potent Chk1 inhibitors. Their activities and binding affinities were predicted using the derived model and docking studies. Designed inhibitors were proposed as potential candidates for experimental synthesis.
Independent validation of a new reirradiation risk score (RRRS) for glioma patients predicting post-recurrence survival: A multicenter DKTK/ROG analysis.

PubMed

Niyazi, Maximilian; Adeberg, Sebastian; Kaul, David; Boulesteix, Anne-Laure; Bougatf, Nina; Fleischmann, Daniel F; Grün, Arne; Krämer, Anna; Rödel, Claus; Eckert, Franziska; Paulsen, Frank; Kessel, Kerstin A; Combs, Stephanie E; Oehlke, Oliver; Grosu, Anca-Ligia; Seidlitz, Annekatrin; Lattermann, Annika; Krause, Mechthild; Baumann, Michael; Guberina, Maja; Stuschke, Martin; Budach, Volker; Belka, Claus; Debus, Jürgen

2018-04-01

Reirradiation (reRT) is a valid option with considerable efficacy in patients with recurrent high-grade glioma, but it is still not known which patients might be optimal candidates for a second course of irradiation. This study validated a newly developed prognostic score independently in an external patient cohort. The reRT risk score (RRRS) is based on a linear combination of initial histology, clinical performance status, and age derived from a multivariable model of 353 patients. This score can predict post-recurrence survival (PRS) after reRT. The validation dataset consisted of 212 patients. The RRRS differentiates three prognostic groups. Discrimination and calibration were maintained in the validation group. Median PRS times in the development cohort for the good/intermediate/poor risk categories were 14.2, 9.1, and 5.3 months, respectively. The respective groups within the validation cohort displayed median PRS times of 13.8, 8.8, and 3.8 months, respectively. Uno's C for development data was 0.64 (CI: 0.60-0.69) and for validation data 0.63 (CI: 0.58-0.68). The RRRS has been successfully validated in an independent patient cohort. This linear combination of three easily determined clinicopathological factors allows for a reliable classification of patients and may be used as stratification factor for future trials. Copyright © 2018 Elsevier B.V. All rights reserved.
Assessing reliability and validity measures in managed care studies.

PubMed

Montoya, Isaac D

2003-01-01

To review the reliability and validity literature and develop an understanding of these concepts as applied to managed care studies. Reliability is a test of how well an instrument measures the same input at varying times and under varying conditions. Validity is a test of how accurately an instrument measures what one believes is being measured. A review of reliability and validity instructional material was conducted. Studies of managed care practices and programs abound. However, many of these studies utilize measurement instruments that were developed for other purposes or for a population other than the one being sampled. In other cases, instruments have been developed without any testing of the instrument's performance. The lack of reliability and validity information may limit the value of these studies. This is particularly true when data are collected for one purpose and used for another. The usefulness of certain studies without reliability and validity measures is questionable, especially in cases where the literature contradicts itself
Determination of Minimum Training Sample Size for Microarray-Based Cancer Outcome Prediction–An Empirical Assessment

PubMed Central

Cheng, Ningtao; Wu, Leihong; Cheng, Yiyu

2013-01-01

The promise of microarray technology in providing prediction classifiers for cancer outcome estimation has been confirmed by a number of demonstrable successes. However, the reliability of prediction results relies heavily on the accuracy of statistical parameters involved in classifiers. It cannot be reliably estimated with only a small number of training samples. Therefore, it is of vital importance to determine the minimum number of training samples and to ensure the clinical value of microarrays in cancer outcome prediction. We evaluated the impact of training sample size on model performance extensively based on 3 large-scale cancer microarray datasets provided by the second phase of MicroArray Quality Control project (MAQC-II). An SSNR-based (scale of signal-to-noise ratio) protocol was proposed in this study for minimum training sample size determination. External validation results based on another 3 cancer datasets confirmed that the SSNR-based approach could not only determine the minimum number of training samples efficiently, but also provide a valuable strategy for estimating the underlying performance of classifiers in advance. Once translated into clinical routine applications, the SSNR-based protocol would provide great convenience in microarray-based cancer outcome prediction in improving classifier reliability. PMID:23861920
Measurement of fatigue: Comparison of the reliability and validity of single-item and short measures to a comprehensive measure.

PubMed

Kim, Hee-Ju; Abraham, Ivo

2017-01-01

Evidence is needed on the clinicometric properties of single-item or short measures as alternatives to comprehensive measures. We examined whether two single-item fatigue measures (i.e., Likert scale, numeric rating scale) or a short fatigue measure were comparable to a comprehensive measure in reliability (i.e., internal consistency and test-retest reliability) and validity (i.e., convergent, concurrent, and predictive validity) in Korean young adults. For this quantitative study, we selected the Functional Assessment of Chronic Illness Therapy-Fatigue for the comprehensive measure and the Profile of Mood States-Brief, Fatigue subscale for the short measure; and constructed two single-item measures. A total of 368 students from four nursing colleges in South Korea participated. We used Cronbach's alpha and item-total correlation for internal consistency reliability and intraclass correlation coefficient for test-retest reliability. We assessed Pearson's correlation with a comprehensive measure for convergent validity, with perceived stress level and sleep quality for concurrent validity and the receiver operating characteristic curve for predictive validity. The short measure was comparable to the comprehensive measure in internal consistency reliability (Cronbach's alpha=0.81 vs. 0.88); test-retest reliability (intraclass correlation coefficient=0.66 vs. 0.61); convergent validity (r with comprehensive measure=0.79); concurrent validity (r with perceived stress=0.55, r with sleep quality=0.39) and predictive validity (area under curve=0.88). Single-item measures were not comparable to the comprehensive measure. A short fatigue measure exhibited similar levels of reliability and validity to the comprehensive measure in Korean young adults. Copyright Â© 2016 Elsevier Ltd. All rights reserved.
[External post-mortem examination].

PubMed

Hartwig, S

2016-09-01

The external post-mortem examination in Germany is a non-delegable medical duty for determination of death, identity of the deceased, cause of death, manner of death, time of death and notifiable infectious diseases. Within the framework of rescue service missions the physician is limited to ascertaining that death has occurred. The determination of death must be reliable and is automatically followed by a complete external post-mortem examination of the body, if necessary by another physician. The certain signs of death are livor mortis, rigor mortis and putrefaction. Reliable features for the occurrence of death are injuries which are not compatible with life and brain death. The external post-mortem examination is the basis for the decision on whether further criminal investigations are necessary. The external post-mortem examination and the accompanying death certification must always be meticulously carried out.
Interaction of Theory and Practice to Assess External Validity.

PubMed

Leviton, Laura C; Trujillo, Mathew D

2016-01-18

Variations in local context bedevil the assessment of external validity: the ability to generalize about effects of treatments. For evaluation, the challenges of assessing external validity are intimately tied to the translation and spread of evidence-based interventions. This makes external validity a question for decision makers, who need to determine whether to endorse, fund, or adopt interventions that were found to be effective and how to ensure high quality once they spread. To present the rationale for using theory to assess external validity and the value of more systematic interaction of theory and practice. We review advances in external validity, program theory, practitioner expertise, and local adaptation. Examples are provided for program theory, its adaptation to diverse contexts, and generalizing to contexts that have not yet been studied. The often critical role of practitioner experience is illustrated in these examples. Work is described that the Robert Wood Johnson Foundation is supporting to study treatment variation and context more systematically. Researchers and developers generally see a limited range of contexts in which the intervention is implemented. Individual practitioners see a different and often a wider range of contexts, albeit not a systematic sample. Organized and taken together, however, practitioner experiences can inform external validity by challenging the developers and researchers to consider a wider range of contexts. Researchers have developed a variety of ways to adapt interventions in light of such challenges. In systematic programs of inquiry, as opposed to individual studies, the problems of context can be better addressed. Evaluators have advocated an interaction of theory and practice for many years, but the process can be made more systematic and useful. Systematic interaction can set priorities for assessment of external validity by examining the prevalence and importance of context features and treatment variations. Practitioner interaction with researchers and developers can assist in sharpening program theory, reducing uncertainty about treatment variations that are consistent or inconsistent with the theory, inductively ruling out the ones that are harmful or irrelevant, and helping set priorities for more rigorous study of context and treatment variation. © The Author(s) 2016.
Dynamic MRI to quantify musculoskeletal motion: A systematic review of concurrent validity and reliability, and perspectives for evaluation of musculoskeletal disorders.

PubMed

Borotikar, Bhushan; Lempereur, Mathieu; Lelievre, Mathieu; Burdin, Valérie; Ben Salem, Douraied; Brochard, Sylvain

2017-01-01

To report evidence for the concurrent validity and reliability of dynamic MRI techniques to evaluate in vivo joint and muscle mechanics, and to propose recommendations for their use in the assessment of normal and impaired musculoskeletal function. The search was conducted on articles published in Web of science, PubMed, Scopus, Academic search Premier, and Cochrane Library between 1990 and August 2017. Studies that reported the concurrent validity and/or reliability of dynamic MRI techniques for in vivo evaluation of joint or muscle mechanics were included after assessment by two independent reviewers. Selected articles were assessed using an adapted quality assessment tool and a data extraction process. Results for concurrent validity and reliability were categorized as poor, moderate, or excellent. Twenty articles fulfilled the inclusion criteria with a mean quality assessment score of 66% (±10.4%). Concurrent validity and/or reliability of eight dynamic MRI techniques were reported, with the knee being the most evaluated joint (seven studies). Moderate to excellent concurrent validity and reliability were reported for seven out of eight dynamic MRI techniques. Cine phase contrast and real-time MRI appeared to be the most valid and reliable techniques to evaluate joint motion, and spin tag for muscle motion. Dynamic MRI techniques are promising for the in vivo evaluation of musculoskeletal mechanics; however results should be evaluated with caution since validity and reliability have not been determined for all joints and muscles, nor for many pathological conditions.
Validity and Reliability of the 8-Item Work Limitations Questionnaire.

PubMed

Walker, Timothy J; Tullar, Jessica M; Diamond, Pamela M; Kohl, Harold W; Amick, Benjamin C

2017-12-01

Purpose To evaluate factorial validity, scale reliability, test-retest reliability, convergent validity, and discriminant validity of the 8-item Work Limitations Questionnaire (WLQ) among employees from a public university system. Methods A secondary analysis using de-identified data from employees who completed an annual Health Assessment between the years 2009-2015 tested research aims. Confirmatory factor analysis (CFA) (n = 10,165) tested the latent structure of the 8-item WLQ. Scale reliability was determined using a CFA-based approach while test-retest reliability was determined using the intraclass correlation coefficient. Convergent/discriminant validity was tested by evaluating relations between the 8-item WLQ with health/performance variables for convergent validity (health-related work performance, number of chronic conditions, and general health) and demographic variables for discriminant validity (gender and institution type). Results A 1-factor model with three correlated residuals demonstrated excellent model fit (CFI = 0.99, TLI = 0.99, RMSEA = 0.03, and SRMR = 0.01). The scale reliability was acceptable (0.69, 95% CI 0.68-0.70) and the test-retest reliability was very good (ICC = 0.78). Low-to-moderate associations were observed between the 8-item WLQ and the health/performance variables while weak associations were observed between the demographic variables. Conclusions The 8-item WLQ demonstrated sufficient reliability and validity among employees from a public university system. Results suggest the 8-item WLQ is a usable alternative for studies when the more comprehensive 25-item WLQ is not available.
A Turkish version of myocardial infarction dimensional assessment scale (TR-MIDAS): reliability-validity assesment.

PubMed

Uysal, Hilal; Ozcan, Şeyda

2011-06-01

Many new measuring devices have been developed so that broader psychometric measurements in the coronary artery disease, disease-specific health status measurements, and identification of the broader quality of life can be performed in the recent years. The study was intended to determine whether, and to what extent, MIDAS is a valid and reliable measurement to the patients suffering from myocardial infarction for the first time in Turkey. The research was conducted with the patients hospitalized and treated with myocardial infarction in the cardiology departments of 2 hospitals in Istanbul, Turkey, between 2007 and 2008. Psychometric evaluations of TR-MIDAS were used for validity studies; language validity, content validity, construct validity were examined. For reliability studies; the tool's internal consistency reliability, Cronbach's alpha reliability coefficient, and test-retest reliability were completed. The instrument's content validity index was determined to be "0.95". Principal component analysis revealed six factors with an eigenvalue >1.5. Cronbach's alpha was found to be 0.89 for total scale which was an acceptable value. The total's test-retest reliability was 0.51 (p<0.01). Data obtained at the end of the study supports that Turkish Myocardial Infarction Dimensional Assessment Scale is a valid and reliable instrument as a disease-specific scale to assess the patients' quality of life suffering from myocardial infarction in Turkey. Copyright © 2010 European Society of Cardiology. Published by Elsevier B.V. All rights reserved.
Longitudinal Models of Reliability and Validity: A Latent Curve Approach.

ERIC Educational Resources Information Center

Tisak, John; Tisak, Marie S.

1996-01-01

Dynamic generalizations of reliability and validity that will incorporate longitudinal or developmental models, using latent curve analysis, are discussed. A latent curve model formulated to depict change is incorporated into the classical definitions of reliability and validity. The approach is illustrated with sociological and psychological…

Scoring Rubric Development: Validity and Reliability.

ERIC Educational Resources Information Center

Moskal, Barbara M.; Leydens, Jon A.

2000-01-01

Provides clear definitions of the terms "validity" and "reliability" in the context of developing scoring rubrics and illustrates these definitions through examples. Also clarifies how validity and reliability may be addressed in the development of scoring rubrics, defined as descriptive scoring schemes developed to guide the analysis of the…
Assessment of generalizability, applicability and predictability (GAP) for evaluating external validity in studies of universal family-based prevention of alcohol misuse in young people: systematic methodological review of randomized controlled trials.

PubMed

Fernandez-Hermida, Jose Ramon; Calafat, Amador; Becoña, Elisardo; Tsertsvadze, Alexander; Foxcroft, David R

2012-09-01

To assess external validity characteristics of studies from two Cochrane Systematic Reviews of the effectiveness of universal family-based prevention of alcohol misuse in young people. Two reviewers used an a priori developed external validity rating form and independently assessed three external validity dimensions of generalizability, applicability and predictability (GAP) in randomized controlled trials. The majority (69%) of the included 29 studies were rated 'unclear' on the reporting of sufficient information for judging generalizability from sample to study population. Ten studies (35%) were rated 'unclear' on the reporting of sufficient information for judging applicability to other populations and settings. No study provided an assessment of the validity of the trial end-point measures for subsequent mortality, morbidity, quality of life or other economic or social outcomes. Similarly, no study reported on the validity of surrogate measures using established criteria for assessing surrogate end-points. Studies evaluating the benefits of family-based prevention of alcohol misuse in young people are generally inadequate at reporting information relevant to generalizability of the findings or implications for health or social outcomes. Researchers, study authors, peer reviewers, journal editors and scientific societies should take steps to improve the reporting of information relevant to external validity in prevention trials. © 2012 The Authors. Addiction © 2012 Society for the Study of Addiction.
An alternative to the balance error scoring system: using a low-cost balance board to improve the validity/reliability of sports-related concussion balance testing.

PubMed

Chang, Jasper O; Levy, Susan S; Seay, Seth W; Goble, Daniel J

2014-05-01

Recent guidelines advocate sports medicine professionals to use balance tests to assess sensorimotor status in the management of concussions. The present study sought to determine whether a low-cost balance board could provide a valid, reliable, and objective means of performing this balance testing. Criterion validity testing relative to a gold standard and 7 day test-retest reliability. University biomechanics laboratory. Thirty healthy young adults. Balance ability was assessed on 2 days separated by 1 week using (1) a gold standard measure (ie, scientific grade force plate), (2) a low-cost Nintendo Wii Balance Board (WBB), and (3) the Balance Error Scoring System (BESS). Validity of the WBB center of pressure path length and BESS scores were determined relative to the force plate data. Test-retest reliability was established based on intraclass correlation coefficients. Composite scores for the WBB had excellent validity (r = 0.99) and test-retest reliability (R = 0.88). Both the validity (r = 0.10-0.52) and test-retest reliability (r = 0.61-0.78) were lower for the BESS. These findings demonstrate that a low-cost balance board can provide improved balance testing accuracy/reliability compared with the BESS. This approach provides a potentially more valid/reliable, yet affordable, means of assessing sports-related concussion compared with current methods.
Developing a Brief Cross-Culturally Validated Screening Tool for Externalizing Disorders in Children

ERIC Educational Resources Information Center

Zwirs, Barbara W. C.; Burger, Huibert; Schulpen, Tom W. J.; Buitelaar, Jan K.

2008-01-01

The study aims at developing and validating a brief, easy-to-use screening instrument for teachers to predict externalizing disorders in children and recommending them for timely referral. The scores are compared between Dutch and non-Dutch immigrant children and a significant amount of cases for externalizing disorders were identified but sex and…
Inter-rater Reliability of Sustained Aberrant Movement Patterns as a Clinical Assessment of Muscular Fatigue

PubMed Central

Aerts, Frank; Carrier, Kathy; Alwood, Becky

2016-01-01

Background: The assessment of clinical manifestation of muscle fatigue is an effective procedure in establishing therapeutic exercise dose. Few studies have evaluated physical therapist reliability in establishing muscle fatigue through detection of changes in quality of movement patterns in a live setting. Objective: The purpose of this study is to evaluate the inter-rater reliability of physical therapists’ ability to detect altered movement patterns due to muscle fatigue. Design: A reliability study in a live setting with multiple raters. Participants: Forty-four healthy individuals (ages 19-35) were evaluated by six physical therapists in a live setting. Methods: Participants were evaluated by physical therapists for altered movement patterns during resisted shoulder rotation. Each participant completed a total of four tests: right shoulder internal rotation, right shoulder external rotation, left shoulder internal rotation and left shoulder external rotation. Results: For all tests combined, the inter-rater reliability for a single rater scoring ICC (2,1) was .65 (95%, .60, .71) This corresponds to moderate inter-rater reliability between physical therapists. Limitations: The results of this study apply only to healthy participants and therefore cannot be generalized to a symptomatic population. Conclusion: Moderate inter-rater reliability was found between physical therapists in establishing muscle fatigue through the observation of sustained altered movement patterns during dynamic resistive shoulder internal and external rotation. PMID:27347241
The cross-cultural adaptation, reliability, and validity of the Copenhagen Neck Functional Disability Scale in patients with chronic neck pain: Turkish version study.

PubMed

Yapali, Gökmen; Günel, Mintaze Kerem; Karahan, Sevilay

2012-05-15

The study design was cross-cultural adaptation and investigation of reliability and validity of the Copenhagen Neck Functional Disability Scale (CNFDS). The aim of this study was to translate the CNFDS into Turkish language and assess its reliability and validity among patients with neck pain in Turkish population. The CNFDS is a reliable and valid evaluation instrument for disability, but there is no published the Turkish version of the CNFDS. One hundred one subjects who had chronic neck pain were included in this study. The CNFDS, Neck Pain and Disability Scale, and visual analogue scale were administered to all subjects. For investigating test-retest reliability, correlation between CNFDS scores, applied at 1-week interval, intraclass correlation coefficient score for test-retest reliability was 0.86 (95% confidence interval = 0.679-0.935). There was no difference between test-retest scores (P < 0.001). For investigating concurrent validity, correlation between total score of the CNFDS and the mean visual analogue scale was r = 0.73 (P < 0.001). Concurrent validity of the CNFDS was very good. For investigating construct validity, correlation between total score of the CNFDS and the Neck Pain and Disability Scale was r = 0.78 (P < 0.001). Construct validity of the CNFDS was also very good. Our results suggest that the Turkish version of the CNFDS is a reliable and valid instrument for Turkish people.
Development of a Conservative Model Validation Approach for Reliable Analysis

DTIC Science & Technology

2015-01-01

CIE 2015 August 2-5, 2015, Boston, Massachusetts, USA [DRAFT] DETC2015-46982 DEVELOPMENT OF A CONSERVATIVE MODEL VALIDATION APPROACH FOR RELIABLE...obtain a conservative simulation model for reliable design even with limited experimental data. Very little research has taken into account the...3, the proposed conservative model validation is briefly compared to the conventional model validation approach. Section 4 describes how to account
Evaluation of Validity and Reliability for Hierarchical Scales Using Latent Variable Modeling

ERIC Educational Resources Information Center

Raykov, Tenko; Marcoulides, George A.

2012-01-01

A latent variable modeling method is outlined, which accomplishes estimation of criterion validity and reliability for a multicomponent measuring instrument with hierarchical structure. The approach provides point and interval estimates for the scale criterion validity and reliability coefficients, and can also be used for testing composite or…
The Reliability and Validity of a Scale to Measure Teachers' Attitudes toward Integration in an Australian Context.

ERIC Educational Resources Information Center

Roberts, Clare; Pratt, Chris

1988-01-01

The study evaluated the psychometric properties of reliability and construct validity of the Attitude Toward Mainstreaming Scale (ATMS) in an Australian context. It was concluded that the scale is both reliable and factorially valid in an Australian context. (Author/DB)
Self-esteem among nursing assistants: reliability and validity of the Rosenberg Self-Esteem Scale.

PubMed

McMullen, Tara; Resnick, Barbara

2013-01-01

To establish the reliability and validity of the Rosenberg Self-Esteem Scale (RSES) when used with nursing assistants (NAs). Testing the RSES used baseline data from a randomized controlled trial testing the Res-Care Intervention. Female NAs were recruited from nursing homes (n = 508). Validity testing for the positive and negative subscales of the RSES was based on confirmatory factor analysis (CFA) using structural equation modeling and Rasch analysis. Estimates of reliability were based on Rasch analysis and the person separation index. Evidence supports the reliability and validity of the RSES in NAs although we recommend minor revisions to the measure for subsequent use. Establishing reliable and valid measures of self-esteem in NAs will facilitate testing of interventions to strengthen workplace self-esteem, job satisfaction, and retention.
Construct Validity and Reliability of the Questionnaire on the Quality of Physician-Patient Interaction in Adults With Hypertension.

PubMed

Hickman, Ronald L; Clochesy, John M; Hetland, Breanna; Alaamri, Marym

2017-04-01

There are limited reliable and valid measures of the patient- provider interaction among adults with hypertension. Therefore, the purpose of this report is to describe the construct validity and reliability of the Questionnaire on the Quality of Physician-Patient Interaction (QQPPI), in community-dwelling adults with hypertension. A convenience sample of 109 participants with hypertension was recruited and administered the QQPPI at baseline and 8 weeks later. The exploratory factor analysis established a 12-item, 2-factor structure for the QQPPI was valid in this sample. The modified QQPPI proved to have sufficient internal consistency and test- retest reliability. The modified QQPPI is a valid and reliable measure of the provider-patient interaction, a construct posited to impact self-management, in adults with hypertension.
Psychometrics of the Home Safety Self-Assessment Tool (HSSAT) to prevent falls in community-dwelling older adults.

PubMed

Tomita, Machiko R; Saharan, Sumandeep; Rajendran, Sheela; Nochajski, Susan M; Schweitzer, Jo A

2014-01-01

OBJECTIVE. To identify psychometric properties of the Home Safety Self-Assessment Tool (HSSAT) to prevent falls in community-dwelling older adults. METHOD. We tested content validity, test-retest reliability, interrater reliability, construct validity, convergent and discriminant validity, and responsiveness to change. RESULTS. The content validity index was .98, the intraclass correlation coefficient for test-retest reliability was .97, and the interrater reliability was .89. The difference on identified risk factors between the use and nonuse of the HSSAT was significant (p = .005). Convergent validity with the Centers for Disease Control and Prevention Home Safety Checklist was high (r = .65), and discriminant validity with fear of falling was very low (r = .10). The responsiveness to change was moderate (standardized response mean = 0.57). CONCLUSION. The HSSAT is a reliable and valid instrument to identify fall risks in a home environment, and the HSSAT booklet is effective as educational material leading to improvement in home safety. Copyright © 2014 by the American Occupational Therapy Association, Inc.
Cross-Cultural Adaptation, Reliability and Validity Study of the Persian Version of the Clinical COPD Questionnaire.

PubMed

Hasanpour, Neda; Attarbashi Moghadam, Behrouz; Sami, Ramin; Tavakol, Kamran

2016-08-01

The clinical COPD questionnaire (CCQ) has been developed to measure the health status of COPD patients. The aim of this study was to translate CCQ into the Persian language and assess the validity and reliability of the translated version. We used a forward-backward procedure to translate the questionnaire. In a cross-sectional study 100 COPD patients and 50 healthy subjects over 40 years old were selected to assess the reliability and construct validity of the instrument. The face and content validity were used for the questionnaire validity. Validity was examined in a population of patients with COPD, using the Persian validated version of the St George's Respiratory Questionnaire (PSGRQ). In order to assess the questionnaire's reliability, the Intraclass correlation coefficient (ICC) and Cronbach's alpha were calculated. Test-retest reliability was tested by re-administering the Persian version of the CCQ (PCCQ) after 1 week. Test-retest carry out of data demonstrates that the PCCQ has excellent reliability (ICC for all 3 domains were higher than 0.9). Internal consistency was found by Cronbach's alpha to be 0.96, 0.94, 0.97, and 0.98 for the symptom, mental state, functional state and total scores respectively. In addition, the correlation between the components of PCCQ and PSGRQ showed satisfactory construct validity. Analyzing the data from healthy subjects and patients divulged that the PCCQ has acceptable discriminant validity. In general, the PCCQ had satisfactory reliability and validity for assessing health-related quality of life status of Iranian COPD patients.
Demonstrating Experimenter "Ineptitude" as a Means of Teaching Internal and External Validity

ERIC Educational Resources Information Center

Treadwell, Kimberli R.H.

2008-01-01

Internal and external validity are key concepts in understanding the scientific method and fostering critical thinking. This article describes a class demonstration of a "botched" experiment to teach validity to undergraduates. Psychology students (N = 75) completed assessments at the beginning of the semester, prior to and immediately following…
The internal and external validity of the Major Depression Inventory in measuring severity of depressive states.

PubMed

Olsen, L R; Jensen, D V; Noerholm, V; Martiny, K; Bech, P

2003-02-01

We have developed the Major Depression Inventory (MDI), consisting of 10 items, covering the DSM-IV as well as the ICD-10 symptoms of depressive illness. We aimed to evaluate this as a scale measuring severity of depressive states with reference to both internal and external validity. Patients representing the score range from no depression to marked depression on the Hamilton Depression Scale (HAM-D) completed the MDI. Both classical and modern psychometric methods were applied for the evaluation of validity, including the Rasch analysis. In total, 91 patients were included. The results showed that the MDI had an adequate internal validity in being a unidimensional scale (the total score an appropriate or sufficient statistic). The external validity of the MDI was also confirmed as the total score of the MDI correlated significantly with the HAM-D (Pearson's coefficient 0.86, P < or = 0.01, Spearman 0.80, P < or = 0.01). When used in a sample of patients with different states of depression the MDI has an adequate internal and external validity.
[Multisite validation of CDT measurement by the %CDT TIA and the Tina Quant %CDT kits].

PubMed

Boehrer, J L; Cano, Y; Capolaghi, B; Desch, G; Dosbaa, I; Estepa, L; Hennache, B; Schellenberg, F

2007-01-01

The measurement of CDT (Carbohydrate Deficient Transferrin) is an essential biological tool in the diagnosis and follow-up of alcohol abuse. It is also employed as a marker of abstinence for the restitution of driving licences. However, the precision of measurement, and the between laboratory homogeneity of the results are still discussed. The ion exchange followed by immunodetermination of CDT is available in two products, the Tina Quant %CDT (Roche, Mannheim, Germany) and the %CDT TIA (Bio-Rad, Hercules, United States). This multicentre study was undertaken: 1) to evaluate the analytical characteristics of these kits and the homogeneity of the results from one laboratory to another, independently of the method used, 2) to validate the differences between the proposed normal values of both kits, 3) to study the possibility of using commercial control sera as external quality control. Four analytical systems were included in the study (Roche Modular/Hitachi 717, Beckman Coulter Immage and LX20, Dade Behring BNII). Determinations were carried out on pools of sera, commercial control sera, kit controls, and 30 serums of patients. These latter were also analyzed in capillary electrophoresis in order to establish correlations between the techniques. The calibrations were stable over one 2 weeks period. The repeatability of measurements spread out from 3,1% to 24,7%, for a mean value lower than 10%. The commercial control sera provided reliable results, with values adapted to a routine quality control use. The results of the Bio-Rad applications were lower by approximately 20% than those of the Roche application, which justifies the difference of the normal values (2,6% versus 3%), and an identical classification of the patients in at least 27 of the 30 samples. We conclude that the analytical quality of the compared techniques, even if it could be improved, is sufficient to guarantee a good reliability of the results. An external quality control could be proposed by using the control sera that we tested.
Practice change in community pharmacy: quantification of facilitators.

PubMed

Roberts, Alison S; Benrimoj, Shalom I; Chen, Timothy F; Williams, Kylie A; Aslani, Parisa

2008-06-01

There has been an increasing international trend toward the delivery of cognitive pharmaceutical services (CPS) in community pharmacy. CPS have been developed and disseminated individually, without a framework underpinning their implementation and with limited knowledge of factors that might assist practice change. The implementation process is complex, involving a range of internal and external factors. To quantify facilitators of practice change in Australian community pharmacies. We employed a literature review and qualitative study to facilitate the design of a 43-item "facilitators of practice change" scale as part of a quantitative survey instrument, using a framework of organizational theory. The questionnaire was pilot-tested (n = 100), then mailed to a random sample of 2000 community pharmacies, with a copy each for the pharmacy owner, employed pharmacist, and pharmacy assistant. The construct validity and reliability of the scale were established using exploratory factor analysis and Cronbach's alpha, respectively. A total of 735 (37%) pharmacies responded, with 1303 individual questionnaires. Factor analysis of the scale yielded 7 factors, explaining 48.8% of the total variance. The factors were: relationship with physicians (item loading range 0.59-0.85; Cronbach's alpha 0.90), remuneration (0.52-0.74; 0.82), pharmacy layout (0.52-0.79; 0.81), patient expectation (0.52-0.85; 0.82), manpower/staff (0.49-0.66; 0.80), communication and teamwork (0.37-0.65; 0.77), and external support/assistance (0.47-0.69; 0.74). All of the factors demonstrated good reliability and construct validity and explained approximately half of the variance. Implementing CPS requires support not only with the clinical aspects of service delivery, but also for the process of implementation itself, and remuneration models must reflect this. The identified facilitators should be used in a multilevel strategy to integrate professional services into the community pharmacy business, engaging pharmacists and their staff, policy makers, educators, and researchers. Further research is required to determine additional factors impacting the capacity of community pharmacies to implement change.
Commercially available molecular tests for human papillomaviruses (HPV): 2015 update.

PubMed

Poljak, Mario; Kocjan, Boštjan J; Oštrbenk, Anja; Seme, Katja

2016-03-01

Commercial molecular tests for human papillomaviruses (HPV) are invaluable diagnostic tools in cervical carcinoma screening and management of women with cervical precancerous lesions as well as important research tools for epidemiological studies, vaccine development, and implementation and monitoring of vaccination programs. In this third inventory of commercial HPV tests, we identified 193 distinct commercial HPV tests and at least 127 test variants available on the market in 2015, which represents a 54% and 79% increase in the number of distinct HPV tests and variants, respectively, in comparison to our last inventory performed in 2012. Identified HPV tests were provisionally divided into eight main groups and several subgroups. Among the 193 commercial HPV tests, all but two target alpha-HPV types only. Although the number of commercial HPV tests with at least one published study in peer-reviewed literature has increased significantly in the last three years, several published performance evaluations are still not in line with agreed-upon standards in the HPV community. Manufacturers should invest greater effort into evaluating their products and publishing validation/evaluation results in peer-reviewed journals. To achieve this, more clinically oriented external quality-control panels and initiatives are required. For evaluating the analytical performance of the entire range of HPV tests currently on the market, more diverse and reliable external quality-control programs based on international standards for all important HPV types are indispensable. The performance of a wider range of HPV tests must be promptly evaluated on a variety of alternative clinical specimens. In addition, more complete HPV assays containing validated sample-extraction protocols and appropriate internal controls are urgently needed. Provision of a broader range of automated systems allowing large-scale HPV testing as well as the development of reliable, rapid, and affordable molecular point-of-care tests are priorities for the further improvement of HPV tests. Copyright © 2015 Elsevier B.V. All rights reserved.
Validity, Reliability, and the Questionable Role of Psychometrics in Plastic Surgery

PubMed Central

2014-01-01

Summary: This report examines the meaning of validity and reliability and the role of psychometrics in plastic surgery. Study titles increasingly include the word “valid” to support the authors’ claims. Studies by other investigators may be labeled “not validated.” Validity simply refers to the ability of a device to measure what it intends to measure. Validity is not an intrinsic test property. It is a relative term most credibly assigned by the independent user. Similarly, the word “reliable” is subject to interpretation. In psychometrics, its meaning is synonymous with “reproducible.” The definitions of valid and reliable are analogous to accuracy and precision. Reliability (both the reliability of the data and the consistency of measurements) is a prerequisite for validity. Outcome measures in plastic surgery are intended to be surveys, not tests. The role of psychometric modeling in plastic surgery is unclear, and this discipline introduces difficult jargon that can discourage investigators. Standard statistical tests suffice. The unambiguous term “reproducible” is preferred when discussing data consistency. Study design and methodology are essential considerations when assessing a study’s validity. PMID:25289354
The Validation of a Case-Based, Cumulative Assessment and Progressions Examination

PubMed Central

Coker, Adeola O.; Copeland, Jeffrey T.; Gottlieb, Helmut B.; Horlen, Cheryl; Smith, Helen E.; Urteaga, Elizabeth M.; Ramsinghani, Sushma; Zertuche, Alejandra; Maize, David

2016-01-01

Objective. To assess content and criterion validity, as well as reliability of an internally developed, case-based, cumulative, high-stakes third-year Annual Student Assessment and Progression Examination (P3 ASAP Exam). Methods. Content validity was assessed through the writing-reviewing process. Criterion validity was assessed by comparing student scores on the P3 ASAP Exam with the nationally validated Pharmacy Curriculum Outcomes Assessment (PCOA). Reliability was assessed with psychometric analysis comparing student performance over four years. Results. The P3 ASAP Exam showed content validity through representation of didactic courses and professional outcomes. Similar scores on the P3 ASAP Exam and PCOA with Pearson correlation coefficient established criterion validity. Consistent student performance using Kuder-Richardson coefficient (KR-20) since 2012 reflected reliability of the examination. Conclusion. Pharmacy schools can implement internally developed, high-stakes, cumulative progression examinations that are valid and reliable using a robust writing-reviewing process and psychometric analyses. PMID:26941435

Validity and Reliability of the Clinical Competency Evaluation Instrument for Use among Physiotherapy Students: Pilot study.

PubMed

Muhamad, Zailani; Ramli, Ayiesah; Amat, Salleh

2015-05-01

The aim of this study was to determine the content validity, internal consistency, test-retest reliability and inter-rater reliability of the Clinical Competency Evaluation Instrument (CCEVI) in assessing the clinical performance of physiotherapy students. This study was carried out between June and September 2013 at University Kebangsaan Malaysia (UKM), Kuala Lumpur, Malaysia. A panel of 10 experts were identified to establish content validity by evaluating and rating each of the items used in the CCEVI with regards to their relevance in measuring students' clinical competency. A total of 50 UKM undergraduate physiotherapy students were assessed throughout their clinical placement to determine the construct validity of these items. The instrument's reliability was determined through a cross-sectional study involving a clinical performance assessment of 14 final-year undergraduate physiotherapy students. The content validity index of the entire CCEVI was 0.91, while the proportion of agreement on the content validity indices ranged from 0.83-1.00. The CCEVI construct validity was established with factor loading of ≥0.6, while internal consistency (Cronbach's alpha) overall was 0.97. Test-retest reliability of the CCEVI was confirmed with a Pearson's correlation range of 0.91-0.97 and an intraclass coefficient correlation range of 0.95-0.98. Inter-rater reliability of the CCEVI domains ranged from 0.59 to 0.97 on initial and subsequent assessments. This pilot study confirmed the content validity of the CCEVI. It showed high internal consistency, thereby providing evidence that the CCEVI has moderate to excellent inter-rater reliability. However, additional refinement in the wording of the CCEVI items, particularly in the domains of safety and documentation, is recommended to further improve the validity and reliability of the instrument.
External Validity in the Study of Human Development: Theoretical and Methodological Issues

ERIC Educational Resources Information Center

Hultsch, David F.; Hickey, Tom

1978-01-01

An examination of the concept of external validity from two theoretical perspectives: a traditional mechanistic approach and a dialectical organismic approach. Examines the theoretical and methodological implications of these perspectives. (BD)
Validity and Reliability of the Turkish Chronic Pain Acceptance Questionnaire

PubMed

Akmaz, Hazel Ekin; Uyar, Meltem; Kuzeyli Yıldırım, Yasemin; Akın Korhan, Esra

2018-05-29

Pain acceptance is the process of giving up the struggle with pain and learning to live a worthwhile life despite it. In assessing patients with chronic pain in Turkey, making a diagnosis and tracking the effectiveness of treatment is done with scales that have been translated into Turkish. However, there is as yet no valid and reliable scale in Turkish to assess the acceptance of pain. To validate a Turkish version of the Chronic Pain Acceptance Questionnaire developed by McCracken and colleagues. Methodological and cross sectional study. A simple randomized sampling method was used in selecting the study sample. The sample was composed of 201 patients, more than 10 times the number of items examined for validity and reliability in the study, which totaled 20. A patient identification form, the Chronic Pain Acceptance Questionnaire, and the Brief Pain Inventory were used to collect data. Data were collected by face-to-face interviews. In the validity testing, the content validity index was used to evaluate linguistic equivalence, content validity, construct validity, and expert views. In reliability testing of the scale, Cronbach’s α coefficient was calculated, and item analysis and split-test reliability methods were used. Principal component analysis and varimax rotation were used in factor analysis and to examine factor structure for construct concept validity. The item analysis established that the scale, all items, and item-total correlations were satisfactory. The mean total score of the scale was 21.78. The internal consistency coefficient was 0.94, and the correlation between the two halves of the scale was 0.89. The Chronic Pain Acceptance Questionnaire, which is intended to be used in Turkey upon confirmation of its validity and reliability, is an evaluation instrument with sufficient validity and reliability, and it can be reliably used to examine patients’ acceptance of chronic pain.
Validity and Reliability of the Turkish Chronic Pain Acceptance Questionnaire

PubMed Central

Akmaz, Hazel Ekin; Uyar, Meltem; Kuzeyli Yıldırım, Yasemin; Akın Korhan, Esra

2018-01-01

Background: Pain acceptance is the process of giving up the struggle with pain and learning to live a worthwhile life despite it. In assessing patients with chronic pain in Turkey, making a diagnosis and tracking the effectiveness of treatment is done with scales that have been translated into Turkish. However, there is as yet no valid and reliable scale in Turkish to assess the acceptance of pain. Aims: To validate a Turkish version of the Chronic Pain Acceptance Questionnaire developed by McCracken and colleagues. Study Design: Methodological and cross sectional study. Methods: A simple randomized sampling method was used in selecting the study sample. The sample was composed of 201 patients, more than 10 times the number of items examined for validity and reliability in the study, which totaled 20. A patient identification form, the Chronic Pain Acceptance Questionnaire, and the Brief Pain Inventory were used to collect data. Data were collected by face-to-face interviews. In the validity testing, the content validity index was used to evaluate linguistic equivalence, content validity, construct validity, and expert views. In reliability testing of the scale, Cronbach’s α coefficient was calculated, and item analysis and split-test reliability methods were used. Principal component analysis and varimax rotation were used in factor analysis and to examine factor structure for construct concept validity. Results: The item analysis established that the scale, all items, and item-total correlations were satisfactory. The mean total score of the scale was 21.78. The internal consistency coefficient was 0.94, and the correlation between the two halves of the scale was 0.89. Conclusion: The Chronic Pain Acceptance Questionnaire, which is intended to be used in Turkey upon confirmation of its validity and reliability, is an evaluation instrument with sufficient validity and reliability, and it can be reliably used to examine patients’ acceptance of chronic pain. PMID:29843496
Adjustment between work demands and health needs: Development of the Work-Health Balance Questionnaire.

PubMed

Gragnano, Andrea; Miglioretti, Massimo; Frings-Dresen, Monique H W; de Boer, Angela G E M

2017-08-01

This study presented the construct of Work-Health Balance (WHB) and the design and validation of the Work-Health Balance Questionnaire (WHBq). More and more workers have a long-standing health problem or disability (LSHPD). The management of health needs and work demands is crucial for the quality of working life and work retention of these workers. However, no instrument exists measuring this process. The WHBq assesses key factors in the process of adjusting between health needs and work demands. We tested the reliability and validity of 38 items with cross-sectional data from a sample of 321 Italian workers (mean age = 45 ± 11 years) using exploratory factor analysis (EFA), Rasch analyses, and the correlations with other relevant variables. The instrument ultimately consisted of 17 items that reliably measured three factors: work-health incompatibility, health climate, and external support. These dimensions were associated with well-being in the workplace, dysfunctional behaviors at work, and general psychological health. A higher level on the WHB index was associated with lower levels of presenteeism, emotional exhaustion, workaholism, and psychological distress and with higher levels of job satisfaction and work engagement, supporting the construct validity of the instrument. The WHBq shows good psychometric characteristics and strong and theoretically consistent relationships with important and well-known variables. These results make the WHBq a promising tool in the study and management of health of employees, especially for the work continuation of employees returning to work with LSHPD. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
National assessment of validity of coding of acute mastoiditis: a standardised reassessment of 1966 records.

PubMed

Stalfors, J; Enoksson, F; Hermansson, A; Hultcrantz, M; Robinson, Å; Stenfeldt, K; Groth, A

2013-04-01

To investigate the internal validity of the diagnosis code used at discharge after treatment of acute mastoiditis. Retrospective national re-evaluation study of patient records 1993-2007 and make comparison with the original ICD codes. All ENT departments at university hospitals and one large county hospital department in Sweden. A total of 1966 records were reviewed for patients with ICD codes for in-patient treatment of acute (529), chronic (44) and unspecified mastoiditis (21) and acute otitis media (1372). ICD codes were reviewed by the authors with a defined protocol for the clinical diagnosis of acute mastoiditis. Those not satisfying the diagnosis were given an alternative diagnosis. Of 529 records with ICD coding for acute mastoiditis, 397 (75%) were found to meet the definition of acute mastoiditis used in this study, while 18% were not diagnosed as having any type of mastoiditis after review. Review of the in-patients treated for acute media otitis identified an additional 60 cases fulfilling the definition of acute mastoiditis. Overdiagnosis was common, and many patients with a diagnostic code indicating acute mastoiditis had been treated for external otitis or otorrhoea with transmyringeal drainage. The internal validity of the diagnosis acute mastoiditis is dependent on the use of standardised, well-defined criteria. Reliability of diagnosis is fundamental for the comparison of results from different studies. Inadequate reliability in the diagnosis of acute mastoiditis also affects calculations of incidence rates and statistical power and may also affect the conclusions drawn from the results. © 2013 Blackwell Publishing Ltd.
Dynamic MRI to quantify musculoskeletal motion: A systematic review of concurrent validity and reliability, and perspectives for evaluation of musculoskeletal disorders

PubMed Central

Lempereur, Mathieu; Lelievre, Mathieu; Burdin, Valérie; Ben Salem, Douraied; Brochard, Sylvain

2017-01-01

Purpose To report evidence for the concurrent validity and reliability of dynamic MRI techniques to evaluate in vivo joint and muscle mechanics, and to propose recommendations for their use in the assessment of normal and impaired musculoskeletal function. Materials and methods The search was conducted on articles published in Web of science, PubMed, Scopus, Academic search Premier, and Cochrane Library between 1990 and August 2017. Studies that reported the concurrent validity and/or reliability of dynamic MRI techniques for in vivo evaluation of joint or muscle mechanics were included after assessment by two independent reviewers. Selected articles were assessed using an adapted quality assessment tool and a data extraction process. Results for concurrent validity and reliability were categorized as poor, moderate, or excellent. Results Twenty articles fulfilled the inclusion criteria with a mean quality assessment score of 66% (±10.4%). Concurrent validity and/or reliability of eight dynamic MRI techniques were reported, with the knee being the most evaluated joint (seven studies). Moderate to excellent concurrent validity and reliability were reported for seven out of eight dynamic MRI techniques. Cine phase contrast and real-time MRI appeared to be the most valid and reliable techniques to evaluate joint motion, and spin tag for muscle motion. Conclusion Dynamic MRI techniques are promising for the in vivo evaluation of musculoskeletal mechanics; however results should be evaluated with caution since validity and reliability have not been determined for all joints and muscles, nor for many pathological conditions. PMID:29232401
Clinical instruments: reliability and validity critical appraisal.

PubMed

Brink, Yolandi; Louw, Quinette A

2012-12-01

RATIONALE, AIM AND OBJECTIVES: There is a lack of health care practitioners using objective clinical tools with sound psychometric properties. There is also a need for researchers to improve their reporting of the validity and reliability results of these clinical tools. Therefore, to promote the use of valid and reliable tools or tests for clinical evaluation, this paper reports on the development of a critical appraisal tool to assess the psychometric properties of objective clinical tools. A five-step process was followed to develop the new critical appraisal tool: (1) preliminary conceptual decisions; (2) defining key concepts; (3) item generation; (4) assessment of face validity; and (5) formulation of the final tool. The new critical appraisal tool consists of 13 items, of which five items relate to both validity and reliability studies, four items to validity studies only and four items to reliability studies. The 13 items could be scored as 'yes', 'no' or 'not applicable'. This critical appraisal tool will aid both the health care practitioner to critically appraise the relevant literature and researchers to improve the quality of reporting of the validity and reliability of objective clinical tools. © 2011 Blackwell Publishing Ltd.
Assessment of a condition-specific quality-of-life measure for patients with developmentally absent teeth: validity and reliability testing.

PubMed

Akram, A J; Ireland, A J; Postlethwaite, K C; Sandy, J R; Jerreat, A S

2013-11-01

This article describes the process of validity and reliability testing of a condition-specific quality-of-life measure for patients with hypodontia presenting for orthodontic treatment. The development of the instrument is described in a previous article. Royal Devon and Exeter NHS Foundation Trust & Musgrove Park Hospital, Taunton. The child perception questionnaire was used as a standard against which to test criterion validity. The Bland and Altman method was used to check agreement between the two questionnaires. Construct validity was tested using principal component analysis on the four sections of the questionnaire. Test-retest reliability was tested using intraclass correlation coefficient and Bland and Altman method. Cronbach's alpha was used to test internal consistency reliability. Overall the questionnaire showed good reliability, criterion and construct validity. This together with previous evidence of good face and content validity suggests that the instrument may prove useful in clinical practice and further research. This study has demonstrated that the newly developed condition-specific quality-of-life questionnaire is both valid and reliable for use in young patients with hypodontia. © 2013 John Wiley & Sons A/S. Published by Blackwell Publishing Ltd.
Validity and reliability of a scale to measure genital body image.

PubMed

Zielinski, Ruth E; Kane-Low, Lisa; Miller, Janis M; Sampselle, Carolyn

2012-01-01

Women's body image dissatisfaction extends to body parts usually hidden from view--their genitals. Ability to measure genital body image is limited by lack of valid and reliable questionnaires. We subjected a previously developed questionnaire, the Genital Self Image Scale (GSIS) to psychometric testing using a variety of methods. Five experts determined the content validity of the scale. Then using four participant groups, factor analysis was performed to determine construct validity and to identify factors. Further construct validity was established using the contrasting groups approach. Internal consistency and test-retest reliability was determined. Twenty one of 29 items were considered content valid. Two items were added based on expert suggestions. Factor analysis was undertaken resulting in four factors, identified as Genital Confidence, Appeal, Function, and Comfort. The revised scale (GSIS-20) included 20 items explaining 59.4% of the variance. Women indicating an interest in genital cosmetic surgery exhibited significantly lower scores on the GSIS-20 than those who did not. The final 20 item scale exhibited internal reliability across all sample groups as well as test-retest reliability. The GSIS-20 provides a measure of genital body image demonstrating reliability and validity across several populations of women.
Psychometric evaluation of the Arabic version of the multidimensional assessment of fatigue scale (MAF) for use in patients with ankylosing spondylitis.

PubMed

Bahouq, Hanane; Rostom, Samira; Bahiri, Rachid; Hakkou, Jinane; Aissaoui, Nawal; Hajjaj-Hassouni, Najia

2012-12-01

Fatigue is a frequent symptom during ankylosing spondylitis (AS) often under estimated which needs to be measured properly with respect to its intensity by appropriate measures, such as the multidimensional assessment of fatigue (MAF). The aims of this study were to translate into the classic Arabic version of the MAF questionnaire and to validate its use for assessing fatigue in Moroccan patients with AS. The MAF contains 16 items with a global fatigue index (IGF). The MAF was translated and back-translated to arabic, pretested and reviewed by a committee following the Guillemin criteria (J Clin Epidemiol 46:1417-1432, 1993). It was then validate on 110 Moroccan patients with AS. Reliability for the 3-day test-retest was assessed using internal consistency by Cronbach's alpha coefficient and the intra-class correlation coefficient (ICC). External construct validity was assessed by correlation with pain, activity of disease and other keys variable. The reproducibility of the 15 items was satisfactory with a kappa statistics of agreement superior to 0.6. The ICC for IGF score reproducibility was good and reached 0.98 (IC 95%, 0.96-0.99). The internal consistency was at 0.991 with Cronbach's alpha coefficient. The construct validity showed a positive correlation between MAF and the axial (r = 0.34) and peripheral (r = 0.32) visual analogical scale, the Bath ankylosing spondylitis disease activity index (BASDAI) (r = 0.77), the first item of BASDAI (r = 0.85), the functional disability by the Bath ankylosing spondylitis functional index (r = 0.64), the erythrocyte sedimentation rate (r = 0.43) and the C reactive protein (r = 0.30) (for all P < 0.001). There was no statistical correlation between MAF and the other variables. The Arabic version of the MAF has good comprehensibility, internal consistency, reliability and validity for the evaluation of Arabic speaking patients with AS.
A Spanish-language patient safety questionnaire to measure medical and nursing students' attitudes and knowledge.

PubMed

Mira, José J; Navarro, Isabel M; Guilabert, Mercedes; Poblete, Rodrigo; Franco, Astolfo L; Jiménez, Pilar; Aquino, Margarita; Fernández-Trujillo, Francisco J; Lorenzo, Susana; Vitaller, Julián; de Valle, Yohana Díaz; Aibar, Carlos; Aranaz, Jesús M; De Pedro, José A

2015-08-01

To design and validate a questionnaire for assessing attitudes and knowledge about patient safety using a sample of medical and nursing students undergoing clinical training in Spain and four countries in Latin America. In this cross-sectional study, a literature review was carried out and total of 786 medical and nursing students were surveyed at eight universities from five countries (Chile, Colombia, El Salvador, Guatemala, and Spain) to develop and refine a Spanish-language questionnaire on knowledge and attitudes about patient safety. The scope of the questionnaire was based on five dimensions (factors) presented in studies related to patient safety culture found in PubMed and Scopus. Based on the five factors, 25 reactive items were developed. Composite reliability indexes and Cronbach's alpha statistics were estimated for each factor, and confirmatory factor analysis was conducted to assess validity. After a pilot test, the questionnaire was refined using confirmatory models, maximum-likelihood estimation, and the variance-covariance matrix (as input). Multiple linear regression models were used to confirm external validity, considering variables related to patient safety culture as dependent variables and the five factors as independent variables. The final instrument was a structured five-point Likert self-administered survey (the "Latino Student Patient Safety Questionnaire") consisting of 21 items grouped into five factors. Compound reliability indexes (Cronbach's alpha statistic) calculated for the five factors were about 0.7 or higher. The results of the multiple linear regression analyses indicated good model fit (goodness-of-fit index: 0.9). Item-total correlations were higher than 0.3 in all cases. The convergent-discriminant validity was adequate. The questionnaire designed and validated in this study assesses nursing and medical students' attitudes and knowledge about patient safety. This instrument could be used to indirectly evaluate whether or not students in health disciplines are acquiring and thus likely to put into practice the professional skills currently considered most appropriate for patient safety.
[The appraisal of reliability and validity of subjective workload assessment technique and NASA-task load index].

PubMed

Xiao, Yuan-mei; Wang, Zhi-ming; Wang, Mian-zhen; Lan, Ya-jia

2005-06-01

To test the reliability and validity of two mental workload assessment scales, i.e. subjective workload assessment technique (SWAT) and NASA task load index (NASA-TLX). One thousand two hundred and sixty-eight mental workers were sampled from various kinds of occupations, such as scientific research, education, administration and medicine, etc, with randomized cluster sampling. The re-test reliability, split-half reliability, Cronbach's alpha coefficient and correlation coefficients between item score and total score were adopted to test the reliability. The test of validity included structure validity. The re-test reliability coefficients of these two scales and their items were ranged from 0.516 to 0.753 (P < 0.01), indicating the two scales had good re-test reliability; the split-half reliability of SWAT was 0.645, and its Cronbach's alpha coefficient was more than 0.80, all the correlation coefficients between its items score and total score were more than 0.70; as for NASA-TLX, both the split-half reliability and Cronbach's alpha coefficient were more than 0.80, the correlation coefficients between its items score and total score were all more than 0.60 (P < 0.01) except the item of performance. Both scales had good inner consistency. The Pearson correlation coefficient between the two scales was 0.492 (P < 0.01), implying the results of the two scales had good consistency. Factor analysis showed that the two scales had good structure validity. Both SWAT and NASA-TLX have good reliability and validity and may be used as a valid tool to assess mental workload in China after being revised properly.
Assessing the validity and reliability of the Pool Activity Level (PAL) Checklist for use with older people with dementia.

PubMed

Wenborn, Jennifer; Challis, David; Pool, Jackie; Burgess, Jane; Elliott, Nicola; Orrell, Martin

2008-03-01

Activity is key to maintaining physical and mental health and well-being. However, as dementia affects the ability to engage in activity, care-givers can find it difficult to provide appropriate activities. The Pool Activity Level (PAL) Checklist guides the selection of appropriate, personally meaningful activities. The aim of this study was to assess the reliability and validity of the PAL Checklist when used with older people with dementia. A postal questionnaire sent to activity providers assessed content validity. Validity and reliability were measured in a sample of 60 older people with dementia. The questionnaire response rate was 83% (102/122). Most respondents felt no important items were missing. Seven of the nine activities were ranked as 'very important' or 'essential' by at least 77% of the sample, indicating very good content validity. Correlation with measures of cognition, severity of dementia and activity performance demonstrated strong concurrent validity. Inter-item correlation indicated strong construct validity. Cronbach's alpha coefficient measured internal consistency as excellent (0.95). All items achieved acceptable test-retest reliability, and the majority demonstrated acceptable inter-rater reliability. We conclude that the PAL Checklist demonstrates adequate validity and reliability when used with older people with dementia and appears a useful tool for a variety of care settings.
Validity and test-retest reliability in assessing current body size with figure drawings in Chinese adolescents.

PubMed

Lo, Wing-Sze; Ho, Sai-Yin; Wong, Bonny Yee-Man; Mak, Kwok-Kei; Lam, Tai-Hing

2011-06-01

The reliability and validity of Stunkard's Figure Rating Scale (FRS) as a measure of current body size (CBS) was established in Western adolescent girls but not in non-Western population. We examined the validity and test-retest reliability of Stunkard's FRS in assessing CBS among Chinese adolescents. Methods. In a school-based survey in Hong Kong, 5666 adolescents (boys: 45.1%; mean age 14.7 years) provided data on self-reported height and weight, CBS, perceived weight status, and health-related quality of life using the Medical Outcomes Study Short-Form version 2 (SF-12v2). Height and weight were also objectively measured. Spearman's correlation was used to assess construct validity, concurrent validity and test-retest reliability. Convergent and discriminant validity were good: CBS correlated strongly with weight and self-reported/measured BMI, but only weakly with SF-12v2. CBS correlated strongly with perceived weight status, showing concurrent validity. Spearman's correlation (r) for CBS was 0.78 for girls and 0.72 for boys indicating good test-retest reliability. Validity and reliability results did not differ significantly between senior and junior grade adolescents. Our findings support the use of Stunkard's FRS to measure body size among Chinese adolescents.
A systematic review of the reliability and validity of discrete choice experiments in valuing non-market environmental goods.

PubMed

Rakotonarivo, O Sarobidy; Schaafsma, Marije; Hockley, Neal

2016-12-01

While discrete choice experiments (DCEs) are increasingly used in the field of environmental valuation, they remain controversial because of their hypothetical nature and the contested reliability and validity of their results. We systematically reviewed evidence on the validity and reliability of environmental DCEs from the past thirteen years (Jan 2003-February 2016). 107 articles met our inclusion criteria. These studies provide limited and mixed evidence of the reliability and validity of DCE. Valuation results were susceptible to small changes in survey design in 45% of outcomes reporting reliability measures. DCE results were generally consistent with those of other stated preference techniques (convergent validity), but hypothetical bias was common. Evidence supporting theoretical validity (consistency with assumptions of rational choice theory) was limited. In content validity tests, 2-90% of respondents protested against a feature of the survey, and a considerable proportion found DCEs to be incomprehensible or inconsequential (17-40% and 10-62% respectively). DCE remains useful for non-market valuation, but its results should be used with caution. Given the sparse and inconclusive evidence base, we recommend that tests of reliability and validity are more routinely integrated into DCE studies and suggest how this might be achieved. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
Psychometric Validation of the Academic Motivation Scale in a Dental Student Sample.

PubMed

Orsini, Cesar; Binnie, Vivian; Evans, Phillip; Ledezma, Priscilla; Fuentes, Fernando; Villegas, Maria J

2015-08-01

The Academic Motivation Scale is one of the most frequently used instruments to assess academic motivation. It relies on the self-determination theory of human motivation. However, motivation has been understudied in dental education. Therefore, to address the lack of valid instruments to assess academic motivation in dental education and contribute to future research in the field, the aim of this study was to analyze the psychometric properties of this instrument in a sample of dental students. Participants were 989 Chilean undergraduate dental students (86% response rate) who completed a survey containing a Chilean face-valid version of the Spanish Academic Motivation Scale and three other motivation-related instruments to assess the survey's construct and criterion validity. Later, 76 of the students (out of 100 invited) took the survey again to assess its test-retest stability. The instrument's construct validity was supported by the superior goodness of fit of the seven-subscale Academic Motivation Scale over competing models through confirmatory factor analysis and by the expected correlations among its subscales. The concurrent criterion validity was supported by the confirmation of correlations between its subscales and external criteria. Adequate internal consistency and test-retest correlations were also found. The evidence from this study suggests that the Academic Motivation Scale is a preliminarily valid and reliable instrument to assess motivation in the predoctoral dental context. Future research in this area is needed to confirm or refute these results.
An exploratory study into the effect of time-restricted internet access on face-validity, construct validity and reliability of postgraduate knowledge progress testing

PubMed Central

2013-01-01

Background Yearly formative knowledge testing (also known as progress testing) was shown to have a limited construct-validity and reliability in postgraduate medical education. One way to improve construct-validity and reliability is to improve the authenticity of a test. As easily accessible internet has become inseparably linked to daily clinical practice, we hypothesized that allowing internet access for a limited amount of time during the progress test would improve the perception of authenticity (face-validity) of the test, which would in turn improve the construct-validity and reliability of postgraduate progress testing. Methods Postgraduate trainees taking the yearly knowledge progress test were asked to participate in a study where they could access the internet for 30 minutes at the end of a traditional pen and paper test. Before and after the test they were asked to complete a short questionnaire regarding the face-validity of the test. Results Mean test scores increased significantly for all training years. Trainees indicated that the face-validity of the test improved with internet access and that they would like to continue to have internet access during future testing. Internet access did not improve the construct-validity or reliability of the test. Conclusion Improving the face-validity of postgraduate progress testing, by adding the possibility to search the internet for a limited amount of time, positively influences test performance and face-validity. However, it did not change the reliability or the construct-validity of the test. PMID:24195696
The Reliability and Validity of Zimbardo Time Perspective Inventory Scores in Academically Talented Adolescents

ERIC Educational Resources Information Center

Worrell, Frank C.; Mello, Zena R.

2007-01-01

In this study, the authors examined the reliability, structural validity, and concurrent validity of Zimbardo Time Perspective Inventory (ZTPI) scores in a group of 815 academically talented adolescents. Reliability estimates of the purported factors' scores were in the low to moderate range. Exploratory factor analysis supported a five-factor…
Reliability and Validity of Information about Student Achievement: Comparing Large-Scale and Classroom Testing Contexts

ERIC Educational Resources Information Center

Cizek, Gregory J.

2009-01-01

Reliability and validity are two characteristics that must be considered whenever information about student achievement is collected. However, those characteristics--and the methods for evaluating them--differ in large-scale testing and classroom testing contexts. This article presents the distinctions between reliability and validity in the two…

Investigating Postgraduate College Admission Interviews: Generalizability Theory Reliability and Incremental Predictive Validity

ERIC Educational Resources Information Center

Arce-Ferrer, Alvaro J.; Castillo, Irene Borges

2007-01-01

The use of face-to-face interviews is controversial for college admissions decisions in light of the lack of availability of validity and reliability evidence for most college admission processes. This study investigated reliability and incremental predictive validity of a face-to-face postgraduate college admission interview with a sample of…
An Integrated Approach to Establish Validity and Reliability of Reading Tests

ERIC Educational Resources Information Center

Razi, Salim

2012-01-01

This study presents the processes of developing and establishing reliability and validity of a reading test by administering an integrative approach as conventional reliability and validity measures superficially reveals the difficulty of a reading test. In this respect, analysing vocabulary frequency of the test is regarded as a more eligible way…
A Validity and Reliability Update on the Informal Reading Inventory with Suggestions for Improvement.

ERIC Educational Resources Information Center

Klesius, Janell P.; Homan, Susan P.

1985-01-01

The article reviews validity and reliability studies on the informal reading inventory, a diagnostic instrument to identify reading grade-level placement and strengths and weaknesses in work recognition and comprehension. Gives suggestions to improve the validity and reliability of existing inventories and to evaluate them in newly published…
Validity and reliability of the Diagnostic Adaptive Behaviour Scale.

PubMed

Tassé, M J; Schalock, R L; Balboni, G; Spreat, S; Navas, P

2016-01-01

The Diagnostic Adaptive Behaviour Scale (DABS) is a new standardised adaptive behaviour measure that provides information for evaluating limitations in adaptive behaviour for the purpose of determining a diagnosis of intellectual disability. This article presents validity evidence and reliability data for the DABS. Validity evidence was based on comparing DABS scores with scores obtained on the Vineland Adaptive Behaviour Scale, second edition. The stability of the test scores was measured using a test and retest, and inter-rater reliability was assessed by computing the inter-respondent concordance. The DABS convergent validity coefficients ranged from 0.70 to 0.84, while the test-retest reliability coefficients ranged from 0.78 to 0.95, and the inter-rater concordance as measured by intraclass correlation coefficients ranged from 0.61 to 0.87. All obtained validity and reliability indicators were strong and comparable with the validity and reliability coefficients of the most commonly used adaptive behaviour instruments. These results and the advantages of the DABS for clinician and researcher use are discussed. © 2015 MENCAP and International Association of the Scientific Study of Intellectual and Developmental Disabilities and John Wiley & Sons Ltd.
[Reliability and validity of Driving Anger Scale in professional drivers in China].

PubMed

Li, Z; Yang, Y M; Zhang, C; Li, Y; Hu, J; Gao, L W; Zhou, Y X; Zhang, X J

2017-11-10

Objective: To assess the reliability and validity of the Chinese version of Driving Anger Scale (DAS) in professional drivers in China and provide a scientific basis for the application of the scale in drivers in China. Methods: Professional drivers, including taxi drivers, bus drivers, truck drivers and school bus drivers, were selected to complete the questionnaire. Cronbach's α and split-half reliability were calculated to evaluate the reliability of DAS, and content, contract, discriminant and convergent validity were performed to measure the validity of the scale. Results: The overall Cronbach's α of DAS was 0.934 and the split-half reliability was 0.874. The correlation coefficient of each subscale with the total scale was 0.639-0.922. The simplified version of DAS supported a presupposed six-factor structure, explaining 56.371% of the total variance revealed by exploratory factor analysis. The DAS had good convergent and discriminant validity, with the success rate of calibration experiment of 100%. Conclusion: DAS has a good reliability and validity in professional drivers in China, and the use of DAS is worth promoting in divers.
Effect of individual shades on reliability and validity of observers in colour matching.

PubMed

Lagouvardos, P E; Diamanti, H; Polyzois, G

2004-06-01

The effect of individual shades in shade guides, on the reliability and validity of measurements in a colour matching process is very important. Observer's agreement on shades and sensitivity/specificity of shades, can give us an estimate of shade's effect on observer's reliability and validity. In the present study, a group of 16 students, matched 15 shades of a Kulzer's guide and 10 human incisors to Kulzer's and/or Vita's shade tabs, in 4 different tests. The results showed shades I, B10, C40, A35 and A10 were those with the highest reliability and validity values. In conclusion, a) the matching process with shades of different materials was not accurate enough, b) some shades produce a more reliable and valid match than others and c) teeth are matched with relative difficulty.
The reliability and validity of a sexual functioning questionnaire.

PubMed

Corty, E W; Althof, S E; Kurit, D M

1996-01-01

The present study assessed the reliability and validity of a measure of sexual functioning, the CMSH-SFQ, for male patients and their partners. The CMSH-SFQ measures erectile and orgasmic functioning, sexual drive, frequency of sexual behavior, and sexual satisfaction. Test-retest reliability was assessed with 19 males and 19 females for the baseline CMSH-SFQ. Criterion validity was measured by comparing the answers of 25 male patients to those of their partners at baseline and follow-up. The majority of items had acceptable levels of reliability and validity. The CMSH-SFQ provides a reliable and valid device that can be used to measure global sexual functioning in men and their partners and may be used to evaluate the efficacy of treatments for sexual dysfunctions. Limitations and suggestions for use of the CMSH-SFQ are addressed.
Reliability and validity of the McDonald Play Inventory.

PubMed

McDonald, Ann E; Vigen, Cheryl

2012-01-01

This study examined the ability of a two-part self-report instrument, the McDonald Play Inventory, to reliably and validly measure the play activities and play styles of 7- to 11-yr-old children and to discriminate between the play of neurotypical children and children with known learning and developmental disabilities. A total of 124 children ages 7-11 recruited from a sample of convenience and a subsample of 17 parents participated in this study. Reliability estimates yielded moderate correlations for internal consistency, total test intercorrelations, and test-retest reliability. Validity estimates were established for content and construct validity. The results suggest that a self-report instrument yields reliable and valid measures of a child's perceived play performance and discriminates between the play of children with and without disabilities. Copyright © 2012 by the American Occupational Therapy Association, Inc.
Examining the reliability and validity of an abbreviated Psychopathic Personality Inventory-Revised (PPI-R) in four samples.

PubMed

Ruchensky, Jared R; Edens, John F; Donnellan, M Brent; Witt, Edward A

2017-02-01

A recently developed 40-item short-form of the Psychopathic Personality Inventory-Revised (PPI-R; Lilienfeld & Widows, 2005) has shown considerable promise as an alternative to the long-form of the instrument (Eisenbarth, Lilienfeld, & Yarkoni, 2015). Beyond the initial construction of the short-form, however, Eisenbarth et al. only evaluated a small number of external correlates in a German college student sample. In this study, we evaluate the internal consistency of the short-form scales in 4 samples previously administered the full PPI-R (3 U.S. college student samples and 1 U.S. forensic psychiatric inpatient sample) and examine a wide range of external correlates to compare the nomological nets of the short- and long-forms. Across all 4 samples, correlations between each short-form scale and its corresponding long-form scale were uniformly high (all rs > .75). In terms of external correlates, the pattern of associations was exceedingly similar, for the short-form and long-form composites with a largely trivial reduction in effect size. Collectively, our findings offer considerable support for the utility of this new short-form as a substitute for the full PPI-R. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Assessing Discriminative Performance at External Validation of Clinical Prediction Models

PubMed Central

Nieboer, Daan; van der Ploeg, Tjeerd; Steyerberg, Ewout W.

2016-01-01

Introduction External validation studies are essential to study the generalizability of prediction models. Recently a permutation test, focusing on discrimination as quantified by the c-statistic, was proposed to judge whether a prediction model is transportable to a new setting. We aimed to evaluate this test and compare it to previously proposed procedures to judge any changes in c-statistic from development to external validation setting. Methods We compared the use of the permutation test to the use of benchmark values of the c-statistic following from a previously proposed framework to judge transportability of a prediction model. In a simulation study we developed a prediction model with logistic regression on a development set and validated them in the validation set. We concentrated on two scenarios: 1) the case-mix was more heterogeneous and predictor effects were weaker in the validation set compared to the development set, and 2) the case-mix was less heterogeneous in the validation set and predictor effects were identical in the validation and development set. Furthermore we illustrated the methods in a case study using 15 datasets of patients suffering from traumatic brain injury. Results The permutation test indicated that the validation and development set were homogenous in scenario 1 (in almost all simulated samples) and heterogeneous in scenario 2 (in 17%-39% of simulated samples). Previously proposed benchmark values of the c-statistic and the standard deviation of the linear predictors correctly pointed at the more heterogeneous case-mix in scenario 1 and the less heterogeneous case-mix in scenario 2. Conclusion The recently proposed permutation test may provide misleading results when externally validating prediction models in the presence of case-mix differences between the development and validation population. To correctly interpret the c-statistic found at external validation it is crucial to disentangle case-mix differences from incorrect regression coefficients. PMID:26881753
Assessing Discriminative Performance at External Validation of Clinical Prediction Models.

PubMed

Nieboer, Daan; van der Ploeg, Tjeerd; Steyerberg, Ewout W

2016-01-01

External validation studies are essential to study the generalizability of prediction models. Recently a permutation test, focusing on discrimination as quantified by the c-statistic, was proposed to judge whether a prediction model is transportable to a new setting. We aimed to evaluate this test and compare it to previously proposed procedures to judge any changes in c-statistic from development to external validation setting. We compared the use of the permutation test to the use of benchmark values of the c-statistic following from a previously proposed framework to judge transportability of a prediction model. In a simulation study we developed a prediction model with logistic regression on a development set and validated them in the validation set. We concentrated on two scenarios: 1) the case-mix was more heterogeneous and predictor effects were weaker in the validation set compared to the development set, and 2) the case-mix was less heterogeneous in the validation set and predictor effects were identical in the validation and development set. Furthermore we illustrated the methods in a case study using 15 datasets of patients suffering from traumatic brain injury. The permutation test indicated that the validation and development set were homogenous in scenario 1 (in almost all simulated samples) and heterogeneous in scenario 2 (in 17%-39% of simulated samples). Previously proposed benchmark values of the c-statistic and the standard deviation of the linear predictors correctly pointed at the more heterogeneous case-mix in scenario 1 and the less heterogeneous case-mix in scenario 2. The recently proposed permutation test may provide misleading results when externally validating prediction models in the presence of case-mix differences between the development and validation population. To correctly interpret the c-statistic found at external validation it is crucial to disentangle case-mix differences from incorrect regression coefficients.
[The development and validation of two scales on retribution practices: PRG-13 and PRE-21].

PubMed

Boada-Grau, Joan; Costa-Solé, Jordi; Gil-Ripoll, Carme; Vigil-Colet, Andreu

2012-01-01

The present study outlines the development process of two scales that measure general and specific retribution practices in organisations. Historically, retribution has been the subject of research of other social sciences such as Sociology and Business Administration. In Psychology, and more specifically in Work and Organisational Psychology, there are hardly any studies or inventories designed to evaluate retribution practices. In order to accomplish the objectives, a sample of 237 employees was selected, 42.6% of whom were women and 57.4% were men. We performed and exploratory factorial analysis using principal axis factoring as extraction method and an oblique rotation (oblimin) to analyse the two scales. The former is made up of four factors and the latter is a two-factor scale. The reliability coefficients of the six subscales we obtained ranged between .72 and .89. External validity was analysed using the correlations obtained between the two inventories and the Balanced Scorecard. The two tools were found to be two potentially useful scales to evaluate retribution practices.
Assessing culture via the Internet: methods and techniques for psychological research.

PubMed

Barry, D T

2001-02-01

This study examines the acculturation experiences of Arabic immigrants and assesses the utility of the Internet as a data collection tool. Based on in-depth pilot interview data from 10 male Arabic immigrants and items selected from pre-existing measures, the Male Arabic Ethnic Identity Measure (MAEIM) was developed. Male Arab immigrants (115 males) were solicited through traditional methods in addition to the Internet. Satisfactory reliability and validity were reported for the MAEIM. No significant differences emerged between the Internet and Midwestern samples. The Internet proved to be an effective method for soliciting a relatively large, geographically dispersed sample of Arabic immigrants. The use of the Internet as a research tool is examined in the context of anonymity, networking, low-cost, perceived interactive control, methodological rigor, and external validity. The Internet was an effective vehicle for addressing concerns raised by prospective participants. It is suggested that the Internet may be an important method to assess culture-relevant variables in further research on Arab and other immigrant populations.
Evidence-Based School Behavior Assessment of Externalizing Behavior in Young Children.

PubMed

Bagner, Daniel M; Boggs, Stephen R; Eyberg, Sheila M

2010-02-01

This study examined the psychometric properties of the Revised Edition of the School Observation Coding System (REDSOCS). Participants were 68 children ages 3 to 6 who completed parent-child interaction therapy for Oppositional Defiant Disorder as part of a larger efficacy trial. Interobserver reliability on REDSOCS categories was moderate to high, with percent agreement ranging from 47% to 90% (M = 67%) and Cohen's kappa coefficients ranging from .69 to .95 (M = .82). Convergent validity of the REDSOCS categories was supported by significant correlations with the Intensity Scale of the Sutter-Eyberg Student Behavior Inventory-Revised and related subscales of the Conners' Teacher Rating Scale-Revised: Long Version (CTRS-R: L). Divergent validity was indicated by nonsignificant correlations between REDSOCS categories and scales on the CTRS-R: L expected not to relate to disruptive classroom behavior. Treatment sensitivity was demonstrated for two of the three primary REDSOCS categories by significant pre to posttreatment changes. This study provides psychometric support for the designation of REDSOCS as an evidence-based assessment procedure for young children.
Reliability and validity of the test of incremental respiratory endurance measures of inspiratory muscle performance in COPD.

PubMed

Formiga, Magno F; Roach, Kathryn E; Vital, Isabel; Urdaneta, Gisel; Balestrini, Kira; Calderon-Candelario, Rafael A; Campos, Michael A; Cahalin, Lawrence P

2018-01-01

The Test of Incremental Respiratory Endurance (TIRE) provides a comprehensive assessment of inspiratory muscle performance by measuring maximal inspiratory pressure (MIP) over time. The integration of MIP over inspiratory duration (ID) provides the sustained maximal inspiratory pressure (SMIP). Evidence on the reliability and validity of these measurements in COPD is not currently available. Therefore, we assessed the reliability, responsiveness and construct validity of the TIRE measures of inspiratory muscle performance in subjects with COPD. Test-retest reliability, known-groups and convergent validity assessments were implemented simultaneously in 81 male subjects with mild to very severe COPD. TIRE measures were obtained using the portable PrO2 device, following standard guidelines. All TIRE measures were found to be highly reliable, with SMIP demonstrating the strongest test-retest reliability with a nearly perfect intraclass correlation coefficient (ICC) of 0.99, while MIP and ID clustered closely together behind SMIP with ICC values of about 0.97. Our findings also demonstrated known-groups validity of all TIRE measures, with SMIP and ID yielding larger effect sizes when compared to MIP in distinguishing between subjects of different COPD status. Finally, our analyses confirmed convergent validity for both SMIP and ID, but not MIP. The TIRE measures of MIP, SMIP and ID have excellent test-retest reliability and demonstrated known-groups validity in subjects with COPD. SMIP and ID also demonstrated evidence of moderate convergent validity and appear to be more stable measures in this patient population than the traditional MIP.
The reliability and validity of ultrasound to quantify muscles in older adults: a systematic review

PubMed Central

Scafoglieri, Aldo; Jager‐Wittenaar, Harriët; Hobbelen, Johannes S.M.; van der Schans, Cees P.

2017-01-01

Abstract This review evaluates the reliability and validity of ultrasound to quantify muscles in older adults. The databases PubMed, Cochrane, and Cumulative Index to Nursing and Allied Health Literature were systematically searched for studies. In 17 studies, the reliability (n = 13) and validity (n = 8) of ultrasound to quantify muscles in community‐dwelling older adults (≥60 years) or a clinical population were evaluated. Four out of 13 reliability studies investigated both intra‐rater and inter‐rater reliability. Intraclass correlation coefficient (ICC) scores for reliability ranged from −0.26 to 1.00. The highest ICC scores were found for the vastus lateralis, rectus femoris, upper arm anterior, and the trunk (ICC = 0.72 to 1.000). All included validity studies found ICC scores ranging from 0.92 to 0.999. Two studies describing the validity of ultrasound to predict lean body mass showed good validity as compared with dual‐energy X‐ray absorptiometry (r 2 = 0.92 to 0.96). This systematic review shows that ultrasound is a reliable and valid tool for the assessment of muscle size in older adults. More high‐quality research is required to confirm these findings in both clinical and healthy populations. Furthermore, ultrasound assessment of small muscles needs further evaluation. Ultrasound to predict lean body mass is feasible; however, future research is required to validate prediction equations in older adults with varying function and health. PMID:28703496
Reliability and concurrent and construct validity of the Strategies for Weight Management measure for adults.

PubMed

Kolodziejczyk, Julia K; Norman, Gregory J; Rock, Cheryl L; Arredondo, Elva M; Roesch, Scott C; Madanat, Hala; Patrick, Kevin

2016-01-01

This study evaluates the reliability and validity of the strategies for weight management (SWM) measure, a questionnaire that assesses weight management strategies for adults. The SWM includes 20 items that are categorized within the following subscales: (1) energy intake, (2) energy expenditure, (3) self-monitoring, and (4) self-regulation. Baseline and 6-month data were collected from 404 overweight/obese adults (mean age=22±3.8 years, 68% ethnic minority) enrolled in a randomized controlled trial aiming to reduce weight by improving diet and physical activity behaviours. Reliability and validity were assessed for each subscale separately. Cronbach alpha was conducted to assess reliability. Concurrent, construct I (sensitivity to the study treatment condition), and construct II (relationship to the outcomes) validity were assessed using linear regressions with the following outcome measures: weight, self-reported diet, and weekly energy expenditure. All subscales showed strong internal consistency. The strength of the validity evidence depended on subscale and validity type. The strongest validity evidence was concurrent validity of the energy intake and energy expenditure subscales; construct I validity of the energy intake and self-monitoring subscales; and construct II validity of the energy intake, energy expenditure, and self-regulation subscales. Results indicate that the SWM can be used to assess weight management strategies among an ethnically diverse sample of adults as each subscale showed evidence of reliability and select types of validity. As validity is an accumulation of evidence over multiple studies, this study provides initial reliability and validity evidence in one population segment. Copyright © 2015 Asia Oceania Association for the Study of Obesity. Published by Elsevier Ltd. All rights reserved.
[Reliability and validity of the Chinese version on Alcohol Use Disorders Identification Test].

PubMed

Zhang, C; Yang, G P; Li, Z; Li, X N; Li, Y; Hu, J; Zhang, F Y; Zhang, X J

2017-08-10

Objective: To assess the reliability and validity of the Chinese version on Alcohol Use Disorders Identification Test (AUDIT) among medical students in China and to provide correct way of application on the recommended scales. Methods: An E-questionnaire was developed and sent to medical students in five different colleges. Students were all active volunteers to accept the testings. Cronbach's α and split-half reliability were calculated to evaluate the reliability of AUDIT while content, contract, discriminant and convergent validity were performed to measure the validity of the scales. Results: The overall Cronbach's α of AUDIT was 0.782 and the split-half reliability was 0.711. Data showed that the domain Cronbach's α and split-half reliability were 0.796 and 0.794 for hazardous alcohol use, 0.561 and 0.623 for dependence symptoms, and 0.647 and 0.640 for harmful alcohol use. Results also showed that the content validity index on the levels of items I-CVI) were from 0.83 to 1.00, the content validity index of scale level (S-CVI/UA) was 0.90, content validity index of average scale level (S-CVI/Ave) was 0.99 and the content validity ratios (CVR) were from 0.80 to 1.00. The simplified version of AUDIT supported a presupposed three-factor structure which could explain 61.175% of the total variance revealed through exploratory factor analysis. AUDIT semed to have good convergent and discriminant validity, with the success rate of calibration experiment as 100%. Conclusion: AUDIT showed good reliability and validity among medical students in China thus worth for promotion on its use.
The Queensland high risk foot form (QHRFF) – is it a reliable and valid clinical research tool for foot disease?

PubMed Central

2014-01-01

Background Foot disease complications, such as foot ulcers and infection, contribute to considerable morbidity and mortality. These complications are typically precipitated by “high-risk factors”, such as peripheral neuropathy and peripheral arterial disease. High-risk factors are more prevalent in specific “at risk” populations such as diabetes, kidney disease and cardiovascular disease. To the best of the authors’ knowledge a tool capturing multiple high-risk factors and foot disease complications in multiple at risk populations has yet to be tested. This study aimed to develop and test the validity and reliability of a Queensland High Risk Foot Form (QHRFF) tool. Methods The study was conducted in two phases. Phase one developed a QHRFF using an existing diabetes foot disease tool, literature searches, stakeholder groups and expert panel. Phase two tested the QHRFF for validity and reliability. Four clinicians, representing different levels of expertise, were recruited to test validity and reliability. Three cohorts of patients were recruited; one tested criterion measure reliability (n = 32), another tested criterion validity and inter-rater reliability (n = 43), and another tested intra-rater reliability (n = 19). Validity was determined using sensitivity, specificity and positive predictive values (PPV). Reliability was determined using Kappa, weighted Kappa and intra-class correlation (ICC) statistics. Results A QHRFF tool containing 46 items across seven domains was developed. Criterion measure reliability of at least moderate categories of agreement (Kappa > 0.4; ICC > 0.75) was seen in 91% (29 of 32) tested items. Criterion validity of at least moderate categories (PPV > 0.7) was seen in 83% (60 of 72) tested items. Inter- and intra-rater reliability of at least moderate categories (Kappa > 0.4; ICC > 0.75) was seen in 88% (84 of 96) and 87% (20 of 23) tested items respectively. Conclusions The QHRFF had acceptable validity and reliability across the majority of items; particularly items identifying relevant co-morbidities, high-risk factors and foot disease complications. Recommendations have been made to improve or remove identified weaker items for future QHRFF versions. Overall, the QHRFF possesses suitable practicality, validity and reliability to assess and capture relevant foot disease items across multiple at risk populations. PMID:24468080
Quantitative Determination of Fusarium proliferatum Concentration in Intact Garlic Cloves Using Near-Infrared Spectroscopy

PubMed Central

Tamburini, Elena; Mamolini, Elisabetta; De Bastiani, Morena; Marchetti, Maria Gabriella

2016-01-01

Fusarium proliferatum is considered to be a pathogen of many economically important plants, including garlic. The objective of this research was to apply near-infrared spectroscopy (NIRS) to rapidly determine fungal concentration in intact garlic cloves, avoiding the laborious and time-consuming procedures of traditional assays. Preventive detection of infection before seeding is of great interest for farmers, because it could avoid serious losses of yield during harvesting and storage. Spectra were collected on 95 garlic cloves, divided in five classes of infection (from 1-healthy to 5-very highly infected) in the range of fungal concentration 0.34–7231.15 ppb. Calibration and cross validation models were developed with partial least squares regression (PLSR) on pretreated spectra (standard normal variate, SNV, and derivatives), providing good accuracy in prediction, with a coefficient of determination (R2) of 0.829 and 0.774, respectively, a standard error of calibration (SEC) of 615.17 ppb, and a standard error of cross validation (SECV) of 717.41 ppb. The calibration model was then used to predict fungal concentration in unknown samples, peeled and unpeeled. The results showed that NIRS could be used as a reliable tool to directly detect and quantify F. proliferatum infection in peeled intact garlic cloves, but the presence of the external peel strongly affected the prediction reliability. PMID:27428978

Model-based and Model-free Machine Learning Techniques for Diagnostic Prediction and Classification of Clinical Outcomes in Parkinson's Disease.

PubMed

Gao, Chao; Sun, Hanbo; Wang, Tuo; Tang, Ming; Bohnen, Nicolaas I; Müller, Martijn L T M; Herman, Talia; Giladi, Nir; Kalinin, Alexandr; Spino, Cathie; Dauer, William; Hausdorff, Jeffrey M; Dinov, Ivo D

2018-05-08

In this study, we apply a multidisciplinary approach to investigate falls in PD patients using clinical, demographic and neuroimaging data from two independent initiatives (University of Michigan and Tel Aviv Sourasky Medical Center). Using machine learning techniques, we construct predictive models to discriminate fallers and non-fallers. Through controlled feature selection, we identified the most salient predictors of patient falls including gait speed, Hoehn and Yahr stage, postural instability and gait difficulty-related measurements. The model-based and model-free analytical methods we employed included logistic regression, random forests, support vector machines, and XGboost. The reliability of the forecasts was assessed by internal statistical (5-fold) cross validation as well as by external out-of-bag validation. Four specific challenges were addressed in the study: Challenge 1, develop a protocol for harmonizing and aggregating complex, multisource, and multi-site Parkinson's disease data; Challenge 2, identify salient predictive features associated with specific clinical traits, e.g., patient falls; Challenge 3, forecast patient falls and evaluate the classification performance; and Challenge 4, predict tremor dominance (TD) vs. posture instability and gait difficulty (PIGD). Our findings suggest that, compared to other approaches, model-free machine learning based techniques provide a more reliable clinical outcome forecasting of falls in Parkinson's patients, for example, with a classification accuracy of about 70-80%.
Construction of Valid and Reliable Test for Assessment of Students

ERIC Educational Resources Information Center

Osadebe, P. U.

2015-01-01

The study was carried out to construct a valid and reliable test in Economics for secondary school students. Two research questions were drawn to guide the establishment of validity and reliability for the Economics Achievement Test (EAT). It is a multiple choice objective test of five options with 100 items. A sample of 1000 students was randomly…
Validity and Reliability of a Medicine Ball Explosive Power Test.

ERIC Educational Resources Information Center

Stockbrugger, Barry A.; Haennel, Robert G.

2001-01-01

Evaluated the validity and reliability of a medicine ball throw test to evaluate explosive power. Data on competitive sand volleyball players who performed a medicine ball throw and a standard countermovement jump indicated that the medicine ball throw test was a valid and reliable way to assess explosive power for an analogous total-body movement…
The Validity and Reliability of the Mobbing Scale (MS)

ERIC Educational Resources Information Center

Yaman, Erkan

2009-01-01

The aim of this research is to develop the Mobbing Scale and examine its validity and reliability. The sample of the study consisted of 515 persons from Sakarya and Bursa. In this study, construct validity, internal consistency, test-retest reliability, and item analysis of the scale were examined. As a result of factor analysis for construct…
Reliability and Validity of the Devereux Early Childhood Assessment (DECA) as a Function of Parent and Teacher Ratings

ERIC Educational Resources Information Center

Barbu, Otilia C.; Levine-Donnerstein, Deborah; Marx, Ronald W.; Yaden, David B., Jr.

2013-01-01

This study examined reliability and validity of the Devereux Early Childhood Assessment (DECA), based on samples of parents and teachers' ratings of 1,145 entering kindergartners in the Southwest. Confirmatory factor analysis showed that DECA presented good reliability and validity for manifest variables, corroborating previous findings. Three…
Convergence among Data Sources, Response Bias, and Reliability and Validity of a Structured Job Analysis Questionnaire.

ERIC Educational Resources Information Center

Smith, Jack E.; Hakel, Milton D.

1979-01-01

Examined are questions pertinent to the use of the Position Analysis Questionnaire: Who can use the PAQ reliably and validly? Must one rely on trained job analysts? Can people having no direct contact with the job use the PAQ reliably and validly? Do response biases influence PAQ responses? (Author/KC)
Validity and Reliability of the Arabic Token Test for Children

ERIC Educational Resources Information Center

Alkhamra, Rana A.; Al-Jazi, Aya B.

2016-01-01

Background: The Token Test for Children (2nd edition) (TTFC) is a measure for assessing receptive language. In this study we describe the translation process, validity and reliability of the Arabic Token Test for Children (A-TTFC). Aims: The aim of this study is to translate, validate and establish the reliability of the Arabic Token Test for…
Construction and Evaluation of Reliability and Validity of Reasoning Ability Test

ERIC Educational Resources Information Center

Bhat, Mehraj A.

2014-01-01

This paper is based on the construction and evaluation of reliability and validity of reasoning ability test at secondary school students. In this paper an attempt was made to evaluate validity, reliability and to determine the appropriate standards to interpret the results of reasoning ability test. The test includes 45 items to measure six types…
Conceptualizing Essay Tests' Reliability and Validity: From Research to Theory

ERIC Educational Resources Information Center

Badjadi, Nour El Imane

2013-01-01

The current paper on writing assessment surveys the literature on the reliability and validity of essay tests. The paper aims to examine the two concepts in relationship with essay testing as well as to provide a snapshot of the current understandings of the reliability and validity of essay tests as drawn in recent research studies. Bearing in…
The Reliability and Validity of Discrete and Continuous Measures of Psychopathology: A Quantitative Review

ERIC Educational Resources Information Center

Markon, Kristian E.; Chmielewski, Michael; Miller, Christopher J.

2011-01-01

In 2 meta-analyses involving 58 studies and 59,575 participants, we quantitatively summarized the relative reliability and validity of continuous (i.e., dimensional) and discrete (i.e., categorical) measures of psychopathology. Overall, results suggest an expected 15% increase in reliability and 37% increase in validity through adoption of a…
Paediatric Automatic Phonological Analysis Tools (APAT).

PubMed

Saraiva, Daniela; Lousada, Marisa; Hall, Andreia; Jesus, Luis M T

2017-12-01

To develop the pediatric Automatic Phonological Analysis Tools (APAT) and to estimate inter and intrajudge reliability, content validity, and concurrent validity. The APAT were constructed using Excel spreadsheets with formulas. The tools were presented to an expert panel for content validation. The corpus used in the Portuguese standardized test Teste Fonético-Fonológico - ALPE produced by 24 children with phonological delay or phonological disorder was recorded, transcribed, and then inserted into the APAT. Reliability and validity of APAT were analyzed. The APAT present strong inter- and intrajudge reliability (>97%). The content validity was also analyzed (ICC = 0.71), and concurrent validity revealed strong correlations between computerized and manual (traditional) methods. The development of these tools contributes to fill existing gaps in clinical practice and research, since previously there were no valid and reliable tools/instruments for automatic phonological analysis, which allowed the analysis of different corpora.
Lateral Load Testing of the Advanced Stirling Convertor (ASC-E2) Heater Head

NASA Technical Reports Server (NTRS)

Cornell, Peggy A.; Krause, David L.; Davis, Glen; Robbie, Malcolm G.; Gubics, David A.

2010-01-01

Free-piston Stirling convertors are fundamental to the development of NASA s next generation of radioisotope power system, the Advanced Stirling Radioisotope Generator (ASRG). The ASRG will use General Purpose Heat Source (GPHS) modules as the energy source and Advanced Stirling Convertors (ASCs) to convert heat into electrical energy, and is being developed by Lockheed Martin under contract to the Department of Energy. Achieving flight status mandates that the ASCs satisfy design as well as flight requirements to ensure reliable operation during launch. To meet these launch requirements, GRC performed a series of quasi-static mechanical tests simulating the pressure, thermal, and external loading conditions that will be experienced by an ASC-E2 heater head assembly. These mechanical tests were collectively referred to as "lateral load tests" since a primary external load lateral to the heater head longitudinal axis was applied in combination with the other loading conditions. The heater head was subjected to the operational pressure, axial mounting force, thermal conditions, and axial and lateral launch vehicle acceleration loadings. To permit reliable prediction of the heater head s structural performance, GRC completed Finite Element Analysis (FEA) computer modeling for the stress, strain, and deformation that will result during launch. The heater head lateral load test directly supported evaluation of the analysis and validation of the design to meet launch requirements. This paper provides an overview of each element within the test and presents assessment of the modeling as well as experimental results of this task.
Lateral Load Testing of the Advanced Stirling Convertor (ASC-E2) Heater Head

NASA Technical Reports Server (NTRS)

Cornell, Peggy A.; Krause, David L.; Davis, Glen; Robbie, Malcolm G.; Gubics, David A.

2011-01-01

Free-piston Stirling convertors are fundamental to the development of NASA s next generation of radioisotope power system, the Advanced Stirling Radioisotope Generator (ASRG). The ASRG will use General Purpose Heat Source (GPHS) modules as the energy source and Advanced Stirling Convertors (ASCs) to convert heat into electrical energy, and is being developed by Lockheed Martin under contract to the Department of Energy. Achieving flight status mandates that the ASCs satisfy design as well as flight requirements to ensure reliable operation during launch. To meet these launch requirements, GRC performed a series of quasi-static mechanical tests simulating the pressure, thermal, and external loading conditions that will be experienced by an ASC-E2 heater head assembly. These mechanical tests were collectively referred to as "lateral load tests" since a primary external load lateral to the heater head longitudinal axis was applied in combination with the other loading conditions. The heater head was subjected to the operational pressure, axial mounting force, thermal conditions, and axial and lateral launch vehicle acceleration loadings. To permit reliable prediction of the heater head s structural performance, GRC completed Finite Element Analysis (FEA) computer modeling for the stress, strain, and deformation that will result during launch. The heater head lateral load test directly supported evaluation of the analysis and validation of the design to meet launch requirements. This paper provides an overview of each element within the test and presents assessment of the modeling as well as experimental results of this task.
Lateral Load Testing of the Advanced Stirling Convertor (ASC-E2) Heater Head

NASA Technical Reports Server (NTRS)

Cornell, Peggy A.; Krause, David L.; Davis, Glen; Robbie, Malcolm G.; Gubics, David A.

2010-01-01

Free-piston Stirling convertors are fundamental to the development of NASA s next generation of radioisotope power system, the Advanced Stirling Radioisotope Generator (ASRG). The ASRG will use General Purpose Heat Source (GPHS) modules as the energy source and Advanced Stirling Convertors (ASCs) to convert heat into electrical energy, and is being developed by Lockheed Martin under contract to the Department of Energy. Achieving flight status mandates that the ASCs satisfy design as well as flight requirements to ensure reliable operation during launch. To meet these launch requirements, GRC performed a series of quasi-static mechanical tests simulating the pressure, thermal, and external loading conditions that will be experienced by an ASC E2 heater head assembly. These mechanical tests were collectively referred to as lateral load tests since a primary external load lateral to the heater head longitudinal axis was applied in combination with the other loading conditions. The heater head was subjected to the operational pressure, axial mounting force, thermal conditions, and axial and lateral launch vehicle acceleration loadings. To permit reliable prediction of the heater head s structural performance, GRC completed Finite Element Analysis (FEA) computer modeling for the stress, strain, and deformation that will result during launch. The heater head lateral load test directly supported evaluation of the analysis and validation of the design to meet launch requirements. This paper provides an overview of each element within the test and presents assessment of the modeling as well as experimental results of this task.
Reliability and validity: Part II.

PubMed

Davis, Debora Winders

2004-01-01

Determining measurement reliability and validity involves complex processes. There is usually room for argument about most instruments. It is important that the researcher clearly describes the processes upon which she made the decision to use a particular instrument, and presents the evidence available showing that the instrument is reliable and valid for the current purposes. In some cases, the researcher may need to conduct pilot studies to obtain evidence upon which to decide whether the instrument is valid for a new population or a different setting. In all cases, the researcher must present a clear and complete explanation for the choices, she has made regarding reliability and validity. The consumer must then judge the degree to which the researcher has provided adequate and theoretically sound rationale. Although I have tried to touch on most of the important concepts related to measurement reliability and validity, it is beyond the scope of this column to be exhaustive. There are textbooks devoted entirely to specific measurement issues if readers require more in-depth knowledge.
The City MISS: development of a scale to measure stigma of perinatal mental illness.

PubMed

Moore, Donna; Ayers, Susan; Drey, Nicholas

2017-07-01

This study aimed to develop and validate a scale to measure perceived stigma for perinatal mental illness in women. Stigma is one of the most frequently cited barriers to seeking treatment and many women with perinatal mental illness fail to get the treatment they need. However, there is no psychometric scale that measures how women may experience the unique aspects of perinatal mental illness stigma. A draft scale of 30 items was developed from a literature review. Women with perinatal mental illness (n = 279) were recruited to complete the City Mental Illness Stigma Scale. Concurrent validity was measured using the Internalised Stigma of Mental Illness Scale. Factor analysis was used to create the final scale. The final 15-item City Mental Illness Stigma Scale has a three-factor structure: perceived external stigma, internal stigma and disclosure stigma. The scale accounted for 54% of the variance and had good internal reliability and concurrent validity. The City Mental Illness Stigma Scale appears to be a valid measure which provides a potentially useful tool for clinical practice and research in stigma and perinatal mental illness, including assessing the prevalence and characteristics of stigma. This research can be used to inform interventions to reduce or address the stigma experienced by some women with perinatal mental illness.
Validity evidence for the adaptation of the State Mindfulness Scale for Physical Activity (SMS-PA) in Spanish youth.

PubMed

Ullrich-French, Sarah; González Hernández, Juan; Hidalgo Montesinos, María D

2017-02-01

Mindfulness is an increasingly popular construct with promise in enhancing multiple positive health outcomes. Physical activity is an important behavior for enhancing overall health, but no Spanish language scale exists to test how mindfulness during physical activity may facilitate physical activity motivation or behavior. This study examined the validity of a Spanish adaption of a new scale, the State Mindfulness Scale for Physical Activity, to assess mindfulness during a specific experience of physical activity. Spanish youths (N = 502) completed a cross-sectional survey of state mindfulness during physical activity and physical activity motivation regulations based on Self-Determination Theory. A high-order model fit the data well and supports the use of one general state mindfulness factor or the use of separate subscales of mindfulness of mental (e.g., thoughts, emotions) and body (physical movement, muscles) aspects of the experience. Internal consistency reliability was good for the general scale and both sub-scales. The pattern of correlations with motivation regulations provides further support for construct validity with significant and positive correlations with self-determined forms of motivation and significant and negative correlations with external regulation and amotivation. Initial validity evidence is promising for the use of the adapted measure.
[The Amsterdam wrist rules: the multicenter prospective derivation and external validation of a clinical decision rule for the use of radiography in acute wrist trauma].

PubMed

Walenkamp, Monique M J; Bentohami, Abdelali; Slaar, Annelie; Beerekamp, M S H Suzan; Maas, Mario; Jager, L C Cara; Sosef, Nico L; van Velde, Romuald; Ultee, Jan M; Steyerberg, Ewout W; Goslings, J C Carel; Schep, Niels W L

2016-01-01

Although only 39% of patients with wrist trauma have sustained a fracture, the majority of patients is routinely referred for radiography. The purpose of this study was to derive and externally validate a clinical decision rule that selects patients with acute wrist trauma in the Emergency Department (ED) for radiography. This multicenter prospective study consisted of three components: (1) derivation of a clinical prediction model for detecting wrist fractures in patients following wrist trauma; (2) external validation of this model; and (3) design of a clinical decision rule. The study was conducted in the EDs of five Dutch hospitals: one academic hospital (derivation cohort) and four regional hospitals (external validation cohort). We included all adult patients with acute wrist trauma. The main outcome was fracture of the wrist (distal radius, distal ulna or carpal bones) diagnosed on conventional X-rays. A total of 882 patients were analyzed; 487 in the derivation cohort and 395 in the validation cohort. We derived a clinical prediction model with eight variables: age; sex, swelling of the wrist; swelling of the anatomical snuffbox, visible deformation; distal radius tender to palpation; pain on radial deviation and painful axial compression of the thumb. The Area Under the Curve at external validation of this model was 0.81 (95% CI: 0.77-0.85). The sensitivity and specificity of the Amsterdam Wrist Rules (AWR) in the external validation cohort were 98% (95% CI: 95-99%) and 21% (95% CI: 15%-28). The negative predictive value was 90% (95% CI: 81-99%). The Amsterdam Wrist Rules is a clinical prediction rule with a high sensitivity and negative predictive value for fractures of the wrist. Although external validation showed low specificity and 100 % sensitivity could not be achieved, the Amsterdam Wrist Rules can provide physicians in the Emergency Department with a useful screening tool to select patients with acute wrist trauma for radiography. The upcoming implementation study will further reveal the impact of the Amsterdam Wrist Rules on the anticipated reduction of X-rays requested, missed fractures, Emergency Department waiting times and health care costs.
Reliability and validity of the Safe Routes to school parent and student surveys

PubMed Central

2011-01-01

Background The purpose of this study is to assess the reliability and validity of the U.S. National Center for Safe Routes to School's in-class student travel tallies and written parent surveys. Over 65,000 tallies and 374,000 parent surveys have been completed, but no published studies have examined their measurement properties. Methods Students and parents from two Charlotte, NC (USA) elementary schools participated. Tallies were conducted on two consecutive days using a hand-raising protocol; on day two students were also asked to recall the previous days' travel. The recall from day two was compared with day one to assess 24-hour test-retest reliability. Convergent validity was assessed by comparing parent-reports of students' travel mode with student-reports of travel mode. Two-week test-retest reliability of the parent survey was assessed by comparing within-parent responses. Reliability and validity were assessed using kappa statistics. Results A total of 542 students participated in the in-class student travel tally reliability assessment and 262 parent-student dyads participated in the validity assessment. Reliability was high for travel to and from school (kappa > 0.8); convergent validity was lower but still high (kappa > 0.75). There were no differences by student grade level. Two-week test-retest reliability of the parent survey (n = 112) ranged from moderate to very high for objective questions on travel mode and travel times (kappa range: 0.62 - 0.97) but was substantially lower for subjective assessments of barriers to walking to school (kappa range: 0.31 - 0.76). Conclusions The student in-class student travel tally exhibited high reliability and validity at all elementary grades. The parent survey had high reliability on questions related to student travel mode, but lower reliability for attitudinal questions identifying barriers to walking to school. Parent survey design should be improved so that responses clearly indicate issues that influence parental decision making in regards to their children's mode of travel to school. PMID:21651794
Test-retest reliability and construct validity of the ENERGY-child questionnaire on energy balance-related behaviours and their potential determinants: the ENERGY-project.

PubMed

Singh, Amika S; Vik, Froydis N; Chinapaw, Mai J M; Uijtdewilligen, Léonie; Verloigne, Maïté; Fernández-Alvira, Juan M; Stomfai, Sarolta; Manios, Yannis; Martens, Marloes; Brug, Johannes

2011-12-09

Insight in children's energy balance-related behaviours (EBRBs) and their determinants is important to inform obesity prevention research. Therefore, reliable and valid tools to measure these variables in large-scale population research are needed. To examine the test-retest reliability and construct validity of the child questionnaire used in the ENERGY-project, measuring EBRBs and their potential determinants among 10-12 year old children. We collected data among 10-12 year old children (n = 730 in the test-retest reliability study; n = 96 in the construct validity study) in six European countries, i.e. Belgium, Greece, Hungary, the Netherlands, Norway, and Spain. Test-retest reliability was assessed using the intra-class correlation coefficient (ICC) and percentage agreement comparing scores from two measurements, administered one week apart. To assess construct validity, the agreement between questionnaire responses and a subsequent face-to-face interview was assessed using ICC and percentage agreement. Of the 150 questionnaire items, 115 (77%) showed good to excellent test-retest reliability as indicated by ICCs > .60 or percentage agreement ≥ 75%. Test-retest reliability was moderate for 34 items (23%) and poor for one item. Construct validity appeared to be good to excellent for 70 (47%) of the 150 items, as indicated by ICCs > .60 or percentage agreement ≥ 75%. From the other 80 items, construct validity was moderate for 39 (26%) and poor for 41 items (27%). Our results demonstrate that the ENERGY-child questionnaire, assessing EBRBs of the child as well as personal, family, and school-environmental determinants related to these EBRBs, has good test-retest reliability and moderate to good construct validity for the large majority of items.

Test-retest reliability and construct validity of the ENERGY-child questionnaire on energy balance-related behaviours and their potential determinants: the ENERGY-project

PubMed Central

2011-01-01

Background Insight in children's energy balance-related behaviours (EBRBs) and their determinants is important to inform obesity prevention research. Therefore, reliable and valid tools to measure these variables in large-scale population research are needed. Objective To examine the test-retest reliability and construct validity of the child questionnaire used in the ENERGY-project, measuring EBRBs and their potential determinants among 10-12 year old children. Methods We collected data among 10-12 year old children (n = 730 in the test-retest reliability study; n = 96 in the construct validity study) in six European countries, i.e. Belgium, Greece, Hungary, the Netherlands, Norway, and Spain. Test-retest reliability was assessed using the intra-class correlation coefficient (ICC) and percentage agreement comparing scores from two measurements, administered one week apart. To assess construct validity, the agreement between questionnaire responses and a subsequent face-to-face interview was assessed using ICC and percentage agreement. Results Of the 150 questionnaire items, 115 (77%) showed good to excellent test-retest reliability as indicated by ICCs > .60 or percentage agreement ≥ 75%. Test-retest reliability was moderate for 34 items (23%) and poor for one item. Construct validity appeared to be good to excellent for 70 (47%) of the 150 items, as indicated by ICCs > .60 or percentage agreement ≥ 75%. From the other 80 items, construct validity was moderate for 39 (26%) and poor for 41 items (27%). Conclusions Our results demonstrate that the ENERGY-child questionnaire, assessing EBRBs of the child as well as personal, family, and school-environmental determinants related to these EBRBs, has good test-retest reliability and moderate to good construct validity for the large majority of items. PMID:22152048
Validation and reliability of the Turkish Utian Quality-of-Life Scale in postmenopausal women.

PubMed

Abay, Halime; Kaplan, Sena

2016-04-01

There are a limited number of menopause-specific quality-of-life scales for the Turkish population. This study was conducted to evaluate the validity and reliability of the Turkish Utian Quality-of-Life Scale in postmenopausal women. The study group was comprised of 250 postmenopausal women who applied to a training and research hospital's menopause clinic in Turkey. A survey form and the Turkish Utian quality-of-Life Scale were used to collect data, and the Turkish version of Short Form-36 was used to evaluate reliability with an equivalent form. Language-validity, content-validity, and construct-validity methods were used to assess the validity of the scale, and Cronbach's α coefficient calculation and the equivalent-form reliability methods were used to assess the reliability of the scale. The Turkish Utian Quality-of-Life Scale was determined to be a valid and reliable instrument for measuring the quality of life of postmenopausal women. Confirmatory factor analysis demonstrates that the instrument fits well with 23 items and a four-factor model. The Cronbach's α coefficient for the quality-of-life domains were as follows: 0.88 overall, 0.79 health, 0.78 emotional, 0.76 sexual, and 0.75 occupational. Reliability of the instrument was confirmed through significant correlations between scores on the Turkish version of the Utian Quality-of-Life Scale and the Turkish version of the Short Form-36 (r = 0.745, P < 0.001). This research emphasizes that the Turkish Utian Quality-of-Life Scale is reliable and valid in postmenopausal women-it is a useful instrument for measuring quality of life during menopause.
Full-Scale Crash Tests and Analyses of Three High-Wing Single

NASA Technical Reports Server (NTRS)

Annett, Martin S.; Littell, Justin D.; Stimson, Chad M.; Jackson, Karen E.; Mason, Brian H.

2015-01-01

The NASA Emergency Locator Transmitter Survivability and Reliability (ELTSAR) project was initiated in 2014 to assess the crash performance standards for the next generation of ELT systems. Three Cessna 172 aircraft have been acquired to conduct crash testing at NASA Langley Research Center's Landing and Impact Research Facility. Testing is scheduled for the summer of 2015 and will simulate three crash conditions; a flare to stall while emergency landing, and two controlled flight into terrain scenarios. Instrumentation and video coverage, both onboard and external, will also provide valuable data of airframe response. Full-scale finite element analyses will be performed using two separate commercial explicit solvers. Calibration and validation of the models will be based on the airframe response under these varying crash conditions.
Psychometric properties of the revised conscientiousness dimension of Inventário Dimensional Clínico da Personalidade (IDCP).

PubMed

Carvalho, Lucas de Francisco; Souza, Bruna Daniela Balbino de; Primi, Ricardo

2014-01-01

This study investigated the psychometric properties of the revised scale of conscientiousness of a clinical personality inventory (Inventário Dimensional Clínico da Personalidade, IDCP). One hundred and twenty participants (68 women; 56.7%) aged 18 to 53 years (mean = 22.58, standard deviation = 6.19) were recruited by convenience and answered the IDCP and the NEO Personality Inventory - Revised. The analysis of internal structure, association with external variables and reliability of the dimension under review confirmed its validity. The psychometric characteristics of the revised dimension seem to be more adequate than those of the original version and more focused on pathological functioning, which was expected and desirable.
Measuring children's regulation of emotion-expressive behavior.

PubMed

Bar-Haim, Yair; Bar-Av, Gali; Sadeh, Avi

2011-04-01

Emotion regulation has become a pivotal concept in developmental and clinical research. However, the measurement of regulatory processes has proved extremely difficult, particularly in the context of within-subject designs. Here, we describe a formal conceptualization and a new experimental procedure, the Balloons Game, to measure a regulatory component of emotion-expressive behavior. We present the internal consistency and stability of the indices derived from the Balloons Game in a sample of 121 kindergarten children. External validation against measures that have been associated with emotion regulation processes is also provided. The findings suggest that the Balloons Game provides a reliable tool for the study of regulation of emotion expression in young children. PsycINFO Database Record (c) 2011 APA, all rights reserved.
Alberta infant motor scale: reliability and validity when used on preterm infants in Taiwan.

PubMed

Jeng, S F; Yau, K I; Chen, L C; Hsiao, S F

2000-02-01

The goal of this study was to examine the reliability and validity of measurements obtained with the Alberta Infant Motor Scale (AIMS) for evaluation of preterm infants in Taiwan. Two independent groups of preterm infants were used to investigate the reliability (n=45) and validity (n=41) for the AIMS. In the reliability study, the AIMS was administered to the infants by a physical therapist, and infant performance was videotaped. The performance was then rescored by the same therapist and by 2 other therapists to examine the intrarater and interrater reliability. In the validity study, the AIMS and the Bayley Motor Scale were administered to the infants at 6 and 12 months of age to examine criterion-related validity. Intraclass correlation coefficients (ICCs) for intrarater and interrater reliability of measurements obtained with the AIMS were high (ICC=.97-.99). The AIMS scores correlated with the Bayley Motor Scale scores at 6 and 12 months (r=.78 and.90), although the AIMS scores at 6 months were only moderately predictive of the motor function at 12 months (r=.56). The results suggest that measurements obtained with the AIMS have acceptable reliability and concurrent validity but limited predictive value for evaluating preterm Taiwanese infants.
Assessment of the severity of dementia: validity and reliability of the Chinese (Cantonese) version of the Hierarchic Dementia Scale (CV-HDS).

PubMed

Poon, Vickie Wan-kei; Lam, Linda Chiu-wa; Wong, Samuel Yeung-shan

2008-09-01

With the rapid growth of the older population, early detection of cognitive deficits is crucial in slowing down functional deterioration of the elderly persons. To examine the validity and reliability of the Chinese (Cantonese) version of the Hierarchic Dementia Scale (CV-HDS) for Chinese older persons in Hong Kong. The HDS was translated into Cantonese Chinese. The content and cultural validity were evaluated by six expert panel members. Sixty-two participants with diagnosis of dementia were recruited for evaluation. Inter-rater reliability, test-retest reliability, internal consistency and concurrent validity were examined. The CV-HDS demonstrated satisfactory psychometric properties. inter-rater reliability and test-retest reliability were high (alpha=0.89 and alpha=0.94 respectively). High value of Cronbach's alpha (alpha=0.94) demonstrated good internal consistency. The concurrent validity of CV-HDS, through correlation with its scores with that of the Chinese version of Mini Mental Status Examination, was established (ranged from r=0.58 to r=0.78, p<0.01). The CV-HDS is a reliable and valid instrument for assessing severity of cognitive impairment in Cantonese speaking Chinese people with dementia. It facilitates treatment planning to optimize the effects of functional training and rehabilitation.
Rational selection of training and test sets for the development of validated QSAR models

NASA Astrophysics Data System (ADS)

Golbraikh, Alexander; Shen, Min; Xiao, Zhiyan; Xiao, Yun-De; Lee, Kuo-Hsiung; Tropsha, Alexander

2003-02-01

Quantitative Structure-Activity Relationship (QSAR) models are used increasingly to screen chemical databases and/or virtual chemical libraries for potentially bioactive molecules. These developments emphasize the importance of rigorous model validation to ensure that the models have acceptable predictive power. Using k nearest neighbors ( kNN) variable selection QSAR method for the analysis of several datasets, we have demonstrated recently that the widely accepted leave-one-out (LOO) cross-validated R2 (q2) is an inadequate characteristic to assess the predictive ability of the models [Golbraikh, A., Tropsha, A. Beware of q2! J. Mol. Graphics Mod. 20, 269-276, (2002)]. Herein, we provide additional evidence that there exists no correlation between the values of q 2 for the training set and accuracy of prediction ( R 2) for the test set and argue that this observation is a general property of any QSAR model developed with LOO cross-validation. We suggest that external validation using rationally selected training and test sets provides a means to establish a reliable QSAR model. We propose several approaches to the division of experimental datasets into training and test sets and apply them in QSAR studies of 48 functionalized amino acid anticonvulsants and a series of 157 epipodophyllotoxin derivatives with antitumor activity. We formulate a set of general criteria for the evaluation of predictive power of QSAR models.
The relationship between external and internal validity of randomized controlled trials: A sample of hypertension trials from China.

PubMed

Zhang, Xin; Wu, Yuxia; Ren, Pengwei; Liu, Xueting; Kang, Deying

2015-10-30

To explore the relationship between the external validity and the internal validity of hypertension RCTs conducted in China. Comprehensive literature searches were performed in Medline, Embase, Cochrane Central Register of Controlled Trials (CCTR), CBMdisc (Chinese biomedical literature database), CNKI (China National Knowledge Infrastructure/China Academic Journals Full-text Database) and VIP (Chinese scientific journals database) as well as advanced search strategies were used to locate hypertension RCTs. The risk of bias in RCTs was assessed by a modified scale, Jadad scale respectively, and then studies with 3 or more grading scores were included for the purpose of evaluating of external validity. A data extract form including 4 domains and 25 items was used to explore relationship of the external validity and the internal validity. Statistic analyses were performed by using SPSS software, version 21.0 (SPSS, Chicago, IL). 226 hypertension RCTs were included for final analysis. RCTs conducted in university affiliated hospitals (P < 0.001) or secondary/tertiary hospitals (P < 0.001) were scored at higher internal validity. Multi-center studies (median = 4.0, IQR = 2.0) were scored higher internal validity score than single-center studies (median = 3.0, IQR = 1.0) (P < 0.001). Funding-supported trials had better methodological quality (P < 0.001). In addition, the reporting of inclusion criteria also leads to better internal validity (P = 0.004). Multivariate regression indicated sample size, industry-funding, quality of life (QOL) taken as measure and the university affiliated hospital as trial setting had statistical significance (P < 0.001, P < 0.001, P = 0.001, P = 0.006 respectively). Several components relate to the external validity of RCTs do associate with the internal validity, that do not stand in an easy relationship to each other. Regarding the poor reporting, other possible links between two variables need to trace in the future methodological researches.
Reliability and validity of electrothermometers and associated thermocouples.

PubMed

Jutte, Lisa S; Knight, Kenneth L; Long, Blaine C

2008-02-01

Examine thermocouple model uncertainty (reliability+validity). First, a 3x3 repeated measures design with independent variables electrothermometers and thermocouple model. Second, a 1x3 repeated measures design with independent variable subprobe. Three electrothermometers, 3 thermocouple models, a multi-sensor probe and a mercury thermometer measured a stable water bath. Temperature and absolute temperature differences between thermocouples and a mercury thermometer. Thermocouple uncertainty was greater than manufactures'claims. For all thermocouple models, validity and reliability were better in the Iso-Themex than the Datalogger, but there were no practical differences between models within an electrothermometers. Validity of multi-sensor probes and thermocouples within a probe were not different but were greater than manufacturers'claims. Reliability of multiprobes and thermocouples within a probe were within manufacturers claims. Thermocouple models vary in reliability and validity. Scientists should test and report the uncertainty of their equipment rather than depending on manufactures' claims.
Reliability and concurrent validity of a Smartphone, bubble inclinometer and motion analysis system for measurement of hip joint range of motion.

PubMed

Charlton, Paula C; Mentiplay, Benjamin F; Pua, Yong-Hao; Clark, Ross A

2015-05-01

Traditional methods of assessing joint range of motion (ROM) involve specialized tools that may not be widely available to clinicians. This study assesses the reliability and validity of a custom Smartphone application for assessing hip joint range of motion. Intra-tester reliability with concurrent validity. Passive hip joint range of motion was recorded for seven different movements in 20 males on two separate occasions. Data from a Smartphone, bubble inclinometer and a three dimensional motion analysis (3DMA) system were collected simultaneously. Intraclass correlation coefficients (ICCs), coefficients of variation (CV) and standard error of measurement (SEM) were used to assess reliability. To assess validity of the Smartphone application and the bubble inclinometer against the three dimensional motion analysis system, intraclass correlation coefficients and fixed and proportional biases were used. The Smartphone demonstrated good to excellent reliability (ICCs>0.75) for four out of the seven movements, and moderate to good reliability for the remaining three movements (ICC=0.63-0.68). Additionally, the Smartphone application displayed comparable reliability to the bubble inclinometer. The Smartphone application displayed excellent validity when compared to the three dimensional motion analysis system for all movements (ICCs>0.88) except one, which displayed moderate to good validity (ICC=0.71). Smartphones are portable and widely available tools that are mostly reliable and valid for assessing passive hip range of motion, with potential for large-scale use when a bubble inclinometer is not available. However, caution must be taken in its implementation as some movement axes demonstrated only moderate reliability. Copyright © 2014 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.
A new scale for the assessment of performance and capacity of hand function in children with hemiplegic cerebral palsy: reliability and validity studies.

PubMed

Rosa-Rizzotto, M; Visonà Dalla Pozza, L; Corlatti, A; Luparia, A; Marchi, A; Molteni, F; Facchin, P; Pagliano, E; Fedrizzi, E

2014-10-01

In hemiplegic children, the recognition of the activity limitation pattern and the possibility of grading its severity are relevant for clinicians while planning interventions, monitoring results, predicting outcomes. Aim of the study is to examine the reliability and validity of Besta Scale, an instrument used to measure in hemiplegic children from 18 months to 12 years of age both grasp on request (capacity) and spontaneous use of upper limb (performance) in bimanual play activities and in ADL. Psychometric analysis of reliability and of validity of the Besta scale was performed. Outpatient study sample Reliability study: A sample of 39 patients was enrolled. The administration of Besta scale was video-recorded in a standardized manner. All videos were scored by 20 independent raters on subsequent viewing. 3 raters randomly selected from the 20-raters group rescored the same video two years later for intra-rater reliability. Intra and inter-rater reliability were calculated using Intraclass Correlation Coefficient (ICC) and Kendall's coefficient (K), respectively. Internal consistency reliability was assessed using Alpha's Chronbach coefficient. Validity study: a sample of 105 children was assessed 5 times (at t0 and 2, 3, 6 and 12 months later) by 20 independent raters. Each patient underwent at the same time to QUEST and Besta scale administration and assessment. Criterion validity was calculated using rho-Pearson coefficient. Reliability study: The inter-rater reliability calculated with Kendall's coefficient resulted moderate K=0.47. The intra-rater (or test-retest) reliability for 3 raters was excellent (ICC=0.927). The Cronbach's alpha for internal consistency was 0.972. Validity study: Besta scale showed a good criterion validity compared to QUEST increasing by age and severity of impairment. Rho Pearson's correlation coefficient r was 0.81 (P<0.0001). Limitations. Besta scales in infants finds hard to distinguish between mild to moderately impaired hand function. Besta scale scoring system is a valid and reliable tool, utilizable in a clinical setting to monitor evolution of unimanual and bimanual manipulation and to distinguish hand's capacity from performance.
Validity and reliability of the Utrecht Work Engagement Scale-Student Version in Sri Lanka.

PubMed

Wickramasinghe, Nuwan Darshana; Dissanayake, Devani Sakunthala; Abeywardena, Gihan Sajiwa

2018-05-04

The present study was aimed at assessing the validity and the reliability of the Sinhala version of the Utrecht Work Engagement Scale-Student Version (UWES-S) among collegiate cycle students in Sri Lanka. The 17-item UWES-S was translated to Sinhala and the judgmental validity was assessed by a multi-disciplinary panel of experts. Construct validity of the UWES-S was appraised by using multi-trait scaling analysis and exploratory factor analysis (EFA) on data obtained from a sample of 194 grade thirteen students in the Kurunegala district, Sri Lanka. Reliability of the UWES-S was assessed by using internal consistency and test-retest reliability. Except for item 13, all other items showed good psychometric properties in judgemental validity, item-convergent validity and item-discriminant validity. EFA using principal component analysis with Oblimin rotation, suggested a three-factor solution (including vigor, dedication and absorption subscales) explaining 65.4% of the total variance for the 16-item UWES-S (with item 13 deleted). All three subscales show high internal consistency with Cronbach's α coefficient values of 0.867, 0.819, and 0.903 and test-retest reliability was high (p < 0.001). Hence, the Sinhala version of the 16-item UWES-S is a valid and a reliable instrument to assess work engagement among collegiate cycle students in Sri Lanka.
Validity and reliability of a Malay version of the Lawton instrumental activities of daily living scale among the Malay speaking elderly in Malaysia.

PubMed

Kadar, Masne; Ibrahim, Suhaili; Razaob, Nor Afifi; Chai, Siaw Chui; Harun, Dzalani

2018-02-01

The Lawton Instrumental Activities of Daily Living Scale is a tool often used to assess independence among elderly at home. Its suitability to be used with the elderly population in Malaysia has not been validated. This current study aimed to assess the validity and reliability of the Lawton Instrumental Activities of Daily Living Scale - Malay Version to Malay speaking elderly in Malaysia. This study was divided into three phases: (1) translation and linguistic validity involving both forward and backward translations; (2) establishment of face validity and content validity; and (3) establishment of reliability involving inter-rater, test-retest and internal consistency analyses. Data used for these analyses were obtained by interviewing 65 elderly respondents. Percentages of Content Validity Index for 4 criteria were from 88.89 to 100.0. The Cronbach α coefficient for internal consistency was 0.838. Intra-class Correlation Coefficient of inter-rater reliability and test-retest reliability was 0.957 and 0.950 respectively. The result shows that the Lawton Instrumental Activities of Daily Living Scale - Malay Version has excellent reliability and validity for use with the Malay speaking elderly people in Malaysia. This scale could be used by professionals to assess functional ability of elderly who live independently in community. © 2018 Occupational Therapy Australia.
The bottom-up approach to integrative validity: a new perspective for program evaluation.

PubMed

Chen, Huey T

2010-08-01

The Campbellian validity model and the traditional top-down approach to validity have had a profound influence on research and evaluation. That model includes the concepts of internal and external validity and within that model, the preeminence of internal validity as demonstrated in the top-down approach. Evaluators and researchers have, however, increasingly recognized that in an evaluation, the over-emphasis on internal validity reduces that evaluation's usefulness and contributes to the gulf between academic and practical communities regarding interventions. This article examines the limitations of the Campbellian validity model and the top-down approach and provides a comprehensive, alternative model, known as the integrative validity model for program evaluation. The integrative validity model includes the concept of viable validity, which is predicated on a bottom-up approach to validity. This approach better reflects stakeholders' evaluation views and concerns, makes external validity workable, and becomes therefore a preferable alternative for evaluation of health promotion/social betterment programs. The integrative validity model and the bottom-up approach enable evaluators to meet scientific and practical requirements, facilitate in advancing external validity, and gain a new perspective on methods. The new perspective also furnishes a balanced view of credible evidence, and offers an alternative perspective for funding. Copyright (c) 2009 Elsevier Ltd. All rights reserved.
[External and internal validity of a multidimensional Locus of control scale of eating attitudes for athletes (LOCSCAS)].

PubMed

Paquet, Y; Scoffier, S; d'Arripe-Longueville, F

2016-10-01

In the field of health psychology, the control has consistently been considered as a protective factor. This protective role has been also highlighted in eating attitudes' domain. However, current studies use the one-dimensional scale of Rotter or the multidimensional health locus of control scale, and no specific eating attitudes' scale in the sport context exists. Moreover, the social influence in previous scales is limited. According to recent works, the purpose of this study was to test the internal and external validity of a multidimensional locus of control scale of eating attitudes for athletes. One hundred and seventy-nine participants were solicited. A confirmatory factorial analysis was conducted in order to test the internal validity of the scale. The scale external validity was tested in relation to eating attitudes. The internal validity of the scale was verified as well as the external validity, which confirmed the importance of taking into consideration social influences. Indeed, the 2 subscales "Trainers, friends" and "Parents, family" are related respectively positively and negatively in eating disorders. Copyright © 2016 L'Encéphale, Paris. Published by Elsevier Masson SAS. All rights reserved.
Validity and Reliability of the Turkish Version of Needs Based Biopsychosocial Distress Instrument for Cancer Patients (CANDI)

PubMed Central

Beyhun, Nazim Ercument; Can, Gamze; Tiryaki, Ahmet; Karakullukcu, Serdar; Bulut, Bekir; Yesilbas, Sehbal; Kavgaci, Halil; Topbas, Murat

2016-01-01

Background Needs based biopsychosocial distress instrument for cancer patients (CANDI) is a scale based on needs arising due to the effects of cancer. Objectives The aim of this research was to determine the reliability and validity of the CANDI scale in the Turkish language. Patients and Methods The study was performed with the participation of 172 cancer patients aged 18 and over. Factor analysis (principal components analysis) was used to assess construct validity. Criterion validities were tested by computing Spearman correlation between CANDI and hospital anxiety depression scale (HADS), and brief symptom inventory (BSI) (convergent validity) and quality of life scales (FACT-G) (divergent validity). Test-retest reliabilities and internal consistencies were measured with intraclass correlation (ICC) and Cronbach-α. Results A three-factor solution (emotional, physical and social) was found with factor analysis. Internal reliability (α = 0.94) and test-retest reliability (ICC = 0.87) were significantly high. Correlations between CANDI and HADS (rs = 0.67), and BSI (rs = 0.69) and FACT-G (rs = -0.76) were moderate and significant in the expected direction. Conclusions CANDI is a valid and reliable scale in cancer patients with a three-factor structure (emotional, physical and social) in the Turkish language. PMID:27621931
Constructing a question bank based on script concordance approach as a novel assessment methodology in surgical education.

PubMed

Aldekhayel, Salah A; Alselaim, Nahar A; Magzoub, Mohi Eldin; Al-Qattan, Mohammad M; Al-Namlah, Abdullah M; Tamim, Hani; Al-Khayal, Abdullah; Al-Habdan, Sultan I; Zamakhshary, Mohammed F

2012-10-24

Script Concordance Test (SCT) is a new assessment tool that reliably assesses clinical reasoning skills. Previous descriptions of developing SCT-question banks were merely subjective. This study addresses two gaps in the literature: 1) conducting the first phase of a multistep validation process of SCT in Plastic Surgery, and 2) providing an objective methodology to construct a question bank based on SCT. After developing a test blueprint, 52 test items were written. Five validation questions were developed and a validation survey was established online. Seven reviewers were asked to answer this survey. They were recruited from two countries, Saudi Arabia and Canada, to improve the test's external validity. Their ratings were transformed into percentages. Analysis was performed to compare reviewers' ratings by looking at correlations, ranges, means, medians, and overall scores. Scores of reviewers' ratings were between 76% and 95% (mean 86% ± 5). We found poor correlations between reviewers (Pearson's: +0.38 to -0.22). Ratings of individual validation questions ranged between 0 and 4 (on a scale 1-5). Means and medians of these ranges were computed for each test item (mean: 0.8 to 2.4; median: 1 to 3). A subset of test items comprising 27 items was generated based on a set of inclusion and exclusion criteria. This study proposes an objective methodology for validation of SCT-question bank. Analysis of validation survey is done from all angles, i.e., reviewers, validation questions, and test items. Finally, a subset of test items is generated based on a set of criteria.
Determining the Appropriateness of the "What If" Situations Test (WIST) with Turkish Pre-Schoolers.

PubMed

Citak Tunc, Gulseren; Gorak, Gulay; Ozyazicioglu, Nurcan; Ak, Bedriye; Isil, Ozlem; Vural, Pinar

2018-04-01

Measurement instruments are needed to assess the child's sexual abuse prevention program. The purpose of the study was to determine the reliability and validity of the WIST (What If Situations Test) for Turkish culture. Participants were children of the 3-6 age group attending pre-school education institutions and the sample size was identified by means of a power analysis. Seventy children were identified as the sample with 0.85 power and 0.05 type I error according to the power analysis. Language validity, content validity, internal validity coefficient (Cronbach alpha coefficient), and test-retest analyses were conducted in terms of validity and reliability in the scope of efforts for adaptation to Turkish culture. Firstly, Kendall W = 0.83 was the score for the expert opinions concerning the content validity of the language validity scale. It was found that the Cronbach alpha coefficients were between 0.68 and 0.90 for the scale sub-dimensions of appropriate and inappropriate recognition, saying, doing, telling, and reporting. The test-retest reliability of the scale was found to be r = 0.89 and the test-retest reliabilities for the sub-dimensions (appropriate recognition, inappropriate recognition, say skills, do skills, tell skills, and reporting skills) were between r = 0.48 and r = 0.92. The test-retest reliability for the Personal Safety Questionnaire (PSQ), as having complimentary items to the WIST, was found to be r = 0.82. The reliability and validity analysis of the 'What If' Situations Test (WIST), used to evaluate pre-schoolers' skills regarding self-protection against sexual abuse, showed that the Test's adaptation to Turkish culture was reliable and valid.
Test-retest reliability and construct validity of the ENERGY-parent questionnaire on parenting practices, energy balance-related behaviours and their potential behavioural determinants: the ENERGY-project.

PubMed

Singh, Amika S; Chinapaw, Mai J M; Uijtdewilligen, Léonie; Vik, Froydis N; van Lippevelde, Wendy; Fernández-Alvira, Juan M; Stomfai, Sarolta; Manios, Yannis; van der Sluijs, Maria; Terwee, Caroline; Brug, Johannes

2012-08-13

Insight in parental energy balance-related behaviours, their determinants and parenting practices are important to inform childhood obesity prevention. Therefore, reliable and valid tools to measure these variables in large-scale population research are needed. The objective of the current study was to examine the test-retest reliability and construct validity of the parent questionnaire used in the ENERGY-project, assessing parental energy balance-related behaviours, their determinants, and parenting practices among parents of 10-12 year old children. We collected data among parents (n = 316 in the test-retest reliability study; n = 109 in the construct validity study) of 10-12 year-old children in six European countries, i.e. Belgium, Greece, Hungary, the Netherlands, Norway, and Spain. Test-retest reliability was assessed using the intra-class correlation coefficient (ICC) and percentage agreement comparing scores from two measurements, administered one week apart. To assess construct validity, the agreement between questionnaire responses and a subsequent interview was assessed using ICC and percentage agreement.All but one item showed good to excellent test-retest reliability as indicated by ICCs > .60 or percentage agreement ≥ 75%. Construct validity appeared to be good to excellent for 92 out of 121 items, as indicated by ICCs > .60 or percentage agreement ≥ 75%. From the other 29 items, construct validity was moderate for 24 and poor for 5 items. The reliability and construct validity of the items of the ENERGY-parent questionnaire on multiple energy balance-related behaviours, their potential determinants, and parenting practices appears to be good. Based on the results of the validity study, we strongly recommend adapting parts of the ENERGY-parent questionnaire if used in future research.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.