McCurdy, M; Bellows, A; Deng, D; Leppert, M; Mahone, E; Pritchard, A
2015-01-01
Reliable and valid screening and assessment tools are necessary to identify children at risk for neurodevelopmental disabilities who may require additional services. This study evaluated the test-retest reliability of the Capute Scales in a high-risk sample, hypothesizing adequate reliability across 6- and 12-month intervals. Capute Scales scores (N = 66) were collected via retrospective chart review from a NICU follow-up clinic within a large urban medical center spanning three age-ranges: 12-18, 19-24, and 25-36 months. On average, participants were classified as very low birth weight and premature. Reliability of the Capute Scales was evaluated with intraclass correlation coefficients across length of test-retest interval, age at testing, and degree of neonatal complications. The Capute Scales demonstrated high reliability, regardless of length of test-retest interval (ranging from 6 to 14 months) or age of participant, for all index scores, including overall Developmental Quotient (DQ), language-based skill index (CLAMS) and nonverbal reasoning index (CAT). Linear regressions revealed that greater neonatal risk was related to poorer test-retest reliability; however, reliability coefficients remained strong. The Capute Scales afford clinicians a reliable and valid means of screening and assessing for neurodevelopmental delay within high-risk infant populations.
Paap, Kenneth R; Sawi, Oliver
2016-12-01
Studies testing for individual or group differences in executive functioning can be compromised by unknown test-retest reliability. Test-retest reliabilities across an interval of about one week were obtained from performance in the antisaccade, flanker, Simon, and color-shape switching tasks. There is a general trade-off between the greater reliability of single mean RT measures, and the greater process purity of measures based on contrasts between mean RTs in two conditions. The individual differences in RT model recently developed by Miller and Ulrich was used to evaluate the trade-off. Test-retest reliability was statistically significant for 11 of the 12 measures, but was of moderate size, at best, for the difference scores. The test-retest reliabilities for the Simon and flanker interference scores were lower than those for switching costs. Standard practice evaluates the reliability of executive-functioning measures using split-half methods based on data obtained in a single day. Our test-retest measures of reliability are lower, especially for difference scores. These reliability measures must also take into account possible day effects that classical test theory assumes do not occur. Measures based on single mean RTs tend to have acceptable levels of reliability and convergent validity, but are "impure" measures of specific executive functions. The individual differences in RT model shows that the impurity problem is worse than typically assumed. However, the "purer" measures based on difference scores have low convergent validity that is partly caused by deficiencies in test-retest reliability. Copyright © 2016 Elsevier B.V. All rights reserved.
The reliability of WorkWell Systems Functional Capacity Evaluation: a systematic review
2014-01-01
Background Functional capacity evaluation (FCE) determines a person’s ability to perform work-related tasks and is a major component of the rehabilitation process. The WorkWell Systems (WWS) FCE (formerly known as Isernhagen Work Systems FCE) is currently the most commonly used FCE tool in German rehabilitation centres. Our systematic review investigated the inter-rater, intra-rater and test-retest reliability of the WWS FCE. Methods We performed a systematic literature search of studies on the reliability of the WWS FCE and extracted item-specific measures of inter-rater, intra-rater and test-retest reliability from the identified studies. Intraclass correlation coefficients ≥ 0.75, percentages of agreement ≥ 80%, and kappa coefficients ≥ 0.60 were categorised as acceptable, otherwise they were considered non-acceptable. The extracted values were summarised for the five performance categories of the WWS FCE, and the results were classified as either consistent or inconsistent. Results From 11 identified studies, 150 item-specific reliability measures were extracted. 89% of the extracted inter-rater reliability measures, all of the intra-rater reliability measures and 96% of the test-retest reliability measures of the weight handling and strength tests had an acceptable level of reliability, compared to only 67% of the test-retest reliability measures of the posture/mobility tests and 56% of the test-retest reliability measures of the locomotion tests. Both of the extracted test-retest reliability measures of the balance test were acceptable. Conclusions Weight handling and strength tests were found to have consistently acceptable reliability. Further research is needed to explore the reliability of the other tests as inconsistent findings or a lack of data prevented definitive conclusions. PMID:24674029
Test-retest reliability of the Progressive Isoinertial Lifting Evaluation (PILE).
Lygren, Hildegunn; Dragesund, Tove; Joensen, Jón; Ask, Tove; Moe-Nilssen, Rolf
2005-05-01
A repeated measures single group design. To investigate test-retest reliability of Progressive Isoinertial Lifting Evaluation on patients with long lasting musculoskeletal problems related to the lumbar spine. Test-retest reliability has been satisfactory in healthy men. Test-retest reliability for clinical populations has not been reported. A total of 31 patients (17 women and 14 men) with long lasting low back pain participated in the study. The patients were tested twice at an interval of 2 days and at the same time of the day. The heaviest load that the patient could lift 4 times was used as outcome measure. The error of measurement indicates that the true result in 95% of cases will be within +/-4.5 kg from the measured value, while the difference between 2 measurements in 95% of cases will be less than 6.4 kg. Intra-class correlation (1,1) was 0.91. Relative test-retest reliability was high assessed by intra-class correlation, but absolute measurement variability reported as the smallest detectable difference has relevance for the interpretation of clinical test results and should also be considered.
Bruininks-Oseretsky Test of Motor Proficiency: Further Verification with 3- to 5- yr. -old Children.
ERIC Educational Resources Information Center
Beitel, Patricia A.; Mead, Barbara J.
1982-01-01
The Bruininks-Oseretsky Test of Motor Proficiency was evaluated to determine test-retest reliability and if there were presensitizing effects at retest for four- to five-year olds. Test reliability was significantly high. No significant test sensitization of the short form to retesting with the short form or subtests was found. (Author/RD)
Merritt, Victoria C; Bradson, Megan L; Meyer, Jessica E; Arnett, Peter A
2018-05-01
The Immediate Post-Concussion Assessment and Cognitive Testing (ImPACT) is a commonly used tool in sports concussion assessment. While test-retest reliabilities have been established for the ImPACT cognitive composites, few studies have evaluated the psychometric properties of the ImPACT's Post-Concussion Symptom Scale (PCSS). The purpose of this study was to establish the test-retest reliability of symptom indices associated with the PCSS. Participants included 38 undergraduate students (50.0% male) who underwent neuropsychological testing as part of their participation in their psychology department's research subject pool. The majority of the participants were Caucasian (94.7%) and had no history of concussion (73.7%). All participants completed the ImPACT at two time points, approximately 6 weeks apart. The PCSS was the main outcome measure, and eight symptom indices were calculated (a total symptom score, three symptom summary indices, and four symptom clusters). Pearson correlations (r) and intraclass correlation coefficients (ICCs) were computed as measures of test-retest reliability. Overall, reliabilities ranged from low to high (r = .44 to .80; ICC = .44 to .77). The cognitive symptom cluster exhibited the highest test-retest reliability (r = .80, ICC = .77), followed by the positive symptom total (PST) index, an indicator of the total number of symptoms endorsed (r = .71, ICC = .69). In contrast, the commonly used total symptom score showed lower test-retest reliability (r = .67, ICC = .62). Paired-samples t tests revealed no significant differences between test and retest for any of the symptom variables (all p > .01). Finally, reliable change indices (RCI) were computed to determine whether differences observed between test and retest represented clinically significant change. RCI values were provided for each symptom index at the 80%, 90%, and 95% confidence intervals. These results suggest that evaluating additional symptom indices beyond the total symptom score from the PCSS is beneficial. Findings from this study can be applied to athlete samples to assess reliable change in symptoms following concussion.
Omari, Taher I.; Savilampi, Johanna; Kokkinn, Karmen; Schar, Mistyka; Lamvik, Kristin; Doeltgen, Sebastian; Cock, Charles
2016-01-01
Purpose. We evaluated the intra- and interrater agreement and test-retest reliability of analyst derivation of swallow function variables based on repeated high resolution manometry with impedance measurements. Methods. Five subjects swallowed 10 × 10 mL saline on two occasions one week apart producing a database of 100 swallows. Swallows were repeat-analysed by six observers using software. Swallow variables were indicative of contractility, intrabolus pressure, and flow timing. Results. The average intraclass correlation coefficients (ICC) for intra- and interrater comparisons of all variable means showed substantial to excellent agreement (intrarater ICC 0.85–1.00; mean interrater ICC 0.77–1.00). Test-retest results were less reliable. ICC for test-retest comparisons ranged from slight to excellent depending on the class of variable. Contractility variables differed most in terms of test-retest reliability. Amongst contractility variables, UES basal pressure showed excellent test-retest agreement (mean ICC 0.94), measures of UES postrelaxation contractile pressure showed moderate to substantial test-retest agreement (mean Interrater ICC 0.47–0.67), and test-retest agreement of pharyngeal contractile pressure ranged from slight to substantial (mean Interrater ICC 0.15–0.61). Conclusions. Test-retest reliability of HRIM measures depends on the class of variable. Measures of bolus distension pressure and flow timing appear to be more test-retest reliable than measures of contractility. PMID:27190520
Test-retest and interrater reliability of the functional lower extremity evaluation.
Haitz, Karyn; Shultz, Rebecca; Hodgins, Melissa; Matheson, Gordon O
2014-12-01
Repeated-measures clinical measurement reliability study. To establish the reliability and face validity of the Functional Lower Extremity Evaluation (FLEE). The FLEE is a 45-minute battery of 8 standardized functional performance tests that measures 3 components of lower extremity function: control, power, and endurance. The reliability and normative values for the FLEE in healthy athletes are unknown. A face validity survey for the FLEE was sent to sports medicine personnel to evaluate the level of importance and frequency of clinical usage of each test included in the FLEE. The FLEE was then administered and rated for 40 uninjured athletes. To assess test-retest reliability, each athlete was tested twice, 1 week apart, by the same rater. To assess interrater reliability, 3 raters scored each athlete during 1 of the testing sessions. Intraclass correlation coefficients were used to assess the test-retest and interrater reliability of each of the FLEE tests. In the face validity survey, the FLEE tests were rated as highly important by 58% to 71% of respondents but frequently used by only 26% to 45% of respondents. Interrater reliability intraclass correlation coefficients ranged from 0.83 to 1.00, and test-retest reliability ranged from 0.71 to 0.95. The FLEE tests are considered clinically important for assessing lower extremity function by sports medicine personnel but are underused. The FLEE also is a reliable assessment tool. Future studies are required to determine if use of the FLEE to make return-to-play decisions may reduce reinjury rates.
Assessing fear-avoidance beliefs in patients with cervical radiculopathy.
Dedering, Asa; Börjesson, Tina
2013-12-01
The study sought to evaluate validity and reliability of the Fear Avoidance Beliefs Questionnaire and the Tampa Scale for Kinesiophobia in patients with cervical radiculopathy. A test-retest design was used to test stability over time in 46 patients with cervical radiculopathy. Differences between patients and healthy subjects were also evaluated comparing the patients with 41 physically active and healthy subjects. The patients answered the Fear Avoidance Beliefs Questionnaire and the Tampa Scale for Kinesiophobia twice. To test for differences between the patients and the healthy subjects, the latter answered the same questionnaires once. Questionnaires about activity, personal factors and health were also used. The test-retest reliability assessed with weighted kappa was 0.68 for the Fear Avoidance Beliefs Questionnaire and 0.45 for the Tampa Scale for Kinesiophobia. Only six of the 11 single items of the Fear Avoidance Beliefs Questionnaire and none of the single items of the Tampa Scale of Kinesiophobia showed kappa coefficients exceeding 0.60 (good reliability). Patients with cervical radiculopathy rated significantly worse on the Fear Avoidance Beliefs Questionnaire and the Tampa Scale for Kinesiophobia than the healthy subjects did. The Fear Avoidance Beliefs Questionnaire may be recommended for test-retest evaluations because 'good' reliability was found. The Tampa Scale for Kinesiophobia had only 'moderate' test-retest reliability, and this should be considered when using this scale in test-retest evaluations. Both questionnaires can discriminate between patients with cervical radiculopathy and healthy subjects. Copyright © 2012 John Wiley & Sons, Ltd.
Malinowsky, Camilla; Kassberg, Ann-Charlotte; Larsson-Lund, Maria; Kottorp, Anders
2016-01-01
To evaluate the test-retest reliability of the Management of Everyday Technology Assessment (META) in a sample of people with acquired brain injury (ABI). The META was administered twice within a two-week period to 25 people with ABI. A Rasch measurement model was used to convert the META ordinal raw scores into equal-interval linear measures of each participant's ability to manage everyday technology (ET). Test-retest reliability of the stability of the person ability measures in the META was examined by a standardized difference Z-test and an intra-class correlations analysis (ICC 1). The results showed that the paired person ability measures generated from the META were stable over the test-retest period for 22 of the 25 subjects. The ICC 1 correlation was 0.63, which indicates good overall reliability. The META demonstrated acceptable test-retest reliability in a sample of people with ABI. The results illustrate the importance of using sufficiently challenging ETs (relative to a person's abilities) to generate stable META measurements over time. Implications for Rehabilitation The findings add evidence regarding the test-retest reliability of the person ability measures generated from the observation assessment META in a sample of people with ABI. The META might support professionals in the evaluation of interventions that are designed to improve clients' performance of activities including the ability to manage ET.
Test-retest reliability of the eating disorder examination-questionnaire (EDE-Q) in a college sample
2013-01-01
Background The Eating Disorder Examination-Questionnaire (EDE-Q), a widely used self-report instrument, is often used for measuring change in eating disorder symptoms over the course of treatment. However, limited data exist about test-retest reliability, particularly for men. The current study evaluated EDE-Q 7-day test-retest reliability in male (n = 47) and female (n = 44) undergraduate students together and separately by gender. Results Internal consistency was consistently higher for women and at Time 2, but remained acceptable for both men and women at both time points. Cronbach’s α ranged from .75 (Restraint at Time 1) to .93 (Shape Concern at Time 2) for women and from .73 (Eating Concern at Time 2) to .89 (Shape Concern at Time 2) for men. With the exception of some of the eating disorder behaviors, test re-test reliability was fairly strong for both men and women. Shape Concern and the global EDE-Q score were highest for both men and women (Spearman’s rho > 0.89 with the exception of Shape Concern for women for which Spearman’s rho = .86). Test re-test reliability was lower for the eating disorder behavior measures, particularly for men, for whom Kendall’s tau-b for frequency and phi for occurrence was less than 0.70 for all but objective bulimic episodes. Conclusions Results were consistent with past research for women, indicating strong test re-test reliability in attitudinal features of eating disorders, but lower test re-test reliability in behavioral features. Internal consistency and test re-test reliability was good for the attitudinal features of eating disorder in men, but tended to be lower for men compared to women. The EDE-Q appears to be a reliable instrument for assessing eating disorder attitudes in both male and female undergraduate students, but is less reliable for assessing ED behaviors, particularly in men. PMID:24999420
Test-retest reliability of a standardized psychiatric interview (DIS/CIDI).
Semler, G; Wittchen, H U; Joschke, K; Zaudig, M; von Geiso, T; Kaiser, S; von Cranach, M; Pfister, H
1987-01-01
The reliability of DSM-III diagnoses using an expanded version of the Diagnostic Interview Schedule (DIS), called the Composite International Diagnostic Interview (CIDI), was evaluated by examining 60 psychiatric inpatients on a test-retest basis. Acceptable agreement coefficients of (kappa) 0.5 or above were found for all but two disorders: dysthymic disorder and generalized anxiety disorder. The subclassification of DSM-III affective disorders also revealed some discrepancies between the test and the retest interviews. When compared with results from earlier versions of the DIS, diagnostic reliability was found to have improved for the DSM-III anxiety disorders in particular. These improvements can possibly be attributed to some changes in the wording of the respective items of this section. Several reasons for lowered test-retest reliability are discussed.
Doig, Emmah; Prescott, Sarah; Fleming, Jennifer; Cornwell, Petrea; Kuipers, Pim
2016-01-01
To examine the internal reliability and test-retest reliability of the Client-Centeredness of Goal Setting (C-COGS) scale. The C-COGS scale was administered to 42 participants with acquired brain injury after completion of multidisciplinary goal planning. Internal reliability of scale items was examined using item-partial total correlations and Cronbach's α coefficient. The scale was readministered within a 1-mo period to a subsample of 12 participants to examine test-retest reliability by calculating exact and close percentage agreement for each item. After examination of item-partial total correlations, test items were revised. The revised items demonstrated stronger internal consistency than the original items. Preliminary evaluation of test-retest reliability was fair, with an average exact percent agreement across all test items of 67%. Findings support the preliminary reliability of the C-COGS scale as a tool to evaluate and promote client-centered goal planning in brain injury rehabilitation. Copyright © 2016 by the American Occupational Therapy Association, Inc.
Valle, Susanne Collier; Støen, Ragnhild; Sæther, Rannei; Jensenius, Alexander Refsum; Adde, Lars
2015-10-01
A computer-based video analysis has recently been presented for quantitative assessment of general movements (GMs). This method's test-retest reliability, however, has not yet been evaluated. The aim of the current study was to evaluate the test-retest reliability of computer-based video analysis of GMs, and to explore the association between computer-based video analysis and the temporal organization of fidgety movements (FMs). Test-retest reliability study. 75 healthy, term-born infants were recorded twice the same day during the FMs period using a standardized video set-up. The computer-based movement variables "quantity of motion mean" (Qmean), "quantity of motion standard deviation" (QSD) and "centroid of motion standard deviation" (CSD) were analyzed, reflecting the amount of motion and the variability of the spatial center of motion of the infant, respectively. In addition, the association between the variable CSD and the temporal organization of FMs was explored. Intraclass correlation coefficients (ICC 1.1 and ICC 3.1) were calculated to assess test-retest reliability. The ICC values for the variables CSD, Qmean and QSD were 0.80, 0.80 and 0.86 for ICC (1.1), respectively; and 0.80, 0.86 and 0.90 for ICC (3.1), respectively. There were significantly lower CSD values in the recordings with continual FMs compared to the recordings with intermittent FMs (p<0.05). This study showed high test-retest reliability of computer-based video analysis of GMs, and a significant association between our computer-based video analysis and the temporal organization of FMs. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Scale for positive aspects of caregiving experience: development, reliability, and factor structure.
Kate, N; Grover, S; Kulhara, P; Nehra, R
2012-06-01
OBJECTIVE. To develop an instrument (Scale for Positive Aspects of Caregiving Experience [SPACE]) that evaluates positive caregiving experience and assess its psychometric properties. METHODS. Available scales which assess some aspects of positive caregiving experience were reviewed and a 50-item questionnaire with a 5-point rating was constructed. In all, 203 primary caregivers of patients with severe mental disorders were asked to complete the questionnaire. Internal consistency, test-retest reliability, cross-language reliability, split-half reliability, and face validity were evaluated. Principal component factor analysis was run to assess the factorial validity of the scale. RESULTS. The scale developed as part of the study was found to have good internal consistency, test-retest reliability, cross-language reliability, split-half reliability, and face validity. Principal component factor analysis yielded a 4-factor structure, which also had good test-retest reliability and cross-language reliability. There was a strong correlation between the 4 factors obtained. CONCLUSION. The SPACE developed as part of this study has good psychometric properties.
Cha, Young Joo; Lee, Jae Jin; Kim, Do Hyun; You, Joshua Sung H
2017-10-23
Core stabilization plays an important role in the regulation of postural stability. To overcome shortcomings associated with pain and severe core instability during conventional core stabilization tests, we recently developed the dynamic neuromuscular stabilization-based heel sliding (DNS-HS) test. The purpose of this study was to establish the criterion validity and test-retest reliability of the novel DNS-HS test. Twenty young adults with core instability completed both the bilateral straight leg lowering test (BSLLT) and DNS-HS test for the criterion validity study and repeated the DNS-HS test for the test-retest reliability study. Criterion validity was determined by comparing hip joint angle data that were obtained from BSLLT and DNS-HS measures. The test-retest reliability was determined by comparing hip joint angle data. Criterion validity was (ICC2,3) = 0.700 (p< 0.05), suggesting a good relationship between the two core stability measures. Test-retest reliability was (ICC3,3) = 0.953 (p< 0.05), indicating excellent consistency between the repeated DNS-HS measurements. Criterion validity data demonstrated a good relationship between the gold standard BSLLT and DNS-HS core stability measures. Test-retest reliability data suggests that DNS-HS core stability was a reliable test for core stability. Clinically, the DNS-HS test is useful to objectively quantify core instability and allow early detection and evaluation.
Romli, Muhammad Hibatullah; Mackenzie, Lynette; Lovarini, Meryl; Tan, Maw Pin; Clemson, Lindy
2017-06-01
Falls can be a devastating issue for older people living in the community, including those living in Malaysia. Health professionals and community members have a responsibility to ensure that older people have a safe home environment to reduce the risk of falls. Using a standardised screening tool is beneficial to intervene early with this group. The Home Falls and Accidents Screening Tool (HOME FAST) should be considered for this purpose; however, its use in Malaysia has not been studied. Therefore, the aim of this study was to evaluate the interrater and test-retest reliability of the HOME FAST with multiple professionals in the Malaysian context. A cross-sectional design was used to evaluate interrater reliability where the HOME FAST was used simultaneously in the homes of older people by 2 raters and a prospective design was used to evaluate test-retest reliability with a separate group of older people at different times in their homes. Both studies took place in an urban area of Kuala Lumpur. Professionals from 9 professional backgrounds participated as raters in this study, and a group of 51 community older people were recruited for the interrater reliability study and another group of 30 for the test-retest reliability study. The overall agreement was moderate for interrater reliability and good for test-retest reliability. The HOME FAST was consistently rated by different professionals, and no bias was found among the multiple raters. The HOME FAST can be used with confidence by a variety of professionals across different settings. The HOME FAST can become a universal tool to screen for home hazards related to falls. © 2017 John Wiley & Sons, Ltd.
USDA-ARS?s Scientific Manuscript database
Mechanography during the vertical jump may enhance screening and determining mechanistic causes for functional deficits that reduce physical performance. Utility of jump mechanography for evaluation is limited by scant test-retest reliability data on force-time variables. This study examined the tes...
2014-01-01
Background Patient-reported outcome validation needs to achieve validity and reliability standards. Among reliability analysis parameters, test-retest reliability is an important psychometric property. Retested patients must be in a clinically stable condition. This is particularly problematic in palliative care (PC) settings because advanced cancer patients are prone to a faster rate of clinical deterioration. The aim of this study was to evaluate the methods by which multi-symptom and health-related qualities of life (HRQoL) based on patient-reported outcomes (PROs) have been validated in oncological PC settings with regards to test-retest reliability. Methods A systematic search of PubMed (1966 to June 2013), EMBASE (1980 to June 2013), PsychInfo (1806 to June 2013), CINAHL (1980 to June 2013), and SCIELO (1998 to June 2013), and specific PRO databases was performed. Studies were included if they described a set of validation studies. Studies were included if they described a set of validation studies for an instrument developed to measure multi-symptom or multidimensional HRQoL in advanced cancer patients under PC. The COSMIN checklist was used to rate the methodological quality of the study designs. Results We identified 89 validation studies from 746 potentially relevant articles. From those 89 articles, 31 measured test-retest reliability and were included in this review. Upon critical analysis of the overall quality of the criteria used to determine the test-retest reliability, 6 (19.4%), 17 (54.8%), and 8 (25.8%) of these articles were rated as good, fair, or poor, respectively, and no article was classified as excellent. Multi-symptom instruments were retested over a shortened interval when compared to the HRQoL instruments (median values 24 hours and 168 hours, respectively; p = 0.001). Validation studies that included objective confirmation of clinical stability in their design yielded better results for the test-retest analysis with regard to both pain and global HRQoL scores (p < 0.05). The quality of the statistical analysis and its description were of great concern. Conclusion Test-retest reliability has been infrequently and poorly evaluated. The confirmation of clinical stability was an important factor in our analysis, and we suggest that special attention be focused on clinical stability when designing a PRO validation study that includes advanced cancer patients under PC. PMID:24447633
Reliability of two social cognition tests: The combined stories test and the social knowledge test.
Thibaudeau, Élisabeth; Cellard, Caroline; Legendre, Maxime; Villeneuve, Karèle; Achim, Amélie M
2018-04-01
Deficits in social cognition are common in psychiatric disorders. Validated social cognition measures with good psychometric properties are necessary to assess and target social cognitive deficits. Two recent social cognition tests, the Combined Stories Test (COST) and the Social Knowledge Test (SKT), respectively assess theory of mind and social knowledge. Previous studies have shown good psychometric properties for these tests, but the test-retest reliability has never been documented. The aim of this study was to evaluate the test-retest reliability and the inter-rater reliability of the COST and the SKT. The COST and the SKT were administered twice to a group of forty-two healthy adults, with a delay of approximately four weeks between the assessments. Excellent test-retest reliability was observed for the COST, and a good test-retest reliability was observed for the SKT. There was no evidence of practice effect. Furthermore, an excellent inter-rater reliability was observed for both tests. This study shows a good reliability of the COST and the SKT that adds to the good validity previously reported for these two tests. These good psychometrics properties thus support that the COST and the SKT are adequate measures for the assessment of social cognition. Copyright © 2018. Published by Elsevier B.V.
Test-Retest Reliability of Self-Reported Sexual Health Measures among US Hispanic Adolescents
ERIC Educational Resources Information Center
Jerman, Petra; Berglas, Nancy F.; Rohrbach, Louise A.; Constantine, Norman A.
2016-01-01
Objective: Although Hispanic adolescents in the USA are often the focus of sexual health interventions, their response to survey measures has rarely been assessed within evaluation studies. This study documents the test-retest reliability of a wide range of self-reported sexual health values, attitudes, knowledge and behaviours among Hispanic…
ERIC Educational Resources Information Center
Romer, Natalie; Merrell, Kenneth W.
2013-01-01
This study focused on evaluating the temporal stability of self-reported and teacher-reported perceptions of students' social and emotional skills and assets. We used a test-retest reliability procedure over repeated administrations of the child, adolescent, and teacher versions of the "Social-Emotional Assets and Resilience Scales".…
Test-retest reliability of the safe driving behavior measure for community-dwelling elderly drivers.
Song, Chiang-Soon; Lee, Joo-Hyun; Han, Sang-Woo
2016-06-01
[Purpose] The Safe Driving Behavior Measure (SDBM) is a self-report measurement tools that assesses the safe-driving behaviors of the elderly. The purpose of this study was to evaluate the test-retest reliability of the SDBM among community-dwelling elderly drivers. [Subjects and Methods] A total of sixty-one community-dwelling elderly were enrolled to investigate the reliability of the SDBM. The SDBM was assessed in two sessions that were conducted three days apart in a quiet and well-organized assessment room. That test-retest reliability of overall scores and three domain scores of the SDBM were statistically evaluated using intraclass correlation coefficients [ICC (2.1)]. Pearson correlation coefficients were used to quantify bivariate associations among the three domains of the SDBM. [Results] The SDBM demonstrated excellent rest-retest reliability for community-dwelling elderly drivers. The Cronbach alpha coefficients of the three domains of person-vehicle (0.979), person-environment (0.944), and person-vehicle-environment (0.971) of the SDBM indicate high internal consistency. [Conclusion] The results of this study suggest that the SDBM is a reliable measure for evaluating the safe- driving of automobiles by community-dwelling elderly, and is adequate for detecting changes in scores in clinical settings.
Nutrition Environment Measures Survey in stores (NEMS-S): development and evaluation.
Glanz, Karen; Sallis, James F; Saelens, Brian E; Frank, Lawrence D
2007-04-01
Eating, or nutrition, environments are believed to contribute to obesity and chronic diseases. There is a need for valid, reliable measures of nutrition environments. This article reports on the development and evaluation of measures of nutrition environments in retail food stores. The Nutrition Environment Measures Study developed observational measures of the nutrition environment within retail food stores (NEMS-S) to assess availability of healthy options, price, and quality. After pretesting, measures were completed by independent raters to evaluate inter-rater reliability and across two occasions to assess test-retest reliability in grocery and convenience stores in four neighborhoods differing on income and community design in the Atlanta metropolitan area. Data were collected and analyzed in 2004 and 2005. Ten food categories (e.g., fruits) or indicator food items (e.g., ground beef) were evaluated in 85 stores. Inter-rater reliability and test-retest reliability of availability were high: inter-rater reliability kappas were 0.84 to 1.00, and test-retest reliabilities were .73 to 1.00. Inter-rater reliability for quality across fresh produce was moderate (kappas, 0.44 to 1.00). Healthier options were higher priced for hot dogs, lean ground beef, and baked chips. More healthful options were available in grocery than convenience stores and in stores in higher income neighborhoods. The NEMS-S tool was found to have a high degree of inter-rater and test-retest reliability, and to reveal significant differences across store types and neighborhoods of high and low socioeconomic status. These observational measures of nutrition environments can be applied in multilevel studies of community nutrition, and can inform new approaches to conducting and evaluating nutrition interventions.
Kim, Min-Beom; Ban, Jae Ho
2012-12-01
To evaluate the test-retest reliability and convenience of simultaneous binaural acoustic-evoked ocular vestibular evoked myogenic potentials (oVEMP). Thirteen healthy subjects with no history of ear diseases participated in this study. All subjects underwent oVEMP test with both separated monaural acoustic stimulation and simultaneous binaural acoustic stimulation. For evaluating test-retest reliability, three repetitive sessions were performed in each ear for calculating the intraclass correlation coefficient (ICC) for both monaural and binaural tests. We analyzed data from the biphasic n1-p1 complex, such as latency of peak, inter-peak amplitude, and asymmetric ratio of amplitude in both ears. Finally, we checked the total time required to complete each test for evaluating test convenience. No significant difference was observed in amplitude and asymmetric ratio in comparison between monaural and binaural oVEMP. However, latency was slightly delayed in binaural oVEMP. In test-retest reliability analysis, binaural oVEMP showed excellent ICC values ranging from 0.68 to 0.98 in latency, asymmetric ratio, and inter-peak amplitude. Additionally, the test time was shorter in binaural than monaural oVEMP. oVEMP elicited from binaural acoustic stimulation yields similar satisfactory results as monaural stimulation. Further, excellent test-retest reliability and shorter test time were achieved in binaural than in monaural oVEMP.
Okochi, Jiro; Utsunomiya, Sakiko; Takahashi, Tai
2005-01-01
Background The International Classification of Functioning, Disability and Health (ICF) was published by the World Health Organization (WHO) to standardize descriptions of health and disability. Little is known about the reliability and clinical relevance of measurements using the ICF and its qualifiers. This study examines the test-retest reliability of ICF codes, and the rate of immeasurability in long-term care settings of the elderly to evaluate the clinical applicability of the ICF and its qualifiers, and the ICF checklist. Methods Reliability of 85 body function (BF) items and 152 activity and participation (AP) items of the ICF was studied using a test-retest procedure with a sample of 742 elderly persons from 59 institutional and at home care service centers. Test-retest reliability was estimated using the weighted kappa statistic. The clinical relevance of the ICF was estimated by calculating immeasurability rate. The effect of the measurement settings and evaluators' experience was analyzed by stratification of these variables. The properties of each item were evaluated using both the kappa statistic and immeasurability rate to assess the clinical applicability of WHO's ICF checklist in the elderly care setting. Results The median of the weighted kappa statistics of 85 BF and 152 AP items were 0.46 and 0.55 respectively. The reproducibility statistics improved when the measurements were performed by experienced evaluators. Some chapters such as genitourinary and reproductive functions in the BF domain and major life area in the AP domain contained more items with lower test-retest reliability measures and rated as immeasurable than in the other chapters. Some items in the ICF checklist were rated as unreliable and immeasurable. Conclusion The reliability of the ICF codes when measured with the current ICF qualifiers is relatively low. The result in increase in reliability according to evaluators' experience suggests proper education will have positive effects to raise the reliability. The ICF checklist contains some items that are difficult to be applied in the geriatric care settings. The improvements should be achieved by selecting the most relevant items for each measurement and by developing appropriate qualifiers for each code according to the interest of the users. PMID:16050960
Palmer, Clare E; Langbehn, Douglas; Tabrizi, Sarah J; Papoutsi, Marina
2017-01-01
Cognitive impairment is common amongst many neurodegenerative movement disorders such as Huntington's disease (HD) and Parkinson's disease (PD) across multiple domains. There are many tasks available to assess different aspects of this dysfunction, however, it is imperative that these show high test-retest reliability if they are to be used to track disease progression or response to treatment in patient populations. Moreover, in order to ensure effects of practice across testing sessions are not misconstrued as clinical improvement in clinical trials, tasks which are particularly vulnerable to practice effects need to be highlighted. In this study we evaluated test-retest reliability in mean performance across three testing sessions of four tasks that are commonly used to measure cognitive dysfunction associated with striatal impairment: a combined Simon Stop-Signal Task; a modified emotion recognition task; a circle tracing task; and the trail making task. Practice effects were seen between sessions 1 and 2 across all tasks for the majority of dependent variables, particularly reaction time variables; some, but not all, diminished in the third session. Good test-retest reliability across all sessions was seen for the emotion recognition, circle tracing, and trail making test. The Simon interference effect and stop-signal reaction time (SSRT) from the combined-Simon-Stop-Signal task showed moderate test-retest reliability, however, the combined SSRT interference effect showed poor test-retest reliability. Our results emphasize the need to use control groups when tracking clinical progression or use pre-baseline training on tasks susceptible to practice effects.
ERIC Educational Resources Information Center
Mowder, Barbara A.; Shamah, Renee
2011-01-01
This study evaluated the test-retest reliability of two parenting measures: the Parent Behavior Importance Questionnaire-Revised (PBIQ-R) and Parent Behavior Frequency Questionnaire-Revised (PBFQ-R). These self-report parenting behavior assessment measures may be utilized as pre- and post-parent education program measures, with parents as well as…
Wang-Hsu, Elizabeth; Smith, Susan S
2017-01-10
Falls are a common cause of injuries and hospital admissions in older adults. Balance limitation is a potentially modifiable factor contributing to falls. The Balance Evaluation Systems Test (BESTest), a clinical balance measure, categorizes balance into 6 underlying subsystems. Each of the subsystems is scored individually and summed to obtain a total score. The reliability of the BESTest and its individual subsystems has been reported in patients with various neurological disorders and cancer survivors. However, the reliability and minimal detectable change (MDC) of the BESTest with community-dwelling older adults have not been reported. The purposes of our study were to (1) determine the interrater and test-retest reliability of the BESTest total and subsystem scores; and (2) estimate the MDC of the BESTest and its individual subsystem scores with community-dwelling older adults. We used a prospective cohort methodological design. Community-dwelling older adults (N = 70; aged 70-94 years; mean = 85.0 [5.5] years) were recruited from a senior independent living community. Trained testers (N = 3) administered the BESTest. All participants were tested with the BESTest by the same tester initially and then retested 7 to 14 days later. With 32 of the participants, a second tester concurrently scored the retest for interrater reliability. Testers were blinded to each other's scores. Intraclass correlation coefficients [ICC(2,1)] were used to determine the interrater and test-retest reliability. Test-retest reliability was also analyzed using method error and the associated coefficients of variation (CVME). MDC was calculated using standard error of measurement. Interrater reliability (N = 32) of the BESTest total score was ICC(2, 1) = 0.97 (95% confidence interval [CI], 0.94-0.99). The ICCs for the individual subsystem scores ranged from 0.85 to 0.94. Test-retest reliability (N = 70) of the BESTest total score was ICC(2,1) = 0.93 (95% CI, 0.89-0.96). ICCs for the individual subsystem scores ranged from 0.72 to 0.89. The CVME (N = 70) of the BESTest total score was 4.1%. The CVME for the subsystem scores ranged from 5.0% to 10.7%. MDC (N = 70) for the BESTest total score at the 95% CI was 7.6%, or 8.2 points. MDC at the 95% CI for subsystem scores ranged from 11.7% to 19.0% (2.1-3.4 points). Results demonstrated generally good to excellent interrater and test-retest reliability in both the BESTest total and subsystem scores with community-dwelling older adults. The BESTest total and individual subsystem scores demonstrate good to excellent interrater and test-retest reliability with community-dwelling older adults. A change of 7.6% (8.2 points) or more in the BESTest total and a percentage change ranged from 11.7% to 19.0% (2.1-3.4 points) in the subsystem scores are suggested for clinicians to be 95% confident of true change when evaluating change in this population.
Acoustic stapedial reflexes in healthy neonates: normative data and test-retest reliability.
Kei, Joseph
2012-01-01
The acoustic stapedial reflex (ASR) test provides useful information about the function of the auditory system. While it is frequently used with adults and children in a clinical setting, its use with young infants is limited. Presently, there are few data for neonates and inadequate research into the test-retest reliability of the ASR test. This study aimed to establish normative data and evaluate the test-retest reliability of the ASR test in healthy neonates. A cross-sectional experimental design was used to establish ASR normative data and assess the test-retest reliability of ASR thresholds obtained from healthy neonates. Sixty-eight full-term neonates with mean chronological age of 2.5 days (SD = 1.8 day), who passed the automated auditory brainstem response, transient evoked otoacoustic emission, and high frequency (1 kHz) tympanometry (HFT) tests. One randomly selected ear from each neonate was tested using TEOAE (transient evoked otoacoustic emission), HFT, and ASR tests using a 1 kHz probe tone. ASR thresholds were elicited by presenting pure tones of 0.5, 2, and 4 kHz and broadband noise (BBN) separately to the test ear in an ipsilateral stimulation mode. The ASR procedure was repeated to acquire retest data within the same testing session. Descriptive statistics, χ2, and analysis of variance with repeated measures tests were used to analyze ASR data. All neonates exhibited ASR when stimulated by tonal stimuli or BBN. The mean ASRTs (acoustic stapedial reflex thresholds) for the 0.5, 2, and 4 kHz tones were 81.6 ± 7.9, 71.3 ± 7.9, and 65.4 ± 8.7 dB HL, respectively. The mean ASRT for the BBN was estimated to be smaller than 57.2 dB HL, given the limitation of the equipment. The 95th percentiles of the ASRT were 95, 85, 80, and 75 dB HL for the 0.5, 2, and 4 kHz and BBN, respectively. The test-retest reliability of the ASR test for all stimuli was high, with no significant difference in mean ASRTs across the test and retest conditions. Test-retest differences were within 10 dB for more than 91% of ASRT data across all stimuli. There was a slight trend of ASRTs being more repeatable in the medium ASRT range than in the higher or lower range. This study demonstrated that ASRTs obtained from healthy neonates were highly repeatable across test and retest sessions. Given the availability of normative data and the high test-retest reliability, the ASR test will be useful as a diagnostic tool in a battery of tests to evaluate the auditory function of neonates. American Academy of Audiology.
Park, Myung Sook; Kang, Kyung Ja; Jang, Sun Joo; Lee, Joo Yun; Chang, Sun Ju
2018-03-01
This study aimed to evaluate the components of test-retest reliability including time interval, sample size, and statistical methods used in patient-reported outcome measures in older people and to provide suggestions on the methodology for calculating test-retest reliability for patient-reported outcomes in older people. This was a systematic literature review. MEDLINE, Embase, CINAHL, and PsycINFO were searched from January 1, 2000 to August 10, 2017 by an information specialist. This systematic review was guided by both the Preferred Reporting Items for Systematic Reviews and Meta-Analyses checklist and the guideline for systematic review published by the National Evidence-based Healthcare Collaborating Agency in Korea. The methodological quality was assessed by the Consensus-based Standards for the selection of health Measurement Instruments checklist box B. Ninety-five out of 12,641 studies were selected for the analysis. The median time interval for test-retest reliability was 14days, and the ratio of sample size for test-retest reliability to the number of items in each measure ranged from 1:1 to 1:4. The most frequently used statistical methods for continuous scores was intraclass correlation coefficients (ICCs). Among the 63 studies that used ICCs, 21 studies presented models for ICC calculations and 30 studies reported 95% confidence intervals of the ICCs. Additional analyses using 17 studies that reported a strong ICC (>0.09) showed that the mean time interval was 12.88days and the mean ratio of the number of items to sample size was 1:5.37. When researchers plan to assess the test-retest reliability of patient-reported outcome measures for older people, they need to consider an adequate time interval of approximately 13days and the sample size of about 5 times the number of items. Particularly, statistical methods should not only be selected based on the types of scores of the patient-reported outcome measures, but should also be described clearly in the studies that report the results of test-retest reliability. Copyright © 2017 Elsevier Ltd. All rights reserved.
Fitzgerald, John S; Johnson, LuAnn; Tomkinson, Grant; Stein, Jesse; Roemmich, James N
2018-05-01
Mechanography during the vertical jump may enhance screening and determining mechanistic causes underlying physical performance changes. Utility of jump mechanography for evaluation is limited by scant test-retest reliability data on force-time variables. This study examined the test-retest reliability of eight jump execution variables assessed from mechanography. Thirty-two women (mean±SD: age 20.8 ± 1.3 yr) and 16 men (age 22.1 ± 1.9 yr) attended a familiarization session and two testing sessions, all one week apart. Participants performed two variations of the squat jump with squat depth self-selected and controlled using a goniometer to 80º knee flexion. Test-retest reliability was quantified as the systematic error (using effect size between jumps), random error (using coefficients of variation), and test-retest correlations (using intra-class correlation coefficients). Overall, jump execution variables demonstrated acceptable reliability, evidenced by small systematic errors (mean±95%CI: 0.2 ± 0.07), moderate random errors (mean±95%CI: 17.8 ± 3.7%), and very strong test-retest correlations (range: 0.73-0.97). Differences in random errors between controlled and self-selected protocols were negligible (mean±95%CI: 1.3 ± 2.3%). Jump execution variables demonstrated acceptable reliability, with no meaningful differences between the controlled and self-selected jump protocols. To simplify testing, a self-selected jump protocol can be used to assess force-time variables with negligible impact on measurement error.
Evaluating the reliability of an injury prevention screening tool: Test-retest study.
Gittelman, Michael A; Kincaid, Madeline; Denny, Sarah; Wervey Arnold, Melissa; FitzGerald, Michael; Carle, Adam C; Mara, Constance A
2016-10-01
A standardized injury prevention (IP) screening tool can identify family risks and allow pediatricians to address behaviors. To assess behavior changes on later screens, the tool must be reliable for an individual and ideally between household members. Little research has examined the reliability of safety screening tool questions. This study utilized test-retest reliability of parent responses on an existing IP questionnaire and also compared responses between household parents. Investigators recruited parents of children 0 to 1 year of age during admission to a tertiary care children's hospital. When both parents were present, one was chosen as the "primary" respondent. Primary respondents completed the 30-question IP screening tool after consent, and they were re-screened approximately 4 hours later to test individual reliability. The "second" parent, when present, only completed the tool once. All participants received a 10-dollar gift card. Cohen's Kappa was used to estimate test-retest reliability and inter-rater agreement. Standard test-retest criteria consider Kappa values: 0.0 to 0.40 poor to fair, 0.41 to 0.60 moderate, 0.61 to 0.80 substantial, and 0.81 to 1.00 as almost perfect reliability. One hundred five families participated, with five lost to follow-up. Thirty-two (30.5%) parent dyads completed the tool. Primary respondents were generally mothers (88%) and Caucasian (72%). Test-retest of the primary respondents showed their responses to be almost perfect; average 0.82 (SD = 0.13, range 0.49-1.00). Seventeen questions had almost perfect test-retest reliability and 11 had substantial reliability. However, inter-rater agreement between household members for 12 objective questions showed little agreement between responses; inter-rater agreement averaged 0.35 (SD = 0.34, range -0.19-1.00). One question had almost perfect inter-rater agreement and two had substantial inter-rater agreement. The IP screening tool used by a single individual had excellent test-retest reliability for nearly all questions. However, when a reporter changes from pre- to postintervention, differences may reflect poor reliability or different subjective experiences rather than true change.
ERIC Educational Resources Information Center
Thompson, Patricia; Beath, Tricia; Bell, Jacqueline; Jacobson, Gabrielle; Phair, Tegan; Salbach, Nancy M.; Wright, F. Virginia
2008-01-01
Short-term test-retest reliability of the 10-metre fast walk test (10mFWT) and 6-minute walk test (6MWT) was evaluated in 31 ambulatory children with cerebral palsy (CP), with subgroup analyses in Gross Motor Function Classification System (GMFCS) Levels I (n=9), II (n=8), and III (n=14). Sixteen females and 15 males participated, mean age 9 years…
Muir-Hunter, Susan W; Graham, Laura; Montero Odasso, Manuel
2015-08-01
To measure test-retest and interrater reliability of the Berg Balance Scale (BBS) in community-dwelling adults with mild to moderate Alzheimer disease (AD). Method : A sample of 15 adults (mean age 80.20 [SD 5.03] years) with AD performed three balance tests: the BBS, timed up-and-go test (TUG), and Functional Reach Test (FRT). Both relative reliability, using the intra-class correlation coefficient (ICC), and absolute reliability, using standard error of measurement (SEM) and minimal detectable change (MDC95) values, were calculated; Bland-Altman plots were constructed to evaluate inter-tester agreement. The test-retest interval was 1 week. Results : For the BBS, relative reliability values were 0.95 (95% CI, 0.85-0.98) for test-retest reliability and 0.72 (95% CI, 0.31-0.91) for interrater reliability; SEM was 6.01 points and MDC95 was 16.66 points; and interrater agreement was 16.62 points. The BBS performed better in test-retest reliability than the TUG and FRT, tests with established reliability in AD. Between 33% and 50% of participants required cueing beyond standardized instructions because they were unable to remember test instructions. Conclusions : The BBS achieved relative reliability values that support its clinical utility, but MDC95 and agreement values indicate the scale has performance limitations in AD. Further research to optimize balance assessment for people with AD is required.
Yung, Marcus; Wells, Richard P
2017-07-01
Fatigue has been linked to deficits in production quality and productivity and, if of long duration, work-related musculoskeletal disorders. It may thus be a useful risk indicator and design and evaluation tool. However, there is limited information on the test-retest reliability, the sensitivity and the effects of diurnal fluctuation on field usable fatigue measures. This study reports on an evaluation of 11 measurement tools and their 14 parameters. Eight measures were found to have test-retest ICC values greater than 0.8. Four measures were particularly responsive during an intermittent fatiguing condition. However, two responsive measures demonstrated rhythmic behaviour, with significant time effects from 08:00 to mid-afternoon and early evening. Action tremor, muscle mechanomyography and perceived fatigue were found to be most reliable and most responsive; but additional analytical considerations might be required when interpreting daylong responses of MMG and action tremor. Practitioner Summary: This paper presents findings from test-retest and daylong reliability and responsiveness evaluations of 11 fatigue measures. This paper suggests that action tremor, muscle mechanomyography and perceived fatigue were most reliable and most responsive. However, mechanomyography and action tremor may be susceptible to diurnal changes.
Vroland-Nordstrand, Kristina; Krumlinde-Sundholm, Lena
2012-11-01
to evaluate the test-retest reliability of children's perceptions of their own competence in performing daily tasks and of their choice of goals for intervention using the Swedish version of the perceived efficacy and goal setting system (PEGS). A second aim was to evaluate agreement between children's and parents' perceptions of the child's competence and choices of intervention goals. Forty-four children with disabilities and their parents completed the Swedish version of the PEGS. Thirty-six of the children completed a retest session allocated into one of two groups: (A) for evaluation of perceived competence and (B) for evaluation of choice of goals. Cohen's kappa, weighted kappa and absolute agreement were calculated. Test-retest reliability for children's perceived competence showed good agreement for the dichotomized scale of competent/non-competent performance; however, using the four-point scale the agreement varied. The children's own goals were relatively stable over time; 78% had an absolute agreement ranging from 50% to 100%. There was poor agreement between the children's and their parents' ratings. Goals identified by the children differed from those identified by their parents, with 48% of the children having no goals identical to those chosen by their parents. These results indicate that the Swedish version of the PEGS produces reliable outcomes comparable to the original version.
Development and reliability testing of the Worksite and Energy Balance Survey.
Hoehner, Christine M; Budd, Elizabeth L; Marx, Christine M; Dodson, Elizabeth A; Brownson, Ross C
2013-01-01
Worksites represent important venues for health promotion. Development of psychometrically sound measures of worksite environments and policy supports for physical activity and healthy eating are needed for use in public health research and practice. Assess the test-retest reliability of the Worksite and Energy Balance Survey (WEBS), a self-report instrument for assessing perceptions of worksite supports for physical activity and healthy eating. The WEBS included items adapted from existing surveys or new items on the basis of a review of the literature and expert review. Cognitive interviews among 12 individuals were used to test the clarity of items and further refine the instrument. A targeted random-digit-dial telephone survey was administered on 2 occasions to assess test-retest reliability (mean days between time periods = 8; minimum = 5; maximum = 14). Five Missouri census tracts that varied by racial-ethnic composition and walkability. Respondents included 104 employed adults (67% white, 64% women, mean age = 48.6 years). Sixty-three percent were employed at worksites with less than 100 employees, approximately one-third supervised other people, and the majority worked a regular daytime shift (75%). Test-retest reliability was assessed using Spearman correlations for continuous variables, Cohen's κ statistics for nonordinal categorical variables, and 1-way random intraclass correlation coefficients for ordinal categorical variables. Test-retest coefficients ranged from 0.41 to 0.97, with 80% of items having reliability coefficients of more than 0.6. Items that assessed participation in or use of worksite programs/facilities tended to have lower reliability. Reliability of some items varied by gender, obesity status, and worksite size. Test-retest reliability and internal consistency for the 5 scales ranged from 0.84 to 0.94 and 0.63 to 0.84, respectively. The WEBS items and scales exhibited sound test-retest reliability and may be useful for research and surveillance. Further evaluation is needed to document the validity of the WEBS and associations with energy balance outcomes.
Test-retest reliability of infant event related potentials evoked by faces.
Munsters, N M; van Ravenswaaij, H; van den Boomen, C; Kemner, C
2017-04-05
Reliable measures are required to draw meaningful conclusions regarding developmental changes in longitudinal studies. Little is known, however, about the test-retest reliability of face-sensitive event related potentials (ERPs), a frequently used neural measure in infants. The aim of the current study is to investigate the test-retest reliability of ERPs typically evoked by faces in 9-10 month-old infants. The infants (N=31) were presented with neutral, fearful and happy faces that contained only the lower or higher spatial frequency information. They were tested twice within two weeks. The present results show that the test-retest reliability of the face-sensitive ERP components is moderate (P400 and Nc) to substantial (N290). However, there is low test-retest reliability for the effects of the specific experimental manipulations (i.e. emotion and spatial frequency) on the face-sensitive ERPs. To conclude, in infants the face-sensitive ERP components (i.e. N290, P400 and Nc) show adequate test-retest reliability, but not the effects of emotion and spatial frequency on these ERP components. We propose that further research focuses on investigating elements that might increase the test-retest reliability, as adequate test-retest reliability is necessary to draw meaningful conclusions on individual developmental trajectories of the face-sensitive ERPs in infants. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.
Simões, Luan; Teixeira-Salmela, Luci Fuscaldi; Magalhães, Lívia; Stuge, Britt; Laurentino, Glória; Wanderley, Elaine; Barros, Raphaela; Lemos, Andrea
2018-04-24
The purpose of this study was to evaluate test-retest reliability, construct validity, and internal consistency of the Brazilian version of the Pelvic Girdle Questionnaire (PGQ-Brazil). Analysis of the measurement properties was carried out in 4 steps. Step 1 was the pilot study, on which basis 4 hypotheses were formulated. These hypotheses were tested during the next step (construct validity, step 2) by completion of the questionnaire by the 2 groups (in pain [n = 105] and not in pain [n = 52]). For implementation of the PGQ-Brazil in the group with pain, we calculated the internal consistency (step 3) and, 7 days later, test-retest reliability (step 4) by re-application of the instrument in this group. First, the PGQ-Brazil was able to discriminate between these groups (construct validity). Second, test-retest reliability (intraclass correlation coefficients for Activities subscale [0.97 with 95% confidence interval of 0.95-0.98] and Symptoms subscale [0.98 with 95% confidence interval of 0.97-0.98] and κ coefficient between 0.50 and 0.89 for the items) was found to be good; the Bland-Altman test indicated satisfactory agreement. The Rasch analysis indicated good internal consistency, and the instrument's ability to divide the participants into at least 3 levels of skills was confirmed. In contrast, a ceiling effect was observed, as 24% of pregnant women exhibited skills superior to what the PGQ-Brazil could evaluate. The PGQ-Brazil had good internal consistency, test-retest reliability, and construct validity in assessment of limitations in activities and symptoms of pregnant women with pelvic girdle pain. Copyright © 2018. Published by Elsevier Inc.
Haga, Nienke; van der Heijden-Maessen, Hélène C; van Hoorn, Jessika F; Boonstra, Anne M; Hadders-Algra, Mijna
2007-12-01
To investigate the test-retest, inter-, and intraobserver reliability of the Quality of Upper Extremity Skills Test (QUEST) in young children with cerebral palsy (CP). For test-retest reliability, a test-retest design was used; for the intra- and interobserver reliability, the videotaped test was scored on 2 occasions by 1 observer and by various observers. Groups of preschool-age children in 2 general rehabilitation centers. Twenty-one children with CP (12 boys, 9 girls) aged 2 to 4.5 years (mean, 39 mo). Not applicable. Spearman correlation coefficient. The data indicated that test-retest reliability was strong (rho range, .85-.94). Intraobserver agreement (rho range, .63-.95) and agreement between various observers (rho range, .72-.90) were moderate to strong. Test-retest and inter- and intraobserver reliability of the QUEST in preschool-age children with CP is good.
Lee, Chin-Pang; Chiu, Yu-Wen; Chu, Chun-Lin; Chen, Yu; Jiang, Kun-Hao; Chen, Jiun-Liang; Chen, Ching-Yen
2016-12-01
The aging males' symptoms (AMS) scale is an instrument used to determine the health-related quality of life in adult and elderly men. The purpose of this study was to synthesize internal consistency (Cronbach's alpha) and test-retest reliability for the AMS scale and its three subscales. Of the 123 studies reviewed, 12 provided alpha coefficients which were then used in the meta-analyses of internal consistency. Seven of the 12 included studies provided test-retest coefficients, and these were used in the meta-analyses of test-retest reliability. The AMS scale had excellent internal consistency [α = 0.89 (95% CI 0.88-0.90)]; the mean alpha estimates across the AMS subscales ranged from 0.79 to 0.82. The AMS scale also had good test-retest reliability [r = 0.85 (95% CI 0.82-0.88]; the test-retest reliability coefficients of the AMS subscales ranged from 0.76 to 0.83. There was significant heterogeneity among the included studies. The AMS scale and the three subscales had fairly good internal consistency and test-retest reliability. Future psychometric studies of the AMS scale should report important characteristics of the participants, details of item scores, and test-retest reliability.
ERIC Educational Resources Information Center
Balogun, Joseph; Abiona, Titilayo; Lukobo-Durrell, Mainza; Adefuye, Adedeji; Amosun, Seyi; Frantz, Jose; Yakut, Yavuz
2011-01-01
Objective: This comparative study evaluated the readability and test-retest reliability of a questionnaire designed to assess the attitudes, beliefs behaviours and sources of information about HIV/AIDS among young adults recruited from universities in the United States of America (USA), Turkey and South Africa. Design/Setting: The instrument was…
Lee, Posen; Lu, Wen-Shian; Liu, Chin-Hsuan; Lin, Hung-Yu; Hsieh, Ching-Lin
2017-12-08
The d2 Test of Attention (D2) is a commonly used measure of selective attention for patients with schizophrenia. However, its test-retest reliability and minimal detectable change (MDC) are unknown in patients with schizophrenia, limiting its utility in both clinical and research settings. The aim of the present study was to examine the test-retest reliability and MDC of the D2 in patients with schizophrenia. A rater administered the D2 on 108 patients with schizophrenia twice at a 1-month interval. Test-retest reliability was determined through the calculation of the intra-class correlation coefficient (ICC). We also carried out Bland-Altman analysis, which included a scatter plot of the differences between test and retest against their mean. Systematic biases were evaluated by use of a paired t-test. The ICCs for the D2 ranged from 0.78 to 0.94. The MDCs (MDC%) of the seven subscores were 102.3 (29.7), 19.4 (85.0), 7.2 (94.6), 21.0 (69.0), 104.0 (33.1), 105.0 (35.8), and 7.8 (47.8), which represented limited-to-acceptable random measurement error. Trends in the Bland-Altman plots of the omissions (E1), commissions (E2), and errors (E) were noted, presenting that the data had heteroscedasticity. According to the results, the D2 had good test-retest reliability, especially in the scores of TN, TN-E, and CP. For the further research, finding a way to improve the administration procedure to reduce random measurement error would be important for the E1, E2, E, and FR subscores. © The Author(s) 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Costa, Y M; Morita-Neto, O; de Araújo-Júnior, E N S; Sampaio, F A; Conti, P C R; Bonjardim, L R
2017-03-01
Assessing the reliability of medical measurements is a crucial step towards the elaboration of an applicable clinical instrument. There are few studies that evaluate the reliability of somatosensory assessment and pain modulation of masticatory structures. This study estimated the test-retest reliability, that is over time, of the mechanical somatosensory assessment of anterior temporalis, masseter and temporomandibular joint (TMJ) and the conditioned pain modulation (CPM) using the anterior temporalis as the test site. Twenty healthy women were evaluated in two sessions (1 week apart) by the same examiner. Mechanical detection threshold (MDT), mechanical pain threshold (MPT), wind-up ratio (WUR) and pressure pain threshold (PPT) were assessed on the skin overlying the anterior temporalis, masseter and TMJ of the dominant side. CPM was tested by comparing PPT before and during the hand immersion in a hot water bath. anova and intra-class correlation coefficients (ICCs) were applied to the data (α = 5%). The overall ICCs showed acceptable values for the test-retest reliability of mechanical somatosensory assessment of masticatory structures. The ICC values of 75% of all quantitative sensory measurements were considered fair to excellent (fair = 8·4%, good = 33·3% and excellent = 33·3%). However, the CPM paradigm presented poor reliability (ICC = 0·25). The mechanical somatosensory assessment of the masticatory structures, but not the proposed CPM protocol, can be considered sufficiently reliable over time to evaluate the trigeminal sensory function. © 2016 John Wiley & Sons Ltd.
Thaung, Jörgen; Olseke, Kjell; Ahl, Johan; Sjöstrand, Johan
2014-09-01
The purpose of our study was to establish a practical and quick test for assessing reading performance and to statistically analyse interchart and test-retest reliability of a new standardized Swedish reading chart system consisting of three charts constructed according to the principles available in the literature. Twenty-four subjects with healthy eyes, mean age 65 ± 10 years, were tested binocularly and the reading performance evaluated as reading acuity, critical print size and maximum reading speed. The test charts all consist of 12 short text sentences with a print size ranging from 0.9 to -0.2 logMAR in approximate steps of 0.1 logMAR. Two testing sessions, in two different groups (C1 and C2), were under strict control of luminance and lighting environment. Reading performance tests with chart T1, T2 and T3 were used for evaluation of interchart reliability and test data from a second session 1 month or more apart for the test-retest analysis. The testing of reading performance in adult observers with short sentences of continuous text was quick and practical. The agreement between the tests obtained with the three different test charts was high both within the same test session and at retest. This new Swedish variant of a standardized reading system based on short sentences and logarithmic progression of print size provides reliable measurements of reading performance and preliminary norms in an age group around 65 years. The reading test with three independent reading charts can be useful for clinical studies of reading ability before and after treatment. © 2013 Acta Ophthalmologica Scandinavica Foundation. Published by John Wiley & Sons Ltd.
Test-retest reliability of sensor-based sit-to-stand measures in young and older adults.
Regterschot, G Ruben H; Zhang, Wei; Baldus, Heribert; Stevens, Martin; Zijlstra, Wiebren
2014-01-01
This study investigated test-retest reliability of sensor-based sit-to-stand (STS) peak power and other STS measures in young and older adults. In addition, test-retest reliability of the sensor method was compared to test-retest reliability of the Timed Up and Go Test (TUGT) and Five-Times-Sit-to-Stand Test (FTSST) in older adults. Ten healthy young female adults (20-23 years) and 31 older adults (21 females; 73-94 years) participated in two assessment sessions separated by 3-8 days. Vertical peak power was assessed during three (young adults) and five (older adults) normal and fast STS trials with a hybrid motion sensor worn on the hip. Older adults also performed the FTSST and TUGT. The average sensor-based STS peak power of the normal STS trials and the average sensor-based STS peak power of the fast STS trials showed excellent test-retest reliability in young adults (intra-class correlation (ICC)≥0.90; zero in 95% confidence interval of mean difference between test and retest (95%CI of D); standard error of measurement (SEM)≤6.7% of mean peak power) and older adults (ICC≥0.91; zero in 95%CI of D; SEM≤9.9%). Test-retest reliability of sensor-based STS peak power and TUGT (ICC=0.98; zero in 95%CI of D; SEM=8.5%) was comparable in older adults, test-retest reliability of the FTSST was lower (ICC=0.73; zero outside 95%CI of D; SEM=14.4%). Sensor-based STS peak power demonstrated excellent test-retest reliability and may therefore be useful for clinical assessment of functional status and fall risk. Copyright © 2014 Elsevier B.V. All rights reserved.
Reliability and Validity of the TIMPSI for Infants With Spinal Muscular Atrophy Type I
Krosschell, Kristin J.; Maczulski, Jo Anne; Scott, Charles; King, Wendy; Hartman, Jill T.; Case, Laura E.; Viazzo-Trussell, Donata; Wood, Janine; Roman, Carolyn A.; Hecker, Eva; Meffert, Marianne; Léveillé, Maude; Kienitz, Krista; Swoboda, Kathryn J.
2014-01-01
Purpose This study examined the reliability and validity of the Test of Infant Motor Performance Screening Items (TIMPSI) in infants with type I spinal muscular atrophy (SMA). Methods After training, 12 evaluators scored 4 videos of infants with type I SMA to assess interrater reliability. Intrarater and test-retest reliability was further assessed for 9 evaluators during a SMA type I clinical trial, with 9 evaluators testing a total of 38 infants twice. Relatedness of the TIMPSI score to ability to reach and ventilatory support was also examined. Results Excellent interrater video score reliability was noted (intraclass correlation coefficient, 0.97–0.98). Intrarater reliability was excellent (intraclass correlation coefficient, 0.91–0.98) and test-retest reliability ranged from r = 0.82 to r = 0.95. The TIMPSI score was related to the ability to reach (P ≤ .05). Conclusion The TIMPSI can reliably be used to assess motor function in infants with type I SMA. In addition, the TIMPSI scores are related to the ability to reach, an important functional skill in children with type I SMA. PMID:23542189
The analysis of reliability and validity of the IT-MAIS, MAIS and MUSS.
Zhong, Yan; Xu, Tianqiu; Dong, Ruijuan; Lyu, Jing; Liu, Bo; Chen, Xueqing
2017-05-01
The aim of this study was to investigate the reliability and validity of the Infant-toddler Meaningful Auditory Integration Scale (IT-MAIS), Meaningful Auditory Integration Scale (MAIS), and Meaningful Use of Speech Scale (MUSS). IT-MAIS, MAIS and MUSS were divided into 3 sub dimensions. 300 children with cochlear implants (CI) were included in the investigation. To assess test-retest reliability of these questionnaires, 30 children were selected randomly to be evaluated at a two-week interval indicated that there were no significant changes between test and retest. Furthermore random test analysis by different evaluators was also administered to 30 users. Reliability test: Test-retest reliability of the three scales was proved to be satisfactory. All domains had correlation coefficients that exceeded 0.750(P < 0.01). The Cronbach's α of the three scales and their three domains were greater than 0.700. Reliability between evaluators of the three scales were considered to be satisfactory. All domains had correlation coefficients that exceeded 0.750(P < 0.01). Validity test: The evaluation of content validity by expert review showed the questionnaire had good content validity; The correlation coefficients between the overall scores of the three scales and their three domains were 0.699-0.978(P < 0.01). There were correlations among the three sub-domains but the strength of the correlations was relatively low. There was certain construct validity. IT-MAIS, MAIS, MUSS scales have good reliability and validity, and can be used to measure the outcome for children with cochlear implants hearing and speech evaluation. Copyright © 2017 Elsevier B.V. All rights reserved.
An, Hyeong Su; Moon, Won-Jin; Ryu, Jae-Kyun; Park, Ju Yeon; Yun, Won Sung; Choi, Jin Woo; Jahng, Geon-Ho; Park, Jang-Yeon
2017-12-01
This prospective multi-center study aimed to evaluate the inter-vendor and test-retest reliabilities of resting-state functional magnetic resonance imaging (RS-fMRI) by assessing the temporal signal-to-noise ratio (tSNR) and functional connectivity. Study included 10 healthy subjects and each subject was scanned using three 3T MR scanners (GE Signa HDxt, Siemens Skyra, and Philips Achieva) in two sessions. The tSNR was calculated from the time course data. Inter-vendor and test-retest reliabilities were assessed with intra-class correlation coefficients (ICCs) derived from variant component analysis. Independent component analysis was performed to identify the connectivity of the default-mode network (DMN). In result, the tSNR for the DMN was not significantly different among the GE, Philips, and Siemens scanners (P=0.638). In terms of vendor differences, the inter-vendor reliability was good (ICC=0.774). Regarding the test-retest reliability, the GE scanner showed excellent correlation (ICC=0.961), while the Philips (ICC=0.671) and Siemens (ICC=0.726) scanners showed relatively good correlation. The DMN pattern of the subjects between the two sessions for each scanner and between three scanners showed the identical patterns of functional connectivity. The inter-vendor and test-retest reliabilities of RS-fMRI using different 3T MR scanners are good. Thus, we suggest that RS-fMRI could be used in multicenter imaging studies as a reliable imaging marker. Copyright © 2017 Elsevier Inc. All rights reserved.
Saengsuwan, Jittima; Berger, Lucia; Schuster-Amft, Corina; Nef, Tobias; Hunt, Kenneth J
2016-09-06
Exercise testing devices for evaluating cardiopulmonary fitness in patients with severe disability after stroke are lacking, but we have adapted a robotics-assisted tilt table (RATT) for cardiopulmonary exercise testing (CPET). Using the RATT in a sample of patients after stroke, this study aimed to investigate test-retest reliability and repeatability of CPET and to prospectively investigate changes in cardiopulmonary outcomes over a period of four weeks. Stroke patients with all degrees of disability underwent 3 separate CPET sessions: 2 tests at baseline (TB1 and TB2) and 1 test at follow up (TF). TB1 and TB2 were at least 24 h apart. TB2 and TF were 4 weeks apart. A RATT equipped with force sensors in the thigh cuffs, a work rate estimation algorithm and a real-time visual feedback system was used to guide the patients' exercise work rate during CPET. Test-retest reliability and repeatability of CPET variables were analysed using paired t-tests, the intraclass correlation coefficient (ICC), the coefficient of variation (CoV), and Bland and Altman limits of agreement. Changes in cardiopulmonary fitness during four weeks were analysed using paired t-tests. Seventeen sub-acute and chronic stroke patients (age 62.7 ± 10.4 years [mean ± SD]; 8 females) completed the test sessions. The median time post stroke was 350 days. There were 4 severely disabled, 1 moderately disabled and 12 mildly disabled patients. For test-retest, there were no statistically significant differences between TB1 and TB2 for most CPET variables. Peak oxygen uptake, peak heart rate, peak work rate and oxygen uptake at the ventilatory anaerobic threshold (VAT) and respiratory compensation point (RCP) showed good to excellent test-retest reliability (ICC 0.65-0.94). For all CPET variables, CoV was 4.1-14.5 %. The mean difference was close to zero in most of the CPET variables. There were no significant changes in most cardiopulmonary performance parameters during the 4-week period (TB2 vs TF). These findings provide the first evidence of test-retest reliability and repeatability of the principal CPET variables using the novel RATT system and testing methodology, and high success rates in identification of VAT and RCP: good to excellent test-retest reliability and repeatability were found for all submaximal and maximal CPET variables. Reliability and repeatability of the main CPET parameters in stroke patients on the RATT were comparable to previous findings in stroke patients using standard exercise testing devices. The RATT has potential to be used as an alternative exercise testing device in patients who have limitations for use of standard exercise testing devices.
Sleeper, Mark D; Kenyon, Lisa K; Elliott, James M; Cheng, M Samuel
2016-12-01
Despite the availability of various field-tests for many competitive sports, a reliable and valid test specifically developed for use in men's gymnastics has not yet been developed. The Men's Gymnastics Functional Measurement Tool (MGFMT) was designed to assess sport-specific physical abilities in male competitive gymnasts. The purpose of this study was to develop the MGFMT by establishing a scoring system for individual test items and to initiate the process of establishing test-retest reliability and construct validity. A total of 83 competitive male gymnasts ages 7-18 underwent testing using the MGFMT. Thirty of these subjects underwent re-testing one week later in order to assess test-retest reliability. Construct validity was assessed using a simple regression analysis between total MGFMT scores and the gymnasts' USA-Gymnastics competitive level to calculate the coefficient of determination (r 2 ). Test-retest reliability was analyzed using Model 1 Intraclass correlation coefficients (ICC). Statistical significance was set at the p<0.05 level. The relationship between total MGFMT scores and subjects' current USA-Gymnastics competitive level was found to be good (r 2 = 0.63). Reliability testing of the MGFMT composite test score showed excellent test-retest reliability over a one-week period (ICC = 0.97). Test-retest reliability of the individual component tests ranged from good to excellent (ICC = 0.75-0.97). The results of this study provide initial support for the construct validity and test-retest reliability of the MGFMT. Level 3.
Tepe, Rodger; Tepe, Chabha
2015-03-01
To develop and psychometrically evaluate an information literacy (IL) self-efficacy survey and an IL knowledge test. In this test-retest reliability study, a 25-item IL self-efficacy survey and a 50-item IL knowledge test were developed and administered to a convenience sample of 53 chiropractic students. Item analyses were performed on all questions. The IL self-efficacy survey demonstrated good reliability (test-retest correlation = 0.81) and good/very good internal consistency (mean κ = .56 and Cronbach's α = .92). A total of 25 questions with the best item analysis characteristics were chosen from the 50-item IL knowledge test, resulting in a 25-item IL knowledge test that demonstrated good reliability (test-retest correlation = 0.87), very good internal consistency (mean κ = .69, KR20 = 0.85), and good item discrimination (mean point-biserial = 0.48). This study resulted in the development of three instruments: a 25-item IL self-efficacy survey, a 50-item IL knowledge test, and a 25-item IL knowledge test. The information literacy self-efficacy survey and the 25-item version of the information literacy knowledge test have shown preliminary evidence of adequate reliability and validity to justify continuing study with these instruments.
Becker, Anne E.; Roberts, Andrea L.; Perloe, Alexandra; Bainivualiku, Asenaca; Richards, Lauren K.; Gilman, Stephen E.; Striegel-Moore, Ruth H.
2010-01-01
Objective The Global School-based Student Health Survey (GSHS) is an assessment for adolescent health risk behaviors and exposures, supported by the World Health Organization. Although already widely implemented—and intended for youth assessment across diverse ethnic and national contexts—no reliability data have yet been reported for GSHS-based assessment in any ethnicity or country-specific population. This study reports test-retest reliability for GSHS content adapted for a female adolescent ethnic Fijian study sample in Fiji. Design We adapted and translated GSHS content to assess health risk behaviors as part of a larger study investigating the impact of social transition on ethnic Fijian secondary schoolgirls in Fiji. In order to evaluate the performance of this measure for our ethnic Fijian study sample (n=523), we examined its test-retest reliability with kappa coefficients, % agreement, and prevalence estimates in a sub-sample (n=81). Reliability among strata defined by topic, age, and language was also examined. Results Average agreement between test and retest was 77%, and average Cohen's kappa was 0.47. Mean kappas for questions from core modules about alcohol use, tobacco use, and sexual behavior were substantial, and higher than those for modules relating to other risk behaviors. Conclusions Although test-retest reliability of responses within this country-specific version of GSHS content was substantial in several topical domains for this ethnic Fijian sample, only fair reliability for the module assessing dietary behaviors and other individual items suggests that population-specific psychometric evaluation is essential to interpreting language and country-specific GSHS data. PMID:20234961
Validation of hindi translation of DSM-5 level 1 cross-cutting symptom measure.
Goel, Ankit; Kataria, Dinesh
2018-04-01
The DSM-5 Level 1 Cross-Cutting Symptom Measure is a self- or informant-rated measure that assesses mental health domains which are important across psychiatric diagnoses. The absence of this self- or informant-administered instrument in Hindi, which is a major language in India, is an important limitation in using this scale. To translate the English version of the DSM-5 Level 1 Cross-Cutting Symptom Measure to Hindi and evaluate its psychometric properties. The study was conducted at a tertiary care hospital in Delhi. The DSM-5 Level 1 Cross-Cutting Symptom Measure was translated into Hindi using the World Health Organization's translation methodology. Mean and standard deviation were evaluated for continuous variables while for categorical variables frequency and percentages were calculated. The translated version was evaluated for cross-language equivalence, test-retest reliability, internal consistency, and split half reliability. Hindi version was found to have good cross-language equivalence and test-retest reliability at the level of items and domains. Twenty two of the 23 items and all the 23 items had a significant correlation (ρ < 0.001) in cross language concordance and test-retest reliability data, respectively. The Cronbach's alpha was 0.95, and the Spearman-Brown Sphericity value was 0.79 for the Hindi version. The present study shows that cross-language concordance, internal consistency, split-half reliability, and test-retest reliability of the Hindi version of the measure are excellent. Thus, the Hindi version of DSM-5 Level 1 Cross-Cutting Symptom Measure as translated in this study is a valid instrument. Copyright © 2018 Elsevier B.V. All rights reserved.
Gerhardsson, Lars; Gillström, Lennart; Hagberg, Mats
2014-01-01
Exposure to hand-held vibrating tools may cause the hand-arm vibration syndrome (HAVS). The aim was to study the test-retest reliability of hand and muscle strength tests, and tests for the determination of thermal and vibration perception thresholds, which are used when investigating signs of neuropathy in vibration exposed workers. In this study, 47 vibration exposed workers who had been investigated at the department of Occupational and Environmental Medicine in Gothenburg were compared with a randomized sample of 18 unexposed subjects from the general population of the city of Gothenburg. All participants passed a structured interview, answered several questionnaires and had a physical examination including hand and finger muscle strength tests, determination of vibrotactile (VPT) and thermal perception thresholds (TPT). Two weeks later, 23 workers and referents, selected in a randomized manner, were called back for the same test-procedures for the evaluation of test-retest reliability. The test-retest reliability after a two week interval expressed as limits of agreement (LOA; Bland-Altman), intra-class correlation coefficients (ICC) and Pearson correlation coefficients was excellent for tests with the Baseline hand grip, Pinch-grip and 3-Chuck grip among the exposed workers and referents (N = 23: percentage of differences within LOA 91 - 100%; ICC-values ≥0.93; Pearson r ≥0.93). The test-retest reliability was also excellent (percentage of differences within LOA 96-100 %) for the determination of vibration perception thresholds in digits 2 and 5 bilaterally as well as for temperature perception thresholds in digits 2 and 5, bilaterally (percentage of differences within LOA 91 - 96%). For ICC and Pearson r the results for vibration perception thresholds were good for digit 2, left hand and for digit 5, bilaterally (ICC ≥ 0.84; r ≥0.85), and lower (ICC = 0.59; r = 0.59) for digit 2, right hand. For the latter two indices the test-retest reliability for the determination of temperature thresholds was lower and showed more varying results. The strong test-retest reliability for hand and muscle strength tests as well as for the determination of VPTs makes these procedures useful for diagnostic purposes and follow-up studies in vibration exposed workers.
Sureda, Xisca; Espelt, Albert; Villalbí, Joan R; Cebrecos, Alba; Baranda, Lucía; Pearce, Jamie; Franco, Manuel
2017-10-05
To describe the development and test-retest reliability of OHCITIES, an instrument characterising alcohol urban environment in terms of availability, promotion and signs of consumption. This study involved: (1) developing the conceptual framework for alcohol urban environment by means of literature reviewing and previous alcohol environment research experience; (2) pilot testing and redesigning the instrument; (3) instrument digitalisation; (4) instrument evaluation using test-retest reliability. Data for testing the reliability of the instrument were collected in seven census sections in Madrid in 2016 by two observers. We computed per cent agreement and Cohen's kappa coefficients to estimate inter-rater and test-retest reliability for alcohol outlet environment measures. We calculated interclass coefficients and their 95% CIs to provide a measure of inter-rater reliability for signs of alcohol consumption measures. We collected information on 92 on-premise and 24 off-premise alcohol outlets identified in the studied areas about availability, accessibility and promotion of alcohol. Most per cent-agreement values for alcohol measures in on-premise and off-premise alcohol outlets were greater than 80%, and inter-rater and test-retest reliability values were generally above 0.80. Observers identified 26 streets and 3 public squares with signs of alcohol consumption. Intraclass correlation coefficient between observers for any type of signs of alcohol consumption was 0.50 (95% CI -0.09 to 0.77). Few items promoting alcohol unrelated to alcohol outlets were found on public spaces. The OHCITIES instrument is a reliable instrument to characterise alcohol urban environment. This instrument might be used to understand how alcohol environment associates with alcohol behaviours and its related health outcomes, and can help in the design and evaluation of policies to reduce the harm caused by alcohol. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Bergamin, Marco; Gobbo, Stefano; Bullo, Valentina; Vendramin, Barbara; Duregon, Federica; Frizziero, Antonio; Di Blasio, Andrea; Cugusi, Lucia; Zaccaria, Marco; Ermolao, Andrea
2017-01-01
Lower extremity muscle mass, strength, power, and physical performance are critical determinants of independent functioning in later life. Isokinetic dynamometers are becoming very common in assessing different features of muscle strength, in both research and clinical practice; however, reliability studies are still needed to support the extended use of those devices. The purpose of this study is to assess the test-retest reliability of knee and ankle isokinetic and isometric strength testing protocols in a sample of older healthy subjects, using a new and untested isokinetic multi-joint evaluation system. Sixteen male and fourteen female older adults (mean age 65.2 ± 4.6 years) were assessed in two testing sessions. Each participant performed a randomized testing procedure that includes different isometric and isokinetic tests for knee and ankle joints. All participants concluded the trial safety and no subject reported any discomfort throughout the overall assessment. Coefficients of correlation between measures were calculated showing moderate to strong effects among all test-retest assessments and paired-sample t test showed only one significant difference (p<0.05) in the maximal isokinetic bilateral knee flexion torque. The multi-joint evaluation system for the assessment of knee and ankle isokinetic and isometric strength provided reliable test-retest measures in healthy older adults. Ib.
Validity and Reliability of the 8-Item Work Limitations Questionnaire.
Walker, Timothy J; Tullar, Jessica M; Diamond, Pamela M; Kohl, Harold W; Amick, Benjamin C
2017-12-01
Purpose To evaluate factorial validity, scale reliability, test-retest reliability, convergent validity, and discriminant validity of the 8-item Work Limitations Questionnaire (WLQ) among employees from a public university system. Methods A secondary analysis using de-identified data from employees who completed an annual Health Assessment between the years 2009-2015 tested research aims. Confirmatory factor analysis (CFA) (n = 10,165) tested the latent structure of the 8-item WLQ. Scale reliability was determined using a CFA-based approach while test-retest reliability was determined using the intraclass correlation coefficient. Convergent/discriminant validity was tested by evaluating relations between the 8-item WLQ with health/performance variables for convergent validity (health-related work performance, number of chronic conditions, and general health) and demographic variables for discriminant validity (gender and institution type). Results A 1-factor model with three correlated residuals demonstrated excellent model fit (CFI = 0.99, TLI = 0.99, RMSEA = 0.03, and SRMR = 0.01). The scale reliability was acceptable (0.69, 95% CI 0.68-0.70) and the test-retest reliability was very good (ICC = 0.78). Low-to-moderate associations were observed between the 8-item WLQ and the health/performance variables while weak associations were observed between the demographic variables. Conclusions The 8-item WLQ demonstrated sufficient reliability and validity among employees from a public university system. Results suggest the 8-item WLQ is a usable alternative for studies when the more comprehensive 25-item WLQ is not available.
Loureiro, Luiz de França Bahia; de Freitas, Paulo Barbosa
2016-04-01
Badminton requires open and fast actions toward the shuttlecock, but there is no specific agility test for badminton players with specific movements. To develop an agility test that simultaneously assesses perception and motor capacity and examine the test's concurrent and construct validity and its test-retest reliability. The Badcamp agility test consists of running as fast as possible to 6 targets placed on the corners and middle points of a rectangular area (5.6 × 4.2 m) from the start position located in the center of it, following visual stimuli presented in a luminous panel. The authors recruited 43 badminton players (17-32 y old) to evaluate concurrent (with shuttle-run agility test--SRAT) and construct validity and test-retest reliability. Results revealed that Badcamp presents concurrent and construct validity, as its performance is strongly related to SRAT (ρ = 0.83, P < .001), with performance of experts being better than nonexpert players (P < .01). In addition, Badcamp is reliable, as no difference (P = .07) and a high intraclass correlation (ICC = .93) were found in the performance of the players on 2 different occasions. The findings indicate that Badcamp is an effective, valid, and reliable tool to measure agility, allowing coaches and athletic trainers to evaluate players' athletic condition and training effectiveness and possibly detect talented individuals in this sport.
Tepe, Rodger; Tepe, Chabha
2015-01-01
Objective To develop and psychometrically evaluate an information literacy (IL) self-efficacy survey and an IL knowledge test. Methods In this test–retest reliability study, a 25-item IL self-efficacy survey and a 50-item IL knowledge test were developed and administered to a convenience sample of 53 chiropractic students. Item analyses were performed on all questions. Results The IL self-efficacy survey demonstrated good reliability (test–retest correlation = 0.81) and good/very good internal consistency (mean κ = .56 and Cronbach's α = .92). A total of 25 questions with the best item analysis characteristics were chosen from the 50-item IL knowledge test, resulting in a 25-item IL knowledge test that demonstrated good reliability (test–retest correlation = 0.87), very good internal consistency (mean κ = .69, KR20 = 0.85), and good item discrimination (mean point-biserial = 0.48). Conclusions This study resulted in the development of three instruments: a 25-item IL self-efficacy survey, a 50-item IL knowledge test, and a 25-item IL knowledge test. The information literacy self-efficacy survey and the 25-item version of the information literacy knowledge test have shown preliminary evidence of adequate reliability and validity to justify continuing study with these instruments. PMID:25517736
Kenyon, Lisa K.; Elliott, James M; Cheng, M. Samuel
2016-01-01
Purpose/Background Despite the availability of various field-tests for many competitive sports, a reliable and valid test specifically developed for use in men's gymnastics has not yet been developed. The Men's Gymnastics Functional Measurement Tool (MGFMT) was designed to assess sport-specific physical abilities in male competitive gymnasts. The purpose of this study was to develop the MGFMT by establishing a scoring system for individual test items and to initiate the process of establishing test-retest reliability and construct validity. Methods A total of 83 competitive male gymnasts ages 7-18 underwent testing using the MGFMT. Thirty of these subjects underwent re-testing one week later in order to assess test-retest reliability. Construct validity was assessed using a simple regression analysis between total MGFMT scores and the gymnasts’ USA-Gymnastics competitive level to calculate the coefficient of determination (r2). Test-retest reliability was analyzed using Model 1 Intraclass correlation coefficients (ICC). Statistical significance was set at the p<0.05 level. Results The relationship between total MGFMT scores and subjects’ current USA-Gymnastics competitive level was found to be good (r2 = 0.63). Reliability testing of the MGFMT composite test score showed excellent test-retest reliability over a one-week period (ICC = 0.97). Test-retest reliability of the individual component tests ranged from good to excellent (ICC = 0.75-0.97). Conclusions The results of this study provide initial support for the construct validity and test-retest reliability of the MGFMT. Level of Evidence Level 3 PMID:27999723
Test-Retest Reliability of the Short-Form Survivor Unmet Needs Survey.
Taylor, Karen; Bulsara, Max; Monterosso, Leanne
2018-01-01
Reliable and valid needs assessment measures are important assessment tools in cancer survivorship care. A new 30-item short-form version of the Survivor Unmet Needs Survey (SF-SUNS) was developed and validated with cancer survivors, including hematology cancer survivors; however, test-retest reliability has not been established. The objective of this study was to assess the test-retest reliability of the SF-SUNS with a cohort of lymphoma survivors ( n = 40). Test-retest reliability of the SF-SUNS was conducted at two time points: baseline (time 1) and 5 days later (time 2). Test-retest data were collected from lymphoma cancer survivors ( n = 40) in a large tertiary cancer center in Western Australia. Intraclass correlation analyses compared data at time 1 (baseline) and time 2 (5 days later). Cronbach's alpha analyses were performed to assess the internal consistency at both time points. The majority (23/30, 77%) of items achieved test-retest reliability scores 0.45-0.74 (fair to good). A high degree of overall internal consistency was demonstrated (time 1 = 0.92, time 2 = 0.95), with scores 0.65-0.94 across subscales for both time points. Mixed test-retest reliability of the SF-SUNS was established. Our results indicate the SF-SUNS is responsive to the changing needs of lymphoma cancer survivors. Routine use of cancer survivorship specific needs-based assessments is required in oncology care today. Nurses are well placed to administer these assessments and provide tailored information and resources. Further assessment of test-retest reliability in hematology and other cancer cohorts is warranted.
Sun, Wei; Song, Qipeng; Yu, Bing; Zhang, Cui; Mao, Dewei
2015-01-01
This study aimed to evaluate the test-retest reliability of a new device for assessing ankle joint kinesthesia. This device could measure the passive motion threshold of four ankle joint movements, namely plantarflexion, dorsiflexion, inversion and eversion. A total of 21 healthy adults, including 13 males and 8 females, participated in the study. Each participant completed two sessions on two separate days with 1-week interval. The sessions were administered by the same experimenter in the same laboratory. At least 12 trials (three successful trials in each of the four directions) were performed in each session. The mean values in each direction were calculated and analysed. The ICC values of test-retest reliability ranged from 0.737 (dorsiflexion) to 0.935 (eversion), whereas the SEM values ranged from 0.21° (plantarflexion) to 0.52° (inversion). The Bland-Altman plots showed that the reliability of plantarflexion-dorsiflexion was better than that of inversion-eversion. The results evaluated the reliability of the new device as fair to excellent. The new device for assessing kinesthesia could be used to examine the ankle joint kinesthesia.
Test-Retest Reliability of a Survey to Measure Transport-Related Physical Activity in Adults
ERIC Educational Resources Information Center
Badland, Hannah; Schofield, Grant
2006-01-01
The present research details test-retest reliability of a newly developed, telephone-administered TPA survey for adults. This instrument examines barriers, perceptions, and current travel behaviors to place of work/study and local convenience shops. Demonstrated test-retest reliability of the Active Friendly Environments-Transport-Related Physical…
Validity and Reliability of a New Device (WIMU®) for Measuring Hamstring Muscle Extensibility.
Muyor, José M
2017-09-01
The aims of the current study were 1) to evaluate the validity of the WIMU ® system for measuring hamstring muscle extensibility in the passive straight leg raise (PSLR) test using an inclinometer for the criterion and 2) to determine the test-retest reliability of the WIMU ® system to measure hamstring muscle extensibility during the PSLR test. 55 subjects were evaluated on 2 separate occasions. Data from a Unilever inclinometer and WIMU ® system were collected simultaneously. Intraclass correlation coefficients (ICCs) for the validity were very high (0.983-1); a very low systematic bias (-0.21°--0.42°), random error (0.05°-0.04°) and standard error of the estimate (0.43°-0.34°) were observed (left-right leg, respectively) between the 2 devices (inclinometer and the WIMU ® system). The R 2 between the devices was 0.999 (p<0.001) in both the left and right legs. The test-retest reliability of the WIMU ® system was excellent, with ICCs ranging from 0.972-0.995, low coefficients of variation (0.01%), and a low standard error of the estimate (0.19-0.31°). The WIMU ® system showed strong concurrent validity and excellent test-retest reliability for the evaluation of hamstring muscle extensibility in the PSLR test. © Georg Thieme Verlag KG Stuttgart · New York.
Willoughby, Michael T; Kuhn, Laura J; Blair, Clancy B; Samek, Anya; List, John A
2017-10-01
This study investigates the test-retest reliability of a battery of executive function (EF) tasks with a specific interest in testing whether the method that is used to create a battery-wide score would result in differences in the apparent test-retest reliability of children's performance. A total of 188 4-year-olds completed a battery of computerized EF tasks twice across a period of approximately two weeks. Two different approaches were used to create a score that indexed children's overall performance on the battery-i.e., (1) the mean score of all completed tasks and (2) a factor score estimate which used confirmatory factor analysis (CFA). Pearson and intra-class correlations were used to investigate the test-retest reliability of individual EF tasks, as well as an overall battery score. Consistent with previous studies, the test-retest reliability of individual tasks was modest (rs ≈ .60). The test-retest reliability of the overall battery scores differed depending on the scoring approach (r mean = .72; r factor_ score = .99). It is concluded that the children's performance on individual EF tasks exhibit modest levels of test-retest reliability. This underscores the importance of administering multiple tasks and aggregating performance across these tasks in order to improve precision of measurement. However, the specific strategy that is used has a large impact on the apparent test-retest reliability of the overall score. These results replicate our earlier findings and provide additional cautionary evidence against the routine use of factor analytic approaches for representing individual performance across a battery of EF tasks.
Broderick, Joan E.; Schneider, Stefan; Junghaenel, Doerte U.; Schwartz, Joseph E.; Stone, Arthur A.
2013-01-01
Objective Evaluation of known group validity, ecological validity, and test-retest reliability of four domain instruments from the Patient Reported Outcomes Measurement System (PROMIS) in osteoarthritis (OA) patients. Methods Recruitment of an osteoarthritis sample and a comparison general population (GP) through an Internet survey panel. Pain intensity, pain interference, physical functioning, and fatigue were assessed for 4 consecutive weeks with PROMIS short forms on a daily basis and compared with same-domain Computer Adaptive Test (CAT) instruments that use a 7-day recall. Known group validity (comparison of OA and GP), ecological validity (comparison of aggregated daily measures with CATs), and test-retest reliability were evaluated. Results The recruited samples matched (age, sex, race, ethnicity) the demographic characteristics of the U.S. sample for arthritis and the 2009 Census for the GP. Compliance with repeated measurements was excellent: > 95%. Known group validity for CATs was demonstrated with large effect sizes (pain intensity: 1.42, pain interference: 1.25, and fatigue: .85). Ecological validity was also established through high correlations between aggregated daily measures and weekly CATs (≥ .86). Test-retest validity (7-day) was very good (≥ .80). Conclusion PROMIS CAT instruments demonstrated known group and ecological validity in a comparison of osteoarthritis patients with a general population sample. Adequate test-retest reliability was also observed. These data provide encouraging initial data on the utility of these PROMIS instruments for clinical and research outcomes in osteoarthritis patients. PMID:23592494
Reliability and Validity of the Chinese (Mandarin) Tinnitus Handicap Inventory
Meng, Zhaoli; Zheng, Yun; Wang, Kai; Kong, Xiudan; Tao, Yong; Xu, Ke; Liu, Guanjian
2012-01-01
Objectives The Tinnitus Handicap Inventory (THI) is a commonly used self-reporting tinnitus questionnaire. We undertook this study to determine the reliability and validity of the Chinese-Mandarin version of the Tinnitus Handicap Inventory (THI-CM) for measuring tinnitus-related handicaps. Methods We tested the test-retest reliability, internal reliability, and construct validity of the THI-CM. Two-hundred patients seeking treatment for primary or secondary tinnitus in Southwest China were asked to complete THI-CM prior to clinical evaluation. Patients were evaluated by a clinician using standard methods, and 40 patients were asked to complete THI-CM a second time 14±3 days after the initial interview. Results The test-retest reliability of THI-CM was high (Pearson correlation, 0.98), as was the internal reliability (Cronbach's α, 0.93). Factor analysis indicated that THI-CM has a unifactorial structure. Conclusion The THI-CM version is reliable. The total score in THI-CM can be used to measure tinnitus-related handicaps in Mandarin-speaking populations. PMID:22468196
Validity of trunk extensor and flexor torque measurements using isokinetic dynamometry.
Guilhem, Gaël; Giroux, Caroline; Couturier, Antoine; Maffiuletti, Nicola A
2014-12-01
This study aimed to evaluate the validity and test-retest reliability of trunk muscle strength testing performed with a latest-generation isokinetic dynamometer. Eccentric, isometric, and concentric peak torque of the trunk flexor and extensor muscles was measured in 15 healthy subjects. Muscle cross sectional area (CSA) and surface electromyographic (EMG) activity were respectively correlated to peak torque and submaximal isometric torque for erector spinae and rectus abdominis muscles. Reliability of peak torque measurements was determined during test and retest sessions. Significant correlations were consistently observed between muscle CSA and peak torque for all contraction types (r=0.74-0.85; P<0.001) and between EMG activity and submaximal isometric torque (r ⩾ 0.99; P<0.05), for both extensor and flexor muscles. Intraclass correlation coefficients were comprised between 0.87 and 0.95, and standard errors of measurement were lower than 9% for all contraction modes. The mean difference in peak torque between test and retest ranged from -3.7% to 3.7% with no significant mean directional bias. Overall, our findings establish the validity of torque measurements using the tested trunk module. Also considering the excellent test-retest reliability of peak torque measurements, we conclude that this latest-generation isokinetic dynamometer could be used with confidence to evaluate trunk muscle function for clinical or athletic purposes. Copyright © 2014 Elsevier Ltd. All rights reserved.
Test-retest reliability and practice effects of a rapid screen of mild traumatic brain injury.
De Monte, Veronica Eileen; Geffen, Gina Malke; Kwapil, Karleigh
2005-07-01
Test-retest reliabilities and practice effects of measures from the Rapid Screen of Concussion (RSC), in addition to the Digit Symbol Substitution Test (Digit Symbol), were examined. Twenty five male participants were tested three times; each testing session scheduled a week apart. The test-retest reliability estimates for most measures were reasonably good, ranging from .79 to .97. An exception was the delayed word recall test, which has had a reliability estimate of .66 for the first retest, and .59 for the second retest. Practice effects were evident from Times 1 to 2 on the sentence comprehension and delayed recall subtests of the RSC, Digit Symbol and a composite score. There was also a practice effect of the same magnitude found from Time 2 to Time 3 on Digit Symbol, delayed recall and the composite score. Statistics on measures for both the first and second retest intervals, with associated practice effects, are presented to enable the calculation of reliable change indices (RCI). The RCI may be used to assess any improvement in cognitive functioning after mild Traumatic Brain Injury.
Trotti, Lynn Marie; Staab, Beth A.; Rye, David B.
2013-01-01
Study Objectives: Differentiation of narcolepsy without cataplexy from idiopathic hypersomnia relies entirely upon the multiple sleep latency test (MSLT). However, the test-retest reliability for these central nervous system hypersomnias has never been determined. Methods: Patients with narcolepsy without cataplexy, idiopathic hypersomnia, and physiologic hypersomnia who underwent two diagnostic multiple sleep latency tests were identified retrospectively. Correlations between the mean sleep latencies on the two studies were evaluated, and we probed for demographic and clinical features associated with reproducibility versus change in diagnosis. Results: Thirty-six patients (58% women, mean age 34 years) were included. Inter -test interval was 4.2 ± 3.8 years (range 2.5 months to 16.9 years). Mean sleep latencies on the first and second tests were 5.5 (± 3.7 SD) and 7.3 (± 3.9) minutes, respectively, with no significant correlation (r = 0.17, p = 0.31). A change in diagnosis occurred in 53% of patients, and was accounted for by a difference in the mean sleep latency (N = 15, 42%) or the number of sleep onset REM periods (N = 11, 31%). The only feature predictive of a diagnosis change was a history of hypnagogic or hypnopompic hallucinations. Conclusions: The multiple sleep latency test demonstrates poor test-retest reliability in a clinical population of patients with central nervous system hypersomnia evaluated in a tertiary referral center. Alternative diagnostic tools are needed. Citation: Trotti LM; Staab BA; Rye DB. Test- retest reliability of the multiple sleep latency test in narcolepsy without cataplexy and idiopathic hypersomnia. J Clin Sleep Med 2013;9(8):789-795. PMID:23946709
Glenn, Jordan M; Galey, Madeline; Edwards, Abigail; Rickert, Bradley; Washington, Tyrone A
2015-07-01
Ability to generate force from the core musculature is a critical factor for sports and general activities with insufficiencies predisposing individuals to injury. This study evaluated isometric force production as a valid and reliable method of assessing abdominal force using the abdominal test and evaluation systems tool (ABTEST). Secondary analysis estimated 1-repetition maximum on commercially available abdominal machine compared to maximum force and average power on ABTEST system. This study utilized test-retest reliability and comparative analysis for validity. Reliability was measured using test-retest design on ABTEST. Validity was measured via comparison to estimated 1-repetition maximum on a commercially available abdominal device. Participants applied isometric, abdominal force against a transducer and muscular activation was evaluated measuring normalized electromyographic activity at the rectus-abdominus, rectus-femoris, and erector-spinae. Test, re-test force production on ABTEST was significantly correlated (r=0.84; p<0.001). Mean electromyographic activity for the rectus-abdominus (72.93% and 75.66%), rectus-femoris (6.59% and 6.51%), and erector-spinae (6.82% and 5.48%) were observed for trial-1 and trial-2, respectively. Significant correlations for the estimated 1-repetition maximum were found for average power (r=0.70, p=0.002) and maximum force (r=0.72, p<0.001). Data indicate the ABTEST can accurately measure rectus-abdominus force isolated from hip-flexor involvement. Negligible activation of erector-spinae substantiates little subjective effort among participants in the lower back. Results suggest ABTEST is a valid and reliable method of evaluating abdominal force. Copyright © 2014 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.
Reliability of temporal summation and diffuse noxious inhibitory control
Cathcart, Stuart; Winefield, Anthony H; Rolan, Paul; Lushington, Kurt
2009-01-01
BACKGROUND: The test-retest reliability of temporal summation (TS) and diffuse noxious inhibitory control (DNIC) has not been reported to date. Establishing such reliability would support the possibility of future experimental studies examining factors affecting TS and DNIC. Similarly, the use of manual algometry to induce TS, or an occlusion cuff to induce DNIC of TS to mechanical stimuli, has not been reported to date. Such devices may offer a simpler method than current techniques for inducing TS and DNIC, affording assessment at more anatomical locations and in more varied research settings. METHOD: The present study assessed the test-retest reliability of TS and DNIC using the above techniques. Sex differences on these measures were also investigated. RESULTS: Repeated measures ANOVA indicated successful induction of TS and DNIC, with no significant differences across test-retest occasions. Sex effects were not significant for any measure or interaction. Intraclass correlations indicated high test-retest reliability for all measures; however, there was large interindividual variation between test and retest measurements. CONCLUSION: The present results indicate acceptable within-session test-retest reliability of TS and DNIC. The results support the possibility of future experimental studies examining factors affecting TS and DNIC. PMID:20011713
Bosakova, Lucia; Kolarcik, Peter; Bobakova, Daniela; Sulcova, Martina; Van Dijk, Jitse P; Reijneveld, Sijmen A; Geckova, Andrea Madarasova
2016-04-01
Participation in organized activities is related with a range of positive outcomes, but the way such participation is measured has not been scrutinized. Test-retest reliability as an important indicator of a scale's reliability has been assessed rarely and for "The scale of participation in organized activities" lacks completely. This test-retest study is based on the Health Behaviour in School-aged Children study and is consistent with its methodology. We obtained data from 353 Czech (51.9 % boys) and 227 Slovak (52.9 % boys) primary school pupils, grades five and nine, who participated in this study in 2013. We used Cohen's kappa statistic and single measures of the intraclass correlation coefficient to estimate the test-retest reliability of all selected items in the sample, stratified by gender, age and country. We mostly observed a large correlation between the test and retest in all of the examined variables (κ ranged from 0.46 to 0.68). Test-retest reliability of the sum score of individual items showed substantial agreement (ICC = 0.64). The scale of participation in organized activities has an acceptable level of agreement, indicating good reliability.
Bergamin, Marco; Gobbo, Stefano; Bullo, Valentina; Vendramin, Barbara; Duregon, Federica; Frizziero, Antonio; Di Blasio, Andrea; Cugusi, Lucia; Zaccaria, Marco; Ermolao, Andrea
2017-01-01
Summary Background Lower extremity muscle mass, strength, power, and physical performance are critical determinants of independent functioning in later life. Isokinetic dynamometers are becoming very common in assessing different features of muscle strength, in both research and clinical practice; however, reliability studies are still needed to support the extended use of those devices. Objective The purpose of this study is to assess the test-retest reliability of knee and ankle isokinetic and isometric strength testing protocols in a sample of older healthy subjects, using a new and untested isokinetic multi-joint evaluation system. Methods Sixteen male and fourteen female older adults (mean age 65.2 ± 4.6 years) were assessed in two testing sessions. Each participant performed a randomized testing procedure that includes different isometric and isokinetic tests for knee and ankle joints. Results All participants concluded the trial safety and no subject reported any discomfort throughout the overall assessment. Coefficients of correlation between measures were calculated showing moderate to strong effects among all test-retest assessments and paired-sample t test showed only one significant difference (p<0.05) in the maximal isokinetic bilateral knee flexion torque. Conclusions The multi-joint evaluation system for the assessment of knee and ankle isokinetic and isometric strength provided reliable test-retest measures in healthy older adults. Level of evidence Ib. PMID:29264344
Test-retest reliability of the Mandarin versions of the Hypertension Self-Care Profile instrument.
Ngoh, Soh Heng Agnes; Lim, Hazel Wai Ling; Koh, Yi Ling Eileen; Tan, Ngiap Chuan
2017-11-01
Self-efficacy in essential hypertension can be measured using scales, such as the "Hypertension Self-Care Profile" (HTN-SCP) questionnaire. It assesses "Behavior", "Motivation", and "Self-efficacy" in 3 domains, respectively. This study aimed to validate the Mandarin version of HTN-SCP instrument (HTN-SCP-Mn) targeted at patients of Chinese ethnicity with hypertension.Our study recruited Chinese patients, aged 40 years and older, with essential hypertension from a public primary healthcare clinic in Singapore. The 60-item HTN-SCP-Mn questionnaire was completed online using a tablet or smartphone on enrolment. A retest was conducted 2 weeks after the initial test. Reliability was assessed by internal consistency and test-retest reliability using Cronbach alpha and intraclass correlation coefficients (ICC). Differences between the overall HTN-SCP-Mn scores of the patients and their self-reported self-management activities were also determined using independent t test.Of the 153 patients who completed the HTN-SCP-Mn during the initial test, 79 responded to the test-retest evaluation. Reliability of the 3 domains "Behavior", "Motivation", and "Self-efficacy" obtained high internal consistency (Cronbach alpha = 0.838, 0.929, and 0.927, respectively). The item total correlation ranged from 0.058 to 0.677 for Behavior, 0.374 to 0.798 for Motivation, and 0.326 to 0.767 for self-efficacy. The ICC indicated fair to good test-retest reliability with scores of 0.643, 0.579, and 0.710 for the respective domains.The results showed face validity of the HTN-SCP-Mn instrument, indicating its potential application in mandarin-proficient patients. Further study is needed to correlate its scores with objective demonstration of self-efficacy.
A pilot study examining density of suppression measurement in strabismus.
Piano, Marianne; Newsham, David
2015-01-01
Establish whether the Sbisa bar, Bagolini filter (BF) bar, and neutral density filter (NDF) bar, used to measure density of suppression, are equivalent and possess test-retest reliability. Determine whether density of suppression is altered when measurement equipment/testing conditions are changed. Our pilot study had 10 subjects aged ≥18 years with childhood-onset strabismus, no ocular pathologies, and no binocular vision when manifest. Density of suppression upon repeated testing, with clinic lights on/off, and using a full/reduced intensity light source, was investigated. Results were analysed for test-retest reliability, equivalence, and changes with alteration of testing conditions. Test-retest reliability issues were present for the BF bar (median 6 filter change from first to final test, p = 0.021) and NDF bar (median 5 filter change from first to final test, p = 0.002). Density of suppression was unaffected by environmental illumination or fixation light intensity variations. Density of suppression measurements were higher when measured with the NDF bar (e.g. NDF bar = 1.5, medium suppression, vs BF bar = 6.5, light suppression). Test-retest reliability issues may be present for the two filter bars currently still under manufacture. Changes in testing conditions do not significantly affect test results, provided the same filter bar is used consistently for testing. Further studies in children with strabismus having active amblyopia treatment would be of benefit. Despite extensive use of these tests in the UK, this is to our knowledge the first study evaluating filter bar equivalence/reliability.
Translation, Cultural Adaptation and Validation of the Simple Shoulder Test to Spanish
Arcuri, Francisco; Barclay, Fernando; Nacul, Ivan
2015-01-01
Background: The validation of widely used scales facilitates the comparison across international patient samples. Objective: The objective was to translate, culturally adapt and validate the Simple Shoulder Test into Argentinian Spanish. Methods: The Simple Shoulder Test was translated from English into Argentinian Spanish by two independent translators, translated back into English and evaluated for accuracy by an expert committee to correct the possible discrepancies. It was then administered to 50 patients with different shoulder conditions.Psycometric properties were analyzed including internal consistency, measured with Cronbach´s Alpha, test-retest reliability at 15 days with the interclass correlation coefficient. Results: The internal consistency, validation, was an Alpha of 0,808, evaluated as good. The test-retest reliability index as measured by intra-class correlation coefficient (ICC) was 0.835, evaluated as excellent. Conclusion: The Simple Shoulder Test translation and it´s cultural adaptation to Argentinian-Spanish demonstrated adequate internal reliability and validity, ultimately allowing for its use in the comparison with international patient samples.
Gilkison, C R; Fenton, M V; Lester, J W
1992-05-01
This study was designed to establish the reliability of a health history questionnaire used as a screening tool for incoming university students. The authors used a test-retest design, with a test interval of 6 months, on a sample of medical and nursing students. The analysis focused on overall reliability of the questionnaire and reproducibility of specific items, based on question format. Questionnaire items of specific interest were those with dichotomous yes/no response options versus open-ended format questions, those using the words frequently or recently, or those that asked multiple questions. Demographic characteristics of the subjects were considered in the evaluation of reliability. Overall reliability of the questionnaire (93.6%) was above the anticipated level of 90%, and subject sex or program of study did not show any significant differences in reproducibility of responses. Although wording of questions did not affect item reliability, dichotomous format questions demonstrated a higher degree of reliability (96.4%) than the overall reliability of the questionnaire. Recommendations for enhancing the reliability of the questionnaire are based on item analysis and information gathered from interviews with subjects.
Determining the Appropriateness of the "What If" Situations Test (WIST) with Turkish Pre-Schoolers.
Citak Tunc, Gulseren; Gorak, Gulay; Ozyazicioglu, Nurcan; Ak, Bedriye; Isil, Ozlem; Vural, Pinar
2018-04-01
Measurement instruments are needed to assess the child's sexual abuse prevention program. The purpose of the study was to determine the reliability and validity of the WIST (What If Situations Test) for Turkish culture. Participants were children of the 3-6 age group attending pre-school education institutions and the sample size was identified by means of a power analysis. Seventy children were identified as the sample with 0.85 power and 0.05 type I error according to the power analysis. Language validity, content validity, internal validity coefficient (Cronbach alpha coefficient), and test-retest analyses were conducted in terms of validity and reliability in the scope of efforts for adaptation to Turkish culture. Firstly, Kendall W = 0.83 was the score for the expert opinions concerning the content validity of the language validity scale. It was found that the Cronbach alpha coefficients were between 0.68 and 0.90 for the scale sub-dimensions of appropriate and inappropriate recognition, saying, doing, telling, and reporting. The test-retest reliability of the scale was found to be r = 0.89 and the test-retest reliabilities for the sub-dimensions (appropriate recognition, inappropriate recognition, say skills, do skills, tell skills, and reporting skills) were between r = 0.48 and r = 0.92. The test-retest reliability for the Personal Safety Questionnaire (PSQ), as having complimentary items to the WIST, was found to be r = 0.82. The reliability and validity analysis of the 'What If' Situations Test (WIST), used to evaluate pre-schoolers' skills regarding self-protection against sexual abuse, showed that the Test's adaptation to Turkish culture was reliable and valid.
Moore, Amy Lawson; Miller, Terissa M
2018-01-01
The purpose of the current study is to evaluate the validity and reliability of the revised Gibson Test of Cognitive Skills, a computer-based battery of tests measuring short-term memory, long-term memory, processing speed, logic and reasoning, visual processing, as well as auditory processing and word attack skills. This study included 2,737 participants aged 5-85 years. A series of studies was conducted to examine the validity and reliability using the test performance of the entire norming group and several subgroups. The evaluation of the technical properties of the test battery included content validation by subject matter experts, item analysis and coefficient alpha, test-retest reliability, split-half reliability, and analysis of concurrent validity with the Woodcock Johnson III Tests of Cognitive Abilities and Tests of Achievement. Results indicated strong sources of evidence of validity and reliability for the test, including internal consistency reliability coefficients ranging from 0.87 to 0.98, test-retest reliability coefficients ranging from 0.69 to 0.91, split-half reliability coefficients ranging from 0.87 to 0.91, and concurrent validity coefficients ranging from 0.53 to 0.93. The Gibson Test of Cognitive Skills-2 is a reliable and valid tool for assessing cognition in the general population across the lifespan.
Manzi, Luigi; Villafañe, Jorge Hugo; Indino, Cristian; Tamini, Jacopo; Berjano, Pedro; Usuelli, Federico Giuseppe
2017-11-08
The purpose of this study was to investigate the test-retest reliability of the Phi angle in patients undergoing total ankle replacement (TAR) for end stage ankle osteoarthritis (OA) to assess the rotational alignment of the talar component. Retrospective observational cross-sectional study of prospectively collected data. Post-operative anteroposterior radiographs of the foot of 170 patients who underwent TAR for the ankle OA were evaluated. Three physicians measured Phi on the 170 randomly sorted and anonymized radiographs on two occasions, one week apart (test and retest conditions), inter and intra-observer agreement were evaluated. Test-retest reliability of Phi angle measurement was excellent for patients with Hintegra TAR (ICC=0.995; p<0.001) and Zimmer TAR (ICC=0.995; p<0.001) on radiographs of subjects with ankle OA. There were no significant differences in the reliability of the Phi angle measurement between patients with Hintegra vs. Zimmer implants (p>0.05). Measurement of Phi angle on weight-bearing dorsoplantar radiograph showed an excellent reliability among orthopaedic surgeons in determining the position of the talar component in the axial plane. Level II, cross sectional study. Copyright © 2017 European Foot and Ankle Society. Published by Elsevier Ltd. All rights reserved.
Improving the Validity and Reliability of a Health Promotion Survey for Physical Therapists
Stephens, Jaca L.; Lowman, John D.; Graham, Cecilia L.; Morris, David M.; Kohler, Connie L.; Waugh, Jonathan B.
2013-01-01
Purpose Physical therapists (PTs) have a unique opportunity to intervene in the area of health promotion. However, no instrument has been validated to measure PTs’ views on health promotion in physical therapy practice. The purpose of this study was to evaluate the content validity and test-retest reliability of a health promotion survey designed for PTs. Methods An expert panel of PTs assessed the content validity of “The Role of Health Promotion in Physical Therapy Survey” and provided suggestions for revision. Item content validity was assessed using the content validity ratio (CVR) as well as the modified kappa statistic. Therapists then participated in the test-retest reliability assessment of the revised health promotion survey, which was assessed using a weighted kappa statistic. Results Based on feedback from the expert panelists, significant revisions were made to the original survey. The expert panel reached at least a majority consensus agreement for all items in the revised survey and the survey-CVR improved from 0.44 to 0.66. Only one item on the revised survey had substantial test-retest agreement, with 55% of the items having moderate agreement and 43% poor agreement. Conclusions All items on the revised health promotion survey demonstrated at least fair validity, but few items had reasonable test-retest reliability. Further modifications should be made to strengthen the validity and improve the reliability of this survey. PMID:23754935
Leifker, Feea R.; Patterson, Thomas L.; Bowie, Christopher R.; Mausbach, Brent T.; Harvey, Philip D.
2010-01-01
Performance-based measures of the ability to perform social and everyday living skills are being more widely used to assess functional capacity in people with serious mental illnesses such as schizophrenia and bipolar disorder. Since they are also being used as outcome measures in pharmacological and cognitive remediation studies aimed at cognitive impairments in schizophrenia, understanding their measurement properties and potential sensitivity to change is important. In this study, the test-retest reliability, practice effects, and reliable change indices of two different performance-based functional capacity measures, the UCSD Performance-based skills assessment (UPSA) and Social skills performance assessment (SSPA) were examined over several different retest intervals in two different samples of people with schizophrenia (n’s=238 and 116) and a healthy comparison sample (n=109). These psychometric properties were compared to those of a neuropsychological assessment battery. Test-retest reliabilities of the long form of the UPSA ranged from r=.63 to r=.80 over follow-up periods up to 36 months in people with schizophrenia, while brief UPSA reliabilities ranged from r=.66 to r=.81. Test-retest reliability of the NP performance scores ranged from r=.77 to r=.79. Test-retest reliabilities of the UPSA were lower in healthy controls, while NP performance was slightly more reliable. SSPA test-retest reliability was lower. Practice effect sizes ranged from .05 to .16 for the UPSA and .07 to .19 for the NP assessment in patients, with HC having more practice effects. Reliable change intervals were consistent across NP and both FC measures, indicating equal potential for detection of change. These performance-based measures of functional capacity appear to have similar potential to be sensitive to change compared to NP performance in people with schizophrenia. PMID:20399613
One year test-retest reliability of neurocognitive baseline scores in 10- to 12-year olds.
Moser, Rosemarie Scolaro; Schatz, Philip; Grosner, Emily; Kollias, Kelly
2017-01-01
How often youth athletes 10-12 years of age should undergo neurocognitive baseline testing remains an unanswered question. We sought to examine the test-retest reliability of annual ImPACT data in a sample of middle school athletes. Participants were 30 youth athletes, ages 10-12 years (Mean = 11.6, SD = 0.6) selected from a larger database of 10-18 year old athletes, who completed two consecutive annual baseline evaluations using the online version of ImPACT. Athlete assent and parental consent were obtained for all participants. Assessments were conducted either individually or in small groups of 2 to 3 athletes, under the supervision of a neuropsychologist or post-doctoral fellow. Test-retest coefficients were as follows: Verbal Memory .71, Visual Memory .35, Visual Motor Speed .69, Reaction Time .34. Intra-class Correlation Coefficients (single/average) were as follows: Verbal Memory .70/.83, Visual Memory .35/.52, Visual Motor Speed .69/.82, Reaction Time .34/.50. Regression-based measures to correct for practice effects revealed that only a small percentage of cases fell outside 90 and 95% confidence intervals, reflecting stability across assessments. Findings indicate that test-retest reliability of Verbal Memory and Visual Motor Speed are generally stable in 10-12 year old athletes. Nevertheless, Visual Memory Index, Reaction Time Index, and Symptom Checklist scores appear to be less reliable over time, especially compared to published data on high school athletes, suggesting the utility of re-testing on an annual basis in this younger age group.
A Test-Retest Analysis of the Vanderbilt Assessment for Leadership in Education in the USA
ERIC Educational Resources Information Center
Minor, Elizabeth Covay; Porter, Andrew C.; Murphy, Joseph; Goldring, Ellen; Elliott, Stephen N.
2017-01-01
The Vanderbilt Assessment for Leadership in Education (VAL-ED) is a 360-degree learning-centered behaviors principal evaluation tool that includes ratings from the principal, supervisors, and teachers. The current study assesses the test-retest reliability of the VAL-ED for a sample of seven school districts as part of multiple validity and…
Exercise-Induced Hypoalgesia After Isometric Wall Squat Exercise: A Test-Retest Reliabilty Study.
Vaegter, Henrik Bjarke; Lyng, Kristian Damgaard; Yttereng, Fredrik Wannebo; Christensen, Mads Holst; Sørensen, Mathias Brandhøj; Graven-Nielsen, Thomas
2018-05-19
Isometric exercises decrease pressure pain sensitivity in exercising and nonexercising muscles known as exercise-induced hypoalgesia (EIH). No studies have assessed the test-retest reliability of EIH after isometric exercise. This study investigated the EIH on pressure pain thresholds (PPTs) after an isometric wall squat exercise. The relative and absolute test-retest reliability of the PPT as a test stimulus and the EIH response in exercising and nonexercising muscles were calculated. In two identical sessions, PPTs of the thigh and shoulder were assessed before and after three minutes of quiet rest and three minutes of wall squat exercise, respectively, in 35 healthy subjects. The relative test-retest reliability of PPT and EIH was determined using analysis of variance models, Person's r, and intraclass correlations (ICCs). The absolute test-retest reliability of EIH was determined based on PPT standard error of measurements and Cohen's kappa for agreement between sessions. Squat increased PPTs of exercising and nonexercising muscles by 16.8% ± 16.9% and 6.7% ± 12.9%, respectively (P < 0.001), with no significant differences between sessions. PPTs within and between sessions showed moderately strong correlations (r ≥ 0.74) and excellent (ICC ≥ 0.84) within-session (rest) and between-session test-retest reliability. EIH responses of exercising and nonexercising muscles showed no systematic errors between sessions; however, the relative test-retest reliability was low (ICCs = 0.03-0.43), and agreement in EIH responders and nonresponders between sessions was not significant (κ < 0.13, P > 0.43). A wall squat exercise increased PPTs compared with quiet rest; however, the relative and absolute reliability of the EIH response was poor. Future research is warranted to investigate the reliability of EIH in clinical pain populations.
Negahban, Hossein; Mohtasebi, Elham; Goharpey, Shahin
2015-01-01
The aim of this methodological study was to cross-culturally translate the Shoulder Activity Scale (SAS) into the Persian and determine its clinimetric properties including reliability, validity, and responsiveness in patients with shoulder disorders. Persian version of the SAS was obtained after standard forward-backward translation. Three questionnaires were completed by the respondents: SAS, shoulder pain and disability index (SPADI), and Short-Form 36 Health Survey (SF-36). The patients completed the SAS, 1 week after the first visit to evaluate the test-retest reliability. Construct validity was evaluated by examining the associations between the scores on the SAS and the scores obtained from the SPADI, SF-36, and age of the patients. To assess responsiveness, data were collected in the first visit and then again after 4 weeks physiotherapy intervention. Test-retest reliability and internal consistency were assessed using Intra-class Correlation Coefficient (ICC) and Cronbach's alpha, respectively. To evaluate construct validity, Spearman's rank correlation was used. The ability of the SAS to detect changes was evaluated by the receiver-operating characteristics method. No problem or language difficulties were reported during translation process. Test-retest reliability of the SAS was excellent with an ICC of 0.98. Also, the marginal Cronbach's alpha level of 0.64 was obtained. The correlation between the SAS and the SPADI was low, proving divergent validity, whereas the correlations between the SAS and the SF-36/age were moderate proving convergent validity. A marginally acceptable responsiveness was achieved for the Persian SAS. The study provides some evidences to support the test-retest reliability, internal consistency, construct validity, and responsiveness of the Persian version of the SAS in patients with shoulder disorders. Therefore, it seems that this instrument is a useful measure of shoulder activity level in research setting and clinical practice. The shoulder activity scale (SAS) is a reliable, valid, and responsive measure of shoulder activity level in Persian-speaking patients with different shoulder disorders. The results on clinimetric properties of the Persian SAS are comparable with its original, English version. Persian version of the SAS can be used in "clinical" and "research" settings of patients with shoulder disorders.
Gustafsson, Margareta; Blomberg, Karin; Holmefur, Marie
2015-07-01
The Clinical Learning Environment, Supervision and Nurse Teacher (CLES + T) scale evaluates the student nurses' perception of the learning environment and supervision within the clinical placement. It has never been tested in a replication study. The aim of the present study was to evaluate the test-retest reliability of the CLES + T scale. The CLES + T scale was administered twice to a group of 42 student nurses, with a one-week interval. Test-retest reliability was determined by calculations of Intraclass Correlation Coefficients (ICCs) and weighted Kappa coefficients. Standard Error of Measurements (SEM) and Smallest Detectable Difference (SDD) determined the precision of individual scores. Bland-Altman plots were created for analyses of systematic differences between the test occasions. The results of the study showed that the stability over time was good to excellent (ICC 0.88-0.96) in the sub-dimensions "Supervisory relationship", "Pedagogical atmosphere on the ward" and "Role of the nurse teacher". Measurements of "Premises of nursing on the ward" and "Leadership style of the manager" had lower but still acceptable stability (ICC 0.70-0.75). No systematic differences occurred between the test occasions. This study supports the usefulness of the CLES + T scale as a reliable measure of the student nurses' perception of the learning environment within the clinical placement at a hospital. Copyright © 2015 Elsevier Ltd. All rights reserved.
Roaldsen, Kirsti Skavberg; Måøy, Åsa Blad; Jørgensen, Vivien; Stanghelle, Johan Kvalvik
2016-05-01
Translation of the Spinal Cord Injury Falls Concern Scale (SCI-FCS), and investigation of test-retest reliability on item-level and total-score-level. Translation, adaptation and test-retest study. A specialized rehabilitation setting in Norway. Fifty-four wheelchair users with a spinal cord injury. The median age of the cohort was 49 years, and the median number of years after injury was 13. Interventions/measurements: The SCI-FCS was translated and back-translated according to guidelines. Individuals answered the SCI-FCS twice over the course of one week. We investigated item-level test-retest reliability using Svensson's rank-based statistical method for disagreement analysis of paired ordinal data. For relative reliability, we analyzed the total-score-level test-retest reliability with intraclass correlation coefficients (ICC2.1), the standard error of measurement (SEM), and the smallest detectable change (SDC) for absolute reliability/measurement-error assessment and Cronbach's alpha for internal consistency. All items showed satisfactory percentage agreement (≥69%) between test and retest. There were small but non-negligible systematic disagreements among three items; we recovered an 11-13% higher chance for a lower second score. There was no disagreement due to random variance. The test-retest agreement (ICC2.1) was excellent (0.83). The SEM was 2.6 (12%), and the SDC was 7.1 (32%). The Cronbach's alpha was high (0.88). The Norwegian SCI-FCS is highly reliable for wheelchair users with chronic spinal cord injuries.
Ribeiro, Fernanda; Lépine, Pierre-Alexis; Garceau-Bolduc, Corine; Coats, Valérie; Allard, Étienne; Maltais, François; Saey, Didier
2015-01-01
Background The purpose of this study was to determine and compare the test-retest reliability of quadriceps isokinetic endurance testing at two knee angular velocities in patients with chronic obstructive pulmonary disease (COPD). Methods After one familiarization session, 14 patients with moderate to severe COPD (mean age 65±4 years; forced expiratory volume in 1 second (FEV1) 55%±18% predicted) performed two quadriceps isokinetic endurance tests on two separate occasions within a 5–7-day interval. Quadriceps isokinetic endurance tests consisted of 30 maximal knee extensions at angular velocities of 90° and 180° per second, performed in random order. Test-retest reliability was assessed for peak torque, muscle endurance, work slope, work fatigue index, and changes in FEV1 for dyspnea and leg fatigue from rest to the end of the test. The intraclass correlation coefficient, minimal detectable change, and limits of agreement were calculated. Results High test-retest reliability was identified for peak torque and muscle total work at both velocities. Work fatigue index was considered reliable at 90° per second but not at 180° per second. A lower reliability was identified for dyspnea and leg fatigue scores at both angular velocities. Conclusion Despite a limited sample size, our findings support the use of a 30-maximal repetition isokinetic muscle testing procedure at angular velocities of 90° and 180° per second in patients with moderate to severe COPD. Endurance measurement (total isokinetic work) at 90° per second was highly reliable, with a minimal detectable change at the 95% confidence level of 10%. Peak torque and fatigue index could also be assessed reliably at 90° per second. Evaluation of dyspnea and leg fatigue using the modified Borg scale of perceived exertion was poorly reliable and its clinical usefulness is questionable. These results should be useful in the design and interpretation of future interventions aimed at improving muscle endurance in COPD. PMID:26124656
Salavati, M; Waninge, A; Rameckers, E A A; de Blécourt, A C E; Krijnen, W P; Steenbergen, B; van der Schans, C P
2015-02-01
The aims of this study were to adapt the Paediatric Evaluation of Disability Inventory, Dutch version (PEDI-NL) for children with cerebral visual impairment (CVI) and cerebral palsy (CP) and determine test-retest and inter-respondent reliability. The Delphi method was used to gain consensus among twenty-one health experts familiar with CVI. Test-retest and inter-respondent reliability were assessed for parents and caregivers of 75 children (aged 50-144 months) with CP and CVI. The percentage identical scores of item scores were computed, as well as the interclass coefficients (ICC) and Cronbach's alphas of scale scores over the domains self-care, mobility, and social function. All experts agreed on the adaptation of the PEDI-NL for children with CVI. On item score, for the Functional Skills scale, mean percentage identical scores variations for test-retest reliability were 73-79 with Caregiver Assistance scale 73-81, and for inter-respondent reliability 21-76 with Caregiver Assistance scale 40-43. For all scales over all domains ICCs exceeded 0.87. For the domains self-care, mobility, and social function, the Functional Skills scale and the Caregiver Assistance scale have Cronbach's alpha above 0.88. The adapted PEDI-NL for children with CP and CVI is reliable and comparable to the original PEDI-NL. Copyright © 2014 Elsevier Ltd. All rights reserved.
Goetz, Katja; Hasse, Philipp; Szecsenyi, Joachim; Campbell, Stephen M
2016-04-01
The consideration of organisational aspects, such as shared goals and clear communication, within the health care team is important to ensure good quality care. In primary health care, the instrument Survey of Organizational Attributes for Primary Care (SOAPC) is available to measure organisational attributes of care. However, there is no instrument available for dental care. The aim of the present study was to investigate psychometric properties and test-retest reliability of the version of SOAPC adapted for dental care, namely the Survey of Organizational Attributes in Dental Care (SOADC). The SOADC consists of 21 items in the following four subscales: communication; decision making; stress/chaos; and history of change. Convergent construct validity was measured using the job satisfaction scale. A total of 287 dental-care practices were asked to participate in the validation study. Psychometric properties and test-retest reliability were observed. A total of 43 dental-care practices responded to the survey. At baseline, 178 dental-care staff completed the questionnaire, and 4 weeks later 138 did so. Internal consistency, measured by Cronbach's alpha, was 0.718 or higher in the subscales. The test-retest reliability for each subscale and the overall SOADC score demonstrated good correlations over the 4-week test-retest interval, except for 'history of change'. A strong correlation with the aggregated job-satisfaction scale showed high convergent construct validity of SOADC. The consideration of organisational aspects from the perspective of dental-care teams is important for providing good quality of care. The SOADC is a reliable instrument with good psychometric properties and is suitable for the evaluation of organisational attributes in dental-care practices. © 2015 FDI World Dental Federation.
Reliability of cognitive tests of ELSA-Brasil, the brazilian longitudinal study of adult health
Batista, Juliana Alves; Giatti, Luana; Barreto, Sandhi Maria; Galery, Ana Roscoe Papini; Passos, Valéria Maria de Azeredo
2013-01-01
Cognitive function evaluation entails the use of neuropsychological tests, applied exclusively or in sequence. The results of these tests may be influenced by factors related to the environment, the interviewer or the interviewee. OBJECTIVES We examined the test-retest reliability of some tests of the Brazilian version from the Consortium to Establish a Registry for Alzheimer's disease. METHODS The ELSA-Brasil is a multicentre study of civil servants (35-74 years of age) from public institutions across six Brazilian States. The same tests were applied, in different order of appearance, by the same trained and certified interviewer, with an approximate 20-day interval, to 160 adults (51% men, mean age 52 years). The Intraclass Correlation Coefficient (ICC) was used to assess the reliability of the measures; and a dispersion graph was used to examine the patterns of agreement between them. RESULTS We observed higher retest scores in all tests as well as a shorter test completion time for the Trail Making Test B. ICC values for each test were as following: Word List Learning Test (0.56), Word Recall (0.50), Word Recognition (0.35), Phonemic Verbal Fluency Test (VFT, 0.61), Semantic VFT (0.53) and Trail B (0.91). The Bland-Altman plot showed better correlation of executive function (VFT and Trail B) than of memory tests. CONCLUSIONS Better performance in retest may reflect a learning effect, and suggest that retest should be repeated using alternate forms or after longer periods. In this sample of adults with high schooling level, reliability was only moderate for memory tests whereas the measurement of executive function proved more reliable. PMID:29213860
Improving the Test-Retest Reliability of Resting State fMRI by Removing the Impact of Sleep.
Wang, Jiahui; Han, Junwei; Nguyen, Vinh T; Guo, Lei; Guo, Christine C
2017-01-01
Resting state functional magnetic resonance imaging (rs-fMRI) provides a powerful tool to examine large-scale neural networks in the human brain and their disturbances in neuropsychiatric disorders. Thanks to its low demand and high tolerance, resting state paradigms can be easily acquired from clinical population. However, due to the unconstrained nature, resting state paradigm is associated with excessive head movement and proneness to sleep. Consequently, the test-retest reliability of rs-fMRI measures is moderate at best, falling short of widespread use in the clinic. Here, we characterized the effect of sleep on the test-retest reliability of rs-fMRI. Using measures of heart rate variability (HRV) derived from simultaneous electrocardiogram (ECG) recording, we identified portions of fMRI data when subjects were more alert or sleepy, and examined their effects on the test-retest reliability of functional connectivity measures. When volumes of sleep were excluded, the reliability of rs-fMRI is significantly improved, and the improvement appears to be general across brain networks. The amount of improvement is robust with the removal of as much as 60% volumes of sleepiness. Therefore, test-retest reliability of rs-fMRI is affected by sleep and could be improved by excluding volumes of sleepiness as indexed by HRV. Our results suggest a novel and practical method to improve test-retest reliability of rs-fMRI measures.
Singh, Amika S; Vik, Froydis N; Chinapaw, Mai J M; Uijtdewilligen, Léonie; Verloigne, Maïté; Fernández-Alvira, Juan M; Stomfai, Sarolta; Manios, Yannis; Martens, Marloes; Brug, Johannes
2011-12-09
Insight in children's energy balance-related behaviours (EBRBs) and their determinants is important to inform obesity prevention research. Therefore, reliable and valid tools to measure these variables in large-scale population research are needed. To examine the test-retest reliability and construct validity of the child questionnaire used in the ENERGY-project, measuring EBRBs and their potential determinants among 10-12 year old children. We collected data among 10-12 year old children (n = 730 in the test-retest reliability study; n = 96 in the construct validity study) in six European countries, i.e. Belgium, Greece, Hungary, the Netherlands, Norway, and Spain. Test-retest reliability was assessed using the intra-class correlation coefficient (ICC) and percentage agreement comparing scores from two measurements, administered one week apart. To assess construct validity, the agreement between questionnaire responses and a subsequent face-to-face interview was assessed using ICC and percentage agreement. Of the 150 questionnaire items, 115 (77%) showed good to excellent test-retest reliability as indicated by ICCs > .60 or percentage agreement ≥ 75%. Test-retest reliability was moderate for 34 items (23%) and poor for one item. Construct validity appeared to be good to excellent for 70 (47%) of the 150 items, as indicated by ICCs > .60 or percentage agreement ≥ 75%. From the other 80 items, construct validity was moderate for 39 (26%) and poor for 41 items (27%). Our results demonstrate that the ENERGY-child questionnaire, assessing EBRBs of the child as well as personal, family, and school-environmental determinants related to these EBRBs, has good test-retest reliability and moderate to good construct validity for the large majority of items.
2011-01-01
Background Insight in children's energy balance-related behaviours (EBRBs) and their determinants is important to inform obesity prevention research. Therefore, reliable and valid tools to measure these variables in large-scale population research are needed. Objective To examine the test-retest reliability and construct validity of the child questionnaire used in the ENERGY-project, measuring EBRBs and their potential determinants among 10-12 year old children. Methods We collected data among 10-12 year old children (n = 730 in the test-retest reliability study; n = 96 in the construct validity study) in six European countries, i.e. Belgium, Greece, Hungary, the Netherlands, Norway, and Spain. Test-retest reliability was assessed using the intra-class correlation coefficient (ICC) and percentage agreement comparing scores from two measurements, administered one week apart. To assess construct validity, the agreement between questionnaire responses and a subsequent face-to-face interview was assessed using ICC and percentage agreement. Results Of the 150 questionnaire items, 115 (77%) showed good to excellent test-retest reliability as indicated by ICCs > .60 or percentage agreement ≥ 75%. Test-retest reliability was moderate for 34 items (23%) and poor for one item. Construct validity appeared to be good to excellent for 70 (47%) of the 150 items, as indicated by ICCs > .60 or percentage agreement ≥ 75%. From the other 80 items, construct validity was moderate for 39 (26%) and poor for 41 items (27%). Conclusions Our results demonstrate that the ENERGY-child questionnaire, assessing EBRBs of the child as well as personal, family, and school-environmental determinants related to these EBRBs, has good test-retest reliability and moderate to good construct validity for the large majority of items. PMID:22152048
Hajdú, Sara Fredslund; Plaschke, Christina Caroline; Johansen, Christoffer; Dalton, Susanne Oksbjerg; Wessel, Irene
2017-08-01
The objectives were to translate and culturally adapt the M.D. Anderson Dysphagia Inventory (MDADI) into Danish and subsequently test the reliability of the Danish version. The MDADI was translated into Danish and cross culturally adapted through cognitive interviews. The final version was test-retest evaluated in a group of head and neck cancer (HNC) patients who responded to the questionnaire twice with a mean of eight days apart. Interclass correlation coefficient, Cronbach's alpha, floor and ceiling effects, standard error of measurement and minimal detectable change were investigated. Fourteen patients were interviewed on the comprehensibility of the Danish MDADI, and all found the questionnaire meaningful, easy to understand, non-offensive and to include relevant aspects of dysphagia related to HNC. Sixty-four patients were included in the test-retest study. Especially, one item in the emotional scale (E7) appeared to be often misinterpreted, and ceiling effects were found in all four subdomains (global, emotional, functional and physical). The four subdomains and the composite score showed acceptable test-retest reliability and internal consistency in a Danish population of HNC patients. The Danish MDADI is reliable in terms of internal consistency and test-retest reproducibility and can be used in assessing the health-related quality of life in head and neck cancer patients with dysphagia.
Huang, Lijie; Huang, Taicheng; Zhen, Zonglei; Liu, Jia
2016-03-15
We present a test-retest dataset for evaluation of long-term reliability of measures from structural and resting-state functional magnetic resonance imaging (sMRI and rfMRI) scans. The repeated scan dataset was collected from 61 healthy adults in two sessions using highly similar imaging parameters at an interval of 103-189 days. However, as the imaging parameters were not completely identical, the reliability estimated from this dataset shall reflect the lower bounds of the true reliability of sMRI/rfMRI measures. Furthermore, in conjunction with other test-retest datasets, our dataset may help explore the impact of different imaging parameters on reliability of sMRI/rfMRI measures, which is especially critical for assessing datasets collected from multiple centers. In addition, intelligence quotient (IQ) was measured for each participant using Raven's Advanced Progressive Matrices. The data can thus be used for purposes other than assessing reliability of sMRI/rfMRI alone. For example, data from each single session could be used to associate structural and functional measures of the brain with the IQ metrics to explore brain-IQ association.
Burns, Ted M.; Conaway, Mark; Sanders, Donald B.
2010-01-01
Objective: To study the concurrent and construct validity and test-retest reliability in the practice setting of an outcome measure for myasthenia gravis (MG). Methods: Eleven centers participated in the validation study of the Myasthenia Gravis Composite (MGC) scale. Patients with MG were evaluated at 2 consecutive visits. Concurrent and construct validities of the MGC were assessed by evaluating MGC scores in the context of other MG-specific outcome measures. We used numerous potential indicators of clinical improvement to assess the sensitivity and specificity of the MGC for detecting clinical improvement. Test-retest reliability was performed on patients at the University of Virginia. Results: A total of 175 patients with MG were enrolled at 11 sites from July 1, 2008, to January 31, 2009. A total of 151 patients were seen in follow-up. Total MGC scores showed excellent concurrent validity with other MG-specific scales. Analyses of sensitivities and specificities of the MGC revealed that a 3-point improvement in total MGC score was optimal for signifying clinical improvement. A 3-point improvement in the MGC also appears to represent a meaningful improvement to most patients, as indicated by improved 15-item myasthenia gravis quality of life scale (MG-QOL15) scores. The psychometric properties were no better for an individualized subscore made up of the 2 functional domains that the patient identified as most important to treat. The test-retest reliability coefficient of the MGC was 98%, with a lower 95% confidence interval of 97%, indicating excellent test-retest reliability. Conclusions: The Myasthenia Gravis Composite is a reliable and valid instrument for measuring clinical status of patients with myasthenia gravis in the practice setting and in clinical trials. PMID:20439845
Koho, P; Aho, S; Kautiainen, H; Pohjolainen, T; Hurri, H
2014-12-01
To estimate the internal consistency, test-retest reliability and comparability of paper and computer versions of the Finnish version of the Tampa Scale of Kinesiophobia (TSK-FIN) among patients with chronic pain. In addition, patients' personal experiences of completing both versions of the TSK-FIN and preferences between these two methods of data collection were studied. Test-retest reliability study. Paper and computer versions of the TSK-FIN were completed twice on two consecutive days. The sample comprised 94 consecutive patients with chronic musculoskeletal pain participating in a pain management or individual rehabilitation programme. The group rehabilitation design consisted of physical and functional exercises, evaluation of the social situation, psychological assessment of pain-related stress factors, and personal pain management training in order to regain overall function and mitigate the inconvenience of pain and fear-avoidance behaviour. The mean TSK-FIN score was 37.1 [standard deviation (SD) 8.1] for the computer version and 35.3 (SD 7.9) for the paper version. The mean difference between the two versions was 1.9 (95% confidence interval 0.8 to 2.9). Test-retest reliability was 0.89 for the paper version and 0.88 for the computer version. Internal consistency was considered to be good for both versions. The intraclass correlation coefficient for comparability was 0.77 (95% confidence interval 0.66 to 0.85), indicating substantial reliability between the two methods. Both versions of the TSK-FIN demonstrated substantial intertest reliability, good test-retest reliability, good internal consistency and acceptable limits of agreement, suggesting their suitability for clinical use. However, subjects tended to score higher when using the computer version. As such, in an ideal situation, data should be collected in a similar manner throughout the course of rehabilitation or clinical research. Copyright © 2014 Chartered Society of Physiotherapy. Published by Elsevier Ltd. All rights reserved.
Yun, Young-Ju; Shin, Yong-Beom; Kim, Soo-Yeon; Shin, Myung-Jun; Kim, Ra-Jin; Oh, Tae-Young
2016-07-01
[Purpose] The purpose of this study was to develop the Korean version of the PedsQL(TM) 3.0 Cerebral Palsy Module to evaluate the health-related quality of life of children with cerebral palsy and to test the reliability and validity. [Subjects and Methods] The study included 108 caregivers of children with cerebral palsy aged 2 to 4 years and 72 caregivers of children aged 5 to 7 years, who visited multiple sites between February and August 2015. The Translation Commission performed the first translation with the approval of the Mapi Research Trust Company to create a Korean-version of the PedsQL(TM). Afterwards, back-translation was performed by one translator specializing in health and medical treatment who was a native English-speaker fluent in Korean, and one native Korean-speaker fluent in English. The consistency of each question was confirmed and a translation-integrated version was created. Test components were explained to caregivers during a one-on-one interview; caregivers then completed the PedsQL(TM) questionnaire and a Pediatric Evaluation Disability Inventory (PEDI) questionnaire. Subjects contributing to test-retest measures were asked to repeat the PedsQL questionnaire one week later and return it by mail. To assess data quality for the survey question results, non-response rate, ceiling effect, and floor effect were analyzed. Test-retest reliability and internal consistency reliability were assessed. For test-retest reliability, an intraclass correlation coefficient (ICC) was calculated, and for internal consistency reliability, Cronbach's alpha was used. To test criterion-related validity, Pearson's correlation coefficient was used. [Results] The content validity of the PedsQL 3.0 Cerebral Palsy Module was high for both age groups, and demonstrated significant internal consistency (>0.7) in all areas. For test-retest reliability, both groups demonstrated a significant ICC (>0.61). Correlation with the PEDI was statistically significant in all areas except pain and hurt. [Conclusion] The Korean version of the PedsQL(TM) 3.0 Cerebral Palsy Module was found to be reliable and valid, and is expected to contribute greatly to the evaluation of the quality of life of children with cerebral palsy.
Duncan, Mitch J; Rashid, Mahbub; Vandelanotte, Corneel; Cutumisu, Nicoleta; Plotnikoff, Ronald C
2013-02-04
Spatial configurations of office environments assessed by Space Syntax methodologies are related to employee movement patterns. These methods require analysis of floors plans which are not readily available in large population-based studies or otherwise unavailable. Therefore a self-report instrument to assess spatial configurations of office environments using four scales was developed. The scales are: local connectivity (16 items), overall connectivity (11 items), visibility of co-workers (10 items), and proximity of co-workers (5 items). A panel cohort (N = 1154) completed an online survey, only data from individuals employed in office-based occupations (n = 307) were used to assess scale measurement properties. To assess test-retest reliability a separate sample of 37 office-based workers completed the survey on two occasions 7.7 (±3.2) days apart. Redundant scale items were eliminated using factor analysis; Chronbach's α was used to evaluate internal consistency and test re-test reliability (retest-ICC). ANOVA was employed to examine differences between office types (Private, Shared, Open) as a measure of construct validity. Generalized Linear Models were used to examine relationships between spatial configuration scales and the duration of and frequency of breaks in occupational sitting. The number of items on all scales were reduced, Chronbach's α and ICCs indicated good scale internal consistency and test re-test reliability: local connectivity (5 items; α = 0.70; retest-ICC = 0.84), overall connectivity (6 items; α = 0.86; retest-ICC = 0.87), visibility of co-workers (4 items; α = 0.78; retest-ICC = 0.86), and proximity of co-workers (3 items; α = 0.85; retest-ICC = 0.70). Significant (p ≤ 0.001) differences, in theoretically expected directions, were observed for all scales between office types, except overall connectivity. Significant associations were observed between all scales and occupational sitting behaviour (p ≤ 0.05). All scales have good measurement properties indicating the instrument may be a useful alternative to Space Syntax to examine environmental correlates of occupational sitting in population surveys.
2013-01-01
Background Spatial configurations of office environments assessed by Space Syntax methodologies are related to employee movement patterns. These methods require analysis of floors plans which are not readily available in large population-based studies or otherwise unavailable. Therefore a self-report instrument to assess spatial configurations of office environments using four scales was developed. Methods The scales are: local connectivity (16 items), overall connectivity (11 items), visibility of co-workers (10 items), and proximity of co-workers (5 items). A panel cohort (N = 1154) completed an online survey, only data from individuals employed in office-based occupations (n = 307) were used to assess scale measurement properties. To assess test-retest reliability a separate sample of 37 office-based workers completed the survey on two occasions 7.7 (±3.2) days apart. Redundant scale items were eliminated using factor analysis; Chronbach’s α was used to evaluate internal consistency and test re-test reliability (retest-ICC). ANOVA was employed to examine differences between office types (Private, Shared, Open) as a measure of construct validity. Generalized Linear Models were used to examine relationships between spatial configuration scales and the duration of and frequency of breaks in occupational sitting. Results The number of items on all scales were reduced, Chronbach’s α and ICCs indicated good scale internal consistency and test re-test reliability: local connectivity (5 items; α = 0.70; retest-ICC = 0.84), overall connectivity (6 items; α = 0.86; retest-ICC = 0.87), visibility of co-workers (4 items; α = 0.78; retest-ICC = 0.86), and proximity of co-workers (3 items; α = 0.85; retest-ICC = 0.70). Significant (p ≤ 0.001) differences, in theoretically expected directions, were observed for all scales between office types, except overall connectivity. Significant associations were observed between all scales and occupational sitting behaviour (p ≤ 0.05). Conclusion All scales have good measurement properties indicating the instrument may be a useful alternative to Space Syntax to examine environmental correlates of occupational sitting in population surveys. PMID:23379485
Pallett, Edward; Rentowl, Patricia; Hanning, Christopher
2009-09-01
An Electronic Portable Information Collection audio device (EPIC-Vox) has been developed to deliver questionnaires in spoken word format via headphones. Patients respond by pressing buttons on the device. The aims of this study were to determine limits of agreement between, and test-retest reliability of audio (A) and paper (P) versions of the Brief Fatigue Inventory (BFI). Two hundred sixty outpatients (204 male, mean age 55.7 years) attending a sleep disorders clinic were allocated to four groups using block randomization. All completed the BFI twice, separated by a one-minute distracter task. Half the patients completed paper and audio versions, then an evaluation questionnaire. The remainder completed either paper or audio versions to compare test-retest reliability. BFI global scores were analyzed using Bland-Altman methodology. Agreement between categorical fatigue severity scores was determined using Cohen's kappa. The mean (SD) difference between paper and audio scores was -0.04 (0.48). The limits of agreement (mean difference+/-2SD) were -0.93 to +1.00. Test-retest reliability of the paper BFI showed a mean (SD) difference of 0.17 (0.32) between first and second presentations (limits -0.46 to +0.81). For audio, the mean (SD) difference was 0.17 (0.48) (limits -0.79 to +1.14). For agreement between categorical scores, Cohen's kappa=0.73 for P and A, 0.67 (P at test and retest) and 0.87 (A at test and retest). Evaluation preferences (n=128): 36.7% audio; 18.0% paper; and 45.3% no preference. A total of 99.2% found EPIC-Vox "easy to use." These data demonstrate that the English audio version of the BFI provides an acceptable alternative to the paper questionnaire.
VanGeest, Jonathan B; Wynia, Matthew K; Cummins, Deborah S; Wilson, Ira B
2002-06-01
This study examined the test-retest reliability of physicians' self-reported manipulation of reimbursement rules for patients. The test-retest reliability of self-report of three specific tactics were examined: (1) exaggerating the severity of patients' conditions, (2) changing a patient's official (billing) diagnosis, and (3) reporting signs or symptoms that patients did not have. The reliability of a scaled summary measure of physicians' manipulation of reimbursement rules was also assessed. Overall, the authors found high levels of test-retest agreement across all three items and the summary measure. These findings suggest that self-report can be used to produce reliable data on this controversial issue. Specifically, the three items reported here can be used to produce a reliable summary measure of physicians' manipulation of reimbursement rules to help patients obtain care that physicians perceive as necessary.
Moro, Maria Francesca; Colom, Francesc; Floris, Francesca; Pintus, Elisa; Pintus, Mirra; Contini, Francesca; Carta, Mauro Giovanni
2012-01-01
Background: Functioning Assessment Short Test (FAST) is a brief instrument designed to assess the main functioning problems experienced by psychiatric patients, specifically bipolar patients. It includes 24 items assessing impairment or disability in six domains of functioning: autonomy, occupational functioning, cognitive functioning, financial issues, interpersonal relationships and leisure time. The aim of this study is to measure the validity and reliability of the Italian version of this instrument. Methods: Twenty-four patients with DSM-IV TR bipolar disorder and 20 healthy controls were recruited and evaluated in three private clinics in Cagliari (Sardinia, Italy). The psychometric properties of FAST (feasibility, internal consistency, concurrent validity, discriminant validity (patients vs controls and eutimic patients vs manic and depressed), and test-retest reliability were analyzed. Results: The internal consistency obtained was very high with a Cronbach's alpha of 0.955. A highly significant negative correlation with GAF was obtained (r = -0.9; p < 0.001) pointing to a reasonable degree of concurrent validity. FAST show a good test-retest reliability between two independent evaluation differing of one week (mean K =0.73). The total FAST scores were lower in controls as compared with Bipolar Patients and in Euthimic patients compared with Depressed or Manic. Conclusion: The Italian version of the FAST showed similar psychometrics properties as far as regard internal consistency and discriminant validity of the original version and show a good test retest reliability measure by means of K statistics. PMID:22905035
Strand, Edythe A; McCauley, Rebecca J; Weigand, Stephen D; Stoeckel, Ruth E; Baas, Becky S
2013-04-01
In this article, the authors report reliability and validity evidence for the Dynamic Evaluation of Motor Speech Skill (DEMSS), a new test that uses dynamic assessment to aid in the differential diagnosis of childhood apraxia of speech (CAS). Participants were 81 children between 36 and 79 months of age who were referred to the Mayo Clinic for diagnosis of speech sound disorders. Children were given the DEMSS and a standard speech and language test battery as part of routine evaluations. Subsequently, intrajudge, interjudge, and test-retest reliability were evaluated for a subset of participants. Construct validity was explored for all 81 participants through the use of agglomerative cluster analysis, sensitivity measures, and likelihood ratios. The mean percentage of agreement for 171 judgments was 89% for test-retest reliability, 89% for intrajudge reliability, and 91% for interjudge reliability. Agglomerative hierarchical cluster analysis showed that total DEMSS scores largely differentiated clusters of children with CAS vs. mild CAS vs. other speech disorders. Positive and negative likelihood ratios and measures of sensitivity and specificity suggested that the DEMSS does not overdiagnose CAS but sometimes fails to identify children with CAS. The value of the DEMSS in differential diagnosis of severe speech impairments was supported on the basis of evidence of reliability and validity.
Cole, Jason C; Ito, Diane; Chen, Yaozhu J; Cheng, Rebecca; Bolognese, Jennifer; Li-McLeod, Josephine
2014-09-04
There is a lack of validated instruments to measure the level of burden of Alzheimer's disease (AD) on caregivers. The Impact of Alzheimer's Disease on Caregiver Questionnaire (IADCQ) is a 12-item instrument with a seven-day recall period that measures AD caregiver's burden across emotional, physical, social, financial, sleep, and time aspects. Primary objectives of this study were to evaluate psychometric properties of IADCQ administered on the Web and to determine most appropriate scoring algorithm. A national sample of 200 unpaid AD caregivers participated in this study by completing the Web-based version of IADCQ and Short Form-12 Health Survey Version 2 (SF-12v2™). The SF-12v2 was used to measure convergent validity of IADCQ scores and to provide an understanding of the overall health-related quality of life of sampled AD caregivers. The IADCQ survey was also completed four weeks later by a randomly selected subgroup of 50 participants to assess test-retest reliability. Confirmatory factor analysis (CFA) was implemented to test the dimensionality of the IADCQ items. Classical item-level and scale-level psychometric analyses were conducted to estimate psychometric characteristics of the instrument. Test-retest reliability was performed to evaluate the instrument's stability and consistency over time. Virtually none (2%) of the respondents had either floor or ceiling effects, indicating the IADCQ covers an ideal range of burden. A single-factor model obtained appropriate goodness of fit and provided evidence that a simple sum score of the 12 items of IADCQ can be used to measure AD caregiver's burden. Scales-level reliability was supported with a coefficient alpha of 0.93 and an intra-class correlation coefficient (for test-retest reliability) of 0.68 (95% CI: 0.50-0.80). Low-moderate negative correlations were observed between the IADCQ and scales of the SF-12v2. The study findings suggest the IADCQ has appropriate psychometric characteristics as a unidimensional, Web-based measure of AD caregiver burden and is supported by strong model fit statistics from CFA, high degree of item-level reliability, good internal consistency, moderate test-retest reliability, and moderate convergent validity. Additional validation of the IADCQ is warranted to ensure invariance between the paper-based and Web-based administration and to determine an appropriate responder definition.
Clinical applications of correlational vestibular autorotation test.
Hsieh, Li-Chun; Lin, Te-Ming; Chang, Yu-Min; Kuo, Terry B J; Lee, Gho-She
2015-06-01
The correlational vestibular autorotation test (VAT) system has the advantages of good test-retest reliability and calibrations of absolute degrees of eye movement are unnecessary when acquiring a cross correlation coefficient (CCC). The approach is able to efficiently detect peripheral vestibulopathies. A VAT has some drawbacks including poor test-retest reliability and slippage of sensor. This study aimed to develop a correlational VAT system and to evaluate the reliability and applicability of this system. Twenty healthy participants and 10 vertiginous patients were enrolled. Vertical and horizontal autorotations from 0 to 3 Hz with either closed or open eyes were performed. A small sensor and a wireless transmission technique were used to acquire the electro-ocular graph and head velocity signals. The two signals were analyzed using CCCs to assess the functioning of the vestibular ocular reflex (VOR). The results showed a significantly greater CCC for open-eye versus closed-eye of head autorotations. The CCCs also increased significantly with head rotational frequencies. Moreover, the CCCs significantly correlated with the VOR gains at autorotation frequencies ≥1.0 Hz. The test-retest reliability was good (intraclass correlation coefficients ≥0.85). The vertiginous participants had significantly lower individual CCCs and overall average CCC than age- and-gender matched controls.
Validity and test-retest reliability of the six-spot step test in persons after stroke.
Arvidsson Lindvall, Mialinn; Anderzén-Carlsson, Agneta; Appelros, Peter; Forsberg, Anette
2018-06-06
After stroke, asymmetric weight distribution is common with decreased balance control in standing and walking. The six-spot step test (SSST) includes a 5-m walk during which one leg shoves wooden blocks out of circles marked on the floor, thus assessing the ability to take load on each leg. The aim of the present study was to investigate the convergent and discriminant validity and test-retest reliability of the SSST in persons with stroke. Eighty-one participants were included. A cross-sectional study was performed, in which the SSST was conducted twice, 3-7 days apart. Validity was investigated using measures of dynamic balance and walking. Reliability was assessed using intraclass correlation coefficient, standard error of the measurement (SEM), and smallest real difference (SRD). The convergent validity was strong to moderate, and the test-retest reliability was good. The SEM% was 14.7%, and the SRD% was 40.8% based on the mean of four walks shoving twice with the paretic and twice with the non-paretic leg. Values on random measurement error were high affecting the use of the SSST for follow-up evaluations but the SSST can be a complementary measure of gait and balance.
Comparison of Medical and Consumer Wireless EEG Systems for Use in Clinical Trials.
Ratti, Elena; Waninger, Shani; Berka, Chris; Ruffini, Giulio; Verma, Ajay
2017-01-01
Objectives: To compare quantitative EEG signal and test-retest reliability of medical grade and consumer EEG systems. Methods: Resting state EEG was acquired by two medical grade (B-Alert, Enobio) and two consumer (Muse, Mindwave) EEG systems in five healthy subjects during two study visits. EEG patterns, power spectral densities (PSDs) and test/retest reliability in eyes closed and eyes open conditions were compared across the four systems, focusing on Fp1, the only common electrode. Fp1 PSDs were obtained using Welch's modified periodogram method and averaged for the five subjects for each visit. The test/retest results were calculated as a ratio of Visit 1/Visit 2 Fp1 channel PSD at each 1 s epoch. Results: B-Alert, Enobio, and Mindwave Fp1 power spectra were similar. Muse showed a broadband increase in power spectra and the highest relative variation across test-retest acquisitions. Consumer systems were more prone to artifact due to eye blinks and muscle movement in the frontal region. Conclusions: EEG data can be successfully collected from all four systems tested. Although there was slightly more time required for application, medical systems offer clear advantages in data quality, reliability, and depth of analysis over the consumer systems. Significance: This evaluation provides evidence for informed selection of EEG systemsappropriate for clinical trials.
Timed activity performance in persons with upper limb amputation: A preliminary study.
Resnik, Linda; Borgia, Mathew; Acluche, Frantzy
55 subjects with upper limb amputation were administered the T-MAP twice within one week. To develop a timed measure of activity performance for persons with upper limb amputation (T-MAP); examine the measure's internal consistency, test-retest reliability and validity; and compare scores by prosthesis use. Measures of activity performance for persons with upper limb amputation are needed The time required to perform daily activities is a meaningful metric that implication for participation in life roles. Internal consistency and test-retest reliability were evaluated. Construct validity was examined by comparing scores by amputation level. Exploratory analyses compared sub-group scores, and examined correlations with other measures. Scale alpha was 0.77, ICC was 0.93. Timed scores differed by amputation level. Subjects using a prosthesis took longer to perform all tasks. T-MAP was not correlated with other measures of dexterity or activity, but was correlated with pain for non-prosthesis users. The timed scale had adequate internal consistency and excellent test-retest reliability. Analyses support reliability and construct validity of the T-MAP. 2c "outcomes" research. Published by Elsevier Inc.
Spathis, Jemima Grace; Connick, Mark James; Beckman, Emma Maree; Newcombe, Peter Anthony; Tweedy, Sean Michael
2015-01-01
Paralympic throwing events for athletes with physical impairments comprise seated and standing javelin, shot put, discus and seated club throwing. Identification of talented throwers would enable prediction of future success and promote participation; however, a valid and reliable talent identification battery for Paralympic throwing has not been reported. This study evaluates the reliability and validity of a talent identification battery for Paralympic throws. Participants were non-disabled so that impairment would not confound analyses, and results would provide an indication of normative performance. Twenty-eight non-disabled participants (13 M; 15 F) aged 23.6 years (±5.44) performed five kinematically distinct criterion throws (three seated, two standing) and nine talent identification tests (three anthropometric, six motor); 23 were tested a second time to evaluate test-retest reliability. Talent identification test-retest reliability was evaluated using Intra-class Correlation Coefficient (ICC) and Bland-Altman plots (Limits of Agreement). Spearman's correlation assessed strength of association between criterion throws and talent identification tests. Reliability was generally acceptable (mean ICC = 0.89), but two seated talent identification tests require more extensive familiarisation. Correlation strength (mean rs = 0.76) indicated that the talent identification tests can be used to validly identify individuals with competitively advantageous attributes for each of the five kinematically distinct throwing activities. Results facilitate further research in this understudied area.
Test-retest reliability of subliminal facial affective priming.
Dannlowski, Udo; Suslow, Thomas
2006-02-01
Since the seminal 1993 demonstrations o f Murphy an d Zajonc, researchers have replicated and extended findings concerning subliminal affective priming. So far, however, no data on test-retest reliability of affective priming effects are available. A subliminal facial affective priming task was administered to 22 healthy individuals (15 women and 7 men) twice about 7 wk. apart. Happy and sad facial expressions were used as affective primes and neutral Chinese ideographs served as target masks, which had to be evaluated. Neutral facial primes and a no-face condition served as baselines. All participants reported not having seen any of the prime faces at either testing session. Priming scores for affective faces compared to the baselines were computed. Acceptable test-retest correlations (rs) of up to .74 were found for the affective priming scores. Although measured almost 2 mo. apart, subliminal affective priming seems to be a temporally stable effect.
Test-retest reliability of resting-state magnetoencephalography power in sensor and source space.
Martín-Buro, María Carmen; Garcés, Pilar; Maestú, Fernando
2016-01-01
Several studies have reported changes in spontaneous brain rhythms that could be used as clinical biomarkers or in the evaluation of neuropsychological and drug treatments in longitudinal studies using magnetoencephalography (MEG). There is an increasing necessity to use these measures in early diagnosis and pathology progression; however, there is a lack of studies addressing how reliable they are. Here, we provide the first test-retest reliability estimate of MEG power in resting-state at sensor and source space. In this study, we recorded 3 sessions of resting-state MEG activity from 24 healthy subjects with an interval of a week between each session. Power values were estimated at sensor and source space with beamforming for classical frequency bands: delta (2-4 Hz), theta (4-8 Hz), alpha (8-13 Hz), low beta (13-20 Hz), high beta (20-30 Hz), and gamma (30-45 Hz). Then, test-retest reliability was evaluated using the intraclass correlation coefficient (ICC). We also evaluated the relation between source power and the within-subject variability. In general, ICC of theta, alpha, and low beta power was fairly high (ICC > 0.6) while in delta and gamma power was lower. In source space, fronto-posterior alpha, frontal beta, and medial temporal theta showed the most reliable profiles. Signal-to-noise ratio could be partially responsible for reliability as low signal intensity resulted in high within-subject variability, but also the inherent nature of some brain rhythms in resting-state might be driving these reliability patterns. In conclusion, our results described the reliability of MEG power estimates in each frequency band, which could be considered in disease characterization or clinical trials. © 2015 Wiley Periodicals, Inc.
Developing an oropharyngeal cancer (OPC) knowledge and behaviors survey.
Dodd, Virginia J; Riley Iii, Joseph L; Logan, Henrietta L
2012-09-01
To use the community participation research model to (1) develop a survey assessing knowledge about mouth and throat cancer and (2) field test and establish test-retest reliability with newly developed instrument. Cognitive interviews with primarily rural African American adults to assess their perception and interpretation of survey items. Test-retest reliability was established with a racially diverse rural population. Test-retest reliabilities ranged from .79 to .40 for screening awareness and .74 to .19 for knowledge. Coefficients increased for composite scores. Community participation methodology provided a culturally appropriate survey instrument that demonstrated acceptable levels of reliability.
Abma, Femke I; van der Klink, Jac J L; Bültmann, Ute
2013-03-01
The promotion of a sustainable, healthy and productive working life attracts more and more attention. Recently the Work Role Functioning Questionnaire (WRFQ) has been cross-culturally translated and adapted to Dutch. This questionnaire aims to measure the health-related work functioning of workers with health problems. The aim of this study is to evaluate the reliability, validity (including five new items) and responsiveness of the WRFQ 2.0 in the working population. A longitudinal study was conducted among workers. The reliability (internal consistency, test-retest reliability, measurement error), validity (structural validity-factor analysis, construct validity by means of hypotheses testing) and responsiveness of the WRFQ 2.0 were evaluated. A total of N = 553 workers completed the survey. The final WRFQ 2.0 has four subscales and showed very good internal consistency, moderate test-retest reliability, good construct validity and moderate responsiveness in the working population. The WRFQ was able to distinguish between groups with different levels of mental health, physical health, fatigue and need for recovery. A moderate correlation was found between WRFQ and related constructs respectively work ability and work productivity. A weak relationship was found with general self-rated health, work engagement and work involvement. The WRFQ 2.0 is a reliable and valid instrument to measure health-related work functioning in the working population. Further validation in larger samples is recommended, especially for test-retest reliability, responsiveness and the questionnaire's ability to predict the future course of health-related work functioning.
Translation, reliability, and clinical utility of the Melbourne Assessment 2.
Gerber, Corinna N; Plebani, Anael; Labruyère, Rob
2017-10-12
The aims were to (i) provide a German translation of the Melbourne Assessment 2 (MA2), a quantitative test to measure unilateral upper limb function in children with neurological disabilities and (ii) to evaluate its reliability and aspects of clinical utility. After its translation into German and approval of the back translation by the original authors, the MA2 was performed and videotaped twice with 30 children with neuromotor disorders. For each participant, two raters scored the video of the first test for inter-rater reliability. To determine test-retest reliability, one rater additionally scored the video of the second test while the other rater repeated the scoring of the first video to evaluate intra-rater reliability. Time needed for rater training, test administration, and scoring was recorded. The four subscale scores showed excellent intra-, inter-rater, and test-retest reliability with intraclass correlation coefficients of 0.90-1.00 (95%-confidence intervals 0.78-1.00). Score items revealed substantial to almost perfect intra-rater reliability (weighted kappa k w = 0.66-1.00) for the more affected side. Score item inter-rater and test-retest reliability of the same extremity were, with one exception, moderate to almost perfect (k w = 0.42-0.97; k w = 0.40-0.89). Furthermore, the MA2 was feasible and acceptable for patients and clinicians. The MA2 showed excellent subscale and moderate to almost perfect score item reliability. Implications for Rehabilitation There is a lack of high-quality studies about psychometric properties of upper limb measurement tools in the neuropediatric population. The Melbourne Assessment 2 is a promising tool for reliable measurement of unilateral upper limb movement quality in the neuropediatric population. The Melbourne Assessment 2 is acceptable and practicable to therapists and patients for routine use in clinical care.
Trotti, Lynn Marie; Staab, Beth A; Rye, David B
2013-08-15
Differentiation of narcolepsy without cataplexy from idiopathic hypersomnia relies entirely upon the multiple sleep latency test (MSLT). However, the test-retest reliability for these central nervous system hypersomnias has never been determined. Patients with narcolepsy without cataplexy, idiopathic hypersomnia, and physiologic hypersomnia who underwent two diagnostic multiple sleep latency tests were identified retrospectively. Correlations between the mean sleep latencies on the two studies were evaluated, and we probed for demographic and clinical features associated with reproducibility versus change in diagnosis. Thirty-six patients (58% women, mean age 34 years) were included. Inter -test interval was 4.2 ± 3.8 years (range 2.5 months to 16.9 years). Mean sleep latencies on the first and second tests were 5.5 (± 3.7 SD) and 7.3 (± 3.9) minutes, respectively, with no significant correlation (r = 0.17, p = 0.31). A change in diagnosis occurred in 53% of patients, and was accounted for by a difference in the mean sleep latency (N = 15, 42%) or the number of sleep onset REM periods (N = 11, 31%). The only feature predictive of a diagnosis change was a history of hypnagogic or hypnopompic hallucinations. The multiple sleep latency test demonstrates poor test-retest reliability in a clinical population of patients with central nervous system hypersomnia evaluated in a tertiary referral center. Alternative diagnostic tools are needed.
Fatigue after stroke: the development and evaluation of a case definition.
Lynch, Joanna; Mead, Gillian; Greig, Carolyn; Young, Archie; Lewis, Susan; Sharpe, Michael
2007-11-01
While fatigue after stroke is a common problem, it has no generally accepted definition. Our aim was to develop a case definition for post-stroke fatigue and to test its psychometric properties. A case definition with face validity and an associated structured interview was constructed. After initial piloting, the feasibility, reliability (test-retest and inter-rater) and concurrent validity (in relation to four fatigue severity scales) were determined in 55 patients with stroke. All participating patients provided satisfactory answers to all the case definition probe questions demonstrating its feasibility For test-retest reliability, kappa was 0.78 (95% CI, 0.57-0.94, P<.01) and for inter-rater reliability kappa was 0.80 (95% CI, 0.62-0.99, P<.01). Patients fulfilling the case definition also had substantially higher fatigue scores on four fatigue severity scales (P<.001) indicating concurrent validity. The proposed case definition is feasible to administer and reliable in practice, and there is evidence of concurrent validity. It requires further evaluation in different settings.
Reliability of laboratory measurement of human food intake.
Laessle, R; Geiermann, L
2012-02-01
The universal eating monitor (UEM) of Kissileff for laboratory measurement of food intake was modified and used with a newly developed special software to compute cumulative intake data. To explore the measurement precision of the UEM an investigation of test-retest-reliability of food intake parameters was conducted. The intake characteristics of 125 males and females were measured repeatedly in the laboratory with a measurement interval of 1 week. Pudding of preferred flavour served as test meal. Test-retest-reliability of intake characteristics ranged from .49 (change of eating rate) to .89 (initial eating rate). All test-retest correlations were highly significant. Sex, BMI and eating habits according to TFEQ-factors had no significant effects on reliability of intake characteristics. The test-retest-reliability of the laboratory intake measures is as good as those of personality questionnaires, where it should be better than .80. Reliability coefficients are valid independent of sex, BMI or trait characteristics of eating behaviour. Copyright © 2011 Elsevier Ltd. All rights reserved.
Larson, Tomas; Kerekes, Nóra; Selinus, Eva Norén; Lichtenstein, Paul; Gumpert, Clara Hellner; Anckarsäter, Henrik; Nilsson, Thomas; Lundström, Sebastian
2014-02-01
The Autism-Tics, AD/HD, and other Comorbidities (A-TAC) inventory is used in epidemiological research to assess neurodevelopmental problems and coexisting conditions. Although the A-TAC has been applied in various populations, data on retest reliability are limited. The objective of the present study was to present additional reliability data. The A-TAC was administered by lay assessors and was completed on two occasions by parents of 400 individual twins, with an average interval of 70 days between test sessions. Intra- and inter-rater reliability were analysed with intraclass correlations and Cohen's kappa. A-TAC showed excellent test-retest intraclass correlations for both autism spectrum disorder and attention deficit hyperactivity disorder (each at .84). Most modules in the A-TAC had intra- and inter-rater reliability intraclass correlation coefficients of > or = .60. Cohen's kappa indi- cated acceptable reliability. The current study provides statistical evidence that the A-TAC yields good test-retest reliability in a population-based cohort of children.
Zambelli, Roberto; Pinto, Rafael Z; Magalhães, João Murilo Brandão; Lopes, Fernando Araujo Silva; Castilho, Rodrigo Simões; Baumfeld, Daniel; Dos Santos, Thiago Ribeiro Teles; Maffulli, Nicola
2016-01-01
There is a need for a patient-relevant instrument to evaluate outcome after treatment in patients with a total Achilles tendon rupture. The purpose of this study was to undertake a cross-cultural adaptation of the Achilles Tendon Total Rupture Score (ATRS) into Brazilian Portuguese, determining the test-retest reliability and construct validity of the instrument. A five-step approach was used in the cross-cultural adaptation process: initial translation (two bilingual Brazilian translators), synthesis of translation, back-translation (two native English language translators), consensus version and evaluation (expert committee), and testing phase. A total of 46 patients were recruited to evaluate the test-retest reproducibility and construct validity of the Brazilian Portuguese version of the ATRS. Test-retest reproducibility was performed by assessing each participant on two separate occasions. The construct validity was determined by the correlation index between the ATRS and the Orthopedic American Foot and Ankle Society (AOFAS) questionnaires. The final version of the Brazilian Portuguese ATRS had the same number of questions as the original ATRS. For the reliability analysis, an ICC(2,1) of 0.93 (95 % CI: 0.88 to 0.96) with SEM of 1.56 points and MDC of 4.32 was observed, indicating excellent reliability. The construct validity showed excellent correlation with R = 0.76 (95 % CI: 0.52 to 0.89, P < 0.001). The ATRS was successfully cross-culturally validated into Brazilian Portuguese. This version was a reliable and valid measure of function in patients who suffered complete rupture of the Achilles Tendon.
Chiwaridzo, Matthew; Chikasha, Tafadzwa Nicole; Naidoo, Nirmala; Dambi, Jermaine Matewu; Tadyanemhandu, Cathrine; Munambah, Nyaradzai; Chizanga, Precious Trish
2017-01-01
In Zimbabwe, a recent increase in the volume of research on recurrent non-specific low back pain (NSLBP) has revealed that adolescents are commonly affected. This is alarming to health professionals and parents and calls for serious primary preventative strategies to be developed and implemented forthwith. Early identification initiatives should be prioritised in order to curtail the condition and its progression. In an attempt to be proactive in minimising the prevalence of recurrent NSLBP, this study was conducted to evaluate the content validity and test-retest reliability of a survey questionnaire with the aim of proffering a valid and reliable questionnaire which can be used in non-clinical settings to identify adolescents with recurrent NSLBP in Harare, Zimbabwe and determine the possible factors associated with the condition. The study was conducted in two parts. The first part assessed content validity of the questionnaire using four experts derived from academia and clinical practice. The second part evaluated the reliability of the questionnaire among 125 high school-children aged between 13 and 19 years in a test-retest study. Twenty-six (26) out of thirty questions in the questionnaire had an Item Content Validity index of 1.00, demonstrating complete agreement among content experts. Overall, the Scale Content Validity Index for the questionnaire was 0.97. Item completion for the reliability study was satisfactory. The questionnaire items had kappa values ranging from 0.17 (slight agreement) to 1 (perfect agreement). High levels of reliability were found for the questions on school bag use ( k =0.94), sports participation ( k =0.97), and lifetime prevalence ( k =0.89). Excellent content validity and slight to perfect test-retest reliability was found for the Low Back Pain (LBP) questionnaire. These results are comparable to findings of other studies evaluating the psychometric properties of LBP questionnaires. Cognisant of the limitations of the study, the results of this study suggest that the LBP questionnaire could be used in local studies investigating LBP among adolescents although questions enquiring on functional limitations and sciatica may need further consideration.
Test-Retest Reliability and Predictive Validity of the Implicit Association Test in Children
ERIC Educational Resources Information Center
Rae, James R.; Olson, Kristina R.
2018-01-01
The Implicit Association Test (IAT) is increasingly used in developmental research despite minimal evidence of whether children's IAT scores are reliable across time or predictive of behavior. When test-retest reliability and predictive validity have been assessed, the results have been mixed, and because these studies have differed on many…
Solah, Vicky A.; Meng, Xingqiong; Wood, Simon; Gahler, Roland J.; Kerr, Deborah A.; James, Anthony P.; Pal, Sebely; Fenton, Haelee K.; Johnson, Stuart K.
2015-01-01
Background The assessment of satiety effects on foods is commonly performed by untrained volunteers marking their perceived hunger or fullness on line scales, marked with pre-set descriptors. The lack of reproducibility of satiety measurement using this approach however results in the tool being unable to distinguish between foods that have small, but possibly important, differences in their satiety effects. An alternate approach is used in sensory evaluation; panellists can be trained in the correct use of the assessment line-scale and brought to consensus on the meanings of descriptors used for food quality attributes to improve the panel reliability. The effect of training on the reliability of a satiety panel has not previously been reported. Method In a randomised controlled parallel intervention, the effect of training in the correct use of a satiety labelled magnitude scale (LMS) was assessed versus no-training. The test-retest precision and reliability of two hour postprandial satiety evaluation after consumption of a standard breakfast was compared. The trained panel then compared the satiety effect of two breakfast meals containing either a viscous or a non-viscous dietary fibre in a crossover trial. Results A subgroup of the 23 panellists (n = 5) improved their test re-test precision after training. Panel satiety area under the curve, “after the training” intervention was significantly different to “before training” (p < 0.001). Reliability of the panel determined by intraclass correlation (ICC) of test and retest showed improved strength of the correlation from 0.70 pre-intervention to 0.95 post intervention. The trained “satiety expert panel” determined that a standard breakfast with 5g of viscous fibre gave significantly higher satiety than with 5g non-viscous fibre (area under curve (AUC) of 478.2, 334.4 respectively) (p ≤ 0.002). Conclusion Training reduced between panellist variability. The improved strength of test-retest ICC as a result of the training intervention suggests that training satiety panellists can improve the discriminating power of satiety evaluation. PMID:25978321
Solah, Vicky A; Meng, Xingqiong; Wood, Simon; Gahler, Roland J; Kerr, Deborah A; James, Anthony P; Pal, Sebely; Fenton, Haelee K; Johnson, Stuart K
2015-01-01
The assessment of satiety effects on foods is commonly performed by untrained volunteers marking their perceived hunger or fullness on line scales, marked with pre-set descriptors. The lack of reproducibility of satiety measurement using this approach however results in the tool being unable to distinguish between foods that have small, but possibly important, differences in their satiety effects. An alternate approach is used in sensory evaluation; panellists can be trained in the correct use of the assessment line-scale and brought to consensus on the meanings of descriptors used for food quality attributes to improve the panel reliability. The effect of training on the reliability of a satiety panel has not previously been reported. In a randomised controlled parallel intervention, the effect of training in the correct use of a satiety labelled magnitude scale (LMS) was assessed versus no-training. The test-retest precision and reliability of two hour postprandial satiety evaluation after consumption of a standard breakfast was compared. The trained panel then compared the satiety effect of two breakfast meals containing either a viscous or a non-viscous dietary fibre in a crossover trial. A subgroup of the 23 panellists (n = 5) improved their test re-test precision after training. Panel satiety area under the curve, "after the training" intervention was significantly different to "before training" (p < 0.001). Reliability of the panel determined by intraclass correlation (ICC) of test and retest showed improved strength of the correlation from 0.70 pre-intervention to 0.95 post intervention. The trained "satiety expert panel" determined that a standard breakfast with 5g of viscous fibre gave significantly higher satiety than with 5g non-viscous fibre (area under curve (AUC) of 478.2, 334.4 respectively) (p ≤ 0.002). Training reduced between panellist variability. The improved strength of test-retest ICC as a result of the training intervention suggests that training satiety panellists can improve the discriminating power of satiety evaluation.
Jensen, Christian Gaden; Niclasen, Janni; Vangkilde, Signe Allerup; Petersen, Anders; Hasselbalch, Steen Gregers
2016-05-01
The Mindful Attention Awareness Scale (MAAS) measures perceived degree of inattentiveness in different contexts and is often used as a reversed indicator of mindfulness. MAAS is hypothesized to reflect a psychological trait or disposition when used outside attentional training contexts, but the long-term test-retest reliability of MAAS scores is virtually untested. It is unknown whether MAAS predicts psychological health after controlling for standardized socioeconomic status classifications. First, MAAS translated to Danish was validated psychometrically within a randomly invited healthy adult community sample (N = 490). Factor analysis confirmed that MAAS scores quantified a unifactorial construct of excellent composite reliability and consistent convergent validity. Structural equation modeling revealed that MAAS scores contributed independently to predicting psychological distress and mental health, after controlling for age, gender, income, socioeconomic occupational class, stressful life events, and social desirability (β = 0.32-.42, ps < .001). Second, MAAS scores showed satisfactory short-term test-retest reliability in 100 retested healthy university students. Finally, MAAS sample mean scores as well as individuals' scores demonstrated satisfactory test-retest reliability across a 6 months interval in the adult community (retested N = 407), intraclass correlations ≥ .74. MAAS scores displayed significantly stronger long-term test-retest reliability than scores measuring psychological distress (z = 2.78, p = .005). Test-retest reliability estimates did not differ within demographic and socioeconomic strata. Scores on the Danish MAAS were psychometrically validated in healthy adults. MAAS's inattentiveness scores reflected a unidimensional construct, long-term reliable disposition, and a factor of independent significance for predicting psychological health. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Reliability of the Cooking Task in adults with acquired brain injury.
Poncet, Frédérique; Swaine, Bonnie; Taillefer, Chantal; Lamoureux, Julie; Pradat-Diehl, Pascale; Chevignard, Mathilde
2015-01-01
Acquired brain injury (ABI) often leads to deficits in executive functioning (EF) responsible for severe and long-standing disabilities in daily life activities. The Cooking Task is an ecological and valid test of EF involving multi-tasking in a real environment. Given its complex scoring system, it is important to establish the tool's reliability. The objective of the study was to examine the reliability of the Cooking Task (internal consistency, inter-rater and test-retest reliability). A total of 160 patients with ABI (113 men, mean age 37 years, SD = 14.3) were tested using the Cooking Task. For test-retest reliability, patients were assessed by the same rater on two occasions (mean interval 11 days) while two raters independently and simultaneously observed and scored patients' performances to estimate inter-rater reliability. Internal consistency was high for the global scale (Cronbach α = .74). Inter-rater reliability (n = 66) for total errors was also high (ICC = .93), however the test-retest reliability (n = 11) was poor (ICC = .36). In general the Cooking Task appears to be a reliable tool. The low test-retest results were expected given the importance of EF in the performance of novel tasks.
Tonga, Eda; Atasavun Uysal, Songul; Karayazgan, Sedef; Hayran, Mutlu; Düger, Tülin
2016-01-01
Clinical measurement. To adapt the original JPBA-S to a Turkish version (TUR-JPBA-S) and to investigate its reliability in assessing patients with rheumatoid arthritis (RA). Twenty-two participants with RA and 21 healthy people were videotaped while performing tasks listed in the TUR-JPBA-S. Two raters scored the video recordings for to evaluate inter-rater reliability. One rater re-analyzed the recordings at a different time point for intra-rater reliability. Participants with RA were asked to perform the same tasks after three to four weeks which was also recorded to evaluate test-retest reliability. Internal consistency (Cronbach's α value) was found to be high (0.89) for participants with RA. Our results demonstrate excellent intra-rater (ICC: 0.99, SEM 1.2) inter-rater (ICC: 0.99, SEM 1.7) reliability, apart from excellent test-retest reliability (ICC: 0.96). The TUR-JPBA-S is a valid and reliable instrument for assessing JP behavior in patients with RA in Turkey. Level 2. Copyright © 2016 Hanley & Belfus. Published by Elsevier Inc. All rights reserved.
Siahaan, Laura A; Syam, Ari F; Simadibrata, Marcellus; Setiati, Siti
2017-01-01
to obtain a valid and reliable GERD-QOL questionnaire for Indonesian application. at the initial stage, the GERD-QOL questionnaire was first translated into Indonesian language and the translated questionnaire was subsequently translated back into the original language (back-to-back translation). The results were evaluated by the researcher team and therefore, an Indonesian version of GERD-QOL questionnaire was developed. Ninety-one patients who had been clinically diagnosed with GERD based on the Montreal criteria were interviewed using the Indonesian version of GERD-QOL questionnaire and the SF 36 questionnaire. The validity was evaluated using a method of construct validity and external validity, and reliability can be tested by the method of internal consistency and test retest. the Indonesian version of GERD-QOL questionnaire had a good internal consistency reliability with a Cronbach Alpha of 0.687-0.842 and a good test retest reliability with an intra-class correlation coefficient of 0.756-0.936; p<0.05). The questionnaire had also been demonstrated to have a good validity with a proven high correlation to each question of SF-36 (p<0.05). the Indonesian version of GERD-QOL questionnaire has been proven valid and reliable to evaluate the quality of life of GERD patients.
Baad-Hansen, L; Pigg, M; Yang, G; List, T; Svensson, P; Drangsholt, M
2015-02-01
The reliability of comprehensive intra-oral quantitative sensory testing (QST) protocol has not been examined systematically in patients with chronic oro-facial pain. The aim of the present multicentre study was to examine test-retest and interexaminer reliability of intra-oral QST measures in terms of absolute values and z-scores as well as within-session coefficients of variation (CV) values in patients with atypical odontalgia (AO) and healthy pain-free controls. Forty-five patients with AO and 68 healthy controls were subjected to bilateral intra-oral gingival QST and unilateral extratrigeminal QST (thenar) on three occasions (twice on 1 day by two different examiners and once approximately 1 week later by one of the examiners). Intra-class correlation coefficients and kappa values for interexaminer and test-retest reliability were computed. Most of the standardised intra-oral QST measures showed fair to excellent interexaminer (9-12 of 13 measures) and test-retest (7-11 of 13 measures) reliability. Furthermore, no robust differences in reliability measures or within-session variability (CV) were detected between patients with AO and the healthy reference group. These reliability results in chronic orofacial pain patients support earlier suggestions based on data from healthy subjects that intra-oral QST is sufficiently reliable for use as a part of a comprehensive evaluation of patients with somatosensory disturbances or neuropathic pain in the trigeminal region. © 2014 John Wiley & Sons Ltd.
A Pilot Study of the Snap & Sniff Threshold Test.
Jiang, Rong-San; Liang, Kai-Li
2018-05-01
The Snap & Sniff ® Threshold Test (S&S) has been recently developed to determine the olfactory threshold. The aim of this study was to further evaluate the validity and test-retest reliability of the S&S. The olfactory thresholds of 120 participants were determined using both the Smell Threshold Test (STT) and the S&S. The participants included 30 normosmic volunteers and 90 patients (60 hyposmic, 30 anosmic). The normosmic participants were retested using the STT and S&S at an intertest interval of at least 1 day. The mean olfactory threshold determined with the S&S was -6.76 for the normosmic participants, -3.79 for the hyposmic patients, and -2 for the anosmic patients. The olfactory thresholds were significantly different across the 3 groups ( P < .001). Snap & Sniff-based and STT-based olfactory thresholds were correlated weakly in the normosmic group (correlation coefficient = 0.162, P = .391) but more strongly correlated in the patient groups (hyposmic: correlation coefficient = 0.376, P = .003; anosmic: correlation coefficient = 1.0). The test-retest correlation for the S&S-based olfactory thresholds was 0.384 ( P = .036). Based on validity and test-retest reliability, we concluded that the S&S is a proper test for olfactory thresholds.
González-Gil, E M; Mouratidou, T; Cardon, G; Androutsos, O; De Bourdeaudhuij, I; Góźdź, M; Usheva, N; Birnbaum, J; Manios, Y; Moreno, L A
2014-08-01
Reliable assessments of health-related behaviours are necessary for accurate evaluation on the efficiency of public health interventions. The aim of the current study was to examine the reliability of a self-administered primary caregivers questionnaire (PCQ) used in the ToyBox-intervention. The questionnaire consisted of six sections addressing sociodemographic and perinatal factors, water and beverages consumption, physical activity, snacking and sedentary behaviours. Parents/caregivers from six countries (Belgium, Bulgaria, Germany, Greece, Poland and Spain) were asked to complete the questionnaire twice within a 2-week interval. A total of 93 questionnaires were collected. Test-retest reliability was assessed using intra-class correlation coefficient (ICC). Reliability of the six questionnaire sections was assessed. A stronger agreement was observed in the questions addressing sociodemographic and perinatal factors as opposed to questions addressing behaviours. Findings showed that 92% of the ToyBox PCQ had a moderate-to-excellent test-retest reliability (defined as ICC values from 0.41 to 1) and less than 8% poor test-retest reliability (ICC < 0.40). Out of the total ICC values, 67% showed good-to-excellent reliability (ICC from 0.61 to 1). We conclude that the PCQ is a reliable tool to assess sociodemographic characteristics, perinatal factors and lifestyle behaviours of pre-school children and their families participating in the ToyBox-intervention. © 2014 World Obesity.
Test-retest reliability of cognitive EEG
NASA Technical Reports Server (NTRS)
McEvoy, L. K.; Smith, M. E.; Gevins, A.
2000-01-01
OBJECTIVE: Task-related EEG is sensitive to changes in cognitive state produced by increased task difficulty and by transient impairment. If task-related EEG has high test-retest reliability, it could be used as part of a clinical test to assess changes in cognitive function. The aim of this study was to determine the reliability of the EEG recorded during the performance of a working memory (WM) task and a psychomotor vigilance task (PVT). METHODS: EEG was recorded while subjects rested quietly and while they performed the tasks. Within session (test-retest interval of approximately 1 h) and between session (test-retest interval of approximately 7 days) reliability was calculated for four EEG components: frontal midline theta at Fz, posterior theta at Pz, and slow and fast alpha at Pz. RESULTS: Task-related EEG was highly reliable within and between sessions (r0.9 for all components in WM task, and r0.8 for all components in the PVT). Resting EEG also showed high reliability, although the magnitude of the correlation was somewhat smaller than that of the task-related EEG (r0.7 for all 4 components). CONCLUSIONS: These results suggest that under appropriate conditions, task-related EEG has sufficient retest reliability for use in assessing clinical changes in cognitive status.
Reliability of the detailed assessment of speed of handwriting on Flemish children.
Simons, Johan; Probst, Michel
2014-01-01
This study evaluates the reliability of the Detailed Assessment of Speed of Handwriting (DASH) in a Dutch-speaking sample of children. The sample included 650 boys and 513 girls (age range = 9-16 years). Handwriting speed measurements were obtained using the DASH. Interrater agreement, test-retest reliability, and internal consistency were calculated; gender and age effects were analyzed. Interrater agreement shows excellent reliability with intraclass correlation coefficients of at least 0.94. Test-retest correlations ranged from r = 0.65 to r = 0.81. The internal consistency measures, calculated with Cronbach's alpha, were between 0.88 and 0.94. Both gender and age have a significant effect on handwriting speed, with F (7.1144) = 17.43 (P < .001) for gender and F (7.1144) = 21.8 (P < .001) for age. The DASH is a reliable assessment tool to evaluate handwriting speed of Dutch-speaking children. There is a tendency of girls to write faster than boys.
Hoyer, Erik H; Young, Daniel L; Klein, Lisa M; Kreif, Julie; Shumock, Kara; Hiser, Stephanie; Friedman, Michael; Lavezza, Annette; Jette, Alan; Chan, Kitty S; Needham, Dale M
2018-02-01
The lack of common language among interprofessional inpatient clinical teams is an important barrier to achieving inpatient mobilization. In The Johns Hopkins Hospital, the Activity Measure for Post-Acute Care (AM-PAC) Inpatient Mobility Short Form (IMSF), also called "6-Clicks," and the Johns Hopkins Highest Level of Mobility (JH-HLM) are part of routine clinical practice. The measurement characteristics of these tools when used by both nurses and physical therapists for interprofessional communication or assessment are unknown. The purposes of this study were to evaluate the reliability and minimal detectable change of AM-PAC IMSF and JH-HLM when completed by nurses and physical therapists and to evaluate the construct validity of both measures when used by nurses. A prospective evaluation of a convenience sample was used. The test-retest reliability and the interrater reliability of AM-PAC IMSF and JH-HLM for inpatients in the neuroscience department (n = 118) of an academic medical center were evaluated. Each participant was independently scored twice by a team of 2 nurses and 1 physical therapist; a total of 4 physical therapists and 8 nurses participated in reliability testing. In a separate inpatient study protocol (n = 69), construct validity was evaluated via an assessment of convergent validity with other measures of function (grip strength, Katz Activities of Daily Living Scale, 2-minute walk test, 5-times sit-to-stand test) used by 5 nurses. The test-retest reliability values (intraclass correlation coefficients) for physical therapists and nurses were 0.91 and 0.97, respectively, for AM-PAC IMSF and 0.94 and 0.95, respectively, for JH-HLM. The interrater reliability values (intraclass correlation coefficients) between physical therapists and nurses were 0.96 for AM-PAC IMSF and 0.99 for JH-HLM. Construct validity (Spearman correlations) ranged from 0.25 between JH-HLM and right-hand grip strength to 0.80 between AM-PAC IMSF and the Katz Activities of Daily Living Scale. The results were obtained from inpatients in the neuroscience department of a single hospital. The AM-PAC IMSF and JH-HLM had excellent interrater reliability and test-retest reliability for both physical therapists and nurses. The evaluation of convergent validity suggested that AM-PAC IMSF and JH-HLM measured constructs of patient mobility and physical functioning. © 2017 American Physical Therapy Association
Reliability and Validity of the Evidence-Based Practice Confidence (EPIC) Scale
ERIC Educational Resources Information Center
Salbach, Nancy M.; Jaglal, Susan B.; Williams, Jack I.
2013-01-01
Introduction: The reliability, minimal detectable change (MDC), and construct validity of the evidence-based practice confidence (EPIC) scale were evaluated among physical therapists (PTs) in clinical practice. Methods: A longitudinal mail survey was conducted. Internal consistency and test-retest reliability were estimated using Cronbach's alpha…
Vijaya, Gopalan; Cartwright, Rufus; Bhide, Alka; Derpapas, Alexandros; Fernando, Ruwan; Khullar, Vik
2016-11-01
The validity and reliability of measurement of urinary NGF as a diagnostic biomarker in women with lower urinary tract dysfunction (LUTD) is uncertain. We aimed to evaluate both the diagnostic and discriminant validity, and the test-retest reliability of urinary NGF measurement in women with LUTD. Urinary NGF was measured in women with LUTD (n = 205) and asymptomatic subjects (n = 31). Urinary NGF was assayed using an ELISA method and normalized against urinary creatinine. NGF/creatinine ratios were compared between symptom subgroups using Mann-Whitney U test, and between different urodynamic diagnoses using the Kruskal-Wallis test. Receiver Operator Characteristic (ROC) analysis was employed to evaluate the diagnostic performance of urinary NGF. Test-retest reliability of NGF measurement was assessed using intra-class correlation (ICC). Urinary NGF was significantly but non-specifically increased in symptomatic patients when compared to controls (13.33 vs. 2.05 ng NGF/g Cr, P < 0.001). On multivariate logistic regression NGF was a good predictor of patients having OAB or not, however, the adjusted odds ratio only 1.006. ROC analysis demonstrated poor discriminant ability between different symptomatic groups and urodynamic groups. Using a cut off of 13.0 ng NGF/g creatinine the test provides a sensitivity of 81%, but a specificity of only 39% for overactive bladder. The assays demonstrated good test-retest reliability with ICC of 0.889. Although urinary NGF can be reliably assayed, and is increased in various LUTDs, it discriminates poorly between these disorders therefore has very limited potential as a biomarker. Neurourol. Urodynam. 35:944-948, 2016. © 2015 Wiley Periodicals, Inc. © 2015 Wiley Periodicals, Inc.
Reliability and Validity of the Turkish Version of the Voice-Related Quality of Life Measure.
Tezcaner, Zahide Çiler; Aksoy, Songül
2017-03-01
This study aims to test the validity and reliability of the Turkish version of the Voice-Related Quality of Life (V-RQOL) questionnaire. This is a nonrandomized, prospective study with control group. The questionnaire was administered to 249 individuals-130 with vocal complaint and 119 without-with a mean age of 37.8 ± 12.3 years. The Turkish version of the Voice Handicap Index (VHI) and perceptual voice evaluation measures were also administered at 2-14 days for retest reliability. The instrument was submitted to validity and reliability evaluation. The V-RQOL measure showed a strong internal consistency and test-retest reliability; the Cronbach's alpha coefficient for the overall V-RQOL was 0.969, the physical functioning domain was 0.949, and the social-emotional domain was 0.940. In the test-retest reliability test, the overall V-RQOL was found to be 0.989. The construct validity of the V-RQOL was determined based on the strength and direction of its relation to the VHI and the perceptual voice evaluation measure. The higher the VHI level, the lower the physical functioning, social-emotional, and overall score levels of the V-RQOL (r = -0.927, r = -0.912, r = -0.944, respectively; P < 0.001). Following the perceptual voice self-assessment, a statistically significant difference was found between the V-RQOL scores of individuals who defined their voices as good, very good, and perfect, and those who defined their voices as bad and very bad (P < 0.001). The results suggest that the Turkish version of the V-RQOL measure has reliability and validity and may play a crucial role in evaluating Turkish-speaking patients with voice disorders. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Hamre, Charlotta; Botolfsen, Pernille; Tangen, Gro Gujord; Helbostad, Jorunn L
2017-04-20
The Balance Evaluation Systems Test (BESTest) was developed to assess underlying systems for balance control in order to be able to individually tailor rehabilitation interventions to people with balance disorders. A short form, the Mini-BESTest, was developed as a screening test. The study aimed to assess interrater and test-retest reliability of the Norwegian version of the BESTest and the Mini-BESTest in community-dwelling people with increased risk of falling and to assess concurrent validity with the Fall Efficacy Scale-International (FES-I), and it was an observational study with a cross-sectional design. Forty-two persons with increased risk of falling (elderly over 65 years of age, persons with a history of stroke or Multiple Sclerosis) were assessed twice by two raters. Relative reliability was analysed with Intraclass Correlation Coefficient (ICC), and absolute reliability with standard error of measurement (SEM) and smallest detectable change (SDC). Concurrent validity was assessed against the FES-I using Spearman's rho. The BESTest showed very good interrater reliability (ICC = 0.98, SEM = 1.79, SDC 95 = 5.0) and test-retest reliability (rater A/rater B = ICC = 0.89/0.89, SEM = 3.9/4.3, SDC 95 = 10.8/11.8). The Mini-BESTest also showed very good interrater reliability (ICC = 0.95, SEM = 1.19, SDC 95 = 3.3) and test-retest reliability (rater A/rater B = ICC = 0.85/0.84, SEM = 1.8/1.9, SDC 95 = 4.9/5.2). The correlations were moderate between the FES-I and both the BESTest and the Mini-BESTest (Spearman's rho -0.51 and-0.50, p < 0.01). The BESTest and its short form, the Mini-BESTest, showed very good interrater and test-retest reliability when assessed in a heterogeneous sample of people with increased risk of falling. The concurrent validity measured against the FES-I showed moderate correlation. The results are comparable with earlier studies and indicate that the Norwegian versions can be used in daily clinic and in research.
van der Ploeg, Hidde P; Streppel, Kitty R M; van der Beek, Allard J; van der Woude, Luc H V; Vollenbroek-Hutten, Miriam; van Mechelen, Willem
2007-01-01
The objective was to determine the test-retest reliability and criterion validity of the Physical Activity Scale for Individuals with Physical Disabilities (PASIPD). Forty-five non-wheelchair dependent subjects were recruited from three Dutch rehabilitation centers. Subjects' diagnoses were: stroke, spinal cord injury, whiplash, and neurological-, orthopedic- or back disorders. The PASIPD is a 7-d recall physical activity questionnaire that was completed twice, 1 wk apart. During this week, physical activity was also measured with an Actigraph accelerometer. The test-retest reliability Spearman correlation of the PASIPD was 0.77. The criterion validity Spearman correlation was 0.30 when compared to the accelerometer. The PASIPD had test-retest reliability and criterion validity that is comparable to well established self-report physical activity questionnaires from the general population.
Test-retest reliability of the multifocal photopic negative response.
Van Alstine, Anthony W; Viswanathan, Suresh
2017-02-01
To assess the test-retest reliability of the multifocal photopic negative response (mfPhNR) of normal human subjects. Multifocal electroretinograms were recorded from one eye of 61 healthy adult subjects on two separate days using a Visual Evoked Response Imaging System software version 4.3 (EDI, San Mateo, California). The visual stimulus delivered on a 75-Hz monitor consisted of seven equal-sized hexagons each subtending 12° of visual angle. The m-step exponent was 9, and the m-sequence was slowed to include at least 30 blank frames after each flash. Only the first slice of the first-order kernel was analyzed. The mfPhNR amplitude was measured at a fixed time in the trough from baseline (BT) as well as at the same fixed time in the trough from the preceding b-wave peak (PT). Additionally, we also analyzed BT normalized either to PT (BT/PT) or to the b-wave amplitude (BT/b-wave). The relative reliability of test-retest differences for each test location was estimated by the Wilcoxon matched-pair signed-rank test and intraclass correlation coefficients (ICC). Absolute test-retest reliability was estimated by Bland-Altman analysis. The test-retest amplitude differences for neither of the two measurement techniques were statistically significant as determined by Wilcoxon matched-pair signed-rank test. PT measurements showed greater ICC values than BT amplitude measurements for all test locations. For each measurement technique, the ICC value of the macular response was greater than that of the surrounding locations. The mean test-retest difference was close to zero for both techniques at each of the test locations, and while the coefficient of reliability (COR-1.96 times the standard deviation of the test-retest difference) was comparable for the two techniques at each test location when expressed in nanovolts, the %COR (COR normalized to the mean test and retest amplitudes) was superior for PT than BT measurements. The ICC and COR were comparable for the BT/PT and BT/b-wave ratios and were better than the ICC and COR for BT but worse than PT. mfPhNR amplitude measured at a fixed time in the trough from the preceding b-wave peak (PT) shows greater test-retest reliability when compared to amplitude measurement from baseline (BT) or BT amplitude normalized to either the PT or b-wave amplitudes.
The Universal Design for Play Tool: Establishing Validity and Reliability
ERIC Educational Resources Information Center
Ruffino, Amy Goetz; Mistrett, Susan G.; Tomita, Machiko; Hajare, Poonam
2006-01-01
The Universal Design for Play (UDP) Tool is an instrument designed to evaluate the presence of universal design (UD) features in toys. This study evaluated its psychometric properties, including content validity, construct validity, and test-retest reliability. The UDP tool was designed to assist in selecting toys most appropriate for children…
Sureda, Xisca; Espelt, Albert; Villalbí, Joan R; Cebrecos, Alba; Baranda, Lucía; Pearce, Jamie; Franco, Manuel
2017-01-01
Objectives To describe the development and test–retest reliability of OHCITIES, an instrument characterising alcohol urban environment in terms of availability, promotion and signs of consumption. Design This study involved: (1) developing the conceptual framework for alcohol urban environment by means of literature reviewing and previous alcohol environment research experience; (2) pilot testing and redesigning the instrument; (3) instrument digitalisation; (4) instrument evaluation using test–retest reliability. Setting Data for testing the reliability of the instrument were collected in seven census sections in Madrid in 2016 by two observers. Primary and secondary outcome measures We computed per cent agreement and Cohen’s kappa coefficients to estimate inter-rater and test–retest reliability for alcohol outlet environment measures. We calculated interclass coefficients and their 95% CIs to provide a measure of inter-rater reliability for signs of alcohol consumption measures. Results We collected information on 92 on-premise and 24 off-premise alcohol outlets identified in the studied areas about availability, accessibility and promotion of alcohol. Most per cent-agreement values for alcohol measures in on-premise and off-premise alcohol outlets were greater than 80%, and inter-rater and test–retest reliability values were generally above 0.80. Observers identified 26 streets and 3 public squares with signs of alcohol consumption. Intraclass correlation coefficient between observers for any type of signs of alcohol consumption was 0.50 (95% CI −0.09 to 0.77). Few items promoting alcohol unrelated to alcohol outlets were found on public spaces. Conclusions The OHCITIES instrument is a reliable instrument to characterise alcohol urban environment. This instrument might be used to understand how alcohol environment associates with alcohol behaviours and its related health outcomes, and can help in the design and evaluation of policies to reduce the harm caused by alcohol. PMID:28982829
Pfau, Maximilian; Lindner, Moritz; Müller, Philipp L; Birtel, Johannes; Finger, Robert P; Harmening, Wolf M; Fleckenstein, Monika; Holz, Frank G; Schmitz-Valckenberg, Steffen
2017-05-01
To determine the effective dynamic range (EDR), retest reliability, and number of discriminable steps (DS) for mesopic and dark-adapted two-color fundus-controlled perimetry (FCP) using the S-MAIA (Scotopic-Macular Integrity Assessment) "micro-perimeter." In this prospective cross-sectional study, each of the 52 eyes of 52 subjects with various macular diseases (mean age 62.0 ± 16.9 years; range, 19.1-90.1 years) underwent duplicate mesopic (achromatic stimuli, 400-800 nm), dark-adapted cyan (505 nm), and dark-adapted red (627 nm) FCP using a grid of 61 stimuli covering 18° of the central retina. The EDR, the number of DS, and the retest reliability for point-wise sensitivity (PWS) were analyzed. The effects of fixation stability, sensitivity, and age on retest reliability were examined using mixed-effects models. The EDR was 10 to 30 dB with five DS for mesopic and 4 to 17 dB with four DS for dark-adapted cyan and red testing. PWS retest reliability was good among all three types of retinal sensitivity assessments (coefficient of repeatability ±5.79, ±4.72, and ±4.77 dB, respectively) and did not depend on fixation stability or age. PWS had no effect on retest variability in dark-adapted cyan and dark-adapted red testing but had a minor effect in mesopic testing. Combined mesopic and dark-adapted two-color FCP allows for reliable topographic testing of cone and rod function in patients with various macular diseases with and without foveal fixation. Retest reliability is homogeneous across eccentricities and various degrees of scotoma depth, including zones at risk for disease progression. These reliability estimates can serve for the design of future clinical trials.
Test-retest reliability and practice effects of the Wechsler Memory Scale-III.
Lo, Ada H Y; Humphreys, Michael; Byrne, Gerard J; Pachana, Nancy A
2012-09-01
Although serial administration of cognitive tests is increasingly common, there is a paucity of research on test-retest reliabilities and practice effects, both of which are important for evaluating changes in functioning. Reliability is generally conceptualized as involving short-lasting changes in performance. However, when repeated testing occurs over a period of years, there will be some longer lasting effects. The implications of these longer lasting effects and practice effects on reliability were examined in the context of repeated administrations of the Wechsler Memory Scale-III in 339 community-dwelling women aged 40-79 years over 2 to 7 years. The results showed that Logical Memory and Verbal Paired Associates subtests were consistently the most reliable subtests across the age cohorts. The magnitude of practice effects varied as a function of subtests and age. The largest practice effects were found in the youngest age cohort, especially on the Faces, Logical Memory, and Verbal Paired Associates subtests. ©2012 The British Psychological Society.
Zwerver, Johannes; Kramer, Tamara; van den Akker-Scheek, Inge
2009-08-11
The VISA-P questionnaire evaluates severity of symptoms, knee function and ability to play sports in athletes with patellar tendinopathy. This English-language self-administered brief patient outcome score was developed in Australia to monitor rehabilitation and to evaluate outcome of clinical studies. Aim of this study was to translate the questionnaire into Dutch and to study the reliability and validity of the Dutch version of the VISA-P. The questionnaire was translated into Dutch according to internationally recommended guidelines. Test-retest reliability was determined in 99 students with a time interval of 2.5 weeks. To determine discriminative validity of the Dutch VISA-P, 18 healthy students, 15 competitive volleyball players (at-risk population), 14 patients with patellar tendinopathy, 6 patients who had surgery for patellar tendinopathy, 17 patients with knee injuries other than patellar tendinopathy, and 9 patients with symptoms unrelated to their knees completed the Dutch VISA-P. The Dutch VISA-P questionnaire showed satisfactory test-retest reliability (ICC=0.74). The mean (+/-SD) VISA-P scores were 95 (+/-9) for the healthy students, 89 (+/-11) for the volleyball players, 58 (+/-19) for patients with patellar tendinopathy, and 56 (+/-21) for athletes who had surgery for patellar tendinopathy. Patients with other knee injuries or symptoms unrelated to the knee scored 62 (+/-24) and 77 (+/-24). The translated Dutch version of the VISA-P questionnaire is equivalent to its original version, has satisfactory test-retest reliability and is a valid score to evaluate symptoms, knee function and ability to play sports of Dutch athletes with patellar tendinopathy.
Løchting, Ida; Grotle, Margreth; Storheim, Kjersti; Werner, Erik L; Garratt, Andrew M
2014-09-01
To evaluate the reliability and validity of the improved version of the Patient Generated Index (PGI) in patients with low back pain. The PGI was administered to 90 patients attending care in 1 of 6 institutions in Norway and evaluated for reliability and validity. The questionnaire was given out to 61 patients for re-test purposes. The PGI was completed correctly by 80 (88.9%) patients and, of the 61 patients responding to the re-test, 50 (82.0%) completed both surveys correctly. PGI scores were approximately normally distributed, with a median of 40 (range 80), where 100 is the best possible quality of life. There were no floor or ceiling effects. The 5 most frequently listed areas affecting quality of life were pain, sleep, stiffness, socializing and housework. The test-retest intraclass correlation coefficient was 0.73. The smallest detectable changes for individual and group purposes were 32.8 and 4.6, respectively. The correlations between PGI scores and other instrument scores followed a priori hypotheses of low to moderate correlations. The PGI has evidence for reliability and validity in Norwegian patients with low back pain at the group level and may be considered for application in intervention studies when a comprehensive evaluation of quality of life is important. However, the smallest detectable change, of approximately 30 points, may be considered too large for individual purposes in clinical applications.
JCQ scale reliability and responsiveness to changes in manufacturing process.
d'Errico, Angelo; Punnett, Laura; Gold, Judith E; Gore, Rebecca
2008-02-01
The job content questionnaire (JCQ) was administered to automobile manufacturing workers in two interviews, 5 years apart. Between the two interviews, the company introduced substantial changes in production technology in some production areas. The aims were: (1) to describe the impact of these changes on self-reported psychosocial exposures, and (2) to examine test-retest reliability of the JCQ scales, taking into account changes in job assignment and, for a subset of workers, physical ergonomic exposures as assessed through field observations. The study population included 790 subjects at the first and 519 at the second interview, of whom 387 were present in both. Differences in demand and control scores between interviews were analyzed by Wilcoxon matched-pairs signed-rank test. Test-retest reliability of these scales was evaluated by the intraclass correlation coefficient (ICC) and the Spearman's rho coefficient. The introduction of more automated technology produced an overall increase in job control but did not decrease psychological demand. The reliability of the control scale was low overall but increased to an acceptable level among workers who had not changed job. The demand scale had high reliability only among workers whose physical ergonomic exposures were similar on both survey occasions. These results show that 5-year test-retest reliability of self-reported psychosocial exposures is adequate among workers whose job assignment and ergonomic exposures have remained stable over time.
Inter-Rater and Test-Retest Reliability of the Beery VMI in Schoolchildren
Harvey, Erin M.; Leonard-Green, Tina K.; Mohan, Kathleen M.; Kulp, Marjean Taylor; Davis, Amy L.; Miller, Joseph M.; Twelker, J. Daniel; Campus, Irene; Dennis, Leslie K.
2017-01-01
Purpose To assess inter-rater and test-retest reliability of the 6th Edition Beery-Buktenica Developmental Test of Visual-Motor Integration (VMI) and test-retest reliability of the VMI Visual Perception Supplemental Test (VMIp) in school-age children. Methods Subjects were 163 Native American 3rd – 8th grade students with no significant refractive error (astigmatism < 1.00 D, myopia: < 0.75 D, hyperopia: < 2.50 D, anisometropia < 1.50 D) or ocular abnormalities. The VMI and VMIp were administered twice, on separate days. All VMI tests were scored by two trained scorers and a subset of 50 tests were also scored by an experienced scorer. Scorers strictly applied objective scoring criteria. Analyses included inter-rater and test-retest assessments of bias, 95% limits of agreement, and intraclass correlation analysis. Results Trained scorers had no significant scoring bias compared to the experienced scorer. One of the two trained scorers tended to provide higher scores than the other (mean difference in standardized scores = 1.54). Inter-rater correlations were strong (0.75 to 0.88). VMI and VMIp test-retest comparisons indicated no significant bias (subjects did not tend to score better on retest). Test-retest correlations were moderate (0.54 to 0.58). The 95% LOAs for the VMI were −24.14 to 24.67 (scorer 1) and −26.06 to 26.58 (scorer 2) and the 95% LOAs for the VMIp were −27.11 to 27.34. Conclusions The 95% LOA for test-retest differences will be useful for determining if the VMI and VMIp have sufficient sensitivity for detecting change with treatment in both clinical and research settings. Further research on test-retest reliability reporting 95% LOAs for children across different age ranges are recommended, particularly if the test is to be used to detect changes due to intervention or treatment. PMID:28422801
Isokinetic Strength and Endurance Tests used Pre- and Post-Spaceflight: Test-Retest Reliability
NASA Technical Reports Server (NTRS)
Laughlin, Mitzi S.; Lee, Stuart M. C.; Loehr, James A.; Amonette, William E.
2009-01-01
To assess changes in muscular strength and endurance after microgravity exposure, NASA measures isokinetic strength and endurance across multiple sessions before and after long-duration space flight. Accurate interpretation of pre- and post-flight measures depends upon the reliability of each measure. The purpose of this study was to evaluate the test-retest reliability of the NASA International Space Station (ISS) isokinetic protocol. Twenty-four healthy subjects (12 M/12 F, 32.0 +/- 5.6 years) volunteered to participate. Isokinetic knee, ankle, and trunk flexion and extension strength as well as endurance of the knee flexors and extensors were measured using a Cybex NORM isokinetic dynamometer. The first weekly session was considered a familiarization session. Data were collected and analyzed for weeks 2-4. Repeated measures analysis of variance (alpha=0.05) was used to identify weekly differences in isokinetic measures. Test-retest reliability was evaluated by intraclass correlation coefficients (ICC) (3,1). No significant differences were found between weeks in any of the strength measures and the reliability of the strength measures were all considered excellent (ICC greater than 0.9), except for concentric ankle dorsi-flexion (ICC=0.67). Although a significant difference was noted in weekly endurance measures of knee extension (p less than 0.01), the reliability of endurance measure by week were considered excellent for knee flexion (ICC=0.97) and knee extension (ICC=0.96). Except for concentric ankle dorsi-flexion, the isokinetic strength and endurance measures are highly reliable when following the NASA ISS protocol. This protocol should allow accurate interpretation isokinetic data even with a small number of crew members.
Oyeyemi, Adewale L; Sallis, James F; Oyeyemi, Adetoyeje Y; Amin, Mariam M; De Bourdeaudhuij, Ilse; Deforche, Benedicte
2013-11-01
This study adapted the Physical Activity Neighborhood Environment Scale (PANES) to the Nigerian context and assessed the test-retest reliability and construct validity of the Nigerian version (PANESN). A multidisciplinary panel of experts adapted the original PANES to reflect the built and social environment of Nigeria. The adapted PANES was subjected to cognitive testing and test retest reliability in a diverse sample of Nigerian adults (N = 132) from different neighborhood types. Intraclass Correlation Coefficients (ICC) was used to assess test-retest reliability, and construct validity was investigated with Analysis of Covariance for differences in environmental attributes between neighborhoods. Four of the 17 items on the original PANES were significantly modified, 3 were removed and 2 new items were incorporated into the final version of adapted PANES-N. Test-retest reliability was substantial to almost perfect (ICC = 0.62-1.00) for all items on the PANES-N, and residents of neighborhoods in the inner city reported higher residential density, land use mix and safety, but lower pedestrian facilities and aesthetics than did residents of government reserved area/new layout neighborhoods. The PANES-N appears promising for assessing environmental perceptions related to physical activity in Nigeria, but further testing is required to assess its applicability across Africa.
Hallgren, Kevin A.; Greenfield, Brenna L.; Ladd, Benjamin O.
2016-01-01
Background Behavioral economic theories of drinking posit that the reinforcing value of engaging in activities with versus without alcohol influences drinking behavior. Measures of the reinforcement value of drugs and alcohol have been used in previous research, but little work has examined the psychometric properties of these measures. Objectives The present study aims to evaluate the factor structure, test-retest reliability, and concurrent validity of an alcohol-only version of the Adolescent Reinforcement Survey Schedule (ARSS-AUV). Methods A sample of 157 college student drinkers completed the ARSS-AUV at two time points 2–3 days apart. Test-retest reliability, hierarchical factor analysis, and correlations with other drinking measures were examined. Results Single, unidimensional general factors accounted for a majority of the variance in alcohol and alcohol-free reinforcement items. Residual factors emerged that typically represented alcohol or alcohol-free reinforcement while doing activities with friends, romantic or sexual partners, and family members. Individual ARSS-AUV items had fair-to-good test-retest reliability, while general and residual factors had excellent test-retest reliability. General alcohol reinforcement and alcohol reinforcement from friends and romantic partners were positively correlated with past-year alcohol consumption, heaviest drinking episode, and alcohol-related negative consequences. Alcohol-free reinforcement indices were unrelated to alcohol use or consequences. Conclusions/Importance The ARSS-AUV appears to demonstrate good reliability and mixed concurrent validity among college student drinkers. The instrument may provide useful information about alcohol reinforcement from various activities and people and could provide clinically-relevant information for prevention and treatment programs. PMID:27096713
Hallgren, Kevin A; Greenfield, Brenna L; Ladd, Benjamin O
2016-06-06
Behavioral economic theories of drinking posit that the reinforcing value of engaging in activities with versus without alcohol influences drinking behavior. Measures of the reinforcement value of drugs and alcohol have been used in previous research, but little work has examined the psychometric properties of these measures. The present study aims to evaluate the factor structure, test-retest reliability, and concurrent validity of an alcohol-only version of the Adolescent Reinforcement Survey Schedule (ARSS-AUV). A sample of 157 college student drinkers completed the ARSS-AUV at two time points 2-3 days apart. Test-retest reliability, hierarchical factor analysis, and correlations with other drinking measures were examined. Single, unidimensional general factors accounted for a majority of the variance in alcohol and alcohol-free reinforcement items. Residual factors emerged that typically represented alcohol or alcohol-free reinforcement while doing activities with friends, romantic or sexual partners, and family members. Individual ARSS-AUV items had fair-to-good test-retest reliability, while general and residual factors had excellent test-retest reliability. General alcohol reinforcement and alcohol reinforcement from friends and romantic partners were positively correlated with past-year alcohol consumption, heaviest drinking episode, and alcohol-related negative consequences. Alcohol-free reinforcement indices were unrelated to alcohol use or consequences. The ARSS-AUV appears to demonstrate good reliability and mixed concurrent validity among college student drinkers. The instrument may provide useful information about alcohol reinforcement from various activities and people and could provide clinically-relevant information for prevention and treatment programs.
Test-Retest Reliability of Computerized, Everyday Memory Measures and Traditional Memory Tests.
ERIC Educational Resources Information Center
Youngjohn, James R.; And Others
Test-retest reliabilities and practice effect magnitudes were considered for nine computer-simulated tasks of everyday cognition and five traditional neuropsychological tests. The nine simulated everyday memory tests were from the Memory Assessment Clinic battery as follows: (1) simple reaction time while driving; (2) divided attention (driving…
Guo, Jing; Lau, Ajax Hong Yin; Chau, Jack; Ng, Bobby Kin Wah; Lee, Kwong Man; Qiu, Yong; Cheng, Jack Chun Yiu; Lam, Tsz Ping
2016-10-01
"Simplified Chinese" version of Spinal Appearance Questionnaire (SC-SAQ) for patients with adolescent idiopathic scoliosis (AIS) was available but did not fit for communities using "Traditional Chinese" as their primary language. We developed a traditional Chinese version of SAQ (TC-SAQ) and evaluated its reliability and validity. TC-SAQ was administered to 112 AIS patients, of which 101 bilingual (English and Chinese) patients completed E-SAQ and the traditional Chinese version of Scoliosis Research Society-22 questionnaire (TC-SRS-22). Internal consistency and test-retest reliability were evaluated. Concurrent validity was evaluated by comparing TC-SAQ score with E-SAQ score, and convergent validity by comparing TC-SAQ score with TC-SRS-22 self-image domain score, and discriminant validity by analyzing the relationship between TC-SAQ score and patients' characteristics. Internal consistency of individual TC-SAQ domain was high (Cronbach's α = 0.785 to 0.940), except for general (Cronbach's α = 0.665) and shoulders (Cronbach's α = 0.421) domain. Test-retest reliability of TC-SAQ was good (ICCs of each domain from 0.798 to 0.865). Concurrent validity demonstrated an excellent correlation between TC-SAQ and E-SAQ scores (r = 0.820 to 0.954, P < 0.0001 for all domains). Correlation between TC-SAQ domains and TC-SRS-22 self-image domain was weak to moderate. TC-SAQ total score and individual domain scores (except waist and chest domains) were positively correlated to major curve magnitude. TC-SAQ had good internal consistency and test-retest reliability. Concurrent validity evaluated against the original English version was excellent. TC-SAQ was both reliable and valid for clinical use for AIS patients using traditional Chinese as their primary language.
Validity and Reliability of General Nutrition Knowledge Questionnaire for Adults in Uganda
Bukenya, Richard; Ahmed, Abhiya; Andrade, Jeanette M.; Grigsby-Toussaint, Diana S.; Muyonga, John; Andrade, Juan E.
2017-01-01
This study sought to develop and validate a general nutrition knowledge questionnaire (GNKQ) for Ugandan adults. The initial draft consisted of 133 items on five constructs associated with nutrition knowledge; expert recommendations (16 items), food groups (70 items), selecting food (10 items), nutrition and disease relationship (23 items), and food fortification in Uganda (14 items). The questionnaire validity was evaluated in three studies. For the content validity (study 1), a panel of five content matter nutrition experts reviewed the GNKQ draft before and after face validity. For the face validity (study 2), head teachers and health workers (n = 27) completed the questionnaire before attending one of three focus groups to review the clarity of the items. For the construct and test-rest reliability (study 3), head teachers (n = 40) from private and public primary schools and nutrition (n = 52) and engineering (n = 49) students from Makerere University took the questionnaire twice (two weeks apart). Experts agreed (content validity index, CVI > 0.9; reliability, Gwet’s AC1 > 0.85) that all constructs were relevant to evaluate nutrition knowledge. After the focus groups, 29 items were identified as unclear, requiring major (n = 5) and minor (n = 24) reviews. The final questionnaire had acceptable internal consistency (Cronbach α > 0.95), test-retest reliability (r = 0.89), and differentiated (p < 0.001) nutrition knowledge scores between nutrition (67 ± 5) and engineering (39 ± 11) students. Only the construct on nutrition recommendations was unreliable (Cronbach α = 0.51, test-retest r = 0.55), which requires further optimization. The final questionnaire included topics on food groups (41 items), selecting food (2 items), nutrition and disease relationship (14 items), and food fortification in Uganda (22 items) and had good content, construct, and test-retest reliability to evaluate nutrition knowledge among Ugandan adults. PMID:28230779
Reliability of the xipho-pubic angle in patients with sagittal imbalance of the spine.
Langella, Francesco; Villafañe, Jorge H; Ismael, Maryem; Buric, Josip; Piazzola, Andrea; Lamartina, Claudio; Berjano, Pedro
2018-04-01
Proximal junctional kyphosis (PJK) is a frequent complication that compromises the outcomes of spinal surgery, especially for adult deformity. To the date no single risk factor or cause has been identified that explains its occurrence. The purpose of this study was to investigate the test-retest reliability of the radiologic measurements using xipho-pubic angle (XPA) for subjects undergoing surgery for sagittal misalignment of the spine. Retrospective observational cross-sectional study of prospectively collected data. Full-spine standing lateral radiographs of 50 patients who underwent surgery for fixed sagittal imbalance (preoperative and postoperative) were evaluated. Internal consistency, reproducibility, concurrent validity, and discriminative ability of the XPA. Two physicians measured XPA on the 100 randomly sorted and anonymized radiographs on two occasions, one week apart (test and retest conditions), were calculated for inter and intraobserver agreement. Test-retest reliability of XPA measurement was excellent for pre- (ICC=0.98; P=0.001) and post-surgical (ICC=0.86; P=0.001) radiographs of subjects with sagittal imbalance of the spine. XPA was able to discriminate between preoperative and postoperative radiographs F=17.924, P<0.001) in patients undergoing surgery for fixed sagittal imbalance for both raters. There were significant differences between pre- vs. postoperative XPA, pelvic tilt, lumbar lordosis and sagittal vertical axis values (all P<0.001). Xipho-pubic angle had fair to excellent test-retest reliability, and it did possess validity to discriminate between preoperative and postoperative radiographs in patients undergoing surgery for fixed sagittal imbalance.
Salamon, Sarah; Santelmann, Hanno; Franklin, Jeremy; Baethge, Christopher
2018-04-01
Reliability of schizoaffective disorder (SAD) diagnoses is low in adults but unclear in children and adolescents (CAD). We estimate the test-retest reliability of SAD and its key differential diagnoses (schizophrenia, bipolar disorder, and unipolar depression). Systematic literature search of Medline, Embase, and PsycInfo for studies on test-retest reliability of SAD, in CAD. Cohen's kappa was extracted from studies. We performed meta-analysis for kappa, including subgroup and sensitivity analysis (PROSPERO protocol: CRD42013006713). Out of > 4000 records screened, seven studies were included. We estimated kappa values of 0.27 [95%-CI: 0.07 0.47] for SAD, 0.56 [0.29; 0.83] for schizophrenia, 0.64 [0.55; 0.74] for bipolar disorder, and 0.66 [0.52; 0.81] for unipolar depression. In 5/7 studies kappa of SAD was lower than that of schizophrenia; similar trends emerged for bipolar disorder (4/5) and unipolar depression (2/3). Estimates of positive agreement of SAD diagnoses supported these results. The number of studies and patients included is low. The point-estimate of the test-retest reliability of schizoaffective disorder is only fair, and lower than that of its main differential diagnoses. All kappa values under study were lower in children and adolescents samples than those reported for adults. Clinically, schizoaffective disorder should be diagnosed in strict adherence to the operationalized criteria and ought to be re-evaluated regularly. Should larger studies confirm the insufficient reliability of schizoaffective disorder in children and adolescents, the clinical value of the diagnosis is highly doubtful. Copyright © 2017. Published by Elsevier B.V.
Buntragulpoontawee, Montana; Phutrit, Suphatha; Tongprasert, Siam; Wongpakaran, Tinakon; Khunachiva, Jeeranan
2018-03-27
This study evaluated additional psychometric properties of the Thai version of the disabilities of the arm, shoulder and hand questionnaire (DASH-TH) which included, test-retest reliability, construct validity, internal consistency of in patients with carpal tunnel syndrome. As for determining construct validity, the Thai EuroQOL questionnaire (EQ-5D-5L) was also administered in order to examine convergent and divergent validity. Fifty patients completed both questionnaires. The DASH-TH showed excellent test-retest reliability (intraclass correlation coefficient = 0.811) and internal consistency (Cronbach's alpha = 0.911). The exploratory factor analysis yielded a six-factor solution while the confirmatory factor analysis denoted that the hypothesized model adequately fit the data with a comparative fit index of 0.967 and a Tucker-Lewis index of 0.964. The related subscales between the DASH-TH and the Thai EQ-5D-5L were significantly correlated, indicating the DASH-TH's convergent and discriminant validity. The DASH-TH demonstrated good reliability, internal consistency construct validity, and multidimensionality, in assessing the upper extremity function in carpal tunnel syndrome patients.
Tung, Li-Chen; Yu, Wan-Hui; Lin, Gong-Hong; Yu, Tzu-Ying; Wu, Chien-Te; Tsai, Chia-Yin; Chou, Willy; Chen, Mei-Hsiang; Hsieh, Ching-Lin
2016-09-01
To develop a Tablet-based Symbol Digit Modalities Test (T-SDMT) and to examine the test-retest reliability and concurrent validity of the T-SDMT in patients with stroke. The study had two phases. In the first phase, six experts, nine college students and five outpatients participated in the development and testing of the T-SDMT. In the second phase, 52 outpatients were evaluated twice (2 weeks apart) with the T-SDMT and SDMT to examine the test-retest reliability and concurrent validity of the T-SDMT. The T-SDMT was developed via expert input and college student/patient feedback. Regarding test-retest reliability, the practise effects of the T-SDMT and SDMT were both trivial (d=0.12) but significant (p≦0.015). The improvement in the T-SDMT (4.7%) was smaller than that in the SDMT (5.6%). The minimal detectable changes (MDC%) of the T-SDMT and SDMT were 6.7 (22.8%) and 10.3 (32.8%), respectively. The T-SDMT and SDMT were highly correlated with each other at the two time points (Pearson's r=0.90-0.91). The T-SDMT demonstrated good concurrent validity with the SDMT. Because the T-SDMT had a smaller practise effect and less random measurement error (superior test-retest reliability), it is recommended over the SDMT for assessing information processing speed in patients with stroke. Implications for Rehabilitation The Symbol Digit Modalities Test (SDMT), a common measure of information processing speed, showed a substantial practise effect and considerable random measurement error in patients with stroke. The Tablet-based SDMT (T-SDMT) has been developed to reduce the practise effect and random measurement error of the SDMT in patients with stroke. The T-SDMT had smaller practise effect and random measurement error than the SDMT, which can provide more reliable assessments of information processing speed.
Reliability and validity of a questionnaire for self-assessment of complete dentures.
Komagamine, Yuriko; Kanazawa, Manabu; Kaiba, Yoshinori; Sato, Yusuke; Minakuchi, Shunsuke
2014-05-02
Demand for complete denture treatment is expected to rise over several decades. However, to date, no questionnaire on complete dentures, as evaluated by edentulous patients, has been shown to be reliable and valid. This study sought to assess the reliability and validity of Patient's Denture Assessment (PDA), which provides a multidimensional evaluation of dentures among edentulous patients. Patients, who had new complete dentures fabricated at the University Hospital of Dentistry, Tokyo Medical and Dental University through 2009 to 2010, were enrolled. The reliability of the PDA was determined by examining internal consistency and test-retest reliability. Internal consistency for all of the question items and the six subscales was measured using Cronbach's α and average inter-item correlation coefficients among 93 participants. For 33 of these participants, test-retest reliability was determined at a 2 month-interval using the interclass correlation coefficients (ICCs) and 95% confidence interval for the summary scores and the six subscale scores. The PDA was validated in 93 participants by examining the difference in the summary score and the six subscale scores of the PDA before and after replacement with new dentures by the paired t-test. Ability to detect change was also tested in 93 patients using effect size. The Cronbach's α for the PDA ranged from 0.56 to 0.93. The average inter-item correlation coefficients ranged from 0.28 to 0.83. ICCs for the PDA ranged from 0.37 to 0.83. The paired t-test showed a significant difference between the summary score and the six subscale scores before and after replacement with new dentures (p < 0.05) and the effect size was 0.97. The PDA demonstrated good reliability by assessing internal consistency and test-retest reliability. In addition, the PDA demonstrated good validity by assessing discriminant validity. Thus, the PDA could help dentists obtain a detailed understanding of the patients' perceptions in using their dentures.
Reliability of a Market Basket Assessment Tool (MBAT) for Use in SNAP-Ed Healthy Retail Initiatives.
Misyak, Sarah A; Hedrick, Valisa E; Pudney, Ellen; Serrano, Elena L; Farris, Alisha R
2018-05-01
To evaluate the reliability of the Market Basket Assessment Tool (MBAT) for assessing the availability of fruits and vegetables, low-fat or nonfat dairy and eggs, lean meats, whole-grain products, and seeds, beans, and nuts in Supplemental Nutrition Assistance Program-authorized retail environments. Different trained raters used the MBAT simultaneously at 14 retail environments to measure interrater reliability. Raters returned to 12 retail environments (85.7%) 1 week later to measure test-retest reliability. Data were analyzed using paired-sample t tests and correlations. No significant differences were found for interrater reliability or test-retest reliability for individual categories (mean differences, 0.0 to 0.3 ± 0.2 points) or total score (mean difference, 0.5 ± 0.4 points and (mean differences, 0.0 to 0.3 ± 0.3 points) or total score (mean difference, 0.8 ± 0.4 points), respectively. Future steps include validation of the MBAT. A low-burden tool can facilitate evaluation of efforts to promote healthful foods in retail environments. Copyright © 2018 Society for Nutrition Education and Behavior. Published by Elsevier Inc. All rights reserved.
Petrova, Tatjana; Kavookjian, Jan; Madson, Michael B; Dagley, John; Shannon, David; McDonough, Sharon K
2015-01-01
Motivational interviewing (MI) has demonstrated a significant impact as an intervention strategy for addiction management, change in lifestyle behaviors, and adherence to prescribed medication and other treatments. Key elements to studying MI include training in MI of professionals who will use it, assessment of skills acquisition in trainees, and the use of a validated skills assessment tool. The purpose of this research project was to develop a psychometrically valid and reliable tool that has been designed to assess MI skills competence in health care provider trainees. The goal was to develop an assessment tool that would evaluate the acquisition and use of specific MI skills and principles, as well as the quality of the patient-provider therapeutic alliance in brief health care encounters. To address this purpose, specific steps were followed, beginning with a literature review. This review contributed to the development of relevant conceptual and operational definitions, selecting a scaling technique and response format, and methods for analyzing validity and reliability. Internal consistency reliability was established on 88 video recorded interactions. The inter-rater and test-retest reliability were established using randomly selected 18 from the 88 interactions. The assessment tool Motivational Interviewing Skills for Health Care Encounters (MISHCE) and a manual for use of the tool were developed. Validity and reliability of MISHCE were examined. Face and content validity were supported with well-defined conceptual and operational definitions and feedback from an expert panel. Reliability was established through internal consistency, inter-rater reliability, and test-retest reliability. The overall internal consistency reliability (Cronbach's alpha) for all fifteen items was 0.75. MISHCE demonstrated good inter-rater reliability and good to excellent test-retest reliability. MISHCE assesses the health provider's level of knowledge and skills in brief disease management encounters. MISHCE also evaluates quality of the patient-provider therapeutic alliance, i.e., the "flow" of the interaction. Copyright © 2015 Elsevier Inc. All rights reserved.
Vestibular Assessments in Children With Global Developmental Delay: An Exploratory Study.
Dannenbaum, Elizabeth; Horne, Victoria; Malik, Farwa; Villeneuve, Myriam; Salvo, Lora; Chilingaryan, Gevorg; Lamontagne, Anouk
2016-01-01
To compare results of 3 clinical vestibular tests between children with global developmental delay (GDD) and children with typical development (TD) and investigate the test-retest reliability. Twenty children with GDD (aged 4.1-12.1 years) and 11 age-matched controls with TD participated. Participants with GDD underwent 2 sessions of testing. Each session consisted of the Clinical Test of Sensory Interaction and Balance (CTSIB), Dynamic Visual Acuity (DVA) test, and the modified Emory Clinical Vestibular Chair Test (m-ECVCT). Up to 33% of the children with GDD had abnormal DVA scores. m-ECVCT results of children with GDD demonstrated larger variance than children with TD. The CTSIB score was significantly reduced in the group with GDD. The test-retest reliability varied, with good reliability for the m-ECVCT and CTSIB, and fair reliability for the DVA. Findings suggest vestibular involvement in children in GDD. The clinical tests demonstrated moderate test-retest reliability.
Mieritz, Rune M; Bronfort, Gert; Jakobsen, Markus D; Aagaard, Per; Hartvigsen, Jan
2014-09-01
A basic premise for any instrument measuring spinal motion is that reliable outcomes can be obtained on a relevant sample under standardized conditions. The purpose of this study was to assess the overall reliability and measurement error of regional spinal sagittal plane motion in patients with chronic low back pain (LBP), and then to evaluate the influence of body mass index, examiner, gender, stability of pain, and pain distribution on reliability and measurement error. This study comprises a test-retest design separated by 7 to 14 days. The patient cohort consisted of 220 individuals with chronic LBP. Kinematics of the lumbar spine were sampled during standardized spinal extension-flexion testing using a 6-df instrumented spatial linkage system. Test-retest reliability and measurement error were evaluated using interclass correlation coefficients (ICC(1,1)) and Bland-Altman limits of agreement (LOAs). The overall test-retest reliability (ICC(1,1)) for various motion parameters ranged from 0.51 to 0.70, and relatively wide LOAs were observed for all parameters. Reliability measures in patient subgroups (ICC(1,1)) ranged between 0.34 and 0.77. In general, greater (ICC(1,1)) coefficients and smaller LOAs were found in subgroups with patients examined by the same examiner, patients with a stable pain level, patients with a body mass index less than below 30 kg/m(2), patients who were men, and patients in the Quebec Task Force classifications Group 1. This study shows that sagittal plane kinematic data from patients with chronic LBP may be sufficiently reliable in measurements of groups of patients. However, because of the large LOAs, this test procedure appears unusable at the individual patient level. Furthermore, reliability and measurement error varies substantially among subgroups of patients. Copyright © 2014 Elsevier Inc. All rights reserved.
Dreessen, L; Arntz, A
1998-01-01
The short-interval test-retest interrater reliability of the Structured Clinical Interview for DSM-III-R personality disorders (SCID-II) was studied in a psychotherapy outpatient group whose main complaint was mostly an Axis I anxiety disorder. Using a test-retest approach to assess interrater reliability, three sources of variance were taken into account (rater variance in the elicitation and interpretation of information and patient variance across interviews). Base rate requirements were established before calculating reliability coefficients. On the whole, interrater agreement on the SCID-II was found to be satisfactory, except for the histrionic personality traits. This is the first study that has estimated short-interval test-retest interrater reliability of the SCID-II in outpatients, and also the first that has studied single SCID-II traits and dimensional diagnoses. The results found support the use of the SCID-II as a diagnostic instrument for clinical and research purposes.
Dijkhuizen, Annemarie; Douma, Rob K; Krijnen, Wim P; van der Schans, Cees P; Waninge, Aly
2018-05-30
A feasible and reliable instrument to measure strength in persons with severe intellectual and visual disabilities (SIVD) is lacking. The aim of our study was to determine feasibility, learning period and reliability of three strength tests. Twenty-nine participants with SIVD performed the Minimum Sit-to-Stand Height test (MSST), the Leg Extension test (LE) and the 30 seconds Chair-Stand test (30sCS), once per week for 5 weeks. Feasibility was determined by the percentage of successful measurements; learning effect by using paired t test between two consecutive measurements; test-retest reliability by intraclass correlation coefficient and Limits of Agreement and, correlations by Pearson correlations. A sufficient feasibility and learning period of the tests was shown. The methods had sufficient test-retest reliability and moderate-to-sufficient correlations. The MSST, the LE, and the 30sCS are feasible tests for measuring muscle strength in persons with SIVD, having sufficient test re-test reliability. © 2018 John Wiley & Sons Ltd.
Williams, Janet B W; Kobak, Kenneth A
2008-01-01
The Montgomery-Asberg Depression Rating Scale (MADRS) is often used in clinical trials to select patients and to assess treatment efficacy. The scale was originally published without suggested questions for clinicians to use in gathering the information necessary to rate the items. Structured and semi-structured interview guides have been found to improve reliability with other scales. To describe the development and test-retest reliability of a structured interview guide for the MADRS (SIGMA). A total of 162 test-retest interviews were conducted by 81 rater pairs. Each patient was interviewed twice, once by each rater conducting an independent interview. The intraclass correlation for total score between raters using the SIGMA was r=0.93, P<0.0001. All ten items had good to excellent interrater reliability. Use of the SIGMA can result in high reliability of MADRS scores in evaluating patients with depression.
Chang, Jasper O; Levy, Susan S; Seay, Seth W; Goble, Daniel J
2014-05-01
Recent guidelines advocate sports medicine professionals to use balance tests to assess sensorimotor status in the management of concussions. The present study sought to determine whether a low-cost balance board could provide a valid, reliable, and objective means of performing this balance testing. Criterion validity testing relative to a gold standard and 7 day test-retest reliability. University biomechanics laboratory. Thirty healthy young adults. Balance ability was assessed on 2 days separated by 1 week using (1) a gold standard measure (ie, scientific grade force plate), (2) a low-cost Nintendo Wii Balance Board (WBB), and (3) the Balance Error Scoring System (BESS). Validity of the WBB center of pressure path length and BESS scores were determined relative to the force plate data. Test-retest reliability was established based on intraclass correlation coefficients. Composite scores for the WBB had excellent validity (r = 0.99) and test-retest reliability (R = 0.88). Both the validity (r = 0.10-0.52) and test-retest reliability (r = 0.61-0.78) were lower for the BESS. These findings demonstrate that a low-cost balance board can provide improved balance testing accuracy/reliability compared with the BESS. This approach provides a potentially more valid/reliable, yet affordable, means of assessing sports-related concussion compared with current methods.
de Oliveira, Valéria M A; Pitangui, Ana C R; Nascimento, Vinícius Y S; da Silva, Hítalo A; Dos Passos, Muana H P; de Araújo, Rodrigo C
2017-02-01
The Closed Kinetic Chain Upper Extremity Stability Test (CKCUEST) has been proposed as an option to assess upper limb function and stability; however, there are few studies that support the use of this test in adolescents. The purpose of the present study was to investigate the intersession reliability and agreement of three CKCUEST scores in adolescents and establish clinimetric values for this test. Test-retest reliability. Twenty-five healthy adolescents of both sexes were evaluated. The subjects performed two CKCUEST with an interval of one week between the tests. An intraclass correlation coefficient (ICC 3,3 ) two-way mixed model with a 95% interval of confidence was utilized to determine intersession reliability. A Bland-Altman graph was plotted to analyze the agreement between assessments. The presence of systematic error was evaluated by a one-sample t test. The difference between the evaluation and reevaluation was observed using a paired-sample t test. The level of significance was set at 0.05. Standard error of measurements and minimum detectable changes were calculated. The intersession reliability of the average touches score, normalized score, and power score were 0.68, 0.68 and 0.87, the standard error of measurement were 2.17, 1.35 and 6.49, and the minimal detectable change was 6.01, 3.74 and 17.98, respectively. The presence of systematic error (p < 0.014), the significant difference between the measurements (p < 0.05), and the analysis of the Bland-Altman graph infer that CKCUEST is a discordant test with moderate to excellent reliability when used with adolescents. The CKCUEST is a measurement with moderate to excellent reliability for adolescents. 2b.
Koh, Yi Ling Eileen; Lua, Yi Hui Adela; Hong, Liyue; Bong, Huey Shin Shirley; Yeo, Ling Sui Jocelyn; Tsang, Li Ping Marianne; Ong, Kai Zhi; Wong, Sook Wai Samantha; Tan, Ngiap Chuan
2016-03-01
Essential hypertension often requires affected patients to self-manage their condition most of the time. Besides seeking regular medical review of their life-long condition to detect vascular complications, patients have to maintain healthy lifestyles in between physician consultations via diet and physical activity, and to take their medications according to their prescriptions. Their self-management ability is influenced by their self-efficacy capacity, which can be assessed using questionnaire-based tools. The "Hypertension Self-Care Profile" (HTN-SCP) is 1 such questionnaire assessing self-efficacy in the domains of "behavior," "motivation," and "self-efficacy." This study aims to determine the test-retest reliability of HTN-SCP in an English-literate Asian population using a web-based approach. Multiethnic Asian patients, aged 40 years and older, with essential hypertension were recruited from a typical public primary care clinic in Singapore. The investigators guided the patients to fill up the web-based 60-item HTN-SCP in English using a tablet or smartphone on the first visit and refilled the instrument 2 weeks later in the retest. Internal consistency and test-retest reliability were evaluated using Cronbach's Alpha and intraclass correlation coefficients (ICC), respectively. The t test was used to determine the relationship between the overall HTN-SCP scores of the patients and their self-reported self-management activities. A total of 160 patients completed the HTN-SCP during the initial test, from which 71 test-retest responses were completed. No floor or ceiling effect was found for the scores for the 3 subscales. Cronbach's Alpha coefficients were 0.857, 0.948, and 0.931 for "behavior," "motivation," and "self-efficacy" domains respectively, indicating high internal consistency. The item-total correlation ranges for the 3 scales were from 0.105 to 0.656 for Behavior, 0.401 to 0.808 for Motivation, 0.349 to 0.789 for Self-efficacy. The corresponding ICC scores of 0.671, 0.762, and 0.720 for these respective domains showed good test-retest reliability. The correlation of the HTN-SCP scores and patients' reported self-management measures were significant, except for keeping their food diary. HTN-SCP showed satisfactory internal consistency and test-retest reliability in an English literate Asian population. A web-based approach is feasible if similar studies are needed to validate its translated versions of the tool for wider application in the local multilingual population.
Measuring leprosy-related stigma - a pilot study to validate a toolkit of instruments.
Rensen, Carin; Bandyopadhyay, Sudhakar; Gopal, Pala K; Van Brakel, Wim H
2011-01-01
Stigma negatively affects the quality of life of leprosy-affected people. Instruments are needed to assess levels of stigma and to monitor and evaluate stigma reduction interventions. We conducted a validation study of such instruments in Tamil Nadu and West Bengal, India. Four instruments were tested in a 'Community Based Rehabilitation' (CBR) setting, the Participation Scale, Internalised Scale of Mental Illness (ISMI) adapted for leprosy-affected persons, Explanatory Model Interview Catalogue (EMIC) for leprosy-affected and non-affected persons and the General Self-Efficacy (GSE) Scale. We evaluated the following components of validity, construct validity, internal consistency, test-retest reproducibility and reliability to distinguish between groups. Construct validity was tested by correlating instrument scores and by triangulating quantitative and qualitative findings. Reliability was evaluated by comparing levels of stigma among people affected by leprosy and community controls, and among affected people living in CBR project areas and those in non-CBR areas. For the Participation, ISMI and EMIC scores significant differences were observed between those affected by leprosy and those not affected (p = 0.0001), and between affected persons in the CBR and Control group (p < 0.05). The internal consistency of the instruments measured with Cronbach's α ranged from 0.83 to 0.96 and was very good for all instruments. Test-retest reproducibility coefficients were 0.80 for the Participation score, 0.70 for the EMIC score, 0.62 for the ISMI score and 0.50 for the GSE score. The construct validity of all instruments was confirmed. The Participation and EMIC Scales met all validity criteria, but test-retest reproducibility of the ISMI and GSE Scales needs further evaluation with a shorter test-retest interval and longer training and additional adaptations for the latter.
Gerards, Sanne M P L; Hummel, Karin; Dagnelie, Pieter C; de Vries, Nanne K; Kremers, Stef P J
2013-01-18
Evaluating whether parental challenges and self-efficacy toward managing children's lifestyle behaviors are successfully addressed by interventions requires valid instruments. The Lifestyle Behavior Checklist (LBC) has recently been developed in the Australian context. It consists of two subscales: the Problem scale, which measures parental perceptions of children's behavioral problems related to overweight and obesity, and the Confidence scale, measuring parental self-efficacy in dealing with these problems. The aim of the current study was to systematically translate the questionnaire into Dutch and to evaluate its internal consistency, construct validity and test-retest reliability. The LBC was systematically translated by four experts at Maastricht University. In total, 392 parents of 3-to13-year-old children were invited to fill out two successive online questionnaires with a two-week interval. Of these, 273 parents responded to the first questionnaire (test, response rate = 69.6%), and of the 202 who could be invited for the second questionnaire (retest), 100 responded (response rate = 49.5%). We assessed the questionnaire's internal consistency (Cronbach's α), construct validity (Spearman's Rho correlation tests, using the criterion measures: restrictiveness, nurturance, and psychological control), and test-retest reliability (Spearman's Rho correlation tests). Both scales had high internal consistency (Cronbach's α ≥ 0.90). Spearman correlation coefficients indicated acceptable test-retest reliability for both the Problem scale (rs = 0.74) and the Confidence scale (rs = 0.70). The LBC Problem scale was significantly correlated to all criterion scales (nurturance, restrictiveness, psychological control) in the hypothesized direction, and the LBC Confidence scale was significantly correlated with nurturance and psychological control in the hypothesized direction, but not with restrictiveness. The Dutch translation of the LBC was found to be a reliable and reasonably valid questionnaire to measure parental perceptions of children's weight-related problem behavior and the extent to which parents feel confident to manage these problems.
Powell, T; Brooker, D J; Papadopolous, A
1993-05-01
Relative and absolute test-retest reliability of the MEAMS was examined in 12 subjects with probable dementia and 12 matched controls. Relative reliability was good. Measures of absolute reliability showed scores changing by up to 3 points over an interval of a week. A version effect was found to be in evidence.
Skinner, Ian W; Hübscher, Markus; Moseley, G Lorimer; Lee, Hopin; Wand, Benedict M; Traeger, Adrian C; Gustin, Sylvia M; McAuley, James H
2017-08-15
Eyetracking is commonly used to investigate attentional bias. Although some studies have investigated the internal consistency of eyetracking, data are scarce on the test-retest reliability and agreement of eyetracking to investigate attentional bias. This study reports the test-retest reliability, measurement error, and internal consistency of 12 commonly used outcome measures thought to reflect the different components of attentional bias: overall attention, early attention, and late attention. Healthy participants completed a preferential-looking eyetracking task that involved the presentation of threatening (sensory words, general threat words, and affective words) and nonthreatening words. We used intraclass correlation coefficients (ICCs) to measure test-retest reliability (ICC > .70 indicates adequate reliability). The ICCs(2, 1) ranged from -.31 to .71. Reliability varied according to the outcome measure and threat word category. Sensory words had a lower mean ICC (.08) than either affective words (.32) or general threat words (.29). A longer exposure time was associated with higher test-retest reliability. All of the outcome measures, except second-run dwell time, demonstrated low measurement error (<6%). Most of the outcome measures reported high internal consistency (α > .93). Recommendations are discussed for improving the reliability of eyetracking tasks in future research.
Robbins, Shawn M; Caplan, Ryan M; Aponte, Daniel I; St-Onge, Nancy
2017-10-01
External perturbations are utilized to challenge balance and mimic realistic balance threats in patient populations. The reliability of such protocols has not been established. The purpose was to examine test-retest reliability of balance testing with external perturbations. Healthy adults (n=34; mean age 23 years) underwent balance testing over two visits. Participants completed ten balance conditions in which the following parameters were combined: perturbation or non-perturbation, single or double leg, and eyes open or closed. Three trials were collected for each condition. Data were collected on a force plate and external perturbations were applied by translating the plate. Force plate center of pressure (CoP) data were summarized using 13 different CoP measures. Test-retest reliability was examined using intraclass correlation coefficients (ICC) and Bland-Altman plots. CoP measures of total speed and excursion in both anterior-posterior and medial-lateral directions generally had acceptable ICC values for perturbation conditions (ICC=0.46 to 0.87); however, many other CoP measures (e.g. range, area of ellipse) had unacceptable test-retest reliability (ICC<0.70). Improved CoP measures were present on the second visit indicating a potential learning effect. Non-perturbation conditions generally produced more reliable CoP measures than perturbation conditions during double leg standing, but not single leg standing. Therefore, changes to balance testing protocols that include external perturbations should be made to improve test-retest reliability and diminish learning including more extensive participant training and increasing the number of trials. CoP measures that consider all data points (e.g. total speed) are more reliable than those that only consider a few data points. Copyright © 2017 Elsevier B.V. All rights reserved.
Ahlström, Isabell; Hellström, Karin; Emtner, Margareta; Anens, Elisabeth
2015-03-01
To examine the test-retest reliability of the Swedish translated version of the Exercise Self-Efficacy Scale (S-ESES) in people with neurological disease and to examine internal consistency. Test-retest study. A total of 30 adults with neurological diseases including: Parkinson's disease; Multiple Sclerosis; Cervical Dystonia; and Charcot-Marie-Tooth disease. The S-ESES was sent twice by surface mail. Completion interval mean was 16 days apart. Weighted kappa, intraclass correlation coefficient 2,1 [ICC (2,1)], standard error of measurement (SEM), also expressed as a percentage value (SEM%), and Cronbach's alpha were calculated. The relative reliability of the test-retest results showed substantial agreement measured using weighted kappa (MD = 0.62) and a very high-reliability ICC (2,1) (0.92). Absolute reliability measured using SEM was 5.3 and SEM% was 20.7. Excellent internal consistency was shown, with an alpha coefficient of 0.91 (test 1) and 0.93 (test 2). The S-ESES is recommended for use in research and in clinical work for people with neurological diseases. The low-absolute reliability, however, indicates a limited ability to measure changes on an individual level.
Inter- and intra-observer reliability of clinical movement-control tests for marines
2012-01-01
Background Musculoskeletal disorders particularly in the back and lower extremities are common among marines. Here, movement-control tests are considered clinically useful for screening and follow-up evaluation. However, few studies have addressed the reliability of clinical tests, and no such published data exists for marines. The present aim was therefore to determine the inter- and intra-observer reliability of clinically convenient tests emphasizing movement control of the back and hip among marines. A secondary aim was to investigate the sensitivity and specificity of these clinical tests for discriminating musculoskeletal pain disorders in this group of military personnel. Methods This inter- and intra-observer reliability study used a test-retest approach with six standardized clinical tests focusing on movement control for back and hip. Thirty-three marines (age 28.7 yrs, SD 5.9) on active duty volunteered and were recruited. They followed an in-vivo observation test procedure that covered both low- and high-load (threshold) tasks relevant for marines on operational duty. Two independent observers simultaneously rated performance as “correct” or “incorrect” following a standardized assessment protocol. Re-testing followed 7–10 days thereafter. Reliability was analysed using kappa (κ) coefficients, while discriminative power of the best-fitting tests for back- and lower-extremity pain was assessed using a multiple-variable regression model. Results Inter-observer reliability for the six tests was moderate to almost perfect with κ-coefficients ranging between 0.56-0.95. Three tests reached almost perfect inter-observer reliability with mean κ-coefficients > 0.81. However, intra-observer reliability was fair-to-moderate with mean κ-coefficients between 0.22-0.58. Three tests achieved moderate intra-observer reliability with κ-coefficients > 0.41. Combinations of one low- and one high-threshold test best discriminated prior back pain, but results were inconsistent for lower-extremity pain. Conclusions Our results suggest that clinical tests of movement control of back and hip are reliable for use in screening protocols using several observers with marines. However, test-retest reproducibility was less accurate, which should be considered in follow-up evaluations. The results also indicate that combinations of low- and high-threshold tests have discriminative validity for prior back pain, but were inconclusive for lower-extremity pain. PMID:23273285
Brett, Benjamin L; Solomon, Gary S
2017-04-01
Research findings to date on the stability of Immediate Post-Concussion Assessment and Cognitive Testing (ImPACT) Composite scores have been inconsistent, requiring further investigation. The use of test validity criteria across these studies also has been inconsistent. Using multiple measures of stability, we examined test-retest reliability of repeated ImPACT baseline assessments in high school athletes across various validity criteria reported in previous studies. A total of 1146 high school athletes completed baseline cognitive testing using the online ImPACT test battery at two time periods of approximately two-year intervals. No participant sustained a concussion between assessments. Five forms of validity criteria used in previous test-retest studies were applied to the data, and differences in reliability were compared. Intraclass correlation coefficients (ICCs) ranged in composite scores from .47 (95% confidence interval, CI [.38, .54]) to .83 (95% CI [.81, .85]) and showed little change across a two-year interval for all five sets of validity criteria. Regression based methods (RBMs) examining the test-retest stability demonstrated a lack of significant change in composite scores across the two-year interval for all forms of validity criteria, with no cases falling outside the expected range of 90% confidence intervals. The application of more stringent validity criteria does not alter test-retest reliability, nor does it account for some of the variation observed across previously performed studies. As such, use of the ImPACT manual validity criteria should be utilized in the determination of test validity and in the individualized approach to concussion management. Potential future efforts to improve test-retest reliability are discussed.
Internal Consistency, Retest Reliability, and their Implications For Personality Scale Validity
McCrae, Robert R.; Kurtz, John E.; Yamagata, Shinji; Terracciano, Antonio
2010-01-01
We examined data (N = 34,108) on the differential reliability and validity of facet scales from the NEO Inventories. We evaluated the extent to which (a) psychometric properties of facet scales are generalizable across ages, cultures, and methods of measurement; and (b) validity criteria are associated with different forms of reliability. Composite estimates of facet scale stability, heritability, and cross-observer validity were broadly generalizable. Two estimates of retest reliability were independent predictors of the three validity criteria; none of three estimates of internal consistency was. Available evidence suggests the same pattern of results for other personality inventories. Internal consistency of scales can be useful as a check on data quality, but appears to be of limited utility for evaluating the potential validity of developed scales, and it should not be used as a substitute for retest reliability. Further research on the nature and determinants of retest reliability is needed. PMID:20435807
Validation of the Brazilian Portuguese Version of Geriatric Anxiety Inventory--GAI-BR.
Massena, Patrícia Nitschke; de Araújo, Narahyana Bom; Pachana, Nancy; Laks, Jerson; de Pádua, Analuiza Camozzato
2015-07-01
The Geriatric Anxiety Inventory (GAI) is a recently developed scale aiming to evaluate symptoms of anxiety in later life. This 20-item scale uses dichotomous answers highlighting non-somatic anxiety complaints of elderly people. The present study aimed to evaluate the psychometric properties of the Brazilian Portuguese version GAI (GAI-BR) in a sample from community and outpatient psychogeriatric clinic. A mixed convenience sample of 72 subjects was recruited for answering the research protocol. The interview procedures were structured with questionnaires about sociodemographic data, clinical health status, anxiety, and depression previously validated instruments, Mini-Mental State Examination, Mini International Neuropsychiatric Interview, and GAI-BR. Twenty-two percent of the sample were interviewed twice for test-retest reliability. For internal consistency analyses, the Cronbach's α test was applied. The Spearman correlation test was applied to evaluate the test-retest GAI-BR reliability. A ROC (receiver operating characteristic) curve study was made to estimate the GAI-BR area under curve, cut-off points, sensitivity, and specificity for the Generalized Anxiety Disorder diagnosis. The GAI-BR version showed high internal consistency (Cronbach's α = 0.91) and strong and significant test-retest reliability (ρ = 0.85, p < 0.001). It also showed moderate and significant correlation with the Beck Anxiety Inventory (ρ = 0.68, p < 0.001) and the State-Trait Anxiety Inventory (ρ = 0.61, p < 0.001) showing evidence of concurrent validation. The cut-off point of 13 estimated by ROC curve analyses showed sensitivity of 83.3% and specificity of 84.6% to detect Generalized Anxiety Disorder (DSM-IV). GAI-BR has demonstrated very good psychometric properties and can be a reliable instrument to measure anxiety in Brazilian elderly people.
Long-term stability of the Wechsler Intelligence Scale for Children--Fourth Edition.
Watkins, Marley W; Smith, Lourdes G
2013-06-01
Long-term stability of the Wechsler Intelligence Scale for Children-Fourth Edition (WISC-IV; Wechsler, 2003) was investigated with a sample of 344 students from 2 school districts twice evaluated for special education eligibility at an average interval of 2.84 years. Test-retest reliability coefficients for the Verbal Comprehension Index (VCI), Perceptual Reasoning Index (PRI), Working Memory Index (WMI), Processing Speed Index (PSI), and the Full Scale IQ (FSIQ) were .72, .76, .66, .65, and .82, respectively. As predicted, the test-retest reliability coefficients for the subtests (Mdn = .56) were generally lower than the index scores (Mdn = .69) and the FSIQ (.82). On average, subtest scores did not differ by more than 1 point, and index scores did not differ by more than 2 points across the test-retest interval. However, 25% of the students earned FSIQ scores that differed by 10 or more points, and 29%, 39%, 37%, and 44% of the students earned VCI, PRI, WMI, and PSI scores, respectively, that varied by 10 or more points. Given this variability, it cannot be assumed that WISC-IV scores will be consistent across long test-retest intervals for individual students. PsycINFO Database Record (c) 2013 APA, all rights reserved.
Menezes, Josiane Roberta de; Luvisaro, Bianca Maria Oliveira; Rodrigues, Claudia Fernandes; Muzi, Camila Drumond; Guimarães, Raphael Mendonça
2017-01-01
To assess the test-retest reliability of the Memorial Symptom Assessment Scale translated and culturally adapted into Brazilian Portuguese. The scale was applied in an interview format for 190 patients with various cancers type hospitalized in clinical and surgical sectors of the Instituto Nacional de Câncer José de Alencar Gomes da Silva and reapplied in 58 patients. Data from the test-retest were double typed into a Microsoft Excel spreadsheet and analyzed by the weighted Kappa. The reliability of the scale was satisfactory in test-retest. The weighted Kappa values obtained for each scale item had to be adequate, the largest item was 0.96 and the lowest was 0.69. The Kappa subscale was also evaluated and values were 0.84 for high frequency physic symptoms, 0.81 for low frequency physical symptoms, 0.81 for psychological symptoms, and 0.78 for Global Distress Index. High level of reliability estimated suggests that the process of measurement of Memorial Symptom Assessment Scale aspects was adequate. Avaliar a confiabilidade teste-reteste da versão traduzida e adaptada culturalmente para o português do Brasil do Memorial Symptom Assessment Scale. A escala foi aplicada em forma de entrevista em 190 pacientes com diversos tipos de câncer internados nos setores clínicos e cirúrgicos do Instituto Nacional de Câncer José de Alencar Gomes da Silva e reaplicada em 58 pacientes. Os dados dos testes-retestes foram inseridos num banco de dados por dupla digitação independente em Excel e analisados pelo Kappa ponderado. A confiabilidade da escala mostrou-se satisfatória nos testes-retestes. Os valores do Kappa ponderado obtidos para cada item da escala apresentaram-se adequados, sendo o maior item de 0,96 e o menor de 0,69. Também se avaliou o Kappa das subescalas, sendo de 0,84 para sintomas físicos de alta frequência, de 0,81 para sintomas físicos de baixa frequência, de 0,81 também para sintomas psicológicos, e de 0,78 para Índice Geral de Sofrimento. Altos níveis de confiabilidade estimados permitem concluir que o processo de aferição dos itens do Memorial Symptom Assessment Scale foi adequado.
Stefanatou, Pentagiotissa; Giannouli, Eleni; Konstantakopoulos, George; Vitoratou, Silia; Mavreas, Venetsanos
2014-11-01
Evaluation of mental health services based on patients' needs assessments has never taken place in Greece, although it is a crucial factor for the efficient use of their limited resources. To examine the inter-rater and test-retest reliability and the concurrent/convergent validity of the Greek research version of the Camberwell Assessment of Need-Research (CAN-R). A total of 53 schizophrenic patient-staff pairs were interviewed twice to test the inter-rater and test-retest reliability of the Greek version of the CAN-R. The World Health Organization Quality of Life-Brief Form (WHOQOL-BREF) and World Health Organization Disability Assessment Schedule-2.0 (WHODAS-2.0) were administered to the patients to examine concurrent validity. The inter-rater and test-retest reliability of patient and staff interviews for the 22 individual items and the eight summary scores of the instrument's four sections were good to excellent. Significant correlations emerged between CAN scores and the WHOQOL-BREF and WHODAS-2.0 domains for both patient and staff ratings, indicating good concurrent validity. Our results suggest that the Greek version of the CAN-R is a reliable instrument for assessing mental health patients' needs. Moreover, it is the first CAN-R validity study with satisfactory results using WHOQOL-BREF and WHODAS-2.0 as criterion variables. © The Author(s) 2013.
Vatan, Sevginar; Ertaş, Sedar; Lester, David
2011-04-01
In a sample of 100 Turkish psychiatric patients with diagnoses of anxiety disorders, Lester's Helplessness, Hopelessness, and Haplessness inventory had moderate estimates of internal consistency, test-retest reliability, and construct validity.
Shisslak, C M; Renger, R; Sharpe, T; Crago, M; McKnight, K M; Gray, N; Bryson, S; Estes, L S; Parnaby, O G; Killen, J; Taylor, C B
1999-03-01
To describe the development, test-retest reliability, internal consistency, and convergent validity of the McKnight Risk Factor Survey-III (MRFS-III). The MRFS-III was designed to assess a number of potential risk and protective factors for the development of disordered eating in preadolescent and adolescent girls. Several versions of the MRFS were pilot tested before the MRFS-III was administered to a sample of 651 4th through 12th- grade girls to establish its psychometric properties. Most of the test-retest reliability coefficients of individual items on the MRFS-III were r > .40. Alpha coefficients for each risk and protective factor domain on the MRFS-III were also computed. The majority of these coefficients were r > .60. High convergent validity coefficients were obtained for specific items on the MRFS-III and measures of self-esteem (Rosenberg Self-Esteem Scale) and weight concerns (Weight Concerns Scale). The test-retest reliability, internal consistency, and convergent validity of the MRFS-III suggest that it is a useful new instrument to assess potential risk and protective factors for the development of disordered eating in preadolescent and adolescent girls.
Buchowski, Maciej S; Matthews, Charles E; Cohen, Sarah S; Signorello, Lisa B; Fowke, Jay H; Hargreaves, Margaret K; Schlundt, David G; Blot, William J
2012-08-01
Low physical activity (PA) is linked to cancer and other diseases prevalent in racial/ethnic minorities and low-income populations. This study evaluated the PA questionnaire (PAQ) used in the Southern Cohort Community Study, a prospective investigation of health disparities between African-American and white adults. The PAQ was administered upon entry into the cohort (PAQ1) and after 12-15 months (PAQ2) in 118 participants (40-60 year-old, 48% male, 74% African-American). Test-retest reliability (PAQ1 versus PAQ2) was assessed using Spearman correlations and the Wilcoxon signed rank test. Criterion validity of the PAQ was assessed via comparison with a PA monitor and a last-month PA survey (LMPAS), administered up to 4 times in the study period. The PAQ test-retest reliability ranged from 0.25-0.54 for sedentary behaviors and 0.22-0.47 for active behaviors. The criterion validity for the PAQ compared with PA monitor ranged from 0.21-0.24 for sedentary behaviors and from 0.17-0.31 for active behaviors. There was general consistency in the magnitude of correlations between the PAQ and PA-monitor between African-Americans and whites. The SCCS-PAQ has fair to moderate test-retest reliability and demonstrated some evidence of criterion validity for ranking participants by their level of sedentary and active behaviors.
Noble, Stephanie; Spann, Marisa N; Tokoglu, Fuyuze; Shen, Xilin; Constable, R Todd; Scheinost, Dustin
2017-11-01
Best practices are currently being developed for the acquisition and processing of resting-state magnetic resonance imaging data used to estimate brain functional organization-or "functional connectivity." Standards have been proposed based on test-retest reliability, but open questions remain. These include how amount of data per subject influences whole-brain reliability, the influence of increasing runs versus sessions, the spatial distribution of reliability, the reliability of multivariate methods, and, crucially, how reliability maps onto prediction of behavior. We collected a dataset of 12 extensively sampled individuals (144 min data each across 2 identically configured scanners) to assess test-retest reliability of whole-brain connectivity within the generalizability theory framework. We used Human Connectome Project data to replicate these analyses and relate reliability to behavioral prediction. Overall, the historical 5-min scan produced poor reliability averaged across connections. Increasing the number of sessions was more beneficial than increasing runs. Reliability was lowest for subcortical connections and highest for within-network cortical connections. Multivariate reliability was greater than univariate. Finally, reliability could not be used to improve prediction; these findings are among the first to underscore this distinction for functional connectivity. A comprehensive understanding of test-retest reliability, including its limitations, supports the development of best practices in the field. © The Author 2017. Published by Oxford University Press.
Test-Retest Reliability of the Salutogenic Wellness Promotion Scale (SWPS)
ERIC Educational Resources Information Center
Anderson, L. M.; Moore, J. B.; Hayden, B. M.; Becker, C. M.
2014-01-01
Objective: This study examined the temporal stability (i.e. test-retest reliability) of the Salutogenic Wellness Promotion Scale (SWPS) using intraclass correlation coefficients (ICC). Current intraclass results were also compared to previously published interclass correlations to support the use of the intraclass method for test-retest…
Shungu, Dikoma C.; Mao, Xiangling; Gonzales, Robyn; Soones, Tacara N.; Dyke, Jonathan P.; van der Veen, Jan Willem; Kegeles, Lawrence S.
2016-01-01
Abnormalities in brain γ-aminobutyric acid (GABA) have been implicated in various neuropsychiatric and neurological disorders. However, in vivo GABA detection by proton magnetic resonance spectroscopy (1H MRS) presents significant challenges arising from low brain concentration, overlap by much stronger resonances, and contamination by mobile macromolecule (MM) signals. This study addresses these impediments to reliable brain GABA detection with the J-editing difference technique on a 3T MR system in healthy human subjects by (a) assessing the sensitivity gains attainable with an 8-channel phased-array head coil, (b) determining the magnitude and anatomic variation of the contamination of GABA by MM, and (c) estimating the test-retest reliability of measuring GABA with this method. Sensitivity gains and test-retest reliability were examined in the dorsolateral prefrontal cortex (DLPFC), while MM levels were compared across three cortical regions: the DLPFC, the medial prefrontal cortex (MPFC) and the occipital cortex (OCC). A 3-fold higher GABA detection sensitivity was attained with the 8-channel head coil compared to the standard single-channel head coil in DLPFC. Despite significant anatomic variation in GABA+MM and MM across the three brain regions (p < 0.05), the contribution of MM to GABA+MM was relatively stable across the three voxels, ranging from 41% to 49%, a non-significant regional variation (p = 0.58). The test-retest reliability of GABA measurement, expressed either as ratios to voxel tissue water (W) or total creatine, was found to be very high for both the single-channel coil and the 8-channel phased-array coil. For the 8-channel coil, for example, Pearson’s correlation coefficient of test vs. retest for GABA/W was 0.98 (R2 = 0.96, p = 0.0007), the percent coefficient of variation (CV) was 1.25%, and the intraclass correlation coefficient (ICC) was 0.98. Similar reliability was also found for the co-edited resonance of combined glutamate and glutamine (Glx) for both coils. PMID:27173449
RELIABILITY CONCERNS IN THE REPEATED COMPUTERIZED ASSESSMENT OF ATTENTION IN CHILDREN
Zabel, T. Andrew; von Thomsen, Christian; Cole, Carolyn; Martin, Rebecca; Mahone, E. Mark
2010-01-01
Assessment of attentional processes via computerized assessment is frequently used to quantify intra-individual cognitive improvement or decline in response to treatment. However, assessment of intra-individual change is highly dependent on sufficient test reliability. We examined the test–retest reliability of selected variables from one popular computerized continuous performance test (CPT)—i.e., the Conners’ CPT – Second Edition (CPT-II). Participants were 39 healthy children (20 girls) ages 6–18 without intellectual impairment (mean PPVT-III SS = 102.6), LD, or psychiatric disorders (DICA-IV). Test–retest reliability over the 3–8 month interval (mean = 6 months) was acceptable (Intraclass Correlations [ICC] = .82 to .92) on comparison measures (Beery Test of Visual Perception, WISC-IV Block Design, PPVT-III). In contrast, test–retest reliability was only modest for CPT-II raw scores (ICCs ranging from .62 to .82) and T-scores (ICCs ranging from .33 to .65) for variables of interest (Omissions, Commissions, Variability, Hit Reaction Time, and Attentiveness). Using test–retest reliability information published in the CPT-II manual, 90% confidence intervals based on reliable change index (RCI) methodology were constructed to examine the significance of test–retest difference/change scores. Of the participants in this sample of typically developing youth, 30% generated intra-individual changes in T-scores on the Omissions and Attentiveness variables that exceeded the 90% confidence intervals and qualified as “statistically rare” changes in score. These results suggest a considerable degree of normal variability in CPT-II test scores over extended test–retest intervals, and suggest a need for caution when interpreting test score changes in neurologically unstable clinical populations. PMID:19452302
Cross-cultural Adaptation of the "Functional Activities Questionnaire - FAQ" for use in Brazil
Sanchez, Maria Angélica dos Santos; Correa, Pricila Cristina Ribeiro; Lourenço, Roberto Alves
2011-01-01
Objective The aim of this paper was to present the results of the first stage of cross-cultural adaptation of the Functional Activities Questionnaire (FAQ). Methods The tool was subjected to translation and re-translation, and the test-retest reliability of a proposed version for use in Brazil was analyzed. Results Of the 548 questionnaire respondents, a convenience sample of 68 informants was selected for retesting. Internal consistency was measured by Cronbach's alpha (0.95) while test-retest reliability was assessed using intra-class correlation (0.97). The findings have shown that FAQ is brief - averaging seven minutes to apply, easily understood and has good intra-rater test-retest reliability. Conclusion Our results suggest this adapted version of the FAQ is a reliable and stable tool which may be useful for assessing function in Brazilian elderly. Notwithstanding, the version should be subjected to further analysis with the aim of reaching functional equivalence. PMID:29213759
Test-retest reliability of the trauma and life events self-report inventory.
Hovens, J E; Bramsen, I; van der Ploeg, H M; Reuling, I E
2000-12-01
Three groups of first-year male and female medical students (total N = 90) completed the Trauma and Life Events Self-report Inventory twice. Test-retest reliability for the three different time periods was .82, .89, and .75, respectively.
Bajaj, Jasmohan S; Heuman, Douglas M; Sterling, Richard K; Sanyal, Arun J; Siddiqui, Muhammad; Matherly, Scott; Luketic, Velimir; Stravitz, R Todd; Fuchs, Michael; Thacker, Leroy R; Gilles, HoChong; White, Melanie B; Unser, Ariel; Hovermale, James; Gavis, Edith; Noble, Nicole A; Wade, James B
2015-10-01
Detection of covert hepatic encephalopathy (CHE) is difficult, but point-of-care testing could increase rates of diagnosis. We aimed to validate the ability of the smartphone app EncephalApp, a streamlined version of Stroop App, to detect CHE. We evaluated face validity, test-retest reliability, and external validity. Patients with cirrhosis (n = 167; 38% with overt HE [OHE]; mean age, 55 years; mean Model for End-Stage Liver Disease score, 12) and controls (n = 114) were each given a paper and pencil cognitive battery (standard) along with EncephalApp. EncephalApp has Off and On states; results measured were OffTime, OnTime, OffTime+OnTime, and number of runs required to complete 5 off and on runs. Thirty-six patients with cirrhosis underwent driving simulation tests, and EncephalApp results were correlated with results. Test-retest reliability was analyzed in a subgroup of patients. The test was performed before and after transjugular intrahepatic portosystemic shunt placement, and before and after correction for hyponatremia, to determine external validity. All patients with cirrhosis performed worse on paper and pencil and EncephalApp tests than controls. Patients with cirrhosis and OHE performed worse than those without OHE. Age-dependent EncephalApp cutoffs (younger or older than 45 years) were set. An OffTime+OnTime value of >190 seconds identified all patients with CHE with an area under the receiver operator characteristic value of 0.91; the area under the receiver operator characteristic value was 0.88 for diagnosis of CHE in those without OHE. EncephalApp times correlated with crashes and illegal turns in driving simulation tests. Test-retest reliability was high (intraclass coefficient, 0.83) among 30 patients retested 1-3 months apart. OffTime+OnTime increased significantly (206 vs 255 seconds, P = .007) among 10 patients retested 33 ± 7 days after transjugular intrahepatic portosystemic shunt placement. OffTime+OnTime decreased significantly (242 vs 225 seconds, P = .03) in 7 patients tested before and after correction for hyponatremia (126 ± 3 to 132 ± 4 meq/L, P = .01) 10 ± 5 days apart. A smartphone app called EncephalApp has good face validity, test-retest reliability, and external validity for the diagnosis of CHE. Copyright © 2015 AGA Institute. Published by Elsevier Inc. All rights reserved.
Reneman, M F; Roelofs, M; Schiphorst Preuper, H R
2017-07-01
To analyze test-retest reliability and agreement, and to explore the safety of neck functional capacity evaluation (Neck-FCE) tests in patients with chronic multifactorial neck pain. Test-retest; 2 FCE sessions were held with a 2-week interval. University-based outpatient rehabilitation center. Individuals (N=18; 14 women) with a mean age of 34 years. Not applicable. The Neck-FCE protocol consists of 6 tests: lifting waist to overhead (kg), 2-handed carrying (kg), overhead working (s), bending and overhead reaching (s), and repetitive side reaching (left and right) (s). Intraclass correlation coefficients (ICCs) and limits of agreement (LoA) were calculated. ICC point estimates between .75 and .90 were considered as good, and >.90 were considered as excellent reliability. ICC point estimates ranged between .39 and .96. Ratios of the LoA ranged between 32.0% and 56.5%. Mean ± SD numeric rating scale pain scores in the neck and shoulder 24 hours after the test were 6.7±2.6 and 6.3±3.0, respectively. Based on ICC point estimates and 95% confidence intervals, 3 tests had excellent reliability and 3 had poor reliability. LoA were substantial in all 6 tests. Safety was confirmed. Copyright © 2016 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Development of An Assessment Test for An Anesthetic Machine.
Tiviraj, Supinya; Yokubol, Bencharatana; Amornyotin, Somchai
2016-05-01
The study is aimed to develop and assess the quality of an evaluation form used to evaluate the nurse anesthetic trainees' skills in undertaking a pre-use check of an anesthetic machine. An evaluation form comprising 25 items was developed, informed by the guidelines published by national anesthesiologist societies and refined to reflect the anesthetic machine used in our institution. The item-checking included the cylinder supplies and medical gas pipelines, vaporizer back bar, ventilator anesthetic breathing system, scavenging system and emergency back-up equipment. The authors sought the opinions of five experienced anesthetic trainers to judge the validity of the content. The authors measured its inter-rater reliability when used by two achievement scores evaluating the performance of 36 nurse anesthetic trainees undertaking 15-minute anesthetic machine checks and test-retest the reliability correlation scores between the two performances in the seven days interval. The five experienced anesthesiologists agreed that the evaluation form accurately reflected the objectives of anesthetic machine checking, equating to an index of congruency of 1.00. The inter-rater reliability of the independent assessors scoring was 0.977 (p = 0.01) and the test-retest reliability was 0.883 (p = 0.01). An evaluation form proved to be a reliable and effective tool for assessing the anesthetic nurse trainees' checking of an anesthetic machine before the use. This evaluation form was brief clear and practical to use, and should help to improve anesthetic nurse education and the patient safety.
Behavioral and cognitive outcomes for clinical trials in children with neurofibromatosis type 1.
van der Vaart, Thijs; Rietman, André B; Plasschaert, Ellen; Legius, Eric; Elgersma, Ype; Moll, Henriëtte A
2016-01-12
To evaluate the appropriateness of cognitive and behavioral outcome measures in clinical trials in neurofibromatosis type 1 (NF1) by analyzing the degree of deficits compared to reference groups, test-retest reliability, and how scores correlate between outcome measures. Data were analyzed from the Simvastatin for cognitive deficits and behavioral problems in patients with neurofibromatosis type 1 (NF1-SIMCODA) trial, a randomized placebo-controlled trial of simvastatin for cognitive deficits and behavioral problems in children with NF1. Outcome measures were compared with age-specific reference groups to identify domains of dysfunction. Pearson r was computed for before and after measurements within the placebo group to assess test-retest reliability. Principal component analysis was used to identify the internal structure in the outcome data. Strongest mean score deviations from the reference groups were observed for full-scale intelligence (-1.1 SD), Rey Complex Figure Test delayed recall (-2.0 SD), attention problems (-1.2 SD), and social problems (-1.1 SD). Long-term test-retest reliability were excellent for Wechsler scales (r > 0.88), but poor to moderate for other neuropsychological tests (r range 0.52-0.81) and Child Behavioral Checklist subscales (r range 0.40-0.79). The correlation structure revealed 2 strong components in the outcome measures behavior and cognition, with no correlation between these components. Scores on psychosocial quality of life correlate strongly with behavioral problems and less with cognitive deficits. Children with NF1 show distinct deficits in multiple domains. Many outcome measures showed weak test-retest correlations over the 1-year trial period. Cognitive and behavioral outcomes are complementary. This analysis demonstrates the need to include reliable outcome measures on a variety of cognitive and behavioral domains in clinical trials for NF1. © 2015 American Academy of Neurology.
Gunaydin, Gurkan; Citaker, Seyit; Meray, Jale; Cobanoglu, Gamze; Gunaydin, Ozge Ece; Hazar Kanik, Zeynep
2016-11-01
Validation of a self-report questionnaire. The purpose of this study was to investigate adaptation, validity, and reliability of the Turkish version of the Bournemouth Questionnaire. Low back pain is one of the most frequent disorders leading to activity limitation. This pain affects most of people in their lives. The most important point to evaluate patient's functional abilities and to decide a successful therapy procedure is to manage the assessment questionnaires precisely. One hundred ten patients with chronic low back pain were included in present study. To assess reliability, test-retest and internal consistency analyses were applied. The results of test-retest analysis were assessed by using Intraclass Correlation Coefficient method (95% confidence interval). For internal consistency, Cronbach alpha value was calculated. Validity of the questionnaire was assessed in terms of construct validity. For construct validity, factor analysis and convergent validity were tested. For convergent validity, total points of the Bournemouth Questionnaire were assessed with the total points of Quebec Back Pain Disability Scale and Roland Morris Disability Questionnaire by using Pearson correlation coefficient analysis. Cronbach alpha value was found 0.914, showing that this questionnaire has high internal consistency. The results of test-retest analysis were varying between 0.851 and 0.927, which shows that test-retest results are highly correlated. Factor analysis test indicated that this questionnaire had one factor. Pearson correlation coefficient of the Bournemouth Questionnaire with Roland Morris Disability Questionnaire was calculated 0.703 and it was found with Quebec Back Pain Disability Scale is 0.659. These results showed that the Bournemouth Questionnaire is very good correlated with Roland Morris Disability Questionnaire and Quebec Back Pain Disability Scale. The Turkish version of the Bournemouth Questionnaire is valid and reliable. 3.
Test-retest reliability of evoked heat stimulation BOLD fMRI.
Upadhyay, Jaymin; Lemme, Jordan; Anderson, Julie; Bleakman, David; Large, Thomas; Evelhoch, Jeffrey L; Hargreaves, Richard; Borsook, David; Becerra, Lino
2015-09-30
To date, the blood oxygenated-level dependent (BOLD) functional magnetic resonance imaging (fMRI) technique has enabled an objective and deeper understanding of pain processing mechanisms embedded within the human central nervous system (CNS). In order to further comprehend the benefits and limitations of BOLD fMRI in the context of pain as well as the corresponding subjective pain ratings, we evaluated the univariate response, test-retest reliability and confidence intervals (CIs) at the 95% level of both data types collected during evoked stimulation of 40°C (non-noxious), 44°C (mildly noxious) and a subject-specific temperature eliciting a 7/10 pain rating. The test-retest reliability between two scanning sessions was determined by calculating group-level interclass correlation coefficients (ICCs) and at the single-subject level. Across the three stimuli, we initially observed a graded response of increasing magnitude for both VAS (visual analog score) pain ratings and fMRI data. Test-retest reliability was observed to be highest for VAS pain ratings obtained during the 7/10 pain stimulation (ICC=0.938), while ICC values of pain fMRI data for a distribution of CNS structures ranged from 0.5 to 0.859 (p<0.05). Importantly, the upper and lower confidence interval CI bounds reported herein could be utilized in subsequent trials involving healthy volunteers to hypothesize the magnitude of effect required to overcome inherent variability of either VAS pain ratings or BOLD responses evoked during innocuous or noxious thermal stimulation. Copyright © 2015 Elsevier B.V. All rights reserved.
Leddy, Abigail L; Crowner, Beth E; Earhart, Gammon M
2011-01-01
Gait impairments, balance impairments, and falls are prevalent in individuals with Parkinson disease (PD). Although the Berg Balance Scale (BBS) can be considered the reference standard for the determination of fall risk, it has a noted ceiling effect. Development of ceiling-free measures that can assess balance and are good at discriminating "fallers" from "nonfallers" is needed. The purpose of this study was to compare the Functional Gait Assessment (FGA) and the Balance Evaluation Systems Test (BESTest) with the BBS among individuals with PD and evaluate the tests' reliability, validity, and discriminatory sensitivity and specificity for fallers versus nonfallers. This was an observational study of community-dwelling individuals with idiopathic PD. The BBS, FGA, and BESTest were administered to 80 individuals with PD. Interrater reliability (n=15) was assessed by 3 raters. Test-retest reliability was based on 2 tests of participants (n=24), 2 weeks apart. Intraclass correlation coefficients (2,1) were used to calculate reliability, and Spearman correlation coefficients were used to assess validity. Cutoff points, sensitivity, and specificity were based on receiver operating characteristic plots. Test-retest reliability was .80 for the BBS, .91 for the FGA, and .88 for the BESTest. Interrater reliability was greater than .93 for all 3 tests. The FGA and BESTest were correlated with the BBS (r=.78 and r=.87, respectively). Cutoff scores to identify fallers were 47/56 for the BBS, 15/30 for the FGA, and 69% for the BESTest. The overall accuracy (area under the curve) for the BBS, FGA, and BESTest was .79, .80, and .85, respectively. Fall reports were retrospective. Both the FGA and the BESTest have reliability and validity for assessing balance in individuals with PD. The BESTest is most sensitive for identifying fallers.
The Comprehensive Snack Parenting Questionnaire (CSPQ): Development and Test-Retest Reliability.
Gevers, Dorus W M; Kremers, Stef P J; de Vries, Nanne K; van Assema, Patricia
2018-04-26
The narrow focus of existing food parenting instruments led us to develop a food parenting practices instrument measuring the full range of food practices constructs with a focus on snacking behavior. We present the development of the questionnaire and our research on the test-retest reliability. The developed Comprehensive Snack Parenting Questionnaire (CSPQ) covers 21 constructs. Test-retest reliability was assessed by calculating intra class correlation coefficients and percentage agreement after two administrations of the CSPQ among a sample of 66 Dutch parents. Test-retest reliability analysis revealed acceptable intra class correlation coefficients (≥0.41) or agreement scores (≥0.60) for all items. These results, together with earlier work, suggest sufficient psychometric characteristics. The comprehensive, but brief CSPQ opens up chances for highly essential but unstudied research questions to understand and predict children’s snack intake. Example applications include studying the interactional nature of food parenting practices or interactions of food parenting with general parenting or child characteristics.
Yapali, Gökmen; Günel, Mintaze Kerem; Karahan, Sevilay
2012-05-15
The study design was cross-cultural adaptation and investigation of reliability and validity of the Copenhagen Neck Functional Disability Scale (CNFDS). The aim of this study was to translate the CNFDS into Turkish language and assess its reliability and validity among patients with neck pain in Turkish population. The CNFDS is a reliable and valid evaluation instrument for disability, but there is no published the Turkish version of the CNFDS. One hundred one subjects who had chronic neck pain were included in this study. The CNFDS, Neck Pain and Disability Scale, and visual analogue scale were administered to all subjects. For investigating test-retest reliability, correlation between CNFDS scores, applied at 1-week interval, intraclass correlation coefficient score for test-retest reliability was 0.86 (95% confidence interval = 0.679-0.935). There was no difference between test-retest scores (P < 0.001). For investigating concurrent validity, correlation between total score of the CNFDS and the mean visual analogue scale was r = 0.73 (P < 0.001). Concurrent validity of the CNFDS was very good. For investigating construct validity, correlation between total score of the CNFDS and the Neck Pain and Disability Scale was r = 0.78 (P < 0.001). Construct validity of the CNFDS was also very good. Our results suggest that the Turkish version of the CNFDS is a reliable and valid instrument for Turkish people.
Lee, Ya-Chen; Yu, Wan-Hui; Hsueh, I-Ping; Chen, Sheng-Shiung; Hsieh, Ching-Lin
2017-10-01
A lack of evidence on the test-retest reliability and responsiveness limits the utility of the BI-based Supplementary Scales (BI-SS) in both clinical and research settings. To examine the test-retest reliability and responsiveness of the BI-based Supplementary Scales (BI-SS) in patients with stroke. A repeated-assessments design (1 week apart) was used to examine the test-retest reliability of the BI-SS. For the responsiveness study, the participants were assessed with the BI-SS and BI (treated as an external criterion) at admission to and discharge from rehabilitation wards. Seven outpatient rehabilitation units and one inpatient rehabilitation unit. Outpatients with chronic stroke. Eighty-four outpatients with chronic stroke participated in the test-retest reliability study. Fifty-seven inpatients completed baseline and follow-up assessments in the responsiveness study. For the test-retest reliability study, the values of the intra-class correlation coefficient and the overall percentage of minimal detectable change for the Ability Scale and Self-perceived Difficulty Scale were 0.97, 12.8%, and 0.78, 35.8%, respectively. For the responsiveness study, the standardized effect size and standardized response mean (representing internal responsiveness) of the Ability Scale and Self-perceived Difficulty Scale were 1.17 and 1.56, and 0.78 and 0.89, respectively. Regarding external responsiveness, the change in score of the Ability Scale had significant and moderate association with that of the BI (r=0.61, P<0.001). The change in score of the Self-perceived Difficulty Scale had non-significant and weak association with that of the BI (r=0.23, P=0.080). The Ability Scale of the BI-SS has satisfactory test-retest reliability and sufficient responsiveness for patients with stroke. However, the Self-perceived Difficulty Scale of the BI-SS has substantial random measurement error and insufficient external responsiveness, which may affect its utility in clinical settings. The findings of this study provide empirical evidence of psychometric properties of the BI-SS for assessing ability and self-perceived difficulty of ADL in patients with stroke.
Scarponi, Letizia; de Felicio, Claudia Maria; Sforza, Chiarella; Pimenta Ferreira, Claudia Lucia; Ginocchio, Daniela; Pizzorni, Nicole; Barozzi, Stefania; Mozzanica, Francesco; Schindler, Antonio
2018-05-30
To evaluate the reliability, validity, and responsiveness of the Italian OMES (I-OMES). The study consisted of 3 phases: (1) internal consistency and reliability, (2) validity, and (3) responsiveness analysis. The recruited population included 27 patients with orofacial myofunctional disorders (OMD) and 174 healthy volunteers. Forty-seven subjects, 18 healthy and all recruited patients with OMD were assessed for inter-rater and test-retest reliability analysis. I-OMES and Nordic Orofacial Test - Screening (NOT-S) scores of the patients were correlated for concurrent validity analysis. I-OMES scores from 27 patients with OMD and 27 age- and gender-matched healthy subjects were compared to investigate construct validity. I-OMES scores before and after successful swallowing rehabilitation in patients were compared for responsiveness analysis. Adequate internal consistency (Cronbach α = 0.71) and strong inter-rater and test-retest reliability (intraclass coefficient correlation = 0.97 and 0.98, respectively) were found. I-OMES and NOT-S scores significantly and inversely correlated (r = -0.38). A statistical significance (p < 0.001) was found between the pathological group and the control group for the total I-OMES score. The mean I-OMES score improved from 90 (78-102) to 99 (89-103) after myofunctional rehabilitation (p < 0.001). The I-OMES is a reliable and valid tool to evaluate OMD. © 2018 S. Karger AG, Basel.
Kim, Jin Goo; Lee, Joong Yub; Seo, Seung Suk; Choi, Choong Hyeok; Lee, Myung Chul
2013-01-01
Purpose To perform a cross-cultural adaptation and to test the measurement properties of the Korean version of International Knee Documentation Committee (K-IKDC) Subjective Knee Form. Materials and Methods According to the guidelines for cross-cultural adaptation, translation and backward translation of the English version of the IKDC Subjective Knee Form were performed. After translation into the Korean version, 150 patients who had knee-related problems were asked to complete the K-IKDC, Lysholm score, and Short Form-36 (SF-36). Of these patients, 126 were retested 2 weeks later to evaluate test-retest reliability, and 104 were recruited 3 months later to evaluate responsiveness. Construct validity was analyzed by investigating the correlation with Lysholm score and SF-36; content validity was also evaluated. Standardized mean response was calculated for evaluating responsiveness. Results The test-retest reliability proved excellent with a high value for the intraclass correlation coefficient (r=0.94). The internal consistency was strong (Cronbach's α=0.91). Good content validity with absence of floor not ceiling effects and good convergent and divergent validity were observed. Moderate responsiveness was shown (standardized mean response=0.689). Conclusions The K-IKDC demonstrated good measurement properties. We suggest that this instrument is an excellent evaluation instrument that can be used for Korean patients with knee-related injuries. PMID:24032098
Indrebø, Kirsten Lerum; Andersen, John Roger; Natvig, Gerd Karin
2014-01-01
The purpose of this study was to adapt the Ostomy Adjustment Scale to a Norwegian version and to assess its construct validity and 2 components of its reliability (internal consistency and test-retest reliability). One hundred fifty-eight of 217 patients (73%) with a colostomy, ileostomy, or urostomy participated in the study. Slightly more than half (56%) were men. Their mean age was 64 years (range, 26-91 years). All respondents had undergone ostomy surgery at least 3 months before participation in the study. The Ostomy Adjustment Scale was translated into Norwegian according to standard procedures for forward and backward translation. The questionnaire was sent to the participants via regular post. The Cronbach alpha and test-retest were computed to assess reliability. Construct validity was evaluated via correlations between each item and score sums; correlations were used to analyze relationships between the Ostomy Adjustment Scale and the 36-item Short Form Health Survey, the Quality of Life Scale, the Hospital Anxiety & Depression Scale, and the General Self-Efficacy Scale. The Cronbach alpha was 0.93, and test-retest reliability r was 0.69. The average correlation quotient item to sum score was 0.49 (range, 0.31-0.73). Results showed moderate negative correlations between the Ostomy Adjustment Scale and the Hospital Anxiety and Depression Scale (-0.37 and -0.40), and moderate positive correlations between the Ostomy Adjustment Scale and the 36-item Short Form Health Survey, the Quality of Life Scale, and the General Self-Efficacy Scale (0.30-0.45) with the exception of the pain domain in the Short Form 36 (0.28). Regression analysis showed linear associations between the Ostomy Adjustment Scale and sociodemographic and clinical variables with the exception of education. The Norwegian language version of the Ostomy Adjustment Scale was found to possess construct validity, along with internal consistency and test-retest reliability. The instrument is sensitive for sociodemographic and clinical variables pertinent to persons with urostomies, colostomies, and ileostomies.
Validity and reliability of the Diagnostic Adaptive Behaviour Scale.
Tassé, M J; Schalock, R L; Balboni, G; Spreat, S; Navas, P
2016-01-01
The Diagnostic Adaptive Behaviour Scale (DABS) is a new standardised adaptive behaviour measure that provides information for evaluating limitations in adaptive behaviour for the purpose of determining a diagnosis of intellectual disability. This article presents validity evidence and reliability data for the DABS. Validity evidence was based on comparing DABS scores with scores obtained on the Vineland Adaptive Behaviour Scale, second edition. The stability of the test scores was measured using a test and retest, and inter-rater reliability was assessed by computing the inter-respondent concordance. The DABS convergent validity coefficients ranged from 0.70 to 0.84, while the test-retest reliability coefficients ranged from 0.78 to 0.95, and the inter-rater concordance as measured by intraclass correlation coefficients ranged from 0.61 to 0.87. All obtained validity and reliability indicators were strong and comparable with the validity and reliability coefficients of the most commonly used adaptive behaviour instruments. These results and the advantages of the DABS for clinician and researcher use are discussed. © 2015 MENCAP and International Association of the Scientific Study of Intellectual and Developmental Disabilities and John Wiley & Sons Ltd.
The Assertiveness Scale for Children.
ERIC Educational Resources Information Center
Peeler, Elizabeth; Rimmer, Susan M.
1981-01-01
Described an assertiveness scale for children developed to assess four dimensions of assertiveness across three categories of interpersonal situations. The scale was administered to elementary and middle school children (N=609) and readministered to students (N=164) to assess test-retest reliability. Test-retest reliability was low while internal…
Measuring cognitive change with ImPACT: the aggregate baseline approach.
Bruce, Jared M; Echemendia, Ruben J; Meeuwisse, Willem; Hutchison, Michael G; Aubry, Mark; Comper, Paul
2017-11-01
The Immediate Post-Concussion Assessment and Cognitive Test (ImPACT) is commonly used to assess baseline and post-injury cognition among athletes in North America. Despite this, several studies have questioned the reliability of ImPACT when given at intervals employed in clinical practice. Poor test-retest reliability reduces test sensitivity to cognitive decline, increasing the likelihood that concussed athletes will be returned to play prematurely. We recently showed that the reliability of ImPACT can be increased when using a new composite structure and the aggregate of two baselines to predict subsequent performance. The purpose of the present study was to confirm our previous findings and determine whether the addition of a third baseline would further increase the test-retest reliability of ImPACT. Data from 97 English speaking professional hockey players who had received at least 4 ImPACT baseline evaluations were extracted from a National Hockey League Concussion Program database. Linear regression was used to determine whether each of the first three testing sessions accounted for unique variance in the fourth testing session. Results confirmed that the aggregate baseline approach improves the psychometric properties of ImPACT, with most indices demonstrating adequate or better test-retest reliability for clinical use. The aggregate baseline approach provides a modest clinical benefit when recent baselines are available - and a more substantial benefit when compared to approaches that obtain baseline measures only once during the course of a multi-year playing career. Pending confirmation in diverse samples, neuropsychologists are encouraged to use the aggregate baseline approach to best quantify cognitive change following sports concussion.
Lin, Shike; Chaiear, Naesinee; Khiewyoo, Jiraporn; Wu, Bin; Johns, Nutjaree Pratheepawanit
2013-03-01
As quality of work-life (QWL) among nurses affects both patient care and institutional standards, assessment regarding QWL for the profession is important. Work-related Quality of Life Scale (WRQOLS) is a reliable QWL assessment tool for the nursing profession. To develop a Chinese version of the WRQOLS-2 and to examine its psychometric properties as an instrument to assess QWL for the nursing profession in China. Forward and back translating procedures were used to develop the Chinese version of WRQOLS-2. Six nursing experts participated in content validity evaluation and 352 registered nurses (RNs) participated in the tests. After a two-week interval, 70 of the RNs were retested. Structural validity was examined by principal components analysis and the Cronbach's alphas calculated. The respective independent sample t-test and intra-class correlation coefficient were used to analyze known-group validity and test-retest reliability. One item was rephrased for adaptation to Chinese organizational cultures. The content validity index of the scale was 0.98. Principal components analysis resulted in a seven-factor model, accounting for 62% of total variance, with Cronbach's alphas for subscales ranging from 0.71 to 0.88. Known-group validity was established in the assessment results of the participants in permanent employment vs. contract employment (t = 2.895, p < 0.01). Good test-retest reliability was observed (r = 0.88, p < 0.01). The translated Chinese version of the WRQOLS-2 has sufficient validity and reliability so that it can be used to evaluate the QWL among nurses in mainland China.
Reedman, Sarah Elizabeth; Beagley, Simon; Sakzewski, Leanne; Boyd, Roslyn N
2016-08-01
The aim of this pilot study was to evaluate reproducibility of the Jebsen Taylor Test of Hand Function (JTTHF) in children. Eighty-seven typically developing children 5 to 10 years old were included from five Outside School Hours Care centers in the Greater Brisbane Region, Australia. Hand function was assessed on two occasions with a modified JTTHF, then reproducibility was assessed using Intraclass Correlation Coefficient (ICC [3,1]) and the Standard Error of Measurement (SEM). Total scores for male and female children were not significantly different. Five-year-old children were significantly different to all other age groups and were excluded from further analysis. Results for 71 children, 6 to 10 years old were analyzed (mean age 8.31 years (SD 1.32); 33 males). Test-retest reliability for total scores on the dominant and nondominant hands were ICC 0.74 (95% CI 0.61, 0.83) and ICC 0.72 (95% CI 0.59, 0.82), respectively. 'Writing' and 'Simulated Feeding' subtests demonstrated poor reproducibility. The Smallest Real Difference was 5.09 seconds for total score on the dominant hand. Findings indicate good test-retest reliability for the JTTHF total score to measure hand function in typically developing children aged 6 to 10 years.
Dimensional indicators of generalized anxiety disorder severity for DSM-V.
Niles, Andrea N; Lebeau, Richard T; Liao, Betty; Glenn, Daniel E; Craske, Michelle G
2012-03-01
For DSM-V, simple dimensional measures of disorder severity will accompany diagnostic criteria. The current studies examine convergent validity and test-retest reliability of two potential dimensional indicators of worry severity for generalized anxiety disorder (GAD): percent of the day worried and number of worry domains. In study 1, archival data from diagnostic interviews from a community sample of individuals diagnosed with one or more anxiety disorders (n = 233) were used to assess correlations between percent of the day worried and number of worry domains with other measures of worry severity (clinical severity rating (CSR), age of onset, number of comorbid disorders, Penn state worry questionnaire (PSWQ)) and DSM-IV criteria (excessiveness, uncontrollability and number of physical symptoms). Both measures were significantly correlated with CSR and number of comorbid disorders, and with all three DSM-IV criteria. In study 2, test-retest reliability of percent of the day worried and number of worry domains were compared to test-retest reliability of DSM-IV diagnostic criteria in a non-clinical sample of undergraduate students (n = 97) at a large west coast university. All measures had low test-retest reliability except percent of the day worried, which had moderate test-retest reliability. Findings suggest that these two indicators capture worry severity, and percent of the day worried may be the most reliable existing indicator. These measures may be useful as dimensional measures for DSM-V. Copyright © 2012 Elsevier Ltd. All rights reserved.
2011-01-01
Background Although measures of knowledge translation and exchange (KTE) effectiveness based on the theory of planned behavior (TPB) have been used among patients and providers, no measure has been developed for use among health system policymakers and stakeholders. A tool that measures the intention to use research evidence in policymaking could assist researchers in evaluating the effectiveness of KTE strategies that aim to support evidence-informed health system decision-making. Therefore, we developed a 15-item tool to measure four TPB constructs (intention, attitude, subjective norm and perceived control) and assessed its face validity through key informant interviews. Methods We carried out a reliability study to assess the tool's internal consistency and test-retest reliability. Our study sample consisted of 62 policymakers and stakeholders that participated in deliberative dialogues. We assessed internal consistency using Cronbach's alpha and generalizability (G) coefficients, and we assessed test-retest reliability by calculating Pearson correlation coefficients (r) and G coefficients for each construct and the tool overall. Results The internal consistency of items within each construct was good with alpha ranging from 0.68 to alpha = 0.89. G-coefficients were lower for a single administration (G = 0.34 to G = 0.73) than for the average of two administrations (G = 0.79 to G = 0.89). Test-retest reliability coefficients for the constructs ranged from r = 0.26 to r = 0.77 and from G = 0.31 to G = 0.62 for a single administration, and from G = 0.47 to G = 0.86 for the average of two administrations. Test-retest reliability of the tool using G theory was moderate (G = 0.5) when we generalized across a single observation, but became strong (G = 0.9) when we averaged across both administrations. Conclusion This study provides preliminary evidence for the reliability of a tool that can be used to measure TPB constructs in relation to research use in policymaking. Our findings suggest that the tool should be administered on more than one occasion when the intervention promotes an initial 'spike' in enthusiasm for using research evidence (as it seemed to do in this case with deliberative dialogues). The findings from this study will be used to modify the tool and inform further psychometric testing following different KTE interventions. PMID:21702956
Green, Dido; Meroz, Anat; Margalit, Adi Edit; Ratzon, Navah Z
2012-11-01
This study examines a potential instrument for measurement of typing postures of children. This paper describes inter-rater, test-retest reliability and concurrent validity of the Keyboard Personal Computer Style instrument (K-PeCS), an observational measurement of postures and movements during keyboarding, for use with children. Two trained raters independently rated videos of 24 children (aged 7-10 years). Six children returned one week later for identifying test-retest reliability. Concurrent validity was assessed by comparing ratings obtained using the K-PECS to scores from a 3D motion analysis system. Inter-rater reliability was moderate to high for 12 out of 16 items (Kappa: 0.46 to 1.00; correlation coefficients: 0.77-0.95) and test-retest reliability varied across items (Kappa: 0.25 to 0.67; correlation coefficients: r = 0.20 to r = 0.95). Concurrent validity compared favourably across arm pathlength, wrist extension and ulnar deviation. In light of the limitations of other tools the K-PeCS offers a fairly affordable, reliable and valid instrument to address the gap for measurement of typing styles of children, despite the shortcomings of some items. However further research is required to refine the instrument for use in evaluating typing among children. Copyright © 2012 Elsevier Ltd and The Ergonomics Society. All rights reserved.
Test-retest reliability of the proposed DSM-5 eating disorder diagnostic criteria
Sysko, Robyn; Roberto, Christina A.; Barnes, Rachel D.; Grilo, Carlos M.; Attia, Evelyn; Walsh, B. Timothy
2012-01-01
The proposed DSM-5 classification scheme for eating disorders includes both major and minor changes to the existing DSM-IV diagnostic criteria. It is not known what effect these modifications will have on the ability to make reliable diagnoses. Two studies were conducted to evaluate the short-term test-retest reliability of the proposed DSM-5 eating disorder diagnoses: anorexia nervosa, bulimia nervosa, binge eating disorder, and feeding and eating conditions not elsewhere classified. Participants completed two independent telephone interviews with research assessors (n=70 Study 1; n=55 Study 2). Fair to substantial agreements (κ= 0.80 and 0.54) were observed across eating disorder diagnoses in Study 1 and Study 2, respectively. Acceptable rates of agreement were identified for the individual eating disorder diagnoses, including DSM-5 anorexia nervosa (κ’s of 0.81 to 0.97), bulimia nervosa (κ=0.84), binge eating disorder (κ’s of 0.75 and 0.61), and feeding and eating disorders not elsewhere classified (κ’s of 0.70 and 0.46). Further, improved short-term test-retest reliability was noted when using the DSM-5, in comparison to DSM-IV, criteria for binge eating disorder. Thus, these studies found that trained interviewers can reliably diagnose eating disorders using the proposed DSM-5 criteria; however, additional data from general practice settings and community samples are needed. PMID:22401974
Validity and reliability of the Turkish Migraine Disability Assessment (MIDAS) questionnaire.
Ertaş, Mustafa; Siva, Aksel; Dalkara, Turgay; Uzuner, Nevzat; Dora, Babür; Inan, Levent; Idiman, Fethi; Sarica, Yakup; Selçuki, Deniz; Sirin, Hadiye; Oğuzhanoğlu, Atilla; Irkeç, Ceyla; Ozmenoğlu, Mehmet; Ozbenli, Taner; Oztürk, Musa; Saip, Sabahattin; Neyal, Münife; Zarifoğlu, Mehmet
2004-09-01
The aim of this study is to assess the comprehensibility, internal consistency, patient-physician reliability, test-retest reliability, and validity of Turkish version of Migraine Disability Assessment (MIDAS) questionnaire in patients with headache. MIDAS questionnaire has been developed by Stewart et al and shown to be reliable and valid to determine the degree of disability caused by migraine. This study was designed as a national multicenter study to demonstrate the reliability and validity of Turkish version of MIDAS questionnaire. Patients applying to 17 Neurology Clinics in Turkey were evaluated at the baseline (visit 1), week 4 (visit 2), and week 12 (visit 3) visits in terms of disease severity and comprehensibility, internal consistency, test-retest reliability, and validity of MIDAS. Since the severity of the disease has been found to change significantly at visit 2 compared to visit 1, test-retest reliability was assessed using the MIDAS scores of a subgroup of patients whose disease severity remained unchanged (up to +/-3 days difference in the number of days with headache between visits 1 and 2). A total of 306 patients (86.2% female, mean age: 35.0 +/- 9.8 years) were enrolled into the study. A total of 65.7%, 77.5%, 82.0% of patients reported that "they had fully understood the MIDAS questionnaire" in visits 1, 2, and 3, respectively. A highly positive correlation was found between physician and patient and the applied total MIDAS scores in all three visits (Spearman correlation coefficients were R= 0.87, 0.83, and 0.90, respectively, P <.001). Internal consistency of MIDAS was assessed using Cronbach's alpha and was found at acceptable (>0.7) or excellent (>0.8) levels in both patient and physician applied MIDAS scores, respectively. Total MIDAS score showed good test-retest reliability (R= 0.68). Both the number of days with headache and the total MIDAS scores were positively correlated at all visits with correlation coefficients between 0.47 and 0.63. There was also a moderate degree of correlation (R= 0.54) between the total MIDAS score at week 12 and the number of days with headache at visit 2 + visit 3, which quantify headache-related disability over a 3-month period similar to MIDAS questionnaire. These findings demonstrated that the Turkish translation is equivalent to the English version of MIDAS in terms of internal consistency, test-retest reliability, and validity. Physicians can reliably use the Turkish translation of the MIDAS questionnaire in defining the severity of illness and its treatment strategy when applied as a self-administered report by migraine patients themselves.
Duncan, Laura; Georgiades, Kathy; Wang, Li; Van Lieshout, Ryan J; MacMillan, Harriet L; Ferro, Mark A; Lipman, Ellen L; Szatmari, Peter; Bennett, Kathryn; Kata, Anna; Janus, Magdalena; Boyle, Michael H
2017-12-04
The goals of the study were to examine test-retest reliability, informant agreement and convergent and discriminant validity of nine DSM-IV-TR psychiatric disorders classified by parent and youth versions of the Mini International Neuropsychiatric Interview for Children and Adolescents (MINI-KID). Using samples drawn from the general population and child mental health outpatient clinics, 283 youth aged 9 to 18 years and their parents separately completed the MINI-KID with trained lay interviewers on two occasions 7 to 14 days apart. Test-retest reliability estimates based on kappa (κ) went from 0.33 to 0.79 across disorders, samples and informants. Parent-youth agreement on disorders was low (average κ = 0.20). Confirmatory factor analysis provided evidence supporting convergent and discriminant validity. The MINI-KID disorder classifications yielded estimates of test-retest reliability and validity comparable to other standardized diagnostic interviews in both general population and clinic samples. These findings, in addition to the brevity and low administration cost, make the MINI-KID a good candidate for use in epidemiological research and clinical practice. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
ERIC Educational Resources Information Center
Power, Allan; Faught, Brent E.; Przysucha, Eryk; McPherson, Moira; Montelpare, William
2012-01-01
In this study the authors examine the test-retest reliability and concurrent validity of the Repeat Ice Skating Test (RIST). This was an on-ice field anaerobic test that measured average peak power and was validated with 3 anaerobic lab tests: (a) vertical jump, (b) the Margaria-Kalamen stair test, and (c) the Wingate Anaerobic Test. The…
Steenson, Sharalyn; Özcebe, Hilal; Arslan, Umut; Konşuk Ünlü, Hande; Araz, Özgür M; Yardim, Mahmut; Üner, Sarp; Bilir, Nazmi; Huang, Terry T-K
2018-01-01
Childhood obesity rates have been rising rapidly in developing countries. A better understanding of the risk factors and social context is necessary to inform public health interventions and policies. This paper describes the validation of several measurement scales for use in Turkey, which relate to child and parent perceptions of physical activity (PA) and enablers and barriers of physical activity in the home environment. The aim of this study was to assess the validity and reliability of several measurement scales in Turkey using a population sample across three socio-economic strata in the Turkish capital, Ankara. Surveys were conducted in Grade 4 children (mean age = 9.7 years for boys; 9.9 years for girls), and their parents, across 6 randomly selected schools, stratified by SES (n = 641 students, 483 parents). Construct validity of the scales was evaluated through exploratory and confirmatory factor analysis. Internal consistency of scales and test-retest reliability were assessed by Cronbach's alpha and intra-class correlation. The scales as a whole were found to have acceptable-to-good model fit statistics (PA Barriers: RMSEA = 0.076, SRMR = 0.0577, AGFI = 0.901; PA Outcome Expectancies: RMSEA = 0.054, SRMR = 0.0545, AGFI = 0.916, and PA Home Environment: RMSEA = 0.038, SRMR = 0.0233, AGFI = 0.976). The PA Barriers subscales showed good internal consistency and poor to fair test-retest reliability (personal α = 0.79, ICC = 0.29, environmental α = 0.73, ICC = 0.59). The PA Outcome Expectancies subscales showed good internal consistency and test-retest reliability (negative α = 0.77, ICC = 0.56; positive α = 0.74, ICC = 0.49). Only the PA Home Environment subscale on support for PA was validated in the final confirmatory model; it showed moderate internal consistency and test-retest reliability (α = 0.61, ICC = 0.48). This study is the first to validate measures of perceptions of physical activity and the physical activity home environment in Turkey. Our results support the originally hypothesized two-factor structures for Physical Activity Barriers and Physical Activity Outcome Expectancies. However, we found the one-factor rather than two-factor structure for Physical Activity Home Environment had the best model fit. This study provides general support for the use of these scales in Turkey in terms of validity, but test-retest reliability warrants further research.
Development of an International Odor Identification Test for Children: The Universal Sniff Test.
Schriever, Valentin A; Agosin, Eduardo; Altundag, Aytug; Avni, Hadas; Cao Van, Helene; Cornejo, Carlos; de Los Santos, Gonzalo; Fishman, Gad; Fragola, Claudio; Guarneros, Marco; Gupta, Neelima; Hudson, Robyn; Kamel, Reda; Knaapila, Antti; Konstantinidis, Iordanis; Landis, Basile N; Larsson, Maria; Lundström, Johan N; Macchi, Alberto; Mariño-Sánchez, Franklin; Martinec Nováková, Lenka; Mori, Eri; Mullol, Joaquim; Nord, Marie; Parma, Valentina; Philpott, Carl; Propst, Evan J; Rawan, Ahmed; Sandell, Mari; Sorokowska, Agnieszka; Sorokowski, Piotr; Sparing-Paschke, Lisa-Marie; Stetzler, Carolin; Valder, Claudia; Vodicka, Jan; Hummel, Thomas
2018-07-01
To assess olfactory function in children and to create and validate an odor identification test to diagnose olfactory dysfunction in children, which we called the Universal Sniff (U-Sniff) test. This is a multicenter study involving 19 countries. The U-Sniff test was developed in 3 phases including 1760 children age 5-7 years. Phase 1: identification of potentially recognizable odors; phase 2: selection of odorants for the odor identification test; and phase 3: evaluation of the test and acquisition of normative data. Test-retest reliability was evaluated in a subgroup of children (n = 27), and the test was validated using children with congenital anosmia (n = 14). Twelve odors were familiar to children and, therefore, included in the U-Sniff test. Children scored a mean ± SD of 9.88 ± 1.80 points out of 12. Normative data was obtained and reported for each country. The U-Sniff test demonstrated a high test-retest reliability (r 27 = 0.83, P < .001) and enabled discrimination between normosmia and children with congenital anosmia with a sensitivity of 100% and specificity of 86%. The U-Sniff is a valid and reliable method of testing olfaction in children and can be used internationally. Copyright © 2018 Elsevier Inc. All rights reserved.
Reliability of the ecSatter Inventory as a tool to measure eating competence.
Stotts, Jodi L; Lohse, Barbara
2007-01-01
To examine the reliability of the ecSatter Inventory (ecSI), a measure of eating competence. Self-report questionnaires were administered in person or by mail. Retesting occurred 2 to 6 weeks after completion of the first questionnaire. Both administrations of the questionnaire were completed by 259 participants who were mostly food secure, white females with some college education; mean age was 26.9 +/- 10.4 years. Test-retest reliability and internal consistency. Spearman's rank correlation coefficients to estimate test-retest reliability and Cronbach alpha coefficients to estimate internal consistency. Spearman's rank correlation coefficient for ecSI total score was 0.68; subscale coefficients were 0.70 for eating attitudes, 0.70 for contextual skills, 0.65 for food acceptance, and 0.52 for internal regulation. Cronbach alpha coefficient for ecSI total score was 0.77. Subscale alphas coefficients were 0.80 for eating attitudes, 0.69 for contextual skills, 0.68 for food acceptance, and 0.66 for internal regulation. This study provides psychometric evidence about the reliability of ecSI as a measure of eating competence in this sample. Although some ecSI items may require revision, results suggest that the instrument may be used to evaluate nutrition education designed to improve eating competence.
Duncan, Laura; Comeau, Jinette; Wang, Li; Vitoroulis, Irene; Boyle, Michael H; Bennett, Kathryn
2018-02-19
A better understanding of factors contributing to the observed variability in estimates of test-retest reliability in published studies on standardized diagnostic interviews (SDI) is needed. The objectives of this systematic review and meta-analysis were to estimate the pooled test-retest reliability for parent and youth assessments of seven common disorders, and to examine sources of between-study heterogeneity in reliability. Following a systematic review of the literature, multilevel random effects meta-analyses were used to analyse 202 reliability estimates (Cohen's kappa = ҡ) from 31 eligible studies and 5,369 assessments of 3,344 children and youth. Pooled reliability was moderate at ҡ = .58 (CI 95% 0.53-0.63) and between-study heterogeneity was substantial (Q = 2,063 (df = 201), p < .001 and I 2 = 79%). In subgroup analysis, reliability varied across informants for specific types of psychiatric disorder (ҡ = .53-.69 for parent vs. ҡ = .39-.68 for youth) with estimates significantly higher for parents on attention deficit hyperactivity disorder, oppositional defiant disorder and the broad groupings of externalizing and any disorder. Reliability was also significantly higher in studies with indicators of poor or fair study methodology quality (sample size <50, retest interval <7 days). Our findings raise important questions about the meaningfulness of published evidence on the test-retest reliability of SDIs and the usefulness of these tools in both clinical and research contexts. Potential remedies include the introduction of standardized study and reporting requirements for reliability studies, and exploration of other approaches to assessing and classifying child and adolescent psychiatric disorder. © 2018 Association for Child and Adolescent Mental Health.
Test-Retest Reliability and Practice Effects of the Stability Evaluation Test.
Williams, Richelle M; Corvo, Matthew A; Lam, Kenneth C; Williams, Travis A; Gilmer, Lesley K; McLeod, Tamara C Valovich
2017-01-17
Postural control plays an essential role in concussion evaluation. The Stability Evaluation Test (SET) aims to objectively analyze postural control by measuring sway velocity on the NeuroCom's VSR portable force platform (Natus, San Carlos, CA). To assess the test-retest reliability and practice effects of the SET protocol. Cohort. Research Laboratory. Fifty healthy adults (males=20, females=30, age=25.30±3.60 years, height=166.60±12.80 cm, mass=68.80±13.90 kg). All participants completed four trials of the SET. Each trial consisted of six 20-second balance tests with eyes closed, under the following conditions: double-leg firm (DFi), single-leg firm (SFi), tandem firm (TFi), double-leg foam (DFo), single-leg foam (SFo), and tandem foam (TFo). Each trial was separated by a 5-minute seated rest period. The dependent variable was sway velocity (deg/sec), with lower values indicating better balance. Sway velocity was recorded for each of the six conditions as well as a composite score for each trial. Test-retest reliability was analyzed across four trials with Intraclass Correlation Coefficients. Practice effects analyzed with repeated measures analysis of variance, followed by Tukey post-hoc comparisons for any significant main effects (p<.05). Sway velocity reliability values were good to excellent: DFi (ICC=0.88;95%CI:0.81,0.92), SFi (ICC=0.75;95%CI:0.61,0.85), TFi (ICC=0.84;95%CI:0.75,0.90), DFo (ICC=0.83;95%CI:0.74,0.90), SFo (ICC=0.82;95%CI:0.72,0.89), TFo (ICC=0.81;95%CI:0.69,0.88), and composite score (ICC=0.93;95%CI:0.88,0.95). Significant practice effects (p<.05) were noted on the SFi, DFo, SFo, TFo conditions, and composite scores. Our results suggest the SET has good to excellent reliability for the assessment of postural control in healthy adults. Due to the practice effects noted, a familiarization session is recommended (i.e., all 6 conditions) prior to recording the data. Future studies should evaluate injured patients to determine meaningful change scores during various injuries.
Translation and validation of the Dutch new Knee Society Scoring System ©.
Van Der Straeten, Catherine; Witvrouw, Erik; Willems, Tine; Bellemans, Johan; Victor, Jan
2013-11-01
A new version of The Knee Society Knee Scoring System(©) (KSS) has recently been developed. Before this scale can be used in non-English-speaking populations, it has to be translated and validated for a particular population. We evaluated the construct and content validity, the test-retest reliability, and the internal consistency of the Dutch version of the New Knee Society KSS. A Dutch translation was performed using a forward-backward translation protocol. We tested the construct validity of the Dutch New KSS by comparing it with the Dutch versions of the WOMAC, Knee Injury and Osteoarthritis Outcome Score (KOOS), and SF-12 scores in 137 patients undergoing total knee arthroplasty (TKA). Content validity was assessed by comparing pre- and postoperative scores and by checking floor and ceiling effects. To evaluate test-retest reliability and consistency, 47 patients completed the questionnaire a second time with a mean of 8 days interval (range, 2-20 days) between tests. Construct validity was demonstrated because the Dutch New KSS correlated well with the Dutch WOMAC (r = -0.751; p < 0.001), Dutch KOOS (r = -0.723; p < 0.001), and Dutch SF-12 (r = 0.569; p < 0.001). There was a significant difference between pre- and postoperative scores (p < 0.001) in line with the other scores. Test-retest reliability proved excellent with an intraclass correlation coefficient between 0.73 and 0.92 depending on the domain tested. Consistency as indicated by Cronbach's alpha ranging from 0.84 to 0.96 was good to excellent. As demonstrated by the validation procedure, the Dutch New KSS is an excellent instrument to evaluate TKA outcome in Dutch-speaking patients.
Environmental education curriculum evaluation questionnaire: A reliability and validity study
NASA Astrophysics Data System (ADS)
Minner, Daphne Diane
The intention of this research project was to bridge the gap between social science research and application to the environmental domain through the development of a theoretically derived instrument designed to give educators a template by which to evaluate environmental education curricula. The theoretical base for instrument development was provided by several developmental theories such as Piaget's theory of cognitive development, Developmental Systems Theory, Life-span Perspective, as well as curriculum research within the area of environmental education. This theoretical base fueled the generation of a list of components which were then translated into a questionnaire with specific questions relevant to the environmental education domain. The specific research question for this project is: Can a valid assessment instrument based largely on human development and education theory be developed that reliably discriminates high, moderate, and low quality in environmental education curricula? The types of analyses conducted to answer this question were interrater reliability (percent agreement, Cohen's Kappa coefficient, Pearson's Product-Moment correlation coefficient), test-retest reliability (percent agreement, correlation), and criterion-related validity (correlation). Face validity and content validity were also assessed through thorough reviews. Overall results indicate that 29% of the questions on the questionnaire demonstrated a high level of interrater reliability and 43% of the questions demonstrated a moderate level of interrater reliability. Seventy-one percent of the questions demonstrated a high test-retest reliability and 5% a moderate level. Fifty-five percent of the questions on the questionnaire were reliable (high or moderate) both across time and raters. Only eight questions (8%) did not show either interrater or test-retest reliability. The global overall rating of high, medium, or low quality was reliable across both coders and time, indicating that the questionnaire can discriminate differences in quality of environmental education curricula. Of the 35 curricula evaluated, 6 were high quality, 14 were medium quality and 15 were low quality. The criterion-related validity of the instrument is at current time unable to be established due to the lack of comparable measures or a concretely usable set of multidisciplinary standards. Face and content validity were sufficiently demonstrated.
Hanson, Lisa C; Taylor, Nicholas F; McBurney, Helen
2016-09-01
To determine the retest reliability of the 10m incremental shuttle walk test (ISWT) in a mixed cardiac rehabilitation population. Participants completed two 10m ISWTs in a single session in a repeated measures study. Ten participants completed a third 10m ISWT as part of a pilot study. Hospital physiotherapy department. 62 adults aged a mean of 68 years (SD 10) referred to a cardiac rehabilitation program. Retest reliability of the 10m ISWT expressed as relative reliability and measurement error. Relative reliability was expressed in a ratio in the form of an intraclass correlation coefficient (ICC) and measurement error in the form of the standard error of measurement (SEM) and 95% confidence intervals for the group and individual. There was a high level of relative reliability over the two walks with an ICC of .99. The SEMagreement was 17m, and a change of at least 23m for the group and 54m for the individual would be required to be 95% confident of exceeding measurement error. The 10m ISWT demonstrated good retest reliability and is sufficiently reliable to be applied in practice in this population without the use of a practice test. Copyright © 2015 Chartered Society of Physiotherapy. Published by Elsevier Ltd. All rights reserved.
Varikuti, Deepthi P; Hoffstaedter, Felix; Genon, Sarah; Schwender, Holger; Reid, Andrew T; Eickhoff, Simon B
2017-04-01
Resting-state functional connectivity analysis has become a widely used method for the investigation of human brain connectivity and pathology. The measurement of neuronal activity by functional MRI, however, is impeded by various nuisance signals that reduce the stability of functional connectivity. Several methods exist to address this predicament, but little consensus has yet been reached on the most appropriate approach. Given the crucial importance of reliability for the development of clinical applications, we here investigated the effect of various confound removal approaches on the test-retest reliability of functional-connectivity estimates in two previously defined functional brain networks. Our results showed that gray matter masking improved the reliability of connectivity estimates, whereas denoising based on principal components analysis reduced it. We additionally observed that refraining from using any correction for global signals provided the best test-retest reliability, but failed to reproduce anti-correlations between what have been previously described as antagonistic networks. This suggests that improved reliability can come at the expense of potentially poorer biological validity. Consistent with this, we observed that reliability was proportional to the retained variance, which presumably included structured noise, such as reliable nuisance signals (for instance, noise induced by cardiac processes). We conclude that compromises are necessary between maximizing test-retest reliability and removing variance that may be attributable to non-neuronal sources.
Varikuti, Deepthi P.; Hoffstaedter, Felix; Genon, Sarah; Schwender, Holger; Reid, Andrew T.; Eickhoff, Simon B.
2016-01-01
Resting-state functional connectivity analysis has become a widely used method for the investigation of human brain connectivity and pathology. The measurement of neuronal activity by functional MRI, however, is impeded by various nuisance signals that reduce the stability of functional connectivity. Several methods exist to address this predicament, but little consensus has yet been reached on the most appropriate approach. Given the crucial importance of reliability for the development of clinical applications, we here investigated the effect of various confound removal approaches on the test-retest reliability of functional-connectivity estimates in two previously defined functional brain networks. Our results showed that grey matter masking improved the reliability of connectivity estimates, whereas de-noising based on principal components analysis reduced it. We additionally observed that refraining from using any correction for global signals provided the best test-retest reliability, but failed to reproduce anti-correlations between what have been previously described as antagonistic networks. This suggests that improved reliability can come at the expense of potentially poorer biological validity. Consistent with this, we observed that reliability was proportional to the retained variance, which presumably included structured noise, such as reliable nuisance signals (for instance, noise induced by cardiac processes). We conclude that compromises are necessary between maximizing test-retest reliability and removing variance that may be attributable to non-neuronal sources. PMID:27550015
The Reliability and Validity of Measures of Gait Variability in Community-Dwelling Older Adults
Brach, Jennifer S.; Perera, Subashan; Studenski, Stephanie; Newman, Anne B.
2009-01-01
Objective To examine the test-retest reliability and concurrent validity of variability of gait characteristics. Design Cross-sectional study. Setting Research laboratory. Participants Older adults (N=558) from the Cardiovascular Health Study. Interventions Not applicable. Main Outcome Measures Gait characteristics were measured using a 4-m computerized walkway. SD determined from the steps recorded were used as the measures of variability. Intraclass correlation coefficients (ICC) were calculated to examine test-retest reliability of a 4-m walk and two 4-m walks. To establish concurrent validity, the measures of gait variability were compared across levels of health, functional status, and physical activity using independent t tests and analysis of variances. Results Gait variability measures from the two 4-m walks demonstrated greater test-retest reliability than those from the single 4-m walk (ICC=.22–.48 and ICC=.40–.63, respectively). Greater step length and stance time variability were associated with poorer health, functional status and physical activity (P<.05). Conclusions Gait variability calculated from a limited number of steps has fair to good test-retest reliability and concurrent validity. Reliability of gait variability calculated from a greater number of steps should be assessed to determine if the consistency can be improved. PMID:19061741
Test-retest reliability of the Military Pre-training Questionnaire.
Robinson, M; Stokes, K; Bilzon, J; Standage, M; Brown, P; Thompson, D
2010-09-01
Musculoskeletal injuries are a significant cause of morbidity during military training. A brief, inexpensive and user-friendly tool that demonstrates reliability and validity is warranted to effectively monitor the relationship between multiple predictor variables and injury incidence in military populations. To examine the test-retest reliability of the Military Pre-training Questionnaire (MPQ), designed specifically to assess risk factors for injury among military trainees across five domains (physical activity, injury history, diet, alcohol and smoking). Analyses were based on a convenience sample of 58 male British Army trainees. Kappa (kappa), weighted kappa (kappa(w)) and intraclass correlation coefficients (ICC) were used to evaluate the 2-week test-retest reliability of the MPQ. For index measures constituting the assessment of a given construct, internal consistency was assessed by Cronbach's alpha (alpha) coefficients. Reliability of individual items ranged from poor to almost perfect (kappa range = 0.45-0.86; kappa(w) range = 0.11-0.91; ICC range = 0.34-0.86) with most items demonstrating moderate reliability. Overall scores related to physical activity, diet, alcohol and smoking constructs were reliable between both administrations (ICC = 0.63-0.85). Support for the internal consistency of the incorporated alcohol (alpha = 0.78) and cigarette (alpha = 0.75) scales was also provided. The MPQ is a reliable self-report instrument for assessing multiple injury-related risk factors during initial military training. Further assessment of the psychometric properties of the MPQ (e.g. different types of validity) with military populations/samples will support its interpretation and use in future surveillance and epidemiological studies.
Retest Reliability of the Rosenzweig Picture-Frustration Study and Similar Semiprojective Techniques
ERIC Educational Resources Information Center
Rosenzweig, Saul; And Others
1975-01-01
The research dealing with the reliability of the Rosenzweig Picture-Frustration Study is surveyed. Analysis of various split-half, and retest procedures are reviewed and their relative effectiveness evaluated. Reliability measures as applied to projective techniques in general are discussed. (Author/DEP)
Arunakul, Marut; Arunakul, Preeyaphan; Suesiritumrong, Chakhrist; Angthong, Chayanin; Chernchujit, Bancha
2015-06-01
Self-administered questionnaires have become an important aspect for clinical outcome assessment of foot and ankle-related problems. The Foot and Ankle Ability Measure (FAAM) subjective form is a region-specific questionnaire that is widely used and has sufficient validity and reliability from previous studies. Translate the original English version of FAAM into a Thai version and evaluate the validity and reliability of Thai FAAM in patients with foot and ankle-related problems. The FAAM subjective form was translated into Thai using forward-backward translation protocol. Afterward, reliability and validity were tested. Following responses from 60 consecutive patients on two questionnaires, the Thai FAAM subjective form and the short form (SF)-36, were used. The validity was tested by correlating the scores from both questionnaires. The reliability was adopted by measuring the test-retest reliability and internal consistency. Thai FAAM score including activity of daily life (ADL) and Sport subscale demonstrated the sufficient correlations with physical functioning (PF) and physical composite score (PCS) domains of the SF-36 (statistically significant with p < 0.001 level and ≥ 0.5 values). The result of reliability revealed highly intra-class correlation coefficient as 0.8 and 0.77, respectively from test-retest study. The internal consistency was strong (Cronbach alpha = 0.94 and 0.88, respectively). The Thai version of FAAM subjective form retained the characteristics of the original version and has proved a reliable evaluation instrument for patients with foot and ankle-related problems.
Alghadir, Ahmad; Anwer, Shahnawaz; Iqbal, Zaheen Ahmed; Alsanawi, Hisham Abdulaziz
2016-01-01
We adapted the reduced Western Ontario and McMaster Universities Osteoarthritis (WOMAC) index for the Arabic language and tested its metric properties in patients with knee osteoarthritis (OA). One hundred and twenty-one consecutive patients who were referred for physiotherapy to the outpatient department were asked to answer the Arabic version of the reduced WOMAC index (ArWOMAC). After the completion of the ArWOMAC, the intensity of knee pain and general health status were assessed using the visual analog scale (VAS) and the 12-item short form health survey (SF-12), respectively. A second assessment was performed at least 48 h after the first session to assess test-retest reliability. The test-retest reliability was quantified using the intra-class correlation coefficient (ICC), and Cronbach's alpha was calculated to assess the internal consistency of the Arabic questionnaire. The construct validity was assessed using Spearman rank correlation coefficients. The total ArWOMAC scale and pain and function subscales were internally consistent with Cronbach's coefficient alpha of 0.91, 0.89 and 0.90, respectively. Test-retest reliability was good to excellent with ICC of 0.91, 0.89 and 0.90, respectively. SF-12 and VAS score significantly correlated with ArWOMAC index (p < 0.01), which support the construct validity. The standard error of measurement (SEM) of the total scale was 2.94, based on repeated measurements for test-retest. The minimum detectable change based on the SEM for test-retest was 8.15. The ArWOMAC index is a reliable and valid instrument for evaluating the severity of knee OA, with metric properties in agreement with the original version. Although, the reduced WOMAC index has been clinically utilized within the Saudi population, the Arabic version of this instrument is not validated for an Arab population to measure lower limb functional disability caused by OA. The Arabic version of reduced WOMAC (ArWOMAC) index is a reliable and valid scale to measure lower limb functional disability in patients with knee OA. The ArWOMAC index could be suitable in Saudi Arabia and other Arab countries where the language, culture and the life style are similar.
Palmer, S; Manns, S; Cramp, F; Lewis, R; Clark, E M
2017-12-01
The Bristol Impact of Hypermobility (BIoH) questionnaire is a patient-reported outcome measure developed in conjunction with adults with Joint Hypermobility Syndrome (JHS). It has demonstrated strong concurrent validity with the Short Form-36 (SF-36) physical component score but other psychometric properties have yet to be established. This study aimed to determine its test-retest reliability and smallest detectable change (SDC). A test-retest reliability study. Participants were recruited from the Hypermobility Syndromes Association, a patient organisation in the United Kingdom. Recruitment packs were sent to 1080 adults who had given permission to be contacted about research. BIoH and SF-36 questionnaires were administered at baseline and repeated two weeks later. An 11-point global rating of change scale (-5 to +5) was also administered at two weeks. Test-retest analysis and calculation of the SDC was conducted on 'stable' patients (defined as global rating of change -1 to +1). 462 responses were received. 233 patients reported a 'stable' condition and were included in analysis (95% women; mean (SD) age 44.5 (13.9) years; BIoH score 223.6 (54.0)). The BIoH questionnaire demonstrated excellent test-retest reliability (ICC 0.923, 95% CI 0.900-0.940). The SDC was 42 points (equivalent to 19% of the mean baseline score). The SF-36 physical and mental component scores demonstrated poorer test-retest reliability and larger SDCs (as a proportion of the mean baseline scores). The results provide further evidence of the potential of the BIoH questionnaire to underpin research and clinical practice for people with JHS. Copyright © 2017 Elsevier Ltd. All rights reserved.
ERIC Educational Resources Information Center
van Kernebeek, Willem G.; de Schipper, Antoine W.; Savelsbergh, Geert J. P.; Toussaint, Huub M.
2018-01-01
In The Netherlands, the 4-Skills Scan is an instrument for physical education teachers to assess gross motor skills of elementary school children. Little is known about its reliability. Therefore, in this study the test-retest and inter-rater reliability was determined. Respectively, 624 and 557 Dutch 6- to 12-year-old children were analyzed for…
Examination of the Test-Retest Reliability of a Computerized Neurocognitive Test Battery.
Nakayama, Yusuke; Covassin, Tracey; Schatz, Philip; Nogle, Sally; Kovan, Jeff
2014-08-01
Test-retest reliability is a critical issue in the utility of computer-based neurocognitive assessment paradigms employing baseline and postconcussion tests. Researchers have reported low test-retest reliability for the Immediate Post Concussion Assessment and Cognitive Testing (ImPACT) across an interval of 45 and 50 days. To re-examine the test-retest reliability of the ImPACT between baseline, 45 days, and 50 days. Descriptive laboratory study. Eighty-five physically active college students (51 male, 34 female) volunteered for this study. Participants completed the ImPACT as well as a 15-item memory test at baseline, 45 days, and 50 days. Intraclass correlation coefficients (ICCs) were calculated for ImPACT composite scores, and change scores were calculated using reliable change indices (RCIs) and regression-based methods (RBMs) at 80% and 95% confidence intervals (CIs). The respective ICCs for baseline to day 45, day 45 to day 50, baseline to day 50, and overall were as follows: verbal memory (0.76, 0.69, 0.65, and 0.78), visual memory (0.72, 0.66, 0.60, and 0.74), visual motor (processing) speed (0.87, 0.88, 0.85, and 0.91), and reaction time (0.67, 0.81, 0.71, and 0.80). All ICCs exceeded the threshold value of 0.60 for acceptable test-retest reliability. All cases fell well within the 80% CI for both the RCI and RBM, while 1% to 5% of cases fell outside the 95% CI for the RCI and 1% for the RBM. Results suggest that the ImPACT is a reliable neurocognitive test battery at 45 and 50 days after the baseline assessment. The current findings agree with those of other reliability studies that have reported acceptable ICCs across 30-day to 1-year testing intervals, and they support the utility of the ImPACT for the multidisciplinary approach to concussion management. This study suggests that the computerized neurocognitive test battery, ImPACT, is a reliable test for postconcussion serial assessments. However, when managing concussed athletes, the ImPACT should not be used as a stand-alone measure. © 2014 The Author(s).
Test-retest reliability of posture measurements in adolescents with idiopathic scoliosis.
Heitz, Pierre-Henri; Aubin-Fournier, Jean-François; Parent, Éric; Fortin, Carole
2018-05-07
Posture changes are a major consequence of IS (IS). Posture changes can lead to psychosocial and physical impairments in adolescents with IS. Therefore, it is important to assess posture but the test-retest reliability of posture measurements still remains unknown in this population. The primary objective was to determine the test-retest reliability of 25 head and trunk posture indices using the Clinical Photographic Postural Assessment Tool (CPPAT) in adolescents with IS. The secondary objective was to determine the standard error of measurement and the minimal detectable change. This is a prospective test-retest reliability study carried out at two tertiary university hospital centers. Forty-one adolescents with IS, aged 10 to 16 years old with curves 10 to 45 o and treated non-operatively were recruited. Two posture assessments were done using the CPPAT five to 10 days apart following a standardized procedure. Photographs were analyzed with the CPPAT software by digitizing reference landmarks placed on the participant by a physiotherapist evaluator. Generalizability theory was used to obtain a coefficient of dependability, standard error of measurement and the minimal detectable change at the 90% confidence interval. This project was supported by the Canadian Pediatric Spine Society (CPSS: 10000$). There is no study-specific conflicts of interest-associated biases. Fourteen of 25 posture indices had a good reliability (ϕ ≥ 0.78), ten of 25 had moderate reliability (ϕ = 0.55 to 0.74) and one had poor reliability (ϕ = 0.45). The most reliable posture indices were waist angles asymmetry (ϕ = 0.93), right waist angle (ϕ = 0.91) and frontal trunk list (ϕ = 0.92). Right sagittal trunk list was the least reliable posture index (ϕ = 0.45). The MDC 90 values ranged from 2.6 to 10.3° for angular measurements and from 8.4 to 35.1 mm for linear measurements. This study demonstrates that most posture indices, especially the trunk posture indices, are reproducible in time among adolescents with IS and provides reference values. Clinicians and researchers can use these reference values in order to assess change in posture over time attributable to treatment effectiveness. Copyright © 2018. Published by Elsevier Inc.
USDA-ARS?s Scientific Manuscript database
Mechanography during the vertical jump test allows for evaluation of force-time variables reflecting jump execution, which may enhance screening for functional deficits that reduce physical performance and determining mechanistic causes underlying performance changes. However, utility of jump mechan...
Buchowski, Maciej S.; Matthews, Charles E.; Cohen, Sarah S.; Signorello, Lisa B.; Fowke, Jay H.; Hargreaves, Margaret K.; Schlundt, David G.; Blot, William J.
2012-01-01
Background Low physical activity (PA) is linked to cancer and other diseases prevalent in racial/ethnic minorities and low-income populations. This study evaluated the PA questionnaire (PAQ) used in the Southern Cohort Community Study, a prospective investigation of health disparities between African-American and white adults. Methods The PAQ was administered upon entry into the cohort (PAQ1) and after 12–15 months (PAQ2) in 118 participants (40–60 year-old, 48% male, 74% African-American). Test-retest reliability (PAQ1 versus PAQ2) was assessed using Spearman correlations and the Wilcoxon signed rank test. Criterion validity of the PAQ was assessed via comparison with a PA monitor and a last-month PA survey (LMPAS), administered up to 4 times in the study period. Results The PAQ test-retest reliability ranged from 0.25–0.54 for sedentary behaviors and 0.22–0.47 for active behaviors. The criterion validity for the PAQ compared with PA monitor ranged from 0.21–0.24 for sedentary behaviors and from 0.17–0.31 for active behaviors. There was general consistency in the magnitude of correlations between the PAQ and PA-monitor between African-Americans and whites. Conclusions The SCCS-PAQ has fair to moderate test-retest reliability and demonstrated some evidence of criterion validity for ranking participants by their level of sedentary and active behaviors. PMID:21952413
Validation study of a Chinese version of Partners in Health in Hong Kong (C-PIH HK).
Chiu, Teresa Mei Lee; Tam, Katharine Tai Wo; Siu, Choi Fong; Chau, Phyllis Wai Ping; Battersby, Malcolm
2017-01-01
The Partners in Health (PIH) scale is a measure designed to assess the generic knowledge, attitudes, behaviors, and impacts of self-management. A cross-cultural adaptation of the PIH for use in Hong Kong was evaluated in this study. This paper reports the validity and reliability of the Chinese version of PIH (C-PIH[HK]). A 12-item PIH was translated using forward-backward translation technique and reviewed by individuals with chronic diseases and health professionals. A total of 209 individuals with chronic diseases completed the scale. The construct validity, internal consistency, and test-retest reliability were evaluated in two waves. The findings in Wave 1 (n = 73) provided acceptable psychometric properties of the C-PIH(HK) but supported the adaptation of question 5 to improve the cultural relevance, validity, and reliability of the scale. An adapted version of C-PIH(HK) was evaluated in Wave 2. The findings in Wave 2 (n = 136) demonstrated good construct validity and internal consistency of C-PIH(HK). A principal component analysis with Oblimin rotation yielded a 3-factor solution, and the Cronbach's alphas of the subscales ranged from 0.773 to 0.845. Participants were asked whether they perceived the self-management workshops they attended and education provided by health professionals as useful or not. The results showed that the C-PIH(HK) was able to discriminate those who agreed and those who disagreed related to the usefulness of individual health education (p < 0.0001 in all subscales) and workshops (p < 0.001 in the knowledge subscale) as hypothesized. The test-retest reliability was high (ICC = 0.818). A culturally adapted version of PIH for use in Hong Kong was evaluated. The study supported good construct validity, discriminate validity, internal consistency, and test-retest reliability of the C-PIH(HK).
Bajaj, Jasmohan S; Heuman, Douglas M; Sterling, Richard K; Sanyal, Arun J; Siddiqui, Muhammad; Matherly, Scott; Luketic, Velimir; Stravitz, R Todd; Fuchs, Michael; Thacker, Leroy R; Gilles, HoChong; White, Melanie B; Unser, Ariel; Hovermale, James; Gavis, Edith; Noble, Nicole A; Wade, James B
2014-01-01
Background & Aims Detection of covert hepatic encephalopathy (CHE) is difficult but point of care testing could increase rates of diagnosis. We aimed to validate the ability of the smartphone app EncephalApp, a streamlined version of Stroop App, to detect CHE. We evaluated face validity, test–retest reliability, and external validity. Methods Patients with cirrhosis (n=167; 38% with overt HE [OHE]; mean age, 55 years; mean model for end-stage liver disease score, 12) and controls (n=114) were each given a paper and pencil cognitive battery (standard) along with EncephalApp. EncephalApp has Off and On states; results measured were: OffTime, OnTime, OffTime+OnTime, and number of runs required to complete 5 off and on runs. Thirty-six patients with cirrhosis underwent driving simulation tests, and EncephalApp results were correlated with results. Test–retest reliability was analyzed in a subgroup of patients. The test was performed before and after transjugular intra-hepatic portosystemic shunt placement, before and after correction for hyponatremia, to determine external validity. Results All patients with cirrhosis performed worse on paper and pencil and EncephalApp tests than controls. Patients with cirrhosis and OHE performed worse than those without OHE. Age-dependent EncephalApp cut-offs (younger or older than 45 years) were set. An OffTime+OnTime value of >190 seconds identified all patients with CHE with an area under the receiver operator characteristic (AUROC) value of 0.91; the AUROC value was 0.88 for diagnosis of CHE in those without OHE. EncephalApp times correlated with crashes and illegal turns in driving simulation tests. Test–retest reliability was high (intra-class coefficient, 0.83) among 30 patients retested 1–3 months apart. OffTime+OnTime increased significantly (206 vs 255, P=.007) among 10 patients retested 33±7 days after transjugular intra-hepatic portosystemic shunt placement. OffTime+OnTime decreased significantly (242 vs 225, P=.03) in 7 patients tested before and after correction for hyponatremia (126±3 to 132±4 meq/L, P=.01), 10±5 days apart. Conclusions A smartphone app called EncephalApp has good face validity, test–retest reliability, and external validity for the diagnosis of CHE. PMID:24846278
Munguía-Izquierdo, Diego; Legaz-Arrese, Alejandro
2012-11-01
To evaluate the reliability, standard error of the mean (SEM), clinical significant change, and known group validity of 2 assessments of endurance strength to low loads in patients with fibromyalgia syndrome (FS). Cross-sectional reliability and comparative study. University Pablo de Olavide, Seville, Spain. Middle-aged women with FS (n=95) and healthy women (n=64) matched for age, weight, and body mass index (BMI) were recruited for the study. Not applicable. The endurance strength to low loads tests of the upper and lower extremities and anthropometric measures (BMI) were used for the evaluations. The differences between the readings (tests 1 and 2) and the SDs of the differences, intraclass correlation coefficient (ICC) model (2,1), 95% confidence interval for the ICC, coefficient of repeatability, intrapatient SD, SEM, Wilcoxon signed-rank test, and Bland-Altman plots were used to examine reliability. A Mann-Whitney U test was used to analyze the differences in test values between the patient group and the control group. We hypothesized that patients with FS would have an endurance strength to low loads performance in lower and upper extremities at least twice as low as that of the healthy controls. Satisfactory test-retest reliability and SEMs were found for the lower extremity, dominant arm, and nondominant arm tests (ICC=.973-.979; P<.001; SEMs=1.44-1.66 repetitions). The differences in the mean between the test and retest were lower than the SEM for all performed tests, varying from -.10 to .29 repetitions. No significant differences were found between the test and retest (P>.05 for all). The Bland-Altman plots showed 95% limits of agreement for the lower extremity (4.7 to -4.5), dominant arm (3.8 to -4.4), and nondominant arm (3.9 to -4.1) tests. The endurance strength to low loads test scores for the patients with FS were 4-fold lower than for the controls in all performed tests (P<.001 for all). The endurance strength to low loads tests showed good reliability and known group validity and can be recommended for evaluating endurance strength to low loads in patients with FS. For individual evaluation, however, an improved score of at least 4 and 5 repetitions for the upper and lower extremities, respectively, was required for the differences to be considered as substantial clinical change. Patients with FS showed impaired endurance strength to low loads performance when compared with the general population. Copyright © 2012 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Temporal Stability of the Dutch Version of the Wechsler Memory Scale-Fourth Edition (WMS-IV-NL).
Bouman, Zita; Hendriks, Marc P H; Aldenkamp, Albert P; Kessels, Roy P C
2015-01-01
The Wechsler Memory Scale-Fourth Edition (WMS-IV) is one of the most widely used memory batteries. We examined the test-retest reliability, practice effects, and standardized regression-based (SRB) change norms for the Dutch version of the WMS-IV (WMS-IV-NL) after both short and long retest intervals. The WMS-IV-NL was administered twice after either a short (M = 8.48 weeks, SD = 3.40 weeks, range = 3-16) or a long (M = 17.87 months, SD = 3.48, range = 12-24) retest interval in a sample of 234 healthy participants (M = 59.55 years, range = 16-90; 118 completed the Adult Battery; and 116 completed the Older Adult Battery). The test-retest reliability estimates varied across indexes. They were adequate to good after a short retest interval (ranging from .74 to .86), with the exception of the Visual Working Memory Index (r = .59), yet generally lower after a long retest interval (ranging from .56 to .77). Practice effects were only observed after a short retest interval (overall group mean gains up to 11 points), whereas no significant change in performance was found after a long retest interval. Furthermore, practice effect-adjusted SRB change norms were calculated for all WMS-IV-NL index scores. Overall, this study shows that the test-retest reliability of the WMS-IV-NL varied across indexes. Practice effects were observed after a short retest interval, but no evidence was found for practice effects after a long retest interval from one to two years. Finally, the SRB change norms were provided for the WMS-IV-NL.
Normative Data for an Instrumental Assessment of the Upper-Limb Functionality.
Caimmi, Marco; Guanziroli, Eleonora; Malosio, Matteo; Pedrocchi, Nicola; Vicentini, Federico; Molinari Tosatti, Lorenzo; Molteni, Franco
2015-01-01
Upper-limb movement analysis is important to monitor objectively rehabilitation interventions, contributing to improving the overall treatments outcomes. Simple, fast, easy-to-use, and applicable methods are required to allow routinely functional evaluation of patients with different pathologies and clinical conditions. This paper describes the Reaching and Hand-to-Mouth Evaluation Method, a fast procedure to assess the upper-limb motor control and functional ability, providing a set of normative data from 42 healthy subjects of different ages, evaluated for both the dominant and the nondominant limb motor performance. Sixteen of them were reevaluated after two weeks to perform test-retest reliability analysis. Data were clustered into three subgroups of different ages to test the method sensitivity to motor control differences. Experimental data show notable test-retest reliability in all tasks. Data from older and younger subjects show significant differences in the measures related to the ability for coordination thus showing the high sensitivity of the method to motor control differences. The presented method, provided with control data from healthy subjects, appears to be a suitable and reliable tool for the upper-limb functional assessment in the clinical environment.
Normative Data for an Instrumental Assessment of the Upper-Limb Functionality
Caimmi, Marco; Guanziroli, Eleonora; Malosio, Matteo; Pedrocchi, Nicola; Vicentini, Federico; Molinari Tosatti, Lorenzo; Molteni, Franco
2015-01-01
Upper-limb movement analysis is important to monitor objectively rehabilitation interventions, contributing to improving the overall treatments outcomes. Simple, fast, easy-to-use, and applicable methods are required to allow routinely functional evaluation of patients with different pathologies and clinical conditions. This paper describes the Reaching and Hand-to-Mouth Evaluation Method, a fast procedure to assess the upper-limb motor control and functional ability, providing a set of normative data from 42 healthy subjects of different ages, evaluated for both the dominant and the nondominant limb motor performance. Sixteen of them were reevaluated after two weeks to perform test-retest reliability analysis. Data were clustered into three subgroups of different ages to test the method sensitivity to motor control differences. Experimental data show notable test-retest reliability in all tasks. Data from older and younger subjects show significant differences in the measures related to the ability for coordination thus showing the high sensitivity of the method to motor control differences. The presented method, provided with control data from healthy subjects, appears to be a suitable and reliable tool for the upper-limb functional assessment in the clinical environment. PMID:26539500
Vincent, Joshua Israel; Macdermid, Joy Christine; Grewal, Ruby; Sekar, Vincent Prabhakaran; Balachandran, Dinesh
2014-01-01
Prospective longitudinal validation study. To translate and cross-culturally adapt the Oswestry Disability Index (ODI) to the Tamil language (ODI-T), and to evaluate its reliability and construct validity. ODI is widely used as a disease specific questionnaire in back pain patients to evaluate pain and disability. A thorough literature search revealed that the Tamil version of the ODI has not been previously published. The ODI was translated and cross-culturally adapted to the Tamil language according to established guidelines. 30 subjects (16 women and 14 men) with a mean age of 42.7 years (S.D. 13.6; Range 22 - 69) with low back pain were recruited to assess the psychometric properties of the ODI-T Questionnaire. Patients completed the ODI-T, Roland-Morris disability questionnaire (RMDQ), VAS-pain and VAS-disability at baseline and 24-72 hours from the baseline visit. The ODI-T displayed a high degree of internal consistency, with a Cronbach's alpha of 0.92. The test-retest reliability was high (n=30) with an ICC of 0.92 (95% CI, 0.84 to 0.96) and a mean re-test difference of 2.6 points lower on re-test. The ODI-T scores exhibited a strong correlation with the RMDQ scores (r = 0.82) p<0.01, VAS-P (r = 0.78) p<0.01 and VAS-D (r = 0.81) p<0.01. Moderate to low correlations were observed between the ODI-T and lumbar ROM (r = -0.27 to -0.53). All the hypotheses that were constructed apriori were supported. The Tamil version of the ODI Questionnaire is a valid and reliable tool that can be used to measure subjective outcomes of pain and disability in Tamil speaking patients with low back pain.
A reliability analysis of the revised competitiveness index.
Harris, Paul B; Houston, John M
2010-06-01
This study examined the reliability of the Revised Competitiveness Index by investigating the test-retest reliability, interitem reliability, and factor structure of the measure based on a sample of 280 undergraduates (200 women, 80 men) ranging in age from 18 to 28 years (M = 20.1, SD = 2.1). The findings indicate that the Revised Competitiveness Index has high test-retest reliability, high inter-item reliability, and a stable factor structure. The results support the assertion that the Revised Competitiveness Index assesses competitiveness as a stable trait rather than a dynamic state.
The Trunk Impairment Scale - modified to ordinal scales in the Norwegian version.
Gjelsvik, Bente; Breivik, Kyrre; Verheyden, Geert; Smedal, Tori; Hofstad, Håkon; Strand, Liv Inger
2012-01-01
To translate the Trunk Impairment Scale (TIS), a measure of trunk control in patients after stroke, into Norwegian (TIS-NV), and to explore its construct validity, internal consistency, intertester and test-retest reliability. TIS was translated according to international guidelines. The validity study was performed on data from 201 patients with acute stroke. Fifty patients with stroke and acquired brain injury were recruited to examine intertester and test-retest reliability. Construct validity was analyzed with exploratory and confirmatory factor analysis and item response theory, internal consistency with Cronbach's alpha test, and intertester and test-retest reliability with kappa and intraclass correlation coefficient tests. The back-translated version of TIS-NV was validated by the original developer. The subscale Static sitting balance was removed. By combining items from the subscales Dynamic sitting balance and Coordination, six ordinal superitems (testlets) were constructed. The TIS-NV was renamed the modified TIS-NV (TIS-modNV). After modifications the TIS-modNV fitted well to a locally dependent unidimensional item response theory model. It demonstrated good construct validity, excellent internal consistency, and high intertester and test-retest reliability for the total score. This study supports that the TIS-modNV is a valid and reliable scale for use in clinical practice and research.
Geber, Christian; Klein, Thomas; Azad, Shahnaz; Birklein, Frank; Gierthmühlen, Janne; Huge, Volker; Lauchart, Meike; Nitzsche, Dorothee; Stengel, Maike; Valet, Michael; Baron, Ralf; Maier, Christoph; Tölle, Thomas; Treede, Rolf-Detlef
2011-03-01
Quantitative sensory testing (QST) is an instrument to assess positive and negative sensory signs, helping to identify mechanisms underlying pathologic pain conditions. In this study, we evaluated the test-retest reliability (TR-R) and the interobserver reliability (IO-R) of QST in patients with sensory disturbances of different etiologies. In 4 centres, 60 patients (37 male and 23 female, 56.4±1.9years) with lesions or diseases of the somatosensory system were included. QST comprised 13 parameters including detection and pain thresholds for thermal and mechanical stimuli. QST was performed in the clinically most affected test area and a less or unaffected control area in a morning and an afternoon session on 2 consecutive days by examiner pairs (4 QSTs/patient). For both, TR-R and IO-R, there were high correlations (r=0.80-0.93) at the affected test area, except for wind-up ratio (TR-R: r=0.67; IO-R: r=0.56) and paradoxical heat sensations (TR-R: r=0.35; IO-R: r=0.44). Mean IO-R (r=0.83, 31% unexplained variance) was slightly lower than TR-R (r=0.86, 26% unexplained variance, P<.05); the difference in variance amounted to 5%. There were no differences between study centres. In a subgroup with an unaffected control area (n=43), reliabilities were significantly better in the test area (TR-R: r=0.86; IO-R: r=0.83) than in the control area (TR-R: r=0.79; IO-R: r=0.71, each P<.01), suggesting that disease-related systematic variance enhances reliability of QST. We conclude that standardized QST performed by trained examiners is a valuable diagnostic instrument with good test-retest and interobserver reliability within 2days. With standardized training, observer bias is much lower than random variance. Quantitative sensory testing performed by trained examiners is a valuable diagnostic instrument with good interobserver and test-retest reliability for use in patients with sensory disturbances of different etiologies to help identify mechanisms of neuropathic and non-neuropathic pain. Copyright © 2010 International Association for the Study of Pain. Published by Elsevier B.V. All rights reserved.
Park, Hee-Won; Baek, Sora; Kim, Hong Young; Park, Jung-Gyoo; Kang, Eun Kyoung
2017-10-01
To investigate the reliability and validity of a new method for isometric back extensor strength measurement using a portable dynamometer. A chair equipped with a small portable dynamometer was designed (Power Track II Commander Muscle Tester). A total of 15 men (mean age, 34.8±7.5 years) and 15 women (mean age, 33.1±5.5 years) with no current back problems or previous history of back surgery were recruited. Subjects were asked to push the back of the chair while seated, and their isometric back extensor strength was measured by the portable dynamometer. Test-retest reliability was assessed with intraclass correlation coefficient (ICC). For the validity assessment, isometric back extensor strength of all subjects was measured by a widely used physical performance evaluation instrument, BTE PrimusRS system. The limit of agreement (LoA) from the Bland-Altman plot was evaluated between two methods. The test-retest reliability was excellent (ICC=0.82; 95% confidence interval, 0.65-0.91). The Bland-Altman plots demonstrated acceptable agreement between the two methods: the lower 95% LoA was -63.1 N and the upper 95% LoA was 61.1 N. This study shows that isometric back extensor strength measurement using a portable dynamometer has good reliability and validity.
Wang, W; Lopez, V; Thompson, D R
2006-09-01
To evaluate the validity, reliability, and cultural relevance of the Chinese Mandarin version of Myocardial Infarction Dimensional Assessment Scale (MIDAS) as a disease-specific quality of life measure. The cultural relevance and content validity of the Chinese Mandarin version of the MIDAS (CM-MIDAS) was evaluated by an expert panel. Measurement performance was tested on 180 randomly selected Chinese MI patents. Thirty participants from the primary group completed the CM-MIDAS for test-retest reliability after 2 weeks. Reliability, validity and discriminatory power of the CM-MIDAS were calculated. Two items were modified as suggested by the expert panel. The overall CM-MIDAS had acceptable internal consistency with Cronbach's alpha coefficient 0.93 for the scale and 0.71-0.94 for the seven domains. Test-retest reliability by intraclass correlations was 0.85 for the overall scale and 0.74-0.94 for the seven domains. There was acceptable concurrent validity with significant (p < 0.05) correlations between the CM-MDAS and the Chinese Version of the Short Form 36. The principal components analysis extracted seven factors that explained 67.18% of the variance with high factor loading indicating good construct validity. Empirical data support CM-MIDAS as a valid and reliable disease-specific quality of life measure for Chinese Mandarin speaking patients with myocardial infarction.
Huang, Wenhao; Chapman-Novakofski, Karen M
2017-01-01
Background The extensive availability and increasing use of mobile apps for nutrition-based health interventions makes evaluation of the quality of these apps crucial for integration of apps into nutritional counseling. Objective The goal of this research was the development, validation, and reliability testing of the app quality evaluation (AQEL) tool, an instrument for evaluating apps’ educational quality and technical functionality. Methods Items for evaluating app quality were adapted from website evaluations, with additional items added to evaluate the specific characteristics of apps, resulting in 79 initial items. Expert panels of nutrition and technology professionals and app users reviewed items for face and content validation. After recommended revisions, nutrition experts completed a second AQEL review to ensure clarity. On the basis of 150 sets of responses using the revised AQEL, principal component analysis was completed, reducing AQEL into 5 factors that underwent reliability testing, including internal consistency, split-half reliability, test-retest reliability, and interrater reliability (IRR). Two additional modifiable constructs for evaluating apps based on the age and needs of the target audience as selected by the evaluator were also tested for construct reliability. IRR testing using intraclass correlations (ICC) with all 7 constructs was conducted, with 15 dietitians evaluating one app. Results Development and validation resulted in the 51-item AQEL. These were reduced to 25 items in 5 factors after principal component analysis, plus 9 modifiable items in two constructs that were not included in principal component analysis. Internal consistency and split-half reliability of the following constructs derived from principal components analysis was good (Cronbach alpha >.80, Spearman-Brown coefficient >.80): behavior change potential, support of knowledge acquisition, app function, and skill development. App purpose split half-reliability was .65. Test-retest reliability showed no significant change over time (P>.05) for all but skill development (P=.001). Construct reliability was good for items assessing age appropriateness of apps for children, teens, and a general audience. In addition, construct reliability was acceptable for assessing app appropriateness for various target audiences (Cronbach alpha >.70). For the 5 main factors, ICC (1,k) was >.80, with a P value of <.05. When 15 nutrition professionals evaluated one app, ICC (2,15) was .98, with a P value of <.001 for all 7 constructs when the modifiable items were specified for adults seeking weight loss support. Conclusions Our preliminary effort shows that AQEL is a valid, reliable instrument for evaluating nutrition apps’ qualities for clinical interventions by nutrition clinicians, educators, and researchers. Further efforts in validating AQEL in various contexts are needed. PMID:29079554
Poon, Vickie Wan-kei; Lam, Linda Chiu-wa; Wong, Samuel Yeung-shan
2008-09-01
With the rapid growth of the older population, early detection of cognitive deficits is crucial in slowing down functional deterioration of the elderly persons. To examine the validity and reliability of the Chinese (Cantonese) version of the Hierarchic Dementia Scale (CV-HDS) for Chinese older persons in Hong Kong. The HDS was translated into Cantonese Chinese. The content and cultural validity were evaluated by six expert panel members. Sixty-two participants with diagnosis of dementia were recruited for evaluation. Inter-rater reliability, test-retest reliability, internal consistency and concurrent validity were examined. The CV-HDS demonstrated satisfactory psychometric properties. inter-rater reliability and test-retest reliability were high (alpha=0.89 and alpha=0.94 respectively). High value of Cronbach's alpha (alpha=0.94) demonstrated good internal consistency. The concurrent validity of CV-HDS, through correlation with its scores with that of the Chinese version of Mini Mental Status Examination, was established (ranged from r=0.58 to r=0.78, p<0.01). The CV-HDS is a reliable and valid instrument for assessing severity of cognitive impairment in Cantonese speaking Chinese people with dementia. It facilitates treatment planning to optimize the effects of functional training and rehabilitation.
Mak, M K Y; Lau, E T L; Tam, V W K; Woo, C W Y; Yuen, S K Y
2015-01-01
To investigate the test-retest reliability of JTT in older patients with Parkinson's disease (PD); and to compare the Jebsen Taylor Hand Function Test (JTT) scores between PD and healthy subjects. Cross-sectional comparative study. Fifteen PD and fifteen healthy subjects performed the JTT and the time taken to complete the JTT was recorded. Test-retest reliabilities of JTT subtests and total score of both dominant and non-dominant hand were good to excellent (ICCs = 0.77-0.97) except J5 checkers which had moderate reliability. PD subjects required significantly longer time to finish subtests and the whole JTT (p < 0.05), except the subtest J1 writing of dominant hand that showed marginal significance (p = 0.059). JTT is a reliable and easily available assessment tool for assessing the hand function of PD subjects. PD subjects took a longer time to complete the JTT, suggesting that they have deficits in gross and fine functional dexterity. Copyright © 2015 Hanley & Belfus. Published by Elsevier Inc. All rights reserved.
Pennathur, Arunkumar; Magham, Rohini; Contreras, Luis Rene; Dowling, Winifred
2004-01-01
The objective of the work reported in this paper is to assess test-retest reliability of Yale Physical Activity Survey Total Time, Estimated Energy Expenditure, Activity Dimension Indices, and Activities Check-list in older Mexican American men and women. A convenience-based healthy sample of 49 (42 women and 7 men) older Mexican American adults recruited from senior recreation centers aged 68 to 80 years volunteered to participate in this pilot study. Forty-nine older Mexican American adults filled out the Yale Physical Activity Survey for this study. Fifteen (12 women and 3 men) of the 49 volunteers responded twice to the Yale Physical Activity Survey after a 2-week period, and helped assess the test-retest reliability of the Yale Physical Activity Survey. Results indicate that based on a 2-week test-retest administration, the Yale Physical Activity Survey was found to have moderate (rhoI= .424, p < .05) to good reliability (rs = .789, p < .01) for physical activity assessment in older Mexican American adults who responded.
Reliability of a questionnaire on substance use among adolescent students, Brazil.
Machado Neto, Adelmo de Souza; Andrade, Tarcisio Matos; Fernandes, Gilênio Borges; Zacharias, Helder Paulo; Carvalho, Fernando Martins; Machado, Ana Paula Souza; Dias, Ana Carmen Costa; Garcia, Ana Carolina Rocha; Santana, Lauro Reis; Rolin, Carlos Eduardo; Sampaio, Cyntia; Ghiraldi, Gisele; Bastos, Francisco Inácio
2010-10-01
To analyze reliability of a self-applied questionnaire on substance use and misuse among adolescent students. Two cross-sectional studies were carried out for the instrument test-retest. The sample comprised male and female students aged 1119 years from public and private schools (elementary, middle, and high school students) in the city of Salvador, Northeastern Brazil, in 2006. A total of 591 questionnaires were applied in the test and 467 in the retest. Descriptive statistics, the Kappa index, Cronbach's alpha and intraclass correlation were estimated. The prevalence of substance use/misuse was similar in both test and retest. Sociodemographic variables showed a "moderate" to "almost perfect" agreement for the Kappa index, and a "satisfactory" (>0.75) consistency for Cronbach's alpha and intraclass correlation. The age which psychoactive substances (tobacco, alcohol, and cannabis) were first used and chronological age were similar in both studies. Test-retest reliability was found to be a good indicator of students' age of initiation and their patterns of substance use. The questionnaire reliability was found to be satisfactory in the population studied.
Reliability of resting-state microstate features in electroencephalography.
Khanna, Arjun; Pascual-Leone, Alvaro; Farzan, Faranak
2014-01-01
Electroencephalographic (EEG) microstate analysis is a method of identifying quasi-stable functional brain states ("microstates") that are altered in a number of neuropsychiatric disorders, suggesting their potential use as biomarkers of neurophysiological health and disease. However, use of EEG microstates as neurophysiological biomarkers requires assessment of the test-retest reliability of microstate analysis. We analyzed resting-state, eyes-closed, 30-channel EEG from 10 healthy subjects over 3 sessions spaced approximately 48 hours apart. We identified four microstate classes and calculated the average duration, frequency, and coverage fraction of these microstates. Using Cronbach's α and the standard error of measurement (SEM) as indicators of reliability, we examined: (1) the test-retest reliability of microstate features using a variety of different approaches; (2) the consistency between TAAHC and k-means clustering algorithms; and (3) whether microstate analysis can be reliably conducted with 19 and 8 electrodes. The approach of identifying a single set of "global" microstate maps showed the highest reliability (mean Cronbach's α > 0.8, SEM ≈ 10% of mean values) compared to microstates derived by each session or each recording. There was notably low reliability in features calculated from maps extracted individually for each recording, suggesting that the analysis is most reliable when maps are held constant. Features were highly consistent across clustering methods (Cronbach's α > 0.9). All features had high test-retest reliability with 19 and 8 electrodes. High test-retest reliability and cross-method consistency of microstate features suggests their potential as biomarkers for assessment of the brain's neurophysiological health.
Metsavaht, Leonardo; Leporace, Gustavo; Riberto, Marcelo; Sposito, Maria Matilde M; Del Castillo, Letícia N C; Oliveira, Liszt P; Batista, Luiz Alberto
2012-11-01
Clinical measurement. To translate and culturally adapt the Lower Extremity Functional Scale (LEFS) into a Brazilian Portuguese version, and to test the construct and content validity and reliability of this version in patients with knee injuries. There is no Brazilian Portuguese version of an instrument to assess the function of the lower extremity after orthopaedic injury. The translation of the original English version of the LEFS into a Brazilian Portuguese version was accomplished using standard guidelines and tested in 31 patients with knee injuries. Subsequently, 87 patients with a variety of knee disorders completed the Brazilian Portuguese LEFS, the Medical Outcomes Study 36-Item Short-Form Health Survey, the Western Ontario and McMaster Universities Osteoarthritis Index, and the International Knee Documentation Committee Subjective Knee Evaluation Form and a visual analog scale for pain. All patients were retested within 2 days to determine reliability of these measures. Validation was assessed by determining the level of association between the Brazilian Portuguese LEFS and the other outcome measures. Reliability was documented by calculating internal consistency, test-retest reliability, and standard error of measurement. The Brazilian Portuguese LEFS had a high level of association with the physical component of the Medical Outcomes Study 36-Item Short-Form Health Survey (r = 0.82), the Western Ontario and McMaster Universities Osteoarthritis Index (r = 0.87), the International Knee Documentation Committee Subjective Knee Evaluation Form (r = 0.82), and the pain visual analog scale (r = -0.60) (all, P<.05). The Brazilian Portuguese LEFS had a low level of association with the mental component of the Medical Outcomes Study 36-Item Short-Form Health Survey (r = 0.38, P<.05). The internal consistency (Cronbach α = .952) and test-retest reliability (intraclass correlation coefficient = 0.957) of the Brazilian Portuguese version of the LEFS were high. The standard error of measurement was low (3.6) and the agreement was considered high, demonstrated by the small differences between test and retest and the narrow limit of agreement, as observed in Bland-Altman and survival-agreement plots. The translation of the LEFS into a Brazilian Portuguese version was successful in preserving the semantic and measurement properties of the original version and was shown to be valid and reliable in a Brazilian population with knee injuries.
Methodology for Developing a New EFNEP Food and Physical Activity Behaviors Questionnaire.
Murray, Erin K; Auld, Garry; Baker, Susan S; Barale, Karen; Franck, Karen; Khan, Tarana; Palmer-Keenan, Debra; Walsh, Jennifer
2017-10-01
Research methods are described for developing a food and physical activity behaviors questionnaire for the Expanded Food and Nutrition Education Program (EFNEP), a US Department of Agriculture nutrition education program serving low-income families. Mixed-methods observational study. The questionnaire will include 5 domains: (1) diet quality, (2) physical activity, (3) food safety, (4) food security, and (5) food resource management. A 5-stage process will be used to assess the questionnaire's test-retest reliability and content, face, and construct validity. Research teams across the US will coordinate questionnaire development and testing nationally. Convenience samples of low-income EFNEP, or EFNEP-eligible, adult participants across the US. A 5-stage process: (1) prioritize domain concepts to evaluate (2) question generation and content analysis panel, (3) question pretesting using cognitive interviews, (4) test-retest reliability assessment, and (5) construct validity testing. A nationally tested valid and reliable food and physical activity behaviors questionnaire for low-income adults to evaluate EFNEP's effectiveness. Cognitive interviews will be summarized to identify themes and dominant trends. Paired t tests (P ≤ .05) and Spearman and intra-class correlation coefficients (r > .5) will be conducted to assess reliability. Construct validity will be assessed using Wilcoxon t test (P ≤ .05), Spearman correlations, and Bland-Altman plots. Copyright © 2017 Society for Nutrition Education and Behavior. Published by Elsevier Inc. All rights reserved.
A Brazilian-Portuguese version of the Kinesthetic and Visual Motor Imagery Questionnaire.
Demanboro, Alan; Sterr, Annette; Anjos, Sarah Monteiro Dos; Conforto, Adriana Bastos
2018-01-01
Motor imagery has emerged as a potential rehabilitation tool in stroke. The goals of this study were: 1) to develop a translated and culturally-adapted Brazilian-Portugese version of the Kinesthetic and Visual Motor Imagery Questionnaire (KVIQ20-P); 2) to evaluate the psychometric characteristics of the scale in a group of patients with stroke and in an age-matched control group; 3) to compare the KVIQ20 performance between the two groups. Test-retest, inter-rater reliabilities, and internal consistencies were evaluated in 40 patients with stroke and 31 healthy participants. In the stroke group, ICC confidence intervals showed excellent test-retest and inter-rater reliabilities. Cronbach's alpha also indicated excellent internal consistency. Results for controls were comparable to those obtained in persons with stroke. The excellent psychometric properties of the KVIQ20-P should be considered during the design of studies of motor imagery interventions for stroke rehabilitation.
Maddali Bongi, S; Del Rosso, A; Miniati, I; Galluccio, F; Landi, G; Tai, G; Matucci-Cerinic, M
2012-09-01
In systemic sclerosis (SSc), mouth and face involvement leads to problems in oral health-related quality of life (OHRQoL). Mouth Handicap in Systemic Sclerosis scale (MHISS) is a 12-item questionnaire specifically quantifying mouth disability in SSc, organized in 3 subscales. Our aim was to validate Italian version of MHISS, by assessing its test-retest reliability and internal and external consistency in Italian SSc patients. Forty SSc patients (7 dSSc, 33 lSSc; age and disease duration: 57.27 ± 11.41, 9.4 ± 4.4 years; 22 with sicca syndrome) were evaluated with MHISS. MHISS was translated following a forward-backward translation procedure, with independent translations and counter-translation. Test-retest reliability was evaluated, comparing the results of two administrations, with intraclass correlation coefficient (ICC). Internal consistency was assessed by Cronbach's α and external consistency by comparison with mouth opening. MHISS has a good test-retest reliability (ICC: 0.93) and internal consistency (Cronbach's α:0.99). A good external consistency was confirmed by correlation with mouth opening (rho: -0,3869, p: 0.0137). Total MHISS score was 17.65 ± 5.20, with scores of subscale 1 (reduced mouth opening) of 6.60 ± 2.85 and scores of subscales 2 (sicca syndrome) and 3 (aesthetic concerns) of 7.82 ± 2.59 and 3.22 ± 1.14. Total and subscale 2 scores are higher in dSSc than in lSSc. This result may be due to the higher presence of sicca syndrome in dSSc than in lSSc (p = 0.0109). Our results support validity and reliability in Italian SSc patients of MHISS, specifically measuring SSc OHRQoL.
Validity and Reliability of the Persian Version of the Dysphagia Handicap Index (DHI).
Asadollahpour, Faezeh; Baghban, Kowsar; Asadi, Mozhgan
2015-05-01
The Dysphagia Handicap Index (DHI) is one of the instruments used for measuring a dysphagic patient's self-assessment. In some ways, it reflects the patient's quality of life. Although it has been recognized and widely applied in English speaking populations, it has not been used in its present forms in Persian speaking countries. The purpose of this study was to adapt a Persian version of the DHI and to evaluate its validity, consistency, and reliability in the Persian population with oropharyngeal dysphagia. Some stages for cross-cultural adaptation were performed, which consisted in translation, synthesis, back translation, review by an expert committee, and final proof reading. The generated Persian DHI was administered to 85 patients with oropharyngeal dysphagia and 89 control subjects at Zahedan city between May 2013 and August 2013. The patients and control subjects answered the same questionnaire 2 weeks later to verify the test-retest reliability. Internal consistency and test-retest reliability were evaluated. The results of the patients and the control group were compared. The Persian DHI showed good internal consistency (Cronbach's alpha coefficients range from 0.82 to 0.94). Also, good test-retest reliability was found for the total scores of the Persian DHI (r=0.89). There was a significant difference between the DHI scores of the control group and those of the oropharyngeal dysphagia group (P‹0.001). The Persian version of the DHI achieved Face and translation validity. This study demonstrated that the Persian DHI is a valid tool for self-assessment of the handicapping effects of dysphagia on the physical, functional, and emotional aspects of patient life and can be a useful tool for screening and treatment planning for the Persian-speaking dysphagic patients, regardless of the cause or the severity of the dysphagia.
Reliability Measure of a Clinical Test: Appreciation of Music in Cochlear Implantees (AMICI)
Cheng, Min-Yu; Spitzer, Jaclyn B.; Shafiro, Valeriy; Sheft, Stanley; Mancuso, Dean
2014-01-01
Purpose The goals of this study were (1) to investigate the reliability of a clinical music perception test, Appreciation of Music in Cochlear Implantees (AMICI), and (2) examine associations between the perception of music and speech. AMICI was developed as a clinical instrument for assessing music perception in persons with cochlear implants (CIs). The test consists of four subtests: (1) music versus environmental noise discrimination, (2) musical instrument identification (closed-set), (3) musical style identification (closed-set), and (4) identification of musical pieces (open-set). To be clinically useful, it is crucial for AMICI to demonstrate high test-retest reliability, so that CI users can be assessed and retested after changes in maps or programming strategies. Research Design Thirteen CI subjects were tested with AMICI for the initial visit and retested again 10–14 days later. Two speech perception tests (consonant-nucleus-consonant [CNC] and Bamford-Kowal-Bench Speech-in-Noise [BKB-SIN]) were also administered. Data Analysis Test-retest reliability and equivalence of the test’s three forms were analyzed using paired t-tests and correlation coefficients, respectively. Correlation analysis was also conducted between results from the music and speech perception tests. Results Results showed no significant difference between test and retest (p > 0.05) with adequate power (0.9) as well as high correlations between the three forms (Forms A and B, r = 0.91; Forms A and C, r = 0.91; Forms B and C, r = 0.95). Correlation analysis showed high correlation between AMICI and BKB-SIN (r = −0.71), and moderate correlation between AMICI and CNC (r = 0.4). Conclusions The study showed AMICI is highly reliable for assessing musical perception in CI users. PMID:24384082
Stoller, Oliver; de Bruin, Eling D; Schindelholz, Matthias; Schuster-Amft, Corina; de Bie, Rob A; Hunt, Kenneth J
2014-10-11
Exercise capacity is seriously reduced after stroke. While cardiopulmonary assessment and intervention strategies have been validated for the mildly and moderately impaired populations post-stroke, there is a lack of effective concepts for stroke survivors suffering from severe motor limitations. This study investigated the test-retest reliability and repeatability of cardiopulmonary exercise testing (CPET) using feedback-controlled robotics-assisted treadmill exercise (FC-RATE) in severely motor impaired individuals early after stroke. 20 subjects (age 44-84 years, <6 month post-stroke) with severe motor limitations (Functional Ambulatory Classification 0-2) were selected for consecutive constant load testing (CLT) and incremental exercise testing (IET) within a powered exoskeleton, synchronised with a treadmill and a body weight support system. A manual human-in-the-loop feedback system was used to guide individual work rate levels. Outcome variables focussed on standard cardiopulmonary performance parameters. Relative and absolute test-retest reliability were assessed by intraclass correlation coefficients (ICC), standard error of the measurement (SEM), and minimal detectable change (MDC). Mean difference, limits of agreement, and coefficient of variation (CoV) were estimated to assess repeatability. Peak performance parameters during IET yielded good to excellent relative reliability: absolute peak oxygen uptake (ICC =0.82), relative peak oxygen uptake (ICC =0.72), peak work rate (ICC =0.91), peak heart rate (ICC =0.80), absolute gas exchange threshold (ICC =0.91), relative gas exchange threshold (ICC =0.88), oxygen cost of work (ICC =0.87), oxygen pulse at peak oxygen uptake (ICC =0.92), ventilation rate versus carbon dioxide output slope (ICC =0.78). For these variables, SEM was 4-13%, MDC 12-36%, and CoV 0.10-0.36. CLT revealed high mean differences and insufficient test-retest reliability for all variables studied. This study presents first evidence on reliability and repeatability for CPET in severely motor impaired individuals early after stroke using a feedback-controlled robotics-assisted treadmill. The results demonstrate good to excellent test-retest reliability and appropriate repeatability for the most important peak cardiopulmonary performance parameters. These findings have important implications for the design and implementation of cardiovascular exercise interventions in severely impaired populations. Future research needs to develop advanced control strategies to enable the true limit of functional exercise capacity to be reached and to further assess test-retest reliability and repeatability in larger samples.
Lo, Wing-Sze; Ho, Sai-Yin; Wong, Bonny Yee-Man; Mak, Kwok-Kei; Lam, Tai-Hing
2011-06-01
The reliability and validity of Stunkard's Figure Rating Scale (FRS) as a measure of current body size (CBS) was established in Western adolescent girls but not in non-Western population. We examined the validity and test-retest reliability of Stunkard's FRS in assessing CBS among Chinese adolescents. Methods. In a school-based survey in Hong Kong, 5666 adolescents (boys: 45.1%; mean age 14.7 years) provided data on self-reported height and weight, CBS, perceived weight status, and health-related quality of life using the Medical Outcomes Study Short-Form version 2 (SF-12v2). Height and weight were also objectively measured. Spearman's correlation was used to assess construct validity, concurrent validity and test-retest reliability. Convergent and discriminant validity were good: CBS correlated strongly with weight and self-reported/measured BMI, but only weakly with SF-12v2. CBS correlated strongly with perceived weight status, showing concurrent validity. Spearman's correlation (r) for CBS was 0.78 for girls and 0.72 for boys indicating good test-retest reliability. Validity and reliability results did not differ significantly between senior and junior grade adolescents. Our findings support the use of Stunkard's FRS to measure body size among Chinese adolescents.
Singh, Amika S; Chinapaw, Mai J M; Uijtdewilligen, Léonie; Vik, Froydis N; van Lippevelde, Wendy; Fernández-Alvira, Juan M; Stomfai, Sarolta; Manios, Yannis; van der Sluijs, Maria; Terwee, Caroline; Brug, Johannes
2012-08-13
Insight in parental energy balance-related behaviours, their determinants and parenting practices are important to inform childhood obesity prevention. Therefore, reliable and valid tools to measure these variables in large-scale population research are needed. The objective of the current study was to examine the test-retest reliability and construct validity of the parent questionnaire used in the ENERGY-project, assessing parental energy balance-related behaviours, their determinants, and parenting practices among parents of 10-12 year old children. We collected data among parents (n = 316 in the test-retest reliability study; n = 109 in the construct validity study) of 10-12 year-old children in six European countries, i.e. Belgium, Greece, Hungary, the Netherlands, Norway, and Spain. Test-retest reliability was assessed using the intra-class correlation coefficient (ICC) and percentage agreement comparing scores from two measurements, administered one week apart. To assess construct validity, the agreement between questionnaire responses and a subsequent interview was assessed using ICC and percentage agreement.All but one item showed good to excellent test-retest reliability as indicated by ICCs > .60 or percentage agreement ≥ 75%. Construct validity appeared to be good to excellent for 92 out of 121 items, as indicated by ICCs > .60 or percentage agreement ≥ 75%. From the other 29 items, construct validity was moderate for 24 and poor for 5 items. The reliability and construct validity of the items of the ENERGY-parent questionnaire on multiple energy balance-related behaviours, their potential determinants, and parenting practices appears to be good. Based on the results of the validity study, we strongly recommend adapting parts of the ENERGY-parent questionnaire if used in future research.
Bove, Allyn M; Lynch, Andrew D; DePaul, Samantha M; Terhorst, Lauren; Irrgang, James J; Fitzgerald, G Kelley
2016-09-01
Study Design Clinical measurement. Background It has been suggested that rating of perceived exertion (RPE) may be a useful alternative to 1-repetition maximum (1RM) to determine proper resistance exercise dosage. However, the test-retest reliability of RPE for resistance exercise has not been determined. Additionally, prior research regarding the relationship between 1RM and RPE is conflicting. Objectives The purpose of this study was to (1) determine test-retest reliability of RPE related to resistance exercise and (2) assess agreement between percentages of 1RM and RPE during quadriceps resistance exercise. Methods A sample of participants with and without knee pathology completed a series of knee extension exercises and rated the perceived difficulty of each exercise on a 0-to-10 RPE scale, then repeated the procedure 1 to 2 weeks later for test-retest reliability. To determine agreement between RPE and 1RM, participants completed knee extension exercises at various percentages of their 1RM (10% to 130% of predicted 1RM) and rated the perceived difficulty of each exercise on a 0-to-10 RPE scale. Percent agreement was calculated between the 1RM and RPE at each resistance interval. Results The intraclass correlation coefficient indicated excellent test-retest reliability of RPE for quadriceps resistance exercises (intraclass correlation coefficient = 0.895; 95% confidence interval: 0.866, 0.918). Overall percent agreement between RPE and 1RM was 60%, but agreement was poor within the ranges that would typically be used for training (50% 1RM for muscle endurance, 70% 1RM and greater for strength). Conclusion Test-retest reliability of perceived exertion during quadriceps resistance exercise was excellent. However, agreement between the RPE and 1RM was poor, especially in common training zones for knee extensor strengthening. J Orthop Sports Phys Ther 2016;46(9):768-774. Epub 5 Aug 2016. doi:10.2519/jospt.2016.6498.
Validation of the Dementia Care Assessment Packet-Instrumental Activities of Daily Living
Lee, Seok Bum; Park, Jeong Ran; Yoo, Jeong-Hwa; Park, Joon Hyuk; Lee, Jung Jae; Yoon, Jong Chul; Jhoo, Jin Hyeong; Lee, Dong Young; Woo, Jong Inn; Han, Ji Won; Huh, Yoonseok; Kim, Tae Hui
2013-01-01
Objective We aimed to evaluate the psychometric properties of the IADL measure included in the Dementia Care Assessment Packet (DCAP-IADL) in dementia patients. Methods The study involved 112 dementia patients and 546 controls. The DCAP-IADL was scored in two ways: observed score (OS) and predicted score (PS). The reliability of the DCAP-IADL was evaluated by testing its internal consistency, inter-rater reliability and test-retest reliability. Discriminant validity was evaluated by comparing the mean OS and PS between dementia patients and controls by ANCOVA. Pearson or Spearman correlation analysis was performed with other instruments to assess concurrent validity. Receiver operating characteristics curve analysis was performed to examine diagnostic accuracy. Results Chronbach's α coefficients of the DCAP-IADL were above 0.7. The values in dementia patients were much higher (OS=0.917, PS=0.927), indicating excellent degrees of internal consistency. Inter-rater reliabilities and test-retest reliabilities were statistically significant (p<0.05). PS exhibited higher reliabilities than OS. The mean OS and PS of dementia patients were significantly higher than those of the non-demented group after controlling for age, sex and education level. The DCAP-IADL was significantly correlated with other IADL instruments and MMSE-KC (p<0.001). Areas under the curves of the DCAP-IADL were above 0.9. Conclusion The DCAP-IADL is a reliable and valid instrument for evaluating instrumental ability of daily living for the elderly, and may also be useful for screening dementia. Moreover, administering PS may enable the DCAP-IADL to overcome the differences in gender, culture and life style that hinders accurate evaluation of the elderly in previous IADL instruments. PMID:24302946
Test-Retest Reliability of fMRI Brain Activity during Memory Encoding
Brandt, David J.; Sommer, Jens; Krach, Sören; Bedenbender, Johannes; Kircher, Tilo; Paulus, Frieder M.; Jansen, Andreas
2013-01-01
The mechanisms underlying hemispheric specialization of memory are not completely understood. Functional magnetic resonance imaging (fMRI) can be used to develop and test models of hemispheric specialization. In particular for memory tasks however, the interpretation of fMRI results is often hampered by the low reliability of the data. In the present study we therefore analyzed the test-retest reliability of fMRI brain activation related to an implicit memory encoding task, with a particular focus on brain activity of the medial temporal lobe (MTL). Fifteen healthy subjects were scanned with fMRI on two sessions (average retest interval 35 days) using a commonly applied novelty encoding paradigm contrasting known and unknown stimuli. To assess brain lateralization, we used three different stimuli classes that differed in their verbalizability (words, scenes, fractals). Test-retest reliability of fMRI brain activation was assessed by an intraclass-correlation coefficient (ICC), describing the stability of inter-individual differences in the brain activation magnitude over time. We found as expected a left-lateralized brain activation network for the words paradigm, a bilateral network for the scenes paradigm, and predominantly right-hemispheric brain activation for the fractals paradigm. Although these networks were consistently activated in both sessions on the group level, across-subject reliabilities were only poor to fair (ICCs ≤ 0.45). Overall, the highest ICC values were obtained for the scenes paradigm, but only in strongly activated brain regions. In particular the reliability of brain activity of the MTL was poor for all paradigms. In conclusion, for novelty encoding paradigms the interpretation of fMRI results on a single subject level is hampered by its low reliability. More studies are needed to optimize the retest reliability of fMRI activation for memory tasks. PMID:24367338
ERIC Educational Resources Information Center
Lourenco Jorge, Liliana; Garcia Marchi, Flavia Helena; Portela Hara, Ana Clara; Battistella, Linamara R.
2011-01-01
The objective of this prospective study was to perform a cross-cultural adaptation of the Functional Assessment Measure (FAM) into Brazilian Portuguese, and to assess the test-retest reliability. The instrument was translated, back-translated, pretested, and reviewed by a committee. The Brazilian version was assessed in 61 brain-injury patients.…
Cultural adaptation of an instrument to assess physical fitness in cardiac patients.
Domingues, Gabriela de Barros Leite; Gallani, Maria Cecília; Gobatto, Claudio Alexandre; Miura, Cinthya Tamie Passos; Rodrigues, Roberta Cunha Matheus; Myers, Jonathan
2011-04-01
To validate the content and to evaluate the reliability of the Veterans Specific Activity Questionnaire instrument, culturally adapted for use in the Brazilian population of cardiac patients. The instrument was translated and back-translated and subsequently analyzed by a committee of judges to evaluate its semantic-idiomatic and cultural equivalences. Physical activities were replaced when indicated in the instrument, but uncommon in the daily life of the target population. Another committee of specialists analyzed the metabolic equivalence of replaced activities. The proportion of agreement of evaluation of the judges was quantified by the Content Validity Index. The pre-test was performed in two stages (n1 and n2=15). Reliability was assessed using the test-retest (interval of 7-15 days, n = 50). In the evaluation of semantic-idiomatic and cultural equivalences, items with a Content Validity Index < 1 were reviewed until consensus among the judges was obtained. The second committee found 100% of agreement in the analysis of metabolic equivalence between original and replaced activities. Test-retest analysis indicated a Kappa coefficient of agreement (k = 0.86; (p<0.001), suggesting temporal stability of the instrument. The Brazilian version of the Veterans Specific Activity Questionnaire showed evidence of reliability, according to the temporal stability criterion and adequate cultural content.
Validity and reliability of the South African health promoting schools monitoring questionnaire
Struthers, Patricia; de Koker, Petra; Lerebo, Wondwossen; Blignaut, Renette J.
2017-01-01
Summary Health promoting schools, as conceptualised by the World Health Organisation, have been developed in many countries to facilitate the health-education link. In 1994, the concept of health promoting schools was introduced in South Africa. In the process of becoming a health promoting school, it is important for schools to monitor and evaluate changes and developments taking place. The Health Promoting Schools (HPS) Monitoring Questionnaire was developed to obtain opinions of students about their school as a health promoting school. It comprises 138 questions in seven sections: socio-demographic information; General health promotion programmes; health related Skills and knowledge; Policies; Environment; Community-school links; and support Services. This paper reports on the reliability and face validity of the HPS Monitoring Questionnaire. Seven experts reviewed the questionnaire and agreed that it has satisfactory face validity. A test-retest reliability study was conducted with 83 students in three high schools in Cape Town, South Africa. The kappa-coefficients demonstrate mostly fair (κ-scores between 0.21 and 0.4) to moderate (κ-scores between 0.41 and 0.6) agreement between test-retest General and Environment items; poor (κ-scores up to 0.2) agreement between Skills and Community test-retest items, fair agreement between Policies items, and for most of the questions focussing on Services a fair agreement was found. The study is a first effort at providing a tool that may be used to monitor and evaluate students’ opinions about changes in health promoting schools. Although the HPS Monitoring Questionnaire has face validity, the results of the reliability testing were inconclusive. Further research is warranted. PMID:27694227
Castro-Díaz, D M; Esteban-Fuertes, M; Salinas-Casado, J; Bustamante-Alarma, S; Gago-Ramos, J L; Galacho-Bech, A; García-Matres, M J; Rodríguez-Toves, L A; Zubiaur-Líbano, C; Collado-Serra, A; Batista-Miranda, J E; Ortiz-Gámiz, A
2014-03-01
To evaluate the psychometric properties of the Spanish version of the ICIQ-Male Lower Urinary Tract Symptoms Questionnaire (ICIQ-MLUTS): Feasibility (% of completion and ceiling/ground effects), reliability (Test-retest), convergent validity (vs Bladder Control Self-Assessment Questionnaire [BSAQ] and vs International Prostate Symptom Score [I-PSS]) and criterion validity (according to presence or absence of symptoms). This was an observational, non-interventionist and multicenter study. 223 male patients with lower urinary tract symptoms (LUTS), predominantly storage symptoms and aged 18-65, took part in the study. Patients completed the ICIQ-MLUTS (test), I-PSS and BSAQ questionnaires and referred their urinary symptoms in a single visit, with the exception of a subgroup composed by 49 patients that completed the questionnaire again 15 days after initial visit to evaluate test-retest reliability. The questionnaire includes 13 items divided in 2 sub-scales: Voiding symptoms (V) from 0-20 and Incontinence symptoms (I) from 0-24. Percentage of patients that completed all items: 98.84%. Ground effect is 0 and ceiling effect was under 6% in both sub-scales. Test-retest reliability: Intraclass correlation coefficient (ICC) ranged from 0.68 to 0.88, except on Delay. Kappa shows a good agreement, between 0.60 and 0.81, except for Nocturia. Convergent validity: Correlation (Spearman) between the questionnaire sub-scales scores and the rest of measures is statistically significant (P < .01 and P < .05). Criterion validity: Statistically significant differences (P < .05) between scores on ICIQ-MLUTS, from patients that refer experiencing symptoms and those who do not. The Spanish version of the ICIQ-MLUTS questionnaire shows adequate feasibility, reliability and validity. Copyright © 2013 AEU. Published by Elsevier Espana. All rights reserved.
Validity and reliability of the South African health promoting schools monitoring questionnaire.
Struthers, Patricia; Wegner, Lisa; de Koker, Petra; Lerebo, Wondwossen; Blignaut, Renette J
2017-04-01
Health promoting schools, as conceptualised by the World Health Organisation, have been developed in many countries to facilitate the health-education link. In 1994, the concept of health promoting schools was introduced in South Africa. In the process of becoming a health promoting school, it is important for schools to monitor and evaluate changes and developments taking place. The Health Promoting Schools (HPS) Monitoring Questionnaire was developed to obtain opinions of students about their school as a health promoting school. It comprises 138 questions in seven sections: socio-demographic information; General health promotion programmes; health related Skills and knowledge; Policies; Environment; Community-school links; and support Services. This paper reports on the reliability and face validity of the HPS Monitoring Questionnaire. Seven experts reviewed the questionnaire and agreed that it has satisfactory face validity. A test-retest reliability study was conducted with 83 students in three high schools in Cape Town, South Africa. The kappa-coefficients demonstrate mostly fair (κ-scores between 0.21 and 0.4) to moderate (κ-scores between 0.41 and 0.6) agreement between test-retest General and Environment items; poor (κ-scores up to 0.2) agreement between Skills and Community test-retest items, fair agreement between Policies items, and for most of the questions focussing on Services a fair agreement was found. The study is a first effort at providing a tool that may be used to monitor and evaluate students' opinions about changes in health promoting schools. Although the HPS Monitoring Questionnaire has face validity, the results of the reliability testing were inconclusive. Further research is warranted. © The Author 2016. Published by Oxford University Press.
Psychometric Evaluation of the Life Orientation Test-Revised in Treated Opiate Dependent Individuals
ERIC Educational Resources Information Center
Hirsch, Jameson K.; Britton, Peter C.; Conner, Kenneth R.
2010-01-01
We examined internal consistency and test-retest reliability of a measure of dispositional optimism, the Life Orientation Test-Revised, in 121 opiate-dependent patients seeking methadone treatment. Internal consistency was adequate at baseline (alpha = 0.69) and follow-up (alpha = 0.72). Low socioeconomic status and being on disability were…
Empirical methods for assessing meaningful neuropsychological change following epilepsy surgery.
Sawrie, S M; Chelune, G J; Naugle, R I; Lüders, H O
1996-11-01
Traditional methods for assessing the neurocognitive effects of epilepsy surgery are confounded by practice effects, test-retest reliability issues, and regression to the mean. This study employs 2 methods for assessing individual change that allow direct comparison of changes across both individuals and test measures. Fifty-one medically intractable epilepsy patients completed a comprehensive neuropsychological battery twice, approximately 8 months apart, prior to any invasive monitoring or surgical intervention. First, a Reliable Change (RC) index score was computed for each test score to take into account the reliability of that measure, and a cutoff score was empirically derived to establish the limits of statistically reliable change. These indices were subsequently adjusted for expected practice effects. The second approach used a regression technique to establish "change norms" along a common metric that models both expected practice effects and regression to the mean. The RC index scores provide the clinician with a statistical means of determining whether a patient's retest performance is "significantly" changed from baseline. The regression norms for change allow the clinician to evaluate the magnitude of a given patient's change on 1 or more variables along a common metric that takes into account the reliability and stability of each test measure. Case data illustrate how these methods provide an empirically grounded means for evaluating neurocognitive outcomes following medical interventions such as epilepsy surgery.
Rand, Stacey; Malley, Juliette; Towers, Ann-Marie; Netten, Ann; Forder, Julien
2017-08-18
The Adult Social Care Outcomes Toolkit (ASCOT-SCT4) is a multi-attribute utility index designed for the evaluation of long-term social care services. The measure comprises eight attributes that capture aspects of social care-related quality of life. The instrument has previously been validated with a sample of older adults who used home care services in England. This paper aims to demonstrate the instrument's test-retest reliability and provide evidence for its validity in a diverse sample of adults who use publicly-funded, community-based social care in England. A survey of 770 social care service users was conducted in England. A subsample of 100 services users participated in a follow-up interview between 7 and 21 days after baseline. Spearman rank correlation coefficients between the ASCOT-SCT4 index score and the EQ-5D-3 L, the ICECAP-A or ICECAP-O and overall quality of life were used to assess convergent validity. Data on variables hypothesised to be related to the ASCOT-SCT4 index score, as well as rating of individual attributes, were also collected. Hypothesised relationships were tested using one-way ANOVA or Fisher's exact test. Test-retest reliability was assessed using the intra-class correlation coefficient for the ASCOT-SCT4 index score at baseline and follow-up. There were moderate to strong correlations between the ASCOT-SCT4 index and EQ-5D-3 L, the ICECAP-A or ICECAP-O, and overall quality of life (all correlations ≥ 0.3). The construct validity was further supported by statistically significant hypothesised relationships between the ASCOT-SCT4 index and individual characteristics in univariate and multivariate analysis. There was also further evidence for the construct validity for the revised Food and drink and Dignity items. The test-retest reliability was considered to be good (ICC = 0.783; 95% CI: 0.678-0.857). The ASCOT-SCT4 index has good test-retest reliability for adults with physical or sensory disabilities who use social care services. The index score and the attributes appear to be valid for adults receiving social care for support reasons connected to underlying mental health problems, and physical or sensory disabilities. Further reliability testing with a wider sample of social care users is warranted, as is further exploration of the relationship between the ASCOT-SCT4, ICECAP-A/O and EQ-5D-3 L indices.
Llerena, Katiah; Wynn, Jonathan K; Hajcak, Greg; Green, Michael F; Horan, William P
2016-07-01
Accurately monitoring one's performance on daily life tasks, and integrating internal and external performance feedback are necessary for guiding productive behavior. Although internal feedback processing, as indexed by the error-related negativity (ERN), is consistently impaired in schizophrenia, initial findings suggest that external performance feedback processing, as indexed by the feedback negativity (FN), may actually be intact. The current study evaluated internal and external feedback processing task performance and test-retest reliability in schizophrenia. 92 schizophrenia outpatients and 63 healthy controls completed a flanker task (ERN) and a time estimation task (FN). Analyses examined the ΔERN and ΔFN defined as difference waves between correct/positive versus error/negative feedback conditions. A temporal principal component analysis was conducted to distinguish the ΔERN and ΔFN from overlapping neural responses. We also assessed test-retest reliability of ΔERN and ΔFN in patients over a 4-week interval. Patients showed reduced ΔERN accompanied by intact ΔFN. In patients, test-retest reliability for both ΔERN and ΔFN over a four-week period was fair to good. Individuals with schizophrenia show a pattern of impaired internal, but intact external, feedback processing. This pattern has implications for understanding the nature and neural correlates of impaired feedback processing in schizophrenia. Published by Elsevier B.V.
Arrow, P; Klobas, E
2015-09-01
Early childhood caries has significant impacts on children and their families. The Early Childhood Oral Health Impact Scale (ECOHIS) is an instrument for capturing the complex dimensions of preschool children's oral health. This study aimed to evaluate the reliability and validity of the instrument among Australian preschool children. Parents/children dyads (n = 286) participating in a treatment trial on early childhood caries completed the scale at baseline, and 33 parents repeated the questionnaire 2-3 weeks later. The validity and reliability of the ECOHIS was determined using tests for convergent and discriminant validity, internal reliability of the instrument and test-retest reliability. Scale impacts were strongly correlated with global oral health ratings (Spearman's correlations; r = 0.51, total score; r = 0.43, child impact; and r = 0.49, family impact; p < 0.001). The scale was significantly associated with children's caries experience, p < 0.001. Cronbach's alpha values were 0.87, 0.89 and 0.74 for the total, the child and the family domains, respectively. Test-retest reliability was 0.92, 0.89 and 0.78 for the total, child and family domains, respectively. The scale demonstrated acceptable validity and reliability for assessing the impact of early childhood caries among Australian preschool children. © 2015 Australian Dental Association.
Nazary-Moghadam, Salman; Zeinalzadeh, Afsaneh; Salavati, Mahyar; Almasi, Simin; Negahban, Hossein
2017-01-01
The aim of the present study was to culturally adapt and evaluate reliability and validity of Health Assessment Questionnaire-Disability Index (HAQ-DI) in Iranian patients with rheumatoid arthritis (RA). 234 patients with RA for validation study, Eighty-six participants for reliability study. Test-retest relative reliability and internal consistency of Persian version of HAQ-DI were examined by intraclass correlation coefficient (ICC) and Cronbach's alpha, respectively. Additionally, HAQ-DI construct validity (Spearman's correlation) was examined using Persian version of Short-Form 36 Health survey (SF-36), activity and severity parameters. Persian version of HAQ-DI total score showed excellent test-retest reliability (ICC = 0.98) and internal consistency (Cronbach's alpha = 0.95). Spearman's correlations between the total PHAQ-DI score and activity and severity parameters were above 0.55. Correlation between PHAQ-DI and SF-36 Physical Health were higher as compared with SF-36 Mental Health. Persian version of HAQ-DI is a reliable and valid culturally-adapted instrument in order to measure functional limitations in Iranian people with RA. Copyright © 2016 Elsevier Ltd. All rights reserved.
Rychlik, Michał; Samborski, Włodzimierz
2015-01-01
The aim of this study was to assess the validity and test-retest reliability of Thermovision Technique of Dry Needling (TTDN) for the gluteus minimus muscle. TTDN is a new thermography approach used to support trigger points (TrPs) diagnostic criteria by presence of short-term vasomotor reactions occurring in the area where TrPs refer pain. Method. Thirty chronic sciatica patients (n=15 TrP-positive and n=15 TrPs-negative) and 15 healthy volunteers were evaluated by TTDN three times during two consecutive days based on TrPs of the gluteus minimus muscle confirmed additionally by referred pain presence. TTDN employs average temperature (T avr), maximum temperature (T max), low/high isothermal-area, and autonomic referred pain phenomenon (AURP) that reflects vasodilatation/vasoconstriction. Validity and test-retest reliability were assessed concurrently. Results. Two components of TTDN validity and reliability, T avr and AURP, had almost perfect agreement according to κ (e.g., thigh: 0.880 and 0.938; calf: 0.902 and 0.956, resp.). The sensitivity for T avr, T max, AURP, and high isothermal-area was 100% for everyone, but specificity of 100% was for T avr and AURP only. Conclusion. TTDN is a valid and reliable method for T avr and AURP measurement to support TrPs diagnostic criteria for the gluteus minimus muscle when digitally evoked referred pain pattern is present. PMID:26137486
Pruitt, Sandi L; Jeffe, Donna B; Yan, Yan; Schootman, Mario
2012-04-01
Limited psychometric research has examined the reliability of self-reported measures of neighbourhood conditions, the effect of measurement error on associations between neighbourhood conditions and health, and potential differences in the reliabilities between neighbourhood strata (urban vs rural and low vs high poverty). We assessed overall and stratified reliability of self-reported perceived neighbourhood conditions using five scales (social and physical disorder, social control, social cohesion, fear) and four single items (multidimensional neighbouring). We also assessed measurement error-corrected associations of these conditions with self-rated health. Using random-digit dialling, 367 women without breast cancer (matched controls from a larger study) were interviewed twice, 2-3 weeks apart. Test-retest (intraclass correlation coefficients (ICC)/weighted κ) and internal consistency reliability (Cronbach's α) were assessed. Differences in reliability across neighbourhood strata were tested using bootstrap methods. Regression calibration corrected estimates for measurement error. All measures demonstrated satisfactory internal consistency (α ≥ 0.70) and either moderate (ICC/κ=0.41-0.60) or substantial (ICC/κ=0.61-0.80) test-retest reliability in the full sample. Internal consistency did not differ by neighbourhood strata. Test-retest reliability was significantly lower among rural (vs urban) residents for two scales (social control, physical disorder) and two multidimensional neighbouring items; test-retest reliability was higher for physical disorder and lower for one multidimensional neighbouring item among the high (vs low) poverty strata. After measurement error correction, the magnitude of associations between neighbourhood conditions and self-rated health were larger, particularly in the rural population. Research is needed to develop and test reliable measures of perceived neighbourhood conditions relevant to the health of rural populations.
Miller, William C; Deathe, A Barry; Speechley, Mark
2003-05-01
To evaluate the internal consistency, test-retest reliability, and construct validity of the Activities-specific Balance Confidence (ABC) Scale among people who have a lower-limb amputation. Retest design. A university-affiliated outpatient amputee clinic in Ontario. Two samples of individuals who have unilateral transtibial and transfemoral amputation. Sample 1 (n=54) was a consecutive and sample 2 (n=329) a convenience sample of all members of the clinic population. Not applicable. Repeated application of the ABC Scale, a 16-item questionnaire that assesses confidence in performing various mobility-related tasks. Correlation to test hypothesized relationships between the ABC Scale and the 2-minute walk (2MWT) and the timed up-and-go (TUG) tests; and assessment of the ability of the ABC Scale to discriminate among groups based on amputation cause, amputation level, mobility device use, automatic stepping ability, wearing time, stair climbing ability, and walking distance. Test-retest reliability (intraclass correlation coefficient) of the ABC Scale was .91 (95% confidence interval [CI], .84-.95) with individual item test-retest coefficients ranging from .53 to .87. Internal consistency, measured by Cronbach alpha, was .95. Hypothesized associations with the 2MWT and TUG test were observed with correlations of .72 (95% CI, .56-.84) and -.70 (95% CI, -.82 to -.53), respectively. The ABC Scale discriminated between all groups except those based on amputation level. Balance confidence, as measured by the ABC Scale, is a construct that provides unique information potentially useful to clinicians who provide amputee rehabilitation. The ABC Scale is reliable, with strong support for validity. Study of the scale's responsiveness is recommended.
Johnson, Matthew W; Bruner, Natalie R
2013-08-01
The Sexual Discounting Task uses the delay discounting framework to examine sexual HIV risk behavior. Previous research showed task performance to be significantly correlated with self-reported HIV risk behavior in cocaine dependence. Test-retest reliability and gender differences had remained unexamined. The present study examined the test-retest reliability of the Sexual Discounting Task. Cocaine-dependent individuals (18 men, 13 women) completed the task in two laboratory visits ∼7 days apart. Participants selected photographs of individuals with whom they were willing to have casual sex. Among these, participants identified the individual most (and least) likely to have a sexually transmitted infection (STI), and the individual with whom he or she most (and least) wanted to have sex. In reference to these individuals, participants rated their likelihood of having unprotected sex versus waiting to have sex with a condom, at various delays. A money delay discounting task was also completed at the first visit. Significant differences in discounting among partner conditions were shown. Differential stability was demonstrated by significant, positive correlations between test and retest for all four partner conditions. Absolute stability was demonstrated by statistical equivalence tests between test and retest, and also supported by a lack of significant differences between test and retest. Men generally discounted significantly more than women for sexual outcomes but not money. Results suggest the Sexual Discounting Task to be a reliable measure in cocaine-dependent individuals, which supports its use as a repeated measure in clinical research, for example, studies examining acute drug effects on sexual risk and the effects of addiction treatment and HIV prevention interventions on sexual risk. PsycINFO Database Record (c) 2013 APA, all rights reserved
Richman, Jesse; Zangalli, Camila; Lu, Lan; Wizov, Sheryl S; Spaeth, Eric; Spaeth, George L
2015-01-01
(1) To determine the ability of a novel, internet-based contrast sensitivity test titled the Spaeth/Richman Contrast Sensitivity Test (SPARCS) to identify patients with glaucoma. (2) To determine the test-retest reliability of SPARCS. A prospective, cross-sectional study of patients with glaucoma and controls was performed. Subjects were assessed by SPARCS and the Pelli-Robson chart. Reliability of each test was assessed by the intraclass correlation coefficient and the coefficient of repeatability. Sensitivity and specificity for identifying glaucoma was also evaluated. The intraclass correlation coefficient for SPARCS was 0.97 and 0.98 for Pelli-Robson. The coefficient of repeatability for SPARCS was ±6.7% and ±6.4% for Pelli-Robson. SPARCS identified patients with glaucoma with 79% sensitivity and 93% specificity. SPARCS has high test-retest reliability. It is easily accessible via the internet and identifies patients with glaucoma well. NCT01300949. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
TCOPPE school environmental audit tool: assessing safety and walkability of school environments.
Lee, Chanam; Kim, Hyung Jin; Dowdy, Diane M; Hoelscher, Deanna M; Ory, Marcia G
2013-09-01
Several environmental audit instruments have been developed for assessing streets, parks and trails, but none for schools. This paper introduces a school audit tool that includes 3 subcomponents: 1) street audit, 2) school site audit, and 3) map audit. It presents the conceptual basis and the development process of this instrument, and the methods and results of the reliability assessments. Reliability tests were conducted by 2 trained auditors on 12 study schools (high-low income and urban-suburban-rural settings). Kappa statistics (categorical, factual items) and ICC (Likert-scale, perceptual items) were used to assess a) interrater, b) test-retest, and c) peak vs. off-peak hour reliability tests. For the interrater reliability test, the average Kappa was 0.839 and the ICC was 0.602. For the test-retest reliability, the average Kappa was 0.903 and the ICC was 0.774. The peak-off peak reliability was 0.801. Rural schools showed the most consistent results in the peak-off peak and test-retest assessments. For interrater tests, urban schools showed the highest ICC, and rural schools showed the highest Kappa. Most items achieved moderate to high levels of reliabilities in all study schools. With proper training, this audit can be used to assess school environments reliably for research, outreach, and policy-support purposes.
NASA Astrophysics Data System (ADS)
Huang, Yuxia; Mao, Mengchai; Zhang, Zong; Zhou, Hui; Zhao, Yang; Duan, Lian; Kreplin, Ute; Xiao, Xiang; Zhu, Chaozhe
2017-01-01
Functional near-infrared spectroscopy (fNIRS) is being increasingly applied to affective and social neuroscience research; however, the reliability of this method is still unclear. This study aimed to evaluate the test-retest reliability of the fNIRS-based prefrontal response to emotional stimuli. Twenty-six participants viewed unpleasant and neutral pictures, and were simultaneously scanned by fNIRS in two sessions three weeks apart. The reproducibility of the prefrontal activation map was evaluated at three spatial scales (mapwise, clusterwise, and channelwise) at both the group and individual levels. The influence of the time interval was also explored and comparisons were made between longer (intersession) and shorter (intrasession) time intervals. The reliabilities of the activation map at the group level for the mapwise (up to 0.88, the highest value appeared in the intersession assessment) and clusterwise scales (up to 0.91, the highest appeared in the intrasession assessment) were acceptable, indicating that fNIRS may be a reliable tool for emotion studies, especially for a group analysis and under larger spatial scales. However, it should be noted that the individual-level and the channelwise fNIRS prefrontal responses were not sufficiently stable. Future studies should investigate which factors influence reliability, as well as the validity of fNIRS used in emotion studies.
RELIABILITY OF ANKLE-FOOT MORPHOLOGY, MOBILITY, STRENGTH, AND MOTOR PERFORMANCE MEASURES.
Fraser, John J; Koldenhoven, Rachel M; Saliba, Susan A; Hertel, Jay
2017-12-01
Assessment of foot posture, morphology, intersegmental mobility, strength and motor control of the ankle-foot complex are commonly used clinically, but measurement properties of many assessments are unclear. To determine test-retest and inter-rater reliability, standard error of measurement, and minimal detectable change of morphology, joint excursion and play, strength, and motor control of the ankle-foot complex. Reliability study. 24 healthy, recreationally-active young adults without history of ankle-foot injury were assessed by two clinicians on two occasions, three to ten days apart. Measurement properties were assessed for foot morphology (foot posture index, total and truncated length, width, arch height), joint excursion (weight-bearing dorsiflexion, rearfoot and hallux goniometry, forefoot inclinometry, 1 st metatarsal displacement) and joint play, strength (handheld dynamometry), and motor control rating during intrinsic foot muscle (IFM) exercises. Clinician order was randomized using a Latin Square. The clinicians performed independent examinations and did not confer on the findings for the duration of the study. Test-retest and inter-tester reliability and agreement was assessed using intraclass correlation coefficients (ICC 2,k ) and weighted kappa ( K w ). Test-retest reliability ICC were as follows: morphology: .80-1.00, joint excursion: .58-.97, joint play: -.67-.84, strength: .67-.92, IFM motor rating: K W -.01-.71. Inter-rater reliability ICC were as follows: morphology: .81-1.00, joint excursion: .32-.97, joint play: -1.06-1.00, strength: .53-.90, and IFM motor rating: K w .02-.56. Measures of ankle-foot posture, morphology, joint excursion, and strength demonstrated fair to excellent test-retest and inter-rater reliability. Test-retest reliability for rating of perceived difficulty and motor performance was good to excellent for short-foot, toe-spread-out, and hallux exercises and poor to fair for lesser toe extension. Joint play measures had poor to fair reliability overall. The findings of this study should be considered when choosing methods of clinical assessment and outcome measures in practice and research. 3.
Khoddami, Seyyedeh Maryam; Talebian, Saeed; Izadi, Farzad; Ansari, Noureddin Nakhostin
2017-05-01
The study aims to evaluate the reliability and the discriminative validity of surface electromyography (sEMG) in the assessment of patients with primary muscle tension dysphonia (MTD). The study design is cross-sectional. Fifteen patients with primary MTD (mean age: 34.07 ± 10.99 years) and 15 healthy volunteers (mean age: 34.53 ± 10.63 years) were included. All participants underwent evaluation of sEMG to record the electrical activity of the thyrohyoid and cricothyroid muscles. The outcome measures were the root mean square (RMS), activity peak, duration, and time to the peak activity, which were obtained during /a/ and /i/ prolongation for test-retest reliability. The test-retest reliability was good to excellent for the RMS and peak activity measures (intraclass correlation coefficient [agreement] [ICC agreement ] = 0.49-0.98). The reliability for the activity duration was poor to excellent (ICC agreement = 0.19-0.9). Poor test-retest reliability was found for the time to peak measure (ICC agreement = 0.15-0.37). The standard error of measurement for all sEMG measures was between 0.41 and 2.05. The smallest detectable change (SDC) was calculated between 1.13 and 5.66. The highest SDC values were obtained for the peak and the lowest SDCs were documented for the duration (5.66 and 1.13, respectively). All sEMG measures were not able to discriminate between the MTD patients and healthy subjects (P > 0.05). The sEMG is a reliable tool to measure the RMS, the peak activity, and the activity duration in primary MTD. However, it is not able to discriminate the patients with primary MTD from healthy subjects. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
The Trojan Lifetime Champions Health Survey: development, validity, and reliability.
Sorenson, Shawn C; Romano, Russell; Scholefield, Robin M; Schroeder, E Todd; Azen, Stanley P; Salem, George J
2015-04-01
Self-report questionnaires are an important method of evaluating lifespan health, exercise, and health-related quality of life (HRQL) outcomes among elite, competitive athletes. Few instruments, however, have undergone formal characterization of their psychometric properties within this population. To evaluate the validity and reliability of a novel health and exercise questionnaire, the Trojan Lifetime Champions (TLC) Health Survey. Descriptive laboratory study. A large National Collegiate Athletic Association Division I university. A total of 63 university alumni (age range, 24 to 84 years), including former varsity collegiate athletes and a control group of nonathletes. Participants completed the TLC Health Survey twice at a mean interval of 23 days with randomization to the paper or electronic version of the instrument. Content validity, feasibility of administration, test-retest reliability, parallel-form reliability between paper and electronic forms, and estimates of systematic and typical error versus differences of clinical interest were assessed across a broad range of health, exercise, and HRQL measures. Correlation coefficients, including intraclass correlation coefficients (ICCs) for continuous variables and κ agreement statistics for ordinal variables, for test-retest reliability averaged 0.86, 0.90, 0.80, and 0.74 for HRQL, lifetime health, recent health, and exercise variables, respectively. Correlation coefficients, again ICCs and κ, for parallel-form reliability (ie, equivalence) between paper and electronic versions averaged 0.90, 0.85, 0.85, and 0.81 for HRQL, lifetime health, recent health, and exercise variables, respectively. Typical measurement error was less than the a priori thresholds of clinical interest, and we found minimal evidence of systematic test-retest error. We found strong evidence of content validity, convergent construct validity with the Short-Form 12 Version 2 HRQL instrument, and feasibility of administration in an elite, competitive athletic population. These data suggest that the TLC Health Survey is a valid and reliable instrument for assessing lifetime and recent health, exercise, and HRQL, among elite competitive athletes. Generalizability of the instrument may be enhanced by additional, larger-scale studies in diverse populations.
Reference values for the muscle power sprint test in 6- to 12-year-old children.
Douma-van Riet, Danielle; Verschuren, Olaf; Jelsma, Dorothee; Kruitwagen, Cas; Smits-Engelsman, Bouwien; Takken, Tim
2012-01-01
The aims of this study were (1) to develop centile reference values for anaerobic performance of Dutch children tested using the Muscle Power Sprint Test (MPST) and (2) to examine the test-retest reliability of the MPST. Children who were developing typically (178 boys and 201 girls) and aged 6 to 12 years (mean = 8.9 years) were recruited. The MPST was administered to 379 children, and test-retest reliability was examined in 47 children. MPST scores were transformed into centile curves, which were created using generalized additive models for location, scale, and shape. Height-related reference curves were created for both genders. Excellent (intraclass correlation coefficient = 0.98) test-retest reliability was demonstrated. The reference values for the MPST of children who are developing typically and aged 6 to 12 years can serve as a clinical standard in pediatric physical therapy practice. The MPST is a reliable and practical method for determining anaerobic performance in children.
Kern, Robert S.
2013-01-01
The psychometric properties of 4 paradigms adapted from the social neuroscience literature were evaluated to determine their suitability for use in clinical trials of schizophrenia. This 2-site study (University of California, Los Angeles and University of North Carolina) included 173 clinically stable schizophrenia outpatients and 88 healthy controls. The social cognition battery was administered twice to the schizophrenia group (baseline, 4-week retest) and once to the control group. The 4 paradigms included 2 that assess perception of nonverbal social and action cues (basic biological motion and emotion in biological motion) and 2 that involve higher level inferences about self and others’ mental states (self-referential memory and empathic accuracy). Each paradigm was evaluated on (1) patient vs healthy control group differences, (2) test-retest reliability, (3) utility as a repeated measure, and (4) tolerability. Of the 4 paradigms, empathic accuracy demonstrated the strongest characteristics, including large between-group differences, adequate test-retest reliability (.72), negligible practice effects, and good tolerability ratings. The other paradigms showed weaker psychometric characteristics in their current forms. These findings highlight challenges in adapting social neuroscience paradigms for use in clinical trials. PMID:24072805
Keppler, Hannah; Dhooge, Ingeborg; Maes, Leen; D'haenens, Wendy; Bockstael, Annelies; Philips, Birgit; Swinnen, Freya; Vinck, Bart
2010-02-01
Knowledge regarding the variability of transient-evoked otoacoustic emissions (TEOAEs) and distortion product otoacoustic emissions (DPOAEs) is essential in clinical settings and improves their utility in monitoring hearing status over time. In the current study, TEOAEs and DPOAEs were measured with commercially available OAE-equipment in 56 normally-hearing ears during three sessions. Reliability was analysed for the retest measurement without probe-refitting, the immediate retest measurement with probe-refitting, and retest measurements after one hour and one week. The highest reliability was obtained in the retest measurement without probe-refitting, and decreased with increasing time-interval between measurements. For TEOAEs, the lowest reliability was seen at half-octave frequency bands 1.0 and 1.4 kHz; whereas for DPOAEs half-octave frequency band 8.0 kHz had also poor reliability. Higher primary tone level combination for DPOAEs yielded to a better reliability of DPOAE amplitudes. External environmental noise seemed to be the dominating noise source in normal-hearing subjects, decreasing the reliability of emission amplitudes especially in the low-frequency region.
Busch, Robyn M; Lineweaver, Tara T; Ferguson, Lisa; Haut, Jennifer S
2015-06-01
Reliable change indices (RCIs) and standardized regression-based (SRB) change score norms permit evaluation of meaningful changes in test scores following treatment interventions, like epilepsy surgery, while accounting for test-retest reliability, practice effects, score fluctuations due to error, and relevant clinical and demographic factors. Although these methods are frequently used to assess cognitive change after epilepsy surgery in adults, they have not been widely applied to examine cognitive change in children with epilepsy. The goal of the current study was to develop RCIs and SRB change score norms for use in children with epilepsy. Sixty-three children with epilepsy (age range: 6-16; M=10.19, SD=2.58) underwent comprehensive neuropsychological evaluations at two time points an average of 12 months apart. Practice effect-adjusted RCIs and SRB change score norms were calculated for all cognitive measures in the battery. Practice effects were quite variable across the neuropsychological measures, with the greatest differences observed among older children, particularly on the Children's Memory Scale and Wisconsin Card Sorting Test. There was also notable variability in test-retest reliabilities across measures in the battery, with coefficients ranging from 0.14 to 0.92. Reliable change indices and SRB change score norms for use in assessing meaningful cognitive change in children following epilepsy surgery are provided for measures with reliability coefficients above 0.50. This is the first study to provide RCIs and SRB change score norms for a comprehensive neuropsychological battery based on a large sample of children with epilepsy. Tables to aid in evaluating cognitive changes in children who have undergone epilepsy surgery are provided for clinical use. An Excel sheet to perform all relevant calculations is also available to interested clinicians or researchers. Copyright © 2015 Elsevier Inc. All rights reserved.
Reliability and validity of migraine disability assessment questionnaire-Thai version (Thai-MIDAS).
Seethong, Piman; Nimmannit, Akarin; Chaisewikul, Rungsan; Prayoonwiwat, Naraporn; Chotinaiwattarakul, Wattanachai
2013-02-01
To assess the validity and test-retest reliability of a Thai translation of the Migraine Disability Assessment (MIDAS) Questionnaire in Thai patients with migraine. Migraineurs from the Headache Clinic in Siriraj Hospital were recruited and asked to complete a 13-weeks diary and answered the Thai-MIDAS at once. Some participants were asked to provide the 2nd Thai-MIDAS in the next 2 weeks for test-retest reliability. Ninety-three patients had completed the 13-weeks diaries. Age range was 18-58 years with mean 37.69 +/- 9.60 years. All 5 items and the total score of Thai-MIDAS were moderately correlated with data from 13-weeks diary (Spearman's correlation coefficient = 0.32-0.62). The test-retest reliability of the total score of Thai-MIDAS in 30 patients demonstrated a highly reliable degree of intraclass correlation (ICC = 0.76, 95% CI 0.49-0.88). The present study reveals that the Thai-MIDAS has satisfactory validity and reliability in comparison with the original English MIDAS version.
Benjamin, Sara E; Neelon, Brian; Ball, Sarah C; Bangdiwala, Shrikant I; Ammerman, Alice S; Ward, Dianne S
2007-01-01
Background Few assessment instruments have examined the nutrition and physical activity environments in child care, and none are self-administered. Given the emerging focus on child care settings as a target for intervention, a valid and reliable measure of the nutrition and physical activity environment is needed. Methods To measure inter-rater reliability, 59 child care center directors and 109 staff completed the self-assessment concurrently, but independently. Three weeks later, a repeat self-assessment was completed by a sub-sample of 38 directors to assess test-retest reliability. To assess criterion validity, a researcher-administered environmental assessment was conducted at 69 centers and was compared to a self-assessment completed by the director. A weighted kappa test statistic and percent agreement were calculated to assess agreement for each question on the self-assessment. Results For inter-rater reliability, kappa statistics ranged from 0.20 to 1.00 across all questions. Test-retest reliability of the self-assessment yielded kappa statistics that ranged from 0.07 to 1.00. The inter-quartile kappa statistic ranges for inter-rater and test-retest reliability were 0.45 to 0.63 and 0.27 to 0.45, respectively. When percent agreement was calculated, questions ranged from 52.6% to 100% for inter-rater reliability and 34.3% to 100% for test-retest reliability. Kappa statistics for validity ranged from -0.01 to 0.79, with an inter-quartile range of 0.08 to 0.34. Percent agreement for validity ranged from 12.9% to 93.7%. Conclusion This study provides estimates of criterion validity, inter-rater reliability and test-retest reliability for an environmental nutrition and physical activity self-assessment instrument for child care. Results indicate that the self-assessment is a stable and reasonably accurate instrument for use with child care interventions. We therefore recommend the Nutrition and Physical Activity Self-Assessment for Child Care (NAP SACC) instrument to researchers and practitioners interested in conducting healthy weight intervention in child care. However, a more robust, less subjective measure would be more appropriate for researchers seeking an outcome measure to assess intervention impact. PMID:17615078
Van de Velde, Dominique; Coorevits, Pascal; Sabbe, Lode; De Baets, Stijn; Bracke, Piet; Van Hove, Geert; Josephsson, Staffan; Ilsbroukx, Stephan; Vanderstraeten, Guy
2017-03-01
To examine the internal consistency, test-retest reliability, construct validity, discriminant validity and responsiveness of the Ghent Participation Scale. Cross-sectional study with a test-retest sample. Six outpatient rehabilitation centres in Belgium. A total of 365 outpatients from eight diagnostic groups. The Ghent Participation Scale, the Impact on Participation and Autonomy, the Utrecht Scale for Evaluation of Rehabilitation-Participation and the Medical outcome study Short Form SF-36. The Ghent Participation Scale was found to have good internal consistency (Cronbach's α between 0.75 and 0.83). At item level, the test-retest reliability was good; weighted kappas ranged between 0.57 and 0.88. On the dimension level intraclass correlation coefficients ranged between 0.80 and 0.90. Evidence for construct validity came from high correlations between the subscales of the Ghent Participation Scale and four subscales of the Impact on Participation and Autonomy (range, r = -0.71 to -0.87) and two subscales of the Utrecht Scale for Evaluation of Rehabilitation-Participation (range, r = 0.54 to 0.72). Standardized response mean ranged between 0.23 and 0.68 and the area under the curve ranged between 68% and 88%. The Ghent Participation Scale appears to be a valid and reliable method of assessing participation irrespective of the respondent's health condition. The Ghent Participation Scale is responsive and is able to detect changes over time.
Martínez-Gómez, David; Martínez-de-Haro, Vicente; Pozo, Tamara; Welk, Gregory J; Villagra, Ariel; Calle, Marisa E; Marcos, Ascensión; Veiga, Oscar L
2009-01-01
Questionnaires are feasible instruments to assess physical activity (PA) in large samples. The aim of the current study was to evaluate the reliability and validity of the PAQ-A questionnaire in Spanish adolescents using the measurement of PA by accelerometer as criterion. In a sample of 82 adolescents, aged 12 to 17 years, 1-week PAQ-A test-retest was administered. Reliability was analyzed by the Intraclass Correlation Coefficient (ICC) and the internal consistency by the Cronbach's alpha Coefficient. Two hundred thirty-two adolescents, aged 13-17 years, completed the PAQ-A and wore the ActiGraph GT1M accelerometer during 7-days. The PAQ-A was compared against total PA and moderate to vigorous PA (MVPA) obtained by the accelerometer. Test-retest reliability showed ICC = 0.71 for the final score of PAQ-A. Internal consistency was alpha = 0.65 in the first self-report, alpha = 0.67 in the retest in 82 adolescents sample, and alpha = 0.74 in the 232 adolescents sample. The PAQ-A was moderately correlated with total PA (rho = 0.39) and MVPA (rho= 0.34) assessed by the accelerometer. The PAQ-A obtained significantly moderate correlations in boys but not in girls against the accelerometer. The PAQ-A questionnaire shows an adequate reliability and a reasonable validity for assessing PA in Spanish adolescents.
Vannebo, Katrine Tranaas; Iversen, Vegard Moe; Fimland, Marius Steiro; Mork, Paul Jarle
2018-03-02
There is a lack of test-retest reliability studies of measurements of cervical muscle strength, taking into account gender and possible learning effects. To investigate test-retest reliability of measurement of maximal isometric cervical muscle strength by handheld dynamometry. Thirty women (age 20-58 years) and 28 men (age 20-60 years) participated in the study. Maximal isometric strength (neck flexion, neck extension, and right/left lateral flexion) was measured on three separate days at least five days apart by one evaluator. Intra-rater consistency tended to improve from day 1-2 measurements to day 2-3 measurements in both women and men. In women, the intra-class correlation coefficients (ICC) for day 2 to day 3 measurements were 0.91 (95% confidence interval [CI], 0.82-0.95) for neck flexion, 0.88 (95% CI, 0.76-0.94) for neck extension, 0.84 (95% CI, 0.68-0.92) for right lateral flexion, and 0.89 (95% CI, 0.78-0.95) for left lateral flexion. The corresponding ICCs among men were 0.86 (95% CI, 0.72-0.93) for neck flexion, 0.93 (95% CI, 0.85-0.97) for neck extension, 0.82 (95% CI, 0.65-0.91) for right lateral flexion and 0.73 (95% CI, 0.50-0.87) for left lateral flexion. This study describes a reliable and easy-to-administer test for assessing maximal isometric cervical muscle strength.
Lim, Chun Yi; Law, Mary; Khetani, Mary; Rosenbaum, Peter; Pollock, Nancy
2018-08-01
To estimate the psychometric properties of a culturally adapted version of the Young Children's Participation and Environment Measure (YC-PEM) for use among Singaporean families. This is a prospective cohort study. Caregivers of 151 Singaporean children with (n = 83) and without (n = 68) developmental disabilities, between 0 and 7 years, completed the YC-PEM (Singapore) questionnaire with 3 participation scales (frequency, involvement, and change desired) and 1 environment scale for three settings: home, childcare/preschool, and community. Setting-specific estimates of internal consistency, test-retest reliability, and construct validity were obtained. Internal consistency estimates varied from .59 to .92 for the participation scales and .73 to .79 for the environment scale. Test-retest reliability estimates from the YC-PEM conducted on two occasions, 2-3 weeks apart, varied from .39 to .89 for the participation scales and from .65 to .80 for the environment scale. Moderate to large differences were found in participation and perceived environmental support between children with and without a disability. YC-PEM (Singapore) scales have adequate psychometric properties except for low internal consistency for the childcare/preschool participation frequency scale and low test-retest reliability for home participation frequency scale. The YC-PEM (Singapore) may be used for population-level studies involving young children with and without developmental disabilities.
Measuring the Characteristic Topography of Brain Stiffness with Magnetic Resonance Elastography
Murphy, Matthew C.; Huston, John; Jack, Clifford R.; Glaser, Kevin J.; Senjem, Matthew L.; Chen, Jun; Manduca, Armando; Felmlee, Joel P.; Ehman, Richard L.
2013-01-01
Purpose To develop a reliable magnetic resonance elastography (MRE)-based method for measuring regional brain stiffness. Methods First, simulation studies were used to demonstrate how stiffness measurements can be biased by changes in brain morphometry, such as those due to atrophy. Adaptive postprocessing methods were created that significantly reduce the spatial extent of edge artifacts and eliminate atrophy-related bias. Second, a pipeline for regional brain stiffness measurement was developed and evaluated for test-retest reliability in 10 healthy control subjects. Results This technique indicates high test-retest repeatability with a typical coefficient of variation of less than 1% for global brain stiffness and less than 2% for the lobes of the brain and the cerebellum. Furthermore, this study reveals that the brain possesses a characteristic topography of mechanical properties, and also that lobar stiffness measurements tend to correlate with one another within an individual. Conclusion The methods presented in this work are resistant to noise- and edge-related biases that are common in the field of brain MRE, demonstrate high test-retest reliability, and provide independent regional stiffness measurements. This pipeline will allow future investigations to measure changes to the brain’s mechanical properties and how they relate to the characteristic topographies that are typical of many neurologic diseases. PMID:24312570
Cuesta-Barriuso, Rubén; Torres-Ortuño, Ana; Galindo-Piñana, Pilar; Nieto-Munuera, Joaquín; Duncan, Natalie; López-Pina, José Antonio
2017-01-01
We aimed to conduct a validation in Spanish of the Validated Hemophilia Regimen Treatment Adherence Scale - Prophylaxis (VERITAS-Pro) questionnaire for use in patients with hemophilia under prophylactic treatment. The VERITAS-Pro scale was adapted through a process of back translation from English to Spanish. A bilingual native Spanish translator translated the scale from English to Spanish. Subsequently, a bilingual native English translator translated the scale from Spanish to English. The disagreements were resolved by agreement between the research team and translators. Seventy-three patients with hemophilia, aged 13-62 years, were enrolled in the study. The scale was applied twice (2 months apart) to evaluate the test-retest reliability. Internal consistency reliability was lower on the Spanish VERITAS-Pro than on the English version. Test-retest reliability was high, ranging from 0.83 to 0.92. No significant differences ( P >0.05) were found between test and retest scores in subscales of VERITAS-Pro. In general, Spanish patients showed higher rates of nonadherence than American patients in all subscales. The Spanish version of the VERITAS-Pro has high levels of consistency and empirical validity. This scale can be administered to assess the degree of adherence of prophylactic treatment in patients with hemophilia.
Urdu translation of the Hamilton Rating Scale for Depression: Results of a validation study
Hashmi, Ali M.; Naz, Shahana; Asif, Aftab; Khawaja, Imran S.
2016-01-01
Objective: To develop a standardized validated version of the Hamilton Rating Scale for Depression (HAM-D) in Urdu. Methods: After translation of the HAM-D into the Urdu language following standard guidelines, the final Urdu version (HAM-D-U) was administered to 160 depressed outpatients. Inter-item correlation was assessed by calculating Cronbach alpha. Correlation between HAM-D-U scores at baseline and after a 2-week interval was evaluated for test-retest reliability. Moreover, scores of two clinicians on HAM-D-U were compared for inter-rater reliability. For establishing concurrent validity, scores of HAM-D-U and BDI-U were compared by using Spearman correlation coefficient. The study was conducted at Mayo Hospital, Lahore, from May to December 2014. Results: The Cronbach alpha for HAM-D-U was 0.71. Composite scores for HAM-D-U at baseline and after a 2-week interval were also highly correlated with each other (Spearman correlation coefficient 0.83, p-value < 0.01) indicating good test-retest reliability. Composite scores for HAM-D-U and BDI-U were positively correlated with each other (Spearman correlation coefficient 0.85, p < 0.01) indicating good concurrent validity. Scores of two clinicians for HAM-D-U were also positively correlated (Spearman correlation coefficient 0.82, p-value < 0.01) indicated good inter-rater reliability. Conclusion: The HAM-D-U is a valid and reliable instrument for the assessment of Depression. It shows good inter-rater and test-retest reliability. The HAM-D-U can be a tool either for clinical management or research. PMID:28083049
Urdu translation of the Hamilton Rating Scale for Depression: Results of a validation study.
Hashmi, Ali M; Naz, Shahana; Asif, Aftab; Khawaja, Imran S
2016-01-01
To develop a standardized validated version of the Hamilton Rating Scale for Depression (HAM-D) in Urdu. After translation of the HAM-D into the Urdu language following standard guidelines, the final Urdu version (HAM-D-U) was administered to 160 depressed outpatients. Inter-item correlation was assessed by calculating Cronbach alpha. Correlation between HAM-D-U scores at baseline and after a 2-week interval was evaluated for test-retest reliability. Moreover, scores of two clinicians on HAM-D-U were compared for inter-rater reliability. For establishing concurrent validity, scores of HAM-D-U and BDI-U were compared by using Spearman correlation coefficient. The study was conducted at Mayo Hospital, Lahore, from May to December 2014. The Cronbach alpha for HAM-D-U was 0.71. Composite scores for HAM-D-U at baseline and after a 2-week interval were also highly correlated with each other (Spearman correlation coefficient 0.83, p-value < 0.01) indicating good test-retest reliability. Composite scores for HAM-D-U and BDI-U were positively correlated with each other (Spearman correlation coefficient 0.85, p < 0.01) indicating good concurrent validity. Scores of two clinicians for HAM-D-U were also positively correlated (Spearman correlation coefficient 0.82, p-value < 0.01) indicated good inter-rater reliability. The HAM-D-U is a valid and reliable instrument for the assessment of Depression. It shows good inter-rater and test-retest reliability. The HAM-D-U can be a tool either for clinical management or research.
Validity and Reliability of a General Nutrition Knowledge Questionnaire for Japanese Adults.
Matsumoto, Mai; Tanaka, Rie; Ikemoto, Shinji
2017-01-01
Nutrition knowledge is necessary for individuals to adopt appropriate dietary habits, and needs to be evaluated before nutrition education is provided. However, there is no tool to assess general nutrition knowledge of adults in Japan. Our aims were to determine the validity and reliability of a general nutrition knowledge questionnaire for Japanese adults. We developed the pilot version of the Japanese general nutrition knowledge questionnaire (JGNKQ) and administered the pilot study to assess content validity and internal reliability to 1,182 Japanese adults aged 18-64 y. The JGNKQ was further modified based on the pilot study and the final version consisted of 5 sections and 147 items. The JGNKQ was administered to female undergraduate Japanese students in their senior year twice in 2015 to assess construct validity and test-retest reliability. Ninety-six students majoring in nutrition and 44 students in other majors who studied at the same university completed the first questionnaire. Seventy-five students completed the questionnaire twice. The responses from the first questionnaire and both questionnaires were used to assess construct validity and test-retest reliability, respectively. The students in nutrition major had significantly higher scores than the students in other majors on all sections of the questionnaire (p=0.000); therefore, the questionnaire had good construct validity. The test-retest reliability correlation coefficient value of overall and each section except "The use of dietary information to make dietary choices" were 0.75, 0.67, 0.67, 0.68 and 0.61, respectively. We suggest that the JGNKQ is an effective tool to assess the nutrition knowledge level of Japanese adults.
Evensen, Natalie M; Kvåle, Alice; Braekken, Ingeborg H
2015-09-01
There is a lack of functional objective tests available to measure functional status in women with pelvic girdle pain (PGP). The purpose of this study was to establish test-retest and intertester reliability of the Timed Up and Go (TUG) test and Ten-metre Timed Walk Test (10mTWT) in pregnant women with PGP. A convenience sample of women was recruited over a 4-month period and tested on two occasions, 1 week apart to determine test-retest reliability. Intertester reliability was established between two assessors at the first testing session. Subjects were instructed to undertake the TUG and 10mTWT at maximum speed. One practise trial and two timed trials for each walking test was undertaken on Day 1 and one practise trial and one timed trial on Day 2. Seventeen women with PGP aged 31.1 years (SD [standard deviation] = 2.3) and 28.7 weeks pregnant (SD = 7.4) completed gait testing. Test-retest reliability using the intraclass correlation coefficient (ICC) was excellent for the TUG (0.88) and good for the 10mTWT (0.74). Intertester reliability was determined in the first 13 participants with excellent ICC values being found for both walking tests (TUG: 0.95; 10mTWT: 0.94). This study demonstrated that the TUG and 10mTWT undertaken at fast pace are reliable, objective functional tests in pregnant women with PGP. While both tests are suitable for use in the clinical and research settings, we would recommend the TUG given the findings of higher test-retest reliability and as this test requires less space and time to set up and score. Future studies in a larger sample size are warranted to confirm the results of this study. Copyright © 2015 John Wiley & Sons, Ltd.
Huang, Min H; Miller, Kara; Smith, Kristin; Fredrickson, Kayle; Shilling, Tracy
2016-01-01
Cancer is primarily a disease of older adults. About 77% of all cancers are diagnosed in persons aged 55 years and older. Cancer and its treatment can cause diverse sequelae impacting body systems underlying balance control. No study has examined the psychometric properties of balance assessment tools in older cancer survivors, presenting a significant challenge in the selection of outcome measures for clinicians treating this fast-growing population. This study aimed to determine the reliability, validity, and minimal detectable change (MDC) of the Balance Evaluation System Test (BESTest), Mini-Balance Evaluation Systems Test (Mini-BESTest), and Brief-Balance Evaluation Systems Test (Brief-BESTest) in community-dwelling older cancer survivors. This study was a cross-sectional design. Twenty breast and 8 prostate cancer survivors participated [age (SD) = 68.4 (8.13) years]. The BESTest and Activity-specific Balance Confidence (ABC) Scale were administered during the first session. Scores of Mini-BESTest and Brief-BESTest were extracted on the basis of the scores of BESTest. The BESTest was repeated within 1 to 2 weeks by the same rater to determine the test-retest reliability. For the analysis of the inter-rater reliability, 21 participants were randomly selected to be evaluated by 2 raters. A primary rater administered the test. The 2 raters independently and concurrently scored the performance of the participants. Each rater recorded the ratings separately on the scoring sheet. No discussion among the raters was allowed throughout the testing. Intraclass correlation coefficients (ICCs), standard error of measurement, minimal detectable change (MDC), and Bland-Altman plots were calculated. Concurrent validity of these balance tests with the ABC Scale was examined using the Spearman correlation. The BESTest, Mini-BESTest, and Brief-BESTest had high test-retest (ICC = 0.90-0.94) and interrater reliability (ICC = 0.86-0.96), small standard error of measurement (0.86-2.47 points), and MDC (2.39-6.86 points). The Bland-Altman plot revealed no systematic errors. The scores of BESTest, Mini-BEST, and Brief-BEST were correlated significantly with those of ABC Scale (P < .01), supporting their concurrent validity. The BESTest, Mini-BESTest, and Brief-BESTest showed high interrater and test-retest reliability, and excellent concurrent validity with the ABC Scale for community-dwelling cancer survivors aged 55 years and older who had completed cancer treatments for at least 3 months. Future studies are necessary to determine the predictive values for determining fall risks using balance assessment tools in older cancer survivors. Clinicians can utilize the BESTest and its short versions to evaluate balance problems in community-dwelling older cancer survivors and apply the established MDC to assess the intervention outcomes.
Li, Hong-Yan; Bi, Rui-Xue; Zhong, Qing-Ling
2017-12-01
Disaster nurse education has received increasing importance in China. Knowing the abilities of disaster response in undergraduate nursing students is beneficial to promote teaching and learning. However, there are few valid and reliable tools that measure the abilities of disaster response in undergraduate nursing students. To develop a self-report scale of self-efficacy in disaster response for Chinese undergraduate nursing students and test its psychometric properties. Nursing students (N=318) from two medical colleges were chosen by purposive sampling. The Disaster Response Self-Efficacy Scale (DRSES) was developed and psychometrically tested. Reliability and content validity were studied. Construct validity was tested by exploratory and confirmatory factor analysis. Reliability was tested by internal consistency and test-retest reliability. The DRSES consisted of 3 factors and 19 items with a 5-point rating. The content validity was 0.91, Cronbach's alpha coefficient was 0.912, and the intraclass correlation coefficient for test-retest reliability was 0.953. The construct validity was good (χ 2 /df=2.440, RMSEA=0.068, NFI=0.907, CFI=0.942, IFI=0.430, p<0.001). The newly developed DRSES has proven good reliability and validity. It could therefore be used as an assessment tool to evaluate self-efficacy in disaster response for Chinese undergraduate nursing students. Copyright © 2017. Published by Elsevier Ltd.
Ryland, Margaret E; Grisbrook, Tiffany L; Wood, Fiona M; Phillips, Michael; Edgar, Dale W
2016-01-01
Lower limb burns can significantly delay recovery of function. Measuring lower limb functional outcomes is challenging in the unique burn patient population and necessitates the use of reliable and valid tools. The aims of this study were to examine the test-retest reliability, sensitivity, and internal consistency of Sections 1 and 3 of the Lower Limb Functional Index-10 (LLFI-10) questionnaire for measuring functional ability in patients with lower limb burns over time. Twenty-nine adult patients who had sustained a lower limb burn injury in the previous 12 months completed the test-retest procedure of the study. In addition, the minimal detectable change (MDC) was calculated for Section 1 and 3 of the LLFI-10. Section 1 is focused on the activity limitations experienced by patients with a lower limb disorder whereas Section 3 involves patients indicating their current percentage of pre-injury duties. Section 1 of the LLFI-10 demonstrated excellent test-retest reliability (intra-class correlation coefficient (ICC) 0.98, 95 % CI 0.96-0.99) whilst Section 3 demonstrated high test-retest reliability (ICC 0.88, 95 % CI 0.79-0.94). MDC scores for Sections 1 and 3 were 1.27 points and 30.22 %, respectively. Internal consistency was demonstrated with a significant negative association (r s = -0.83) between Sections 1 and 3 of the LLFI-10 (p < 0.001). This study demonstrates that Section 1 and 3 of the LLFI-10 are reliable for measuring functional ability in patients who have sustained lower limb burns in the previous 12 months, and furthermore, Section 1 is sensitive to changes in patient function over time.
Beemster, Timo T; van Velzen, Judith M; van Bennekom, Coen A M; Reneman, Michiel F; Frings-Dresen, Monique H W
2018-03-16
The purpose of this study was to assess test-retest reliability, agreement, and responsiveness of questionnaires on productivity loss (iPCQ-VR) and healthcare utilization (TiCP-VR) for sick-listed workers with chronic musculoskeletal pain who were referred to vocational rehabilitation. Methods Test-retest reliability and agreement was assessed with a 2-week interval. Responsiveness was assessed at discharge after a 15-week vocational rehabilitation (VR) program. Data was obtained from six Dutch VR centers. Test-retest reliability was determined with intraclass correlation coefficient (ICC) and Cohen's kappa. Agreement was determined by Standard Error of Measurement (SEM), smallest detectable changes (on group and individual level), and percentage observed, positive and negative agreement. Responsiveness was determined with area under the curve (AUC) obtained from receiver operation characteristic (ROC). Results A sample of 52 participants on test-retest reliability and agreement, and a sample of 223 on responsiveness were included in the analysis. Productivity loss (iPCQ-VR): ICCs ranged from 0.52 to 0.90, kappa ranged from 0.42 to 0.96, and AUC ranged from 0.55 to 0.86. Healthcare utilization (TiCP-VR): ICC was 0.81, and kappa values of the single healthcare utilization items ranged from 0.11 to 1.00. Conclusions The iPCQ-VR showed good measurement properties on working status, number of hours working per week and long-term sick leave, and low measurement properties on short-term sick leave and presenteeism. The TiCP-VR showed adequate reliability on all healthcare utilization items together and medication use, but showed low measurement properties on the single healthcare utilization items.
Low-Frequency Fluctuations of the Resting Brain: High Magnitude Does Not Equal High Reliability
Jia, Wenbin; Liao, Wei; Li, Xun; Huang, Huiyuan; Yuan, Jianhua; Zang, Yu-Feng; Zhang, Han
2015-01-01
The amplitude of low-frequency fluctuation (ALFF) measures low-frequency oscillations of the blood-oxygen-level-dependent signal, characterizing local spontaneous activity during the resting state. ALFF is a commonly used measure for resting-state functional magnetic resonance imaging (rs-fMRI) in numerous basic and clinical neuroscience studies. Using a test-retest rs-fMRI dataset consisting of 21 healthy subjects and three repetitive scans, we found that several key brain regions with high ALFF intensities (or magnitude) had poor reliability. Such regions included the posterior cingulate cortex, the medial prefrontal cortex in the default mode network, parts of the right and left thalami, and the primary visual and motor cortices. The above finding was robust with regard to different sample sizes (number of subjects), different scanning parameters (repetition time) and variations of test-retest intervals (i.e., intra-scan, intra-session, and inter-session reliability), as well as with different scanners. Moreover, the qualitative, map-wise results were validated further with a region-of-interest-based quantitative analysis using “canonical” coordinates as reported previously. Therefore, we suggest that the reliability assessments be incorporated in future ALFF studies, especially for the brain regions with a large ALFF magnitude as listed in our paper. Splitting single data into several segments and assessing within-scan “test-retest” reliability is an acceptable alternative if no “real” test-retest datasets are available. Such evaluations might become more necessary if the data are collected with clinical scanners whose performance is not as good as those that are used for scientific research purposes and are better maintained because the lower signal-to-noise ratio may further dampen ALFF reliability. PMID:26053265
Reliability of Computerized Neurocognitive Tests for Concussion Assessment: A Meta-Analysis.
Farnsworth, James L; Dargo, Lucas; Ragan, Brian G; Kang, Minsoo
2017-09-01
Although widely used, computerized neurocognitive tests (CNTs) have been criticized because of low reliability and poor sensitivity. A systematic review was published summarizing the reliability of Immediate Post-Concussion Assessment and Cognitive Testing (ImPACT) scores; however, this was limited to a single CNT. Expansion of the previous review to include additional CNTs and a meta-analysis is needed. Therefore, our purpose was to analyze reliability data for CNTs using meta-analysis and examine moderating factors that may influence reliability. A systematic literature search (key terms: reliability, computerized neurocognitive test, concussion) of electronic databases (MEDLINE, PubMed, Google Scholar, and SPORTDiscus) was conducted to identify relevant studies. Studies were included if they met all of the following criteria: used a test-retest design, involved at least 1 CNT, provided sufficient statistical data to allow for effect-size calculation, and were published in English. Two independent reviewers investigated each article to assess inclusion criteria. Eighteen studies involving 2674 participants were retained. Intraclass correlation coefficients were extracted to calculate effect sizes and determine overall reliability. The Fisher Z transformation adjusted for sampling error associated with averaging correlations. Moderator analyses were conducted to evaluate the effects of the length of the test-retest interval, intraclass correlation coefficient model selection, participant demographics, and study design on reliability. Heterogeneity was evaluated using the Cochran Q statistic. The proportion of acceptable outcomes was greatest for the Axon Sports CogState Test (75%) and lowest for the ImPACT (25%). Moderator analyses indicated that the type of intraclass correlation coefficient model used significantly influenced effect-size estimates, accounting for 17% of the variation in reliability. The Axon Sports CogState Test, which has a higher proportion of acceptable outcomes and shorter test duration relative to other CNTs, may be a reliable option; however, future studies are needed to compare the diagnostic accuracy of these instruments.
Bolster, Eline A M; Dallmeijer, Annet J; de Wolf, G Sander; Versteegt, Marieke; Schie, Petra E M van
2017-05-01
To determine the test-retest reliability and construct validity of a novel 6-Minute Racerunner Test (6MRT) in children and youth with cerebral palsy (CP) classified as Gross Motor Function Classification System (GMFCS) levels III and IV. The racerunner is a step-propelled tricycle. The participants were 38 children and youth with CP (mean age 11 y 2 m, SD 3 y 7 m; GMFCS III, n = 19; IV, n = 19). Racerunner capability was determined as the distance covered during the 6MRT on three occasions. The intraclass correlation coefficient (ICC), standard error of measurement (SEM), and smallest detectable differences (SDD) were calculated to assess test-retest reliability. The ICC for tests 2 and 3 were 0.89 (SDD 37%; 147 m) for children in level III and 0.91 for children in level IV (SDD 52%; 118 m). When the average of two separate test occasions was used, the SDDs were reduced to 26% (104 m; level III) and 37% (118 m; level IV). For tests 1 to 3, the mean distance covered increased from 345 m (SD 148 m) to 413 m (SD 137 m) for children in level III, and from 193 m (SD 100 m) to 239 m (SD 148 m) for children in level IV. Results suggest high test-retest reliability. However, large SDDs indicate that a single 6MRT measurement is only useful for individual evaluation when large improvements are expected, or when taking the average of two tests. The 6MRT discriminated the distance covered between children and youth in levels III and IV, supporting construct validity.
Assessment Instrument for Problem-focused Coping. Reliability test of APC. Part 1.
Tollén, A; Ahlström, G
1998-01-01
A new self-report instrument, the Assessment Instrument of Problem-focused Coping (APC) developed from qualitative interviews, is described. This instrument provides knowledge of the patients' own competence in coping with activities of daily living (ADL), the patients' own assessment of what they experience as problems, and the extent to which they are satisfied with their ADL. The purpose of the study was to test the reliability of the instrument with regard to intra-rater reliability and internal consistency. The study group comprised 40 patients with muscular weakness and other symptoms relating to the postpolio syndrome. The result showed an acceptable internal consistency (alpha 0.70), which confirms the construct validity of the instrument. The test-retest showed that the stability over a period of time varied from low to high for a total of 28 items. At the same time, it is evident that the instrument does not achieve the aim of being a good evaluation instrument, because the stability over a period of time was unsatisfactory. The test-retest should be repeated with a larger test group in future research projects.
Bryant, Elizabeth; Murtagh, Shemane; Finucane, Laura; McCrum, Carol; Mercer, Christopher; Smith, Toby; Canby, Guy; Rowe, David A; Moore, Ann P
2018-05-11
In response for the need of a freely available, stand-alone, validated outcome measure for use within musculoskeletal (MSK) physiotherapy practice, sensitive enough to measure clinical effectiveness, we developed an MSK patient reported outcome measure. This study examined the validity and reliability of the newly developed Brighton musculoskeletal Patient-Reported Outcome Measure (BmPROM) within physiotherapy outpatient settings. Two hundred twenty-four patients attending physiotherapy outpatient departments in South East England with an MSK condition participated in this study. The BmPROM was assessed for user friendliness (rated feedback, N = 224), reliability (internal consistency and test-retest reliability, n = 42), validity (internal and external construct validity, N = 224), and responsiveness (internal, n = 25). Exploratory factor analysis indicated that a two-factor model provides a good fit to the data. Factors were representative of "Functionality" and "Wellbeing". Correlations observed between the BmPROM and SF-36 domains provided evidence of convergent validity. Reliability results indicated that both subscales were internally consistent with alphas above the acceptable limits for both "Functionality" (α = .85, 95% CI [.81, .88]) and 'Wellbeing' (α = .80, 95% CI [.75, .84]). Test-retest analyses (n = 42) demonstrated a high degree of reliability between "Functionality" (ICC = .84; 95% CI [.72, .91]) and "Wellbeing" scores (ICC = .84; 95% CI [.72, .91]). Further examination of test-retest reliability through the Bland-Altman analysis demonstrated that the difference between "Functionality" and "Wellbeing" test scores did not vary as a function of absolute test score. Large treatment effect sizes were found for both subscales (Functionality d = 1.10; Wellbeing 1.03). The BmPROM is a reliable and valid outcome measure for use in evaluating physiotherapy treatment of MSK conditions. Copyright © 2018 John Wiley & Sons, Ltd.
Crockford, Christopher; Newton, Judith; Lonergan, Katie; Madden, Caoifa; Mays, Iain; O'Sullivan, Meabhdh; Costello, Emmet; Pinto-Grau, Marta; Vajda, Alice; Heverin, Mark; Pender, Niall; Al-Chalabi, Ammar; Hardiman, Orla; Abrahams, Sharon
2018-02-01
Cognitive impairment affects approximately 50% of people with amyotrophic lateral sclerosis (ALS). Research has indicated that impairment may worsen with disease progression. The Edinburgh Cognitive and Behavioural ALS Screen (ECAS) was designed to measure neuropsychological functioning in ALS, with its alternate forms (ECAS-A, B, and C) allowing for serial assessment over time. The aim of the present study was to establish reliable change scores for the alternate forms of the ECAS, and to explore practice effects and test-retest reliability of the ECAS's alternate forms. Eighty healthy participants were recruited, with 57 completing two and 51 completing three assessments. Participants were administered alternate versions of the ECAS serially (A-B-C) at four-month intervals. Intra-class correlation analysis was employed to explore test-retest reliability, while analysis of variance was used to examine the presence of practice effects. Reliable change indices (RCI) and regression-based methods were utilized to establish change scores for the ECAS alternate forms. Test-retest reliability was excellent for ALS Specific, ALS Non-Specific, and ECAS Total scores of the combined ECAS A, B, and C (all > .90). No significant practice effects were observed over the three testing sessions. RCI and regression-based methods produced similar change scores. The alternate forms of the ECAS possess excellent test-retest reliability in a healthy control sample, with no significant practice effects. The use of conservative RCI scores is recommended. Therefore, a change of ≥8, ≥4, and ≥9 for ALS Specific, ALS Non-Specific, and ECAS Total score is required for reliable change.
Test-Retest Reliability of the Preschool Age Psychiatric Assessment (PAPA)
ERIC Educational Resources Information Center
Egger, Helen Link; Erkanli, Alaattin; Keeler, Gordon; Potts, Edward; Walter, Barbara Keith; Angold, Adrian
2006-01-01
Objective: To examine the test-retest reliability of a new interviewer-based psychiatric diagnostic measure (the Preschool Age Psychiatric Assessment) for use with parents of preschoolers 2 to 5 years old. Method: A total of 1,073 parents of children attending a large pediatric clinic completed the Child Behavior Checklist 1 1/2-5. For 18 months,…
ERIC Educational Resources Information Center
Huang, Francis L.; Cornell, Dewey G.
2016-01-01
Although school climate has long been recognized as an important factor in the school improvement process, there are few psychometrically supported measures based on teacher perspectives. The current study replicated and extended the factor structure, concurrent validity, and test-retest reliability of the teacher version of the Authoritative…
One-Year Test-Retest Reliability of the Inventory of Statements about Self-Injury (ISAS)
ERIC Educational Resources Information Center
Glenn, Catherine R.; Klonsky, E. David
2011-01-01
Nonsuicidal self-injury (NSSI) is a growing public health problem among adolescents and young adults. The Inventory of Statements About Self-Injury (ISAS) is a self-report measure designed to assess NSSI behaviors and functions. The current study examines the one-year test-retest reliability of the ISAS in a sample of young adult self-injurers.…
Bonekamp, S; Ghosh, P; Crawford, S; Solga, S F; Horska, A; Brancati, F L; Diehl, A M; Smith, S; Clark, J M
2008-01-01
To examine five available software packages for the assessment of abdominal adipose tissue with magnetic resonance imaging, compare their features and assess the reliability of measurement results. Feature evaluation and test-retest reliability of softwares (NIHImage, SliceOmatic, Analyze, HippoFat and EasyVision) used in manual, semi-automated or automated segmentation of abdominal adipose tissue. A random sample of 15 obese adults with type 2 diabetes. Axial T1-weighted spin echo images centered at vertebral bodies of L2-L3 were acquired at 1.5 T. Five software packages were evaluated (NIHImage, SliceOmatic, Analyze, HippoFat and EasyVision), comparing manual, semi-automated and automated segmentation approaches. Images were segmented into cross-sectional area (CSA), and the areas of visceral (VAT) and subcutaneous adipose tissue (SAT). Ease of learning and use and the design of the graphical user interface (GUI) were rated. Intra-observer accuracy and agreement between the software packages were calculated using intra-class correlation. Intra-class correlation coefficient was used to obtain test-retest reliability. Three of the five evaluated programs offered a semi-automated technique to segment the images based on histogram values or a user-defined threshold. One software package allowed manual delineation only. One fully automated program demonstrated the drawbacks of uncritical automated processing. The semi-automated approaches reduced variability and measurement error, and improved reproducibility. There was no significant difference in the intra-observer agreement in SAT and CSA. The VAT measurements showed significantly lower test-retest reliability. There were some differences between the software packages in qualitative aspects, such as user friendliness. Four out of five packages provided essentially the same results with respect to the inter- and intra-rater reproducibility. Our results using SliceOmatic, Analyze or NIHImage were comparable and could be used interchangeably. Newly developed fully automated approaches should be compared to one of the examined software packages.
Rushton, Paula W; Smith, Emma M; Miller, William C; Kirby, R Lee; Daoust, Geneviève
2018-01-31
The aim of this study was to evaluate the internal consistency, test-retest reliability and responsiveness of the Self-Efficacy in Assessing, Training and Spotting manual wheelchair skills (SEATS-M) and Self-Efficacy in Assessing, Training and Spotting power wheelchair skills (SEATS-P). A 2-week test-retest design was used with a convenience sample of occupational and physical therapists who worked at a provincial rehabilitation centre (inpatient and outpatient services). Sixteen participants completed the SEATS-M and 18 participants completed the SEATS-P. For the SEATS-M assessment, training, spotting and documentation sections, Cronbach's alpha coefficients ranged from 0.90 to 0.97, the 2-week intraclass correlation coefficients (ICC 1,1 ) ranged from 0.81 to 0.95, the standard error of measurements (SEM) ranged from 5.06 to 8.70 and the smallest real differences (SRD) ranged from 6.24 to 8.18. For the SEATS-P assessment, training, spotting and documentation sections, Cronbach's alpha coefficients ranged from 0.83 to 0.92, the ICCs ranged from 0.72 to 0.86, the SEMs ranged from 4.54 to 8.91 and the SRDs ranged from 5.90 to 8.27. There is preliminary evidence that both the SEATS-M and the SEATS-P have high internal consistency, good test-retest reliability and support for responsiveness. These tools can be used in evaluating clinician self-efficacy with assessing, training, spotting and documenting wheelchair skills included on the Wheelchair Skills Test. Implications for Rehabilitation There is preliminary evidence that the SEATS-M and SEATS-P are reliable and responsive outcome measures that can be used to evaluate the self-efficacy of clinicians to administer the Wheelchair Skills Program. Measurement of clinicians' self-efficacy in this area of practice may enable an enhanced understanding of the areas in which clinicians lack self-efficacy, thereby informing the development of improved knowledge translation interventions.
ERIC Educational Resources Information Center
Blagov, Pavel S.; Bi, Wu; Shedler, Jonathan; Westen, Drew
2012-01-01
The Shedler-Westen Assessment Procedure (SWAP) is a personality assessment instrument designed for use by expert clinical assessors. Critics have raised questions about its psychometrics, most notably its validity across observers and situations, the impact of its fixed score distribution on research findings, and its test-retest reliability. We…
ERIC Educational Resources Information Center
Karapolat, Hale; Eyigor, Sibel; Kirazli, Yesim; Celebisoy, Nese; Bilgen, Cem; Kirazli, Tayfun
2010-01-01
The aim of this study is to evaluate the internal consistency, test-retest reliability, construct validity, and sensitivity to change of the Activities-specific Balance Confidence Scale (ABC) in people with peripheral vestibular disorder. Thirty-three patients with unilateral peripheral vestibular disease were included in the study. Patients were…
Brogårdh, Christina; Lexell, Jan
2016-05-01
A new 13-item rating scale, the Self-Reported Impairments in Persons with Late Effects of Polio (SIPP), has been developed. The SIPP has been analyzed using the Rasch method and has shown good construct validity and internal consistency. To establish its clinical utility, further evaluation of its psychometric properties is needed. To evaluate the test-retest reliability of the SIPP and to define limits for the smallest change that indicates a real change, both for a group of persons and a single individual. A postal survey. University Hospital. Fifty-one persons (31 men and 20 women; mean age, 72 years) with clinically verified late effects of polio. Not applicable. The participants completed the SIPP twice, 2 weeks apart. The response frequencies at test occasion 1 (T1) and test occasion 2 (T2) were calculated. Test-retest reliability was analyzed using the percentage agreement of each item, the intraclass correlation coefficient, and the mean difference between the test occasions (đ), together with the 95% confidence intervals for đ, the standard error of measurement, the smallest real difference, and a Bland-Altman plot. The percentage agreement (ie, the same scoring at both test occasions) was >70% for 10 of 13 items. The mean score (standard deviation) was 27.9 (5.7) points at T1 and 28.2 (6.0) points at T2, with no systematic difference between the test occasions. The intraclass correlation coefficient was 0.88, the standard error of measurement (the smallest change for a group of persons) was 2.0 points, and the smallest real difference (the smallest change for a single individual) was 5.6 points, respectively. The SIPP is a reliable rating scale in persons with late effects of polio and can be used to evaluate effects of rehabilitation interventions and changes of perceived impairments over time both for a group of persons and for a single individual. Copyright © 2016 American Academy of Physical Medicine and Rehabilitation. Published by Elsevier Inc. All rights reserved.
Validity and reliability of a new tool to evaluate handwriting difficulties in Parkinson's disease.
Nackaerts, Evelien; Heremans, Elke; Smits-Engelsman, Bouwien C M; Broeder, Sanne; Vandenberghe, Wim; Bergmans, Bruno; Nieuwboer, Alice
2017-01-01
Handwriting in Parkinson's disease (PD) features specific abnormalities which are difficult to assess in clinical practice since no specific tool for evaluation of spontaneous movement is currently available. This study aims to validate the 'Systematic Screening of Handwriting Difficulties' (SOS-test) in patients with PD. Handwriting performance of 87 patients and 26 healthy age-matched controls was examined using the SOS-test. Sixty-seven patients were tested a second time within a period of one month. Participants were asked to copy as much as possible of a text within 5 minutes with the instruction to write as neatly and quickly as in daily life. Writing speed (letters in 5 minutes), size (mm) and quality of handwriting were compared. Correlation analysis was performed between SOS outcomes and other fine motor skill measurements and disease characteristics. Intrarater, interrater and test-retest reliability were assessed using the intraclass correlation coefficient (ICC) and Spearman correlation coefficient. Patients with PD had a smaller (p = 0.043) and slower (p<0.001) handwriting and showed worse writing quality (p = 0.031) compared to controls. The outcomes of the SOS-test significantly correlated with fine motor skill performance and disease duration and severity. Furthermore, the test showed excellent intrarater, interrater and test-retest reliability (ICC > 0.769 for both groups). The SOS-test is a short and effective tool to detect handwriting problems in PD with excellent reliability. It can therefore be recommended as a clinical instrument for standardized screening of handwriting deficits in PD.
Richards, Rickelle; Brown, Lora Beth; Williams, D Pauline; Eggett, Dennis L
2017-02-01
Develop a questionnaire to measure students' knowledge, attitude, behavior, self-efficacy, and environmental factors related to the use of canned foods. The Knowledge-Attitude-Behavior Model, Social Cognitive Theory, and Canned Foods Alliance survey were used as frameworks for questionnaire development. Cognitive interviews were conducted with college students (n = 8). Nutrition and survey experts assessed content validity. Reliability was measured via Cronbach α and 2 rounds (1, n = 81; 2, n = 65) of test-retest statistics. Means and frequencies were used. The 65-item questionnaire had a test-retest reliability of .69. Cronbach α scores were .87 for knowledge (9 items), .86 for attitude (30 items), .80 for self-efficacy (12 items), .68 for canned foods use (8 items), and .30 for environment (6 items). A reliable questionnaire was developed to measure perceptions and use of canned foods. Nutrition educators may find this questionnaire useful to evaluate pretest-posttest changes from canned foods-based interventions among college students. Copyright © 2016 Society for Nutrition Education and Behavior. Published by Elsevier Inc. All rights reserved.
Martinez, Esteve; Castro, Josefina; Bigorra, Aitana; Morer, Astrid; Calvo, Rosa; Vila, Montserrat; Toro, Josep; Rieger, Elisabeth
2007-01-01
To assess motivation to change in adolescent patients with bulimia nervosa through the Bulimia Nervosa Stages of Change Questionnaire (BNSOCQ), an instrument adapted from the Anorexia Nervosa Stages of Change Questionnaire (ANSOCQ) already validated in anorexic patients. Subjects were 30 bulimia nervosa patients (mean age = 16.3 years) who were receiving treatment at an eating disorders unit. The evaluation instruments were: the BNSOCQ, the Eating Disorders Inventory (EDI-2) and the Beck Depression Inventory (BDI). The BNSOCQ was re-administered 1 week later to evaluate test-retest reliability. The BNSOCQ demonstrated good internal consistency (Cronbach's alpha = 0.94) and one week test-retest reliability (Pearson's r = 0.93). Negative significant correlations were found between the BNSOCQ and several EDI-2 scales (Pearson's r between -0.51 and -0.84) and the BDI (r = -0.74). The study provides initial support for the reliability and validity of the BNSOCQ as a self-report instrument for assessing motivation to change in adolescents with bulimia nervosa. 2006 John Wiley & Sons, Ltd and Eating Disorders Association
Psychometric evaluation of the Dutch version of the Subjective Opiate Withdrawal Scale (SOWS).
Dijkstra, Boukje A G; Krabbe, Paul F M; Riezebos, Truus G M; van der Staak, Cees P F; De Jong, Cor A J
2007-01-01
To evaluate the psychometric properties of the Dutch version of the 16-item Subjective Opiate Withdrawal Scale (SOWS). The SOWS measures withdrawal symptoms at the time of assessment. The Dutch SOWS was repeatedly administered to a sample of 272 opioid-dependent inpatients of four addiction treatment centers during rapid detoxification with or without general anesthesia. Examination of the psychometric properties of the SOWS included exploratory factor analysis, internal consistency, test-retest reliability, and criterion validity. Exploratory factor analysis of the SOWS revealed a general pattern of four factors with three items not always clustered in the same factors at different points of measurement. After excluding these items from factor analysis four factors were identified during detoxification (temperature dysregulation, tractus locomotorius, tractus gastro-intestinalis and facial disinhibition). The 13-item SOWS shows high internal consistency and test-retest reliability and good validity at different stages of withdrawal. The 13-item SOWS is a reliable and valid instrument to assess opioid withdrawal during rapid detoxification. Three items were deleted because their content does not correspond directly with opioid withdrawal symptoms. Copyright (c) 2007 S. Karger AG, Basel.
Kaufman, Denise R; Puckett, Mallory J; Smith, Mitchell J; Wilson, Kyle S; Cheema, Rebecca; Landers, Merrill R
2014-08-01
The purpose of this study was to establish reliability and responsiveness of the dynamic visual acuity test (DVAT) at head speeds of 150-200 degrees per second (deg/s) and the gaze stabilization test (GST) in high school and college football players. Reliability design. Fifty high school and college football athletes completed the DVAT and GST in both the yaw (horizontal) and pitch (vertical) planes twice within two weeks. Test-retest reliability for the DVAT was good in yaw, Intraclass Correlation Coefficient (ICC) = 0.770, and moderate/good in pitch, ICC = 0.725. Minimal detectable change (MDC) was 0.16 logMAR for yaw and 0.21 logMAR for pitch. GST reliability was moderate in yaw, ICC = 0.634, and poor in pitch, ICC = 0.411. MDCs were 73.4 deg/s (yaw) and 81.2 deg/s (pitch). The DVAT is reliable at high head speeds in high school and college football athletes in both yaw and pitch. GST head speeds were higher than previously reported in the literature, but reliability of this tool for this population was poor to moderate. From a clinical perspective, DVAT may be reliably used in the assessment of high school and college football athletes; however, GST requires further evaluation. Copyright © 2013 Elsevier Ltd. All rights reserved.
Reliability of a new test battery for fitness assessment of the European Astronaut corps.
Petersen, Nora; Thieschäfer, Lutz; Ploutz-Snyder, Lori; Damann, Volker; Mester, Joachim
2015-01-01
To optimise health for space missions, European astronauts follow specific conditioning programs before, during and after their flights. To evaluate the effectiveness of these programs, the European Space Agency conducts an Astronaut Fitness Assessment (AFA), but the test-retest reliability of elements within it remains unexamined. The reliability study described here presents a scientific basis for implementing the AFA, but also highlights challenges faced by operational teams supporting humans in such unique environments, especially with respect to health and fitness monitoring of crew members travelling not only into space, but also across the world. The AFA tests assessed parameters known to be affected by prolonged exposure to microgravity: aerobic capacity (VO2max), muscular strength (one repetition max, 1 RM) and power (vertical jumps), core stability, flexibility and balance. Intraclass correlation coefficients (ICC3.1), standard error of measurement and coefficient of variation were used to assess relative and absolute test-retest reliability. Squat and bench 1 RM (ICC3.1 = 0.94-0.99), hip flexion (ICC3.1 = 0.99) and left and right handgrip strength (ICC3.1 = 0.95 and 0.97), showed the highest test-retest reliability, followed by VO2max (ICC3.1 = 0.91), core strength (ICC3.1 = 0.78-0.89), hip extension (ICC3.1 = 0.63), the countermeasure (ICC3.1 = 0.76) and squat (ICC3.1 = 0.63) jumps, and single right- and left-leg jump height (ICC3.1 = 0.51 and 0.14). For balance, relative reliability ranged from ICC3.1 = 0.78 for path length (two legs, head tilted back, eyes open) to ICC3.1 = 0.04 for average rotation velocity (one leg, eyes closed). In a small sample (n = 8) of young, healthy individuals, the AFA battery of tests demonstrated acceptable test-retest reliability for most parameters except some balance and single-leg jump tasks. These findings suggest that, for the application with astronauts, most AFA tests appear appropriate to be maintained in the test battery, but that some elements may be unreliable, and require either modification (duration, selection of task) or removal (single-leg jump, balance test on sphere) from the battery. The test battery is mobile and universally applicable for occupational and general fitness assessment by its comprehensive composition of tests covering many systems involved in whole body movement.
Seguí, María del Mar; Cabrero-García, Julio; Crespo, Ana; Verdú, José; Ronda, Elena
2015-06-01
To design and validate a questionnaire to measure visual symptoms related to exposure to computers in the workplace. Our computer vision syndrome questionnaire (CVS-Q) was based on a literature review and validated through discussion with experts and performance of a pretest, pilot test, and retest. Content validity was evaluated by occupational health, optometry, and ophthalmology experts. Rasch analysis was used in the psychometric evaluation of the questionnaire. Criterion validity was determined by calculating the sensitivity and specificity, receiver operator characteristic curve, and cutoff point. Test-retest repeatability was tested using the intraclass correlation coefficient (ICC) and concordance by Cohen's kappa (κ). The CVS-Q was developed with wide consensus among experts and was well accepted by the target group. It assesses the frequency and intensity of 16 symptoms using a single rating scale (symptom severity) that fits the Rasch rating scale model well. The questionnaire has sensitivity and specificity over 70% and achieved good test-retest repeatability both for the scores obtained [ICC = 0.802; 95% confidence interval (CI): 0.673, 0.884] and CVS classification (κ = 0.612; 95% CI: 0.384, 0.839). The CVS-Q has acceptable psychometric properties, making it a valid and reliable tool to control the visual health of computer workers, and can potentially be used in clinical trials and outcome research. Copyright © 2015 Elsevier Inc. All rights reserved.
Development and positioning reliability of a TMS coil holder for headache research.
Chronicle, Edward P; Pearson, A Jane; Matthews, Cheryl
2005-01-01
Accurate and reproducible coil positioning is important for headache research using transcranial magnetic stimulation protocols. We aimed to design a transcranial magnetic stimulation coil holder and demonstrate reliability of test-retest coil positioning. A coil holder was developed and manufactured according to three principles of stability, durability, and three-dimensional positional accuracy. Reliability of coil positioning was assessed by stimulating over the motor cortex of four neurologically normal subjects and recording finger muscle responses, both at a test phase and a retest phase several hours later. In all four subjects, repositioning of the transcranial magnetic stimulation coil solely on the basis of coil holder coordinates was accurate to within 2 mm. The coil holder demonstrated good test-retest reliability of coil positioning, and is thus a promising tool for transcranial magnetic stimulation-based headache research, particularly studies of prophylactic drug effect where several laboratory visits with identical coil positioning are necessary.
Tenke, Craig E.; Kayser, Jürgen; Pechtel, Pia; Webb, Christian A.; Dillon, Daniel G.; Goer, Franziska; Murray, Laura; Deldin, Patricia; Kurian, Benji T.; McGrath, Patrick J.; Parsey, Ramin; Trivedi, Madhukar; Fava, Maurizio; Weissman, Myrna M.; McInnis, Melvin; Abraham, Karen; Alvarenga, Jorge; Alschuler, Daniel M.; Cooper, Crystal; Pizzagalli, Diego A.; Bruder, Gerard E.
2016-01-01
Growing evidence suggests that loudness dependency of auditory evoked potentials (LDAEP) and resting EEG alpha and theta may be biological markers for predicting response to antidepressants. In spite of this promise, little is known about the joint reliability of these markers, and thus their clinical applicability. New, standardized procedures were developed to improve the compatibility of data acquired with different EEG platforms, and used to examine test-retest reliability for the three electrophysiological measures selected for a multisite project—Establishing Moderators and Biosignatures of Antidepressant Response for Clinical Care (EMBARC). Thirty nine healthy controls across four clinical research sites were tested in two sessions separated by about one week. Resting EEG (eyes-open and eyes-closed conditions) was recorded and LDAEP measured using binaural tones (1000 Hz, 40 ms) at five intensities (60–100 dB SPL). Principal components analysis (PCA) of current source density (CSD) waveforms reduced volume conduction and provided reference-free measures of resting EEG alpha and N1 dipole activity to tones from auditory cortex. Low Resolution Electromagnetic Tomography (LORETA) extracted resting theta current density measures corresponding to rostral anterior cingulate (rACC), which has been implicated in treatment response. There were no significant differences in posterior alpha, N1 dipole or rACC theta across sessions. Test-retest reliability was .84 for alpha, .87 for N1 dipole, and .70 for theta rACC current density. The demonstration of good-to-excellent reliability for these measures provides a template for future EEG/ERP studies from multiple testing sites, and an important step for evaluating them as biomarkers for predicting treatment response. PMID:28000259
Tenke, Craig E; Kayser, Jürgen; Pechtel, Pia; Webb, Christian A; Dillon, Daniel G; Goer, Franziska; Murray, Laura; Deldin, Patricia; Kurian, Benji T; McGrath, Patrick J; Parsey, Ramin; Trivedi, Madhukar; Fava, Maurizio; Weissman, Myrna M; McInnis, Melvin; Abraham, Karen; E Alvarenga, Jorge; Alschuler, Daniel M; Cooper, Crystal; Pizzagalli, Diego A; Bruder, Gerard E
2017-01-01
Growing evidence suggests that loudness dependency of auditory evoked potentials (LDAEP) and resting EEG alpha and theta may be biological markers for predicting response to antidepressants. In spite of this promise, little is known about the joint reliability of these markers, and thus their clinical applicability. New standardized procedures were developed to improve the compatibility of data acquired with different EEG platforms, and used to examine test-retest reliability for the three electrophysiological measures selected for a multisite project-Establishing Moderators and Biosignatures of Antidepressant Response for Clinical Care (EMBARC). Thirty-nine healthy controls across four clinical research sites were tested in two sessions separated by about 1 week. Resting EEG (eyes-open and eyes-closed conditions) was recorded and LDAEP measured using binaural tones (1000 Hz, 40 ms) at five intensities (60-100 dB SPL). Principal components analysis of current source density waveforms reduced volume conduction and provided reference-free measures of resting EEG alpha and N1 dipole activity to tones from auditory cortex. Low-resolution electromagnetic tomography (LORETA) extracted resting theta current density measures corresponding to rostral anterior cingulate (rACC), which has been implicated in treatment response. There were no significant differences in posterior alpha, N1 dipole, or rACC theta across sessions. Test-retest reliability was .84 for alpha, .87 for N1 dipole, and .70 for theta rACC current density. The demonstration of good-to-excellent reliability for these measures provides a template for future EEG/ERP studies from multiple testing sites, and an important step for evaluating them as biomarkers for predicting treatment response. © 2016 Society for Psychophysiological Research.
de Vasconcellos, Ilmeire Ramos Rosembach; Griep, Rosane Härter; Portela, Luciana; Alves, Márcia Guimarães de Mello; Rotenberg, Lúcia
2016-01-01
ABSTRACT OBJECTIVE To describe the steps in the transcultural adaptation of the scale in the Effort-reward imbalance model to household and family work to the Brazilian context. METHODS We performed the translation, back-translation, and initial psychometric evaluation of the questionnaire that comprised three dimensions: (i) effort (eight items, emphasizing quantitative workload), (ii) reward (11 items that seek to capture the intrinsic value of family and household work, societal esteem, recognition from the spouse/partner, and affection from the children), and (iii) overcommitment (four items related to intrinsic effort). The scale was included in a sectional study conducted with 1,045 nursing workers. A subsample of 222 subjects answered the questionnaire for a second time, seven to 15 days thereafter. The data were collected between October 2012 and May 2013. The internal consistency of the scale was evaluated using Cronbach’s alpha and test-retest reliability analysis, square weighted kappa, prevalence and bias adjusted Kappa, and intraclass correlation coefficient. RESULTS Prevalence and bias-adjusted Kappa (ka) of the scale dimensions ranged from 0.80-0.83 for overcommitment, 0.78-0.90 for effort, and 0.76-0.93 for reward. In most dimensions, the values of minimum and maximum scores, average, standard deviation, and Cronbach’s alpha were similar in test and retest scores. Only on societal esteem subdimension (reward) was there little variation in standard deviation (test score of 2.24 and retest score of 3.36) and in Cronbach’s alpha coefficient (test score of 0.38 and retest score of 0.59). CONCLUSIONS The Brazilian version of the scale was found to have proper reliability indices regarding time stability, which suggests adapting it to be used in population with characteristics that are similar to the one in this study. PMID:27355466
Papadakaki, Maria; Prokopiadou, Dimitra; Petridou, Eleni; Kogevinas, Manolis; Lionis, Christos
2012-06-01
The current article aims to translate the PREMIS (Physician Readiness to Manage Intimate Partner Violence) survey into the Greek language and test its validity and reliability in a sample of primary care physicians. The validation study was conducted in 2010 and involved all the general practitioners serving two adjacent prefectures of Greece (n = 80). Maximum-likelihood factor analysis (MLF) was used to extract key survey factors. The instrument was further assessed for the following psychometric properties: (a) scale reliability, (b) item-specific reliability, (c) test-retest reliability, (d) scale construct validity, and (e) internal predictive validity. The MLF analysis of 23 opinion items revealed a seven-factor solution (preparation, constraint, workplace issues, screening, self-efficacy, alcohol/drugs, victim understanding), which was statistically sound (p = .293). Most of the newly derived scales displayed satisfactory internal consistency (α ≥ .60), high item-specific reliability, strong construct, and internal predictive validity (F = 2.82; p = .004), and high repeatability when retested with 20 individuals (intraclass correlation coefficient [ICC] > .70). The tool was found appropriate to facilitate the identification of competence deficits and the evaluation of training initiatives.
Kahraman, Turhan; Özdoğar, Asiye Tuba; Honan, Cynthia Alison; Ertekin, Özge; Özakbaş, Serkan
2018-05-09
To linguistically and culturally adapt the Multiple Sclerosis Work Difficulties Questionnaire-23 (MSWDQ-23) for use in Turkey, and to examine its reliability and validity. Following standard forward-back translation of the MSWDQ-23, it was administered to 124 people with multiple sclerosis (MS). Validity was evaluated using related outcome measures including those related to employment status and expectations, disability level, fatigue, walking, and quality of life. Randomly selected participants were asked to complete the MSWDQ-23 again to assess test-retest reliability. Confirmatory factor analysis on the MSWDQ-23 demonstrated a good fit for the data, and the internal consistency of each subscale was excellent. The test-retest reliability for the total score, psychological/cognitive barriers, physical barriers, and external barriers subscales were high. The MSWDQ-23 and its subscales were positively correlated with the employment, disability level, walking, and fatigue outcome measures. This study suggests that the Turkish version of MSWDQ-23 has high reliability and adequate validity, and it can be used to determine the difficulties faced by people with multiple sclerosis in workplace. Moreover, the study provides evidence about the test-retest reliability of the questionnaire. Implications for rehabilitation Multiple sclerosis affects young people of working age. Understanding work-related problems is crucial to enhance people with multiple sclerosis likelihood of maintaining their job. The Multiple Sclerosis Work Difficulties Questionnaire-23 (MSWDQ-23) is a valid and reliable measure of perceived workplace difficulties in people with multiple sclerosis: we presented its validation to Turkish. Professionals working in the field of vocational rehabilitation may benefit from using the MSWDQ-23 to predict the current work outcomes and future employment expectations.
Intraobserver reliability of contact pachymetry in children.
Weise, Katherine K; Kaminski, Brett; Melia, Michele; Repka, Michael X; Bradfield, Yasmin S; Davitt, Bradley V; Johnson, David A; Kraker, Raymond T; Manny, Ruth E; Matta, Noelle S; Schloff, Susan
2013-04-01
Central corneal thickness (CCT) is an important measurement in the treatment and management of pediatric glaucoma and potentially of refractive error, but data regarding reliability of CCT measurement in children are limited. The purpose of this study was to evaluate the reliability of CCT measurement with the use of handheld contact pachymetry in children. We conducted a multicenter intraobserver test-retest reliability study of more than 3,400 healthy eyes in children aged from newborn to 17 years by using a handheld contact pachymeter (Pachmate DGH55; DGH Technology Inc, Exton, PA) in 2 clinical settings--with the use of topical anesthesia in the office and with the patient under general anesthesia in a surgical facility. The overall standard error of measurement, including only measurements with standard deviation ≤5 μm, was 8 μm; the corresponding coefficient of repeatability, or limits within which 95% of test-retest differences fell, was ±22.3 μm. However, standard error of measurement increased as CCT increased, from 6.8 μm for CCT less than 525 μm, to 12.9 μm for CCT 625 μm and greater. The standard error of measurement including measurements with standard deviation >5 μm was 10.5 μm. Age, sex, race/ethnicity group, and examination setting did not influence the magnitude of test-retest differences. CCT measurement reliability in children via the Pachmate DGH55 handheld contact pachymeter is similar to that reported for adults. Because thicker CCT measurements are less reliable than thinner measurements, a second measure may be helpful when the first exceeds 575 μm. Reliability is also improved by disregarding measurements with instrument-reported standard deviations >5 μm. Copyright © 2013 American Association for Pediatric Ophthalmology and Strabismus. Published by Mosby, Inc. All rights reserved.
Novel Strength Test Battery to Permit Evidence-Based Paralympic Classification
Beckman, Emma M.; Newcombe, Peter; Vanlandewijck, Yves; Connick, Mark J.; Tweedy, Sean M.
2014-01-01
Abstract Ordinal-scale strength assessment methods currently used in Paralympic athletics classification prevent the development of evidence-based classification systems. This study evaluated a battery of 7, ratio-scale, isometric tests with the aim of facilitating the development of evidence-based methods of classification. This study aimed to report sex-specific normal performance ranges, evaluate test–retest reliability, and evaluate the relationship between the measures and body mass. Body mass and strength measures were obtained from 118 participants—63 males and 55 females—ages 23.2 years ± 3.7 (mean ± SD). Seventeen participants completed the battery twice to evaluate test–retest reliability. The body mass–strength relationship was evaluated using Pearson correlations and allometric exponents. Conventional patterns of force production were observed. Reliability was acceptable (mean intraclass correlation = 0.85). Eight measures had moderate significant correlations with body size (r = 0.30–61). Allometric exponents were higher in males than in females (mean 0.99 vs 0.30). Results indicate that this comprehensive and parsimonious battery is an important methodological advance because it has psychometric properties critical for the development of evidence-based classification. Measures were interrelated with body size, indicating further research is required to determine whether raw measures require normalization in order to be validly applied in classification. PMID:25068950
Validation and cross cultural adaptation of the Italian version of the Harris Hip Score.
Dettoni, Federico; Pellegrino, Pietro; La Russa, Massimo R; Bonasia, Davide E; Blonna, Davide; Bruzzone, Matteo; Castoldi, Filippo; Rossi, Roberto
2015-01-01
The Harris Hip Score (HHS) is one of the most widely used health related quality of life (HRQOL) measures for the assessment of hip pathology: in spite of this, a validation study, and an official Italian version have not been provided yet. The aim of this study was to create an Italian valid and reliable version of the HHS. The score was translated and modified in Italian; then 103 patients with different hip pathologies were evaluated using this HHS version and also with the WOMAC and the SF-12 questionnaires. Content, construct and criterion validities were tested, such as interobserver reliability, test-retest reliability and internal consistency. Cross-cultural adaptation was easy, and only minor adaptation was required in the translation process. Construct and criterion validity of the HHS Italian Version were confirmed by satisfactory values of Spearman's Rho for correlation between specific domains of HHS and Womac and SF12 scores. Interobserver and test-retest reliabilities obtained values of 0.996 and 0.975 respectively; Cronbach's alpha for internal consistency was 0.816. Statistical and clinical analysis showed that HHS is highly valid and reliable in this new Italian version.
Hypertension Knowledge-Level Scale (HK-LS): a study on development, validity and reliability.
Erkoc, Sultan Baliz; Isikli, Burhanettin; Metintas, Selma; Kalyoncu, Cemalettin
2012-03-01
This study was conducted to develop a scale to measure knowledge about hypertension among Turkish adults. The Hypertension Knowledge-Level Scale (HK-LS) was generated based on content, face, and construct validity, internal consistency, test re-test reliability, and discriminative validity procedures. The final scale had 22 items with six sub-dimensions. The scale was applied to 457 individuals aged ≥ 18 years, and 414 of them were re-evaluated for test-retest reliability. The six sub-dimensions encompassed 60.3% of the total variance. Cronbach alpha coefficients were 0.82 for the entire scale and 0.92, 0.59, 0.67, 0.77, 0.72, and 0.76 for the sub-dimensions of definition, medical treatment, drug compliance, lifestyle, diet, and complications, respectively. The scale ensured internal consistency in reliability and construct validity, as well as stability over time. Significant relationships were found between knowledge score and age, gender, educational level, and history of hypertension of the participants. No correlation was found between knowledge score and working at an income-generating job. The present scale, developed to measure the knowledge level of hypertension among Turkish adults, was found to be valid and reliable.
Validation of a clinical assessment of spectral-ripple resolution for cochlear implant users.
Drennan, Ward R; Anderson, Elizabeth S; Won, Jong Ho; Rubinstein, Jay T
2014-01-01
Nonspeech psychophysical tests of spectral resolution, such as the spectral-ripple discrimination task, have been shown to correlate with speech-recognition performance in cochlear implant (CI) users. However, these tests are best suited for use in the research laboratory setting and are impractical for clinical use. A test of spectral resolution that is quicker and could more easily be implemented in the clinical setting has been developed. The objectives of this study were (1) To determine whether this new clinical ripple test would yield individual results equivalent to the longer, adaptive version of the ripple-discrimination test; (2) To evaluate test-retest reliability for the clinical ripple measure; and (3) To examine the relationship between clinical ripple performance and monosyllabic word recognition in quiet for a group of CI listeners. Twenty-eight CI recipients participated in the study. Each subject was tested on both the adaptive and the clinical versions of spectral ripple discrimination, as well as consonant-nucleus-consonant word recognition in quiet. The adaptive version of spectral ripple used a two-up, one-down procedure for determining spectral ripple discrimination threshold. The clinical ripple test used a method of constant stimuli, with trials for each of 12 fixed ripple densities occurring six times in random order. Results from the clinical ripple test (proportion correct) were then compared with ripple-discrimination thresholds (in ripples per octave) from the adaptive test. The clinical ripple test showed strong concurrent validity, evidenced by a good correlation between clinical ripple and adaptive ripple results (r = 0.79), as well as a correlation with word recognition (r = 0.7). Excellent test-retest reliability was also demonstrated with a high test-retest correlation (r = 0.9). The clinical ripple test is a reliable nonlinguistic measure of spectral resolution, optimized for use with CI users in a clinical setting. The test might be useful as a diagnostic tool or as a possible surrogate outcome measure for evaluating treatment effects in hearing.
The Arthroscopic Surgical Skill Evaluation Tool (ASSET).
Koehler, Ryan J; Amsdell, Simon; Arendt, Elizabeth A; Bisson, Leslie J; Braman, Jonathan P; Bramen, Jonathan P; Butler, Aaron; Cosgarea, Andrew J; Harner, Christopher D; Garrett, William E; Olson, Tyson; Warme, Winston J; Nicandri, Gregg T
2013-06-01
Surgeries employing arthroscopic techniques are among the most commonly performed in orthopaedic clinical practice; however, valid and reliable methods of assessing the arthroscopic skill of orthopaedic surgeons are lacking. The Arthroscopic Surgery Skill Evaluation Tool (ASSET) will demonstrate content validity, concurrent criterion-oriented validity, and reliability when used to assess the technical ability of surgeons performing diagnostic knee arthroscopic surgery on cadaveric specimens. Cross-sectional study; Level of evidence, 3. Content validity was determined by a group of 7 experts using the Delphi method. Intra-articular performance of a right and left diagnostic knee arthroscopic procedure was recorded for 28 residents and 2 sports medicine fellowship-trained attending surgeons. Surgeon performance was assessed by 2 blinded raters using the ASSET. Concurrent criterion-oriented validity, interrater reliability, and test-retest reliability were evaluated. Content validity: The content development group identified 8 arthroscopic skill domains to evaluate using the ASSET. Concurrent criterion-oriented validity: Significant differences in the total ASSET score (P < .05) between novice, intermediate, and advanced experience groups were identified. Interrater reliability: The ASSET scores assigned by each rater were strongly correlated (r = 0.91, P < .01), and the intraclass correlation coefficient between raters for the total ASSET score was 0.90. Test-retest reliability: There was a significant correlation between ASSET scores for both procedures attempted by each surgeon (r = 0.79, P < .01). The ASSET appears to be a useful, valid, and reliable method for assessing surgeon performance of diagnostic knee arthroscopic surgery in cadaveric specimens. Studies are ongoing to determine its generalizability to other procedures as well as to the live operating room and other simulated environments.
Park, Juhyun; Kang, Minyong; Jeong, Chang Wook; Oh, Sohee; Lee, Jeong Woo; Lee, Seung Bae; Son, Hwancheol; Jeong, Hyeon; Cho, Sung Yong
2015-08-01
The modified Seoul National University Renal Stone Complexity scoring system (S-ReSC-R) for retrograde intrarenal surgery (RIRS) was developed as a tool to predict stone-free rate (SFR) after RIRS. We externally validated the S-ReSC-R. We retrospectively reviewed 159 patients who underwent RIRS. The S-ReSC-R was assigned from 1 to 12 according to the location and number of sites involved. The stone-free status was defined as no evidence of a stone or with clinically insignificant residual fragment stones less than 2 mm. Interobserver and test-retest reliabilities were evaluated. Statistical performance of the prediction model was assessed by its predictive accuracy, predictive probability, and clinical usefulness. Overall SFR was 73.0%. The SFRs were 86.7%, 70.2%, and 48.6% in low-score (1-2), intermediate-score (3-4), and high-score (5-12) groups, respectively (p<0.001). External validation of S-ReSC-R revealed an area under the curve (AUC) of 0.731 (95% CI 0.650-0.813). The AUC of the three-titered S-ReSC-R was 0.701 (95% CI 0.609-0.794). The calibration plot showed that the predicted probability of SFR had a concordance comparable to that of observed frequency. The Hosmer-Lemeshow goodness of fit test revealed a p-value of 0.01 for the S-ReSC-R and 0.90 for the three-titered S-ReSC-R. Interobserver and test-retest reliabilities revealed an almost perfect level of agreement. The present study proved the predictive value of S-ReSC-R to predict SFR following RIRS in an independent cohort. Interobserver and test-retest reliabilities confirmed that S-ReSC-R was reliable and valid.
2017-01-01
Objective To investigate the reliability and validity of a new method for isometric back extensor strength measurement using a portable dynamometer. Methods A chair equipped with a small portable dynamometer was designed (Power Track II Commander Muscle Tester). A total of 15 men (mean age, 34.8±7.5 years) and 15 women (mean age, 33.1±5.5 years) with no current back problems or previous history of back surgery were recruited. Subjects were asked to push the back of the chair while seated, and their isometric back extensor strength was measured by the portable dynamometer. Test-retest reliability was assessed with intraclass correlation coefficient (ICC). For the validity assessment, isometric back extensor strength of all subjects was measured by a widely used physical performance evaluation instrument, BTE PrimusRS system. The limit of agreement (LoA) from the Bland-Altman plot was evaluated between two methods. Results The test-retest reliability was excellent (ICC=0.82; 95% confidence interval, 0.65–0.91). The Bland-Altman plots demonstrated acceptable agreement between the two methods: the lower 95% LoA was −63.1 N and the upper 95% LoA was 61.1 N. Conclusion This study shows that isometric back extensor strength measurement using a portable dynamometer has good reliability and validity. PMID:29201818
Şahin, Sedef; Huri, Meral; Aran, Orkun Tahir; Uyanık, Mine
2018-02-23
Background/aim: The Cancer Fatigue Scale (CFS) was developed to evaluate the severity of fatigue in patients with breast cancer. The aim of this study is to translate and culturally adapt a Turkish version and investigate the validity and reliability of the CFS in Turkish patients with fatigue symptoms. Materials and methods: Eighty participants completed the Turkish version of the CFS for breast cancer and the European Organization for Research and Treatment of Cancer Quality of Life Core Questionnaire ″Core 30″ (EORTC QLQ-C30). Test-retest reliability was evaluated by repeating the CFS with a 7-day interval. Results: The CFS demonstrated high test-retest reliability (ICC = 0.95) and good internal consistency (Cronbach′s alpha = 0.74) for all domains. The Kaiser-Meyer-Olkin measure of sampling adequacy was found to be 0.819, which is considered to be satisfactory (>0.5). Correlations between domains of CFS physical and EORTC physical (r: 0.77), CFS cognitive and EORTC cognitive (r: 0.70), and CFS physical and EORTC fatigue (r: 0.80) were found to be significant. Conclusion: The Turkish version of the CFS is a reliable and valid instrument to assess physical, effective, and cognitive dimensions of fatigue. The CFS may be used to evaluate the severity of fatigue in Turkish-speaking breast cancer patients.
Koo, Terry K; Cohen, Jeffrey H; Zheng, Yongping
2011-11-01
Soft tissue exhibits nonlinear stress-strain behavior under compression. Characterizing its nonlinear elasticity may aid detection, diagnosis, and treatment of soft tissue abnormality. The purposes of this study were to develop a rate-controlled Mechano-Acoustic Indentor System and a corresponding finite element optimization method to extract nonlinear elastic parameters of soft tissue and evaluate its test-retest reliability. An indentor system using a linear actuator to drive a force-sensitive probe with a tip-mounted ultrasound transducer was developed. Twenty independent sites at the upper lateral quadrant of the buttock from 11 asymptomatic subjects (7 men and 4 women from a chiropractic college) were indented at 6% per second for 3 sessions, each consisting of 5 trials. Tissue thickness, force at 25% deformation, and area under the load-deformation curve from 0% to 25% deformation were calculated. Optimized hyperelastic parameters of the soft tissue were calculated with a finite element model using a first-order Ogden material model. Load-deformation response on a standardized block was then simulated, and the corresponding area and force parameters were calculated. Between-trials repeatability and test-retest reliability of each parameter were evaluated using coefficients of variation and intraclass correlation coefficients, respectively. Load-deformation responses were highly reproducible under repeated measurements. Coefficients of variation of tissue thickness, area under the load-deformation curve from 0% to 25% deformation, and force at 25% deformation averaged 0.51%, 2.31%, and 2.23%, respectively. Intraclass correlation coefficients ranged between 0.959 and 0.999, indicating excellent test-retest reliability. The automated Mechano-Acoustic Indentor System and its corresponding optimization technique offers a viable technology to make in vivo measurement of the nonlinear elastic properties of soft tissue. This technology showed excellent between-trials repeatability and test-retest reliability with potential to quantify the effects of a wide variety of manual therapy techniques on the soft tissue elastic properties. Copyright © 2011 National University of Health Sciences. Published by Mosby, Inc. All rights reserved.
ASSOCIATIONS BETWEEN THREE CLINICAL ASSESSMENT TOOLS FOR POSTURAL STABILITY
Saxion, Casie E.; Cameron, Kenneth L.; Gerber, J. Parry
2010-01-01
Study Design: Clinical Measurement, Correlation, Reliability Objectives: To assess the relationship between the Single Leg Balance (SLB), modified Balance Error Scoring System (mBESS), and modified Star Excursion Balance (mSEBT) tests and secondarily to assess inter-rater and test-retest reliability of these tests. Background: Ankle sprains often result in chronic instability and dysfunction. Several clinical tests assess postural deficits as a potential cause of this dysfunction; however, limited information exists pertaining to the relationship that these tests have with one another. Methods: Two independent examiners measured the performance of 34 healthy participants completing the SLB Test, mBESS test, and mSEBT at two different time periods. The relationship between tests was assessed using the Pearson Correlation and Fisher's Exact Tests. Inter-rater and test-retest reliability were assessed using the intraclass correlation coefficient (ICC) and Kappa statistics. Results: A significant correlation (r = -0.35) was observed between the mSEBT and the mBESS. Fisher's Exact Test showed a significant association between the SLB Test and mBESS (P = .048), but no association between the SLB and mSEBT (P = 1.000). Inter-rater reliability was excellent for the mSEBT and fair for the mBESS (ICCs of .91 and .61 respectively). Excellent agreement was observed between raters for the SLB test (k = 1.00). Test-retest reliability was excellent for the mSEBT (ICC = 0.98) and fair for the mBESS (ICC = 0.74). There was poor test-retest agreement for the SLB test (k = .211). Conclusion: There was a significant relationship observed between the SLB Test, mBESS test, and mSEBT: however; strength of association measures showed limited overlap between these tests. This suggests that these tests are interrelated but may not assess equal components of postural stability. PMID:21589668
Apivatgaroon, Adinun; Angthong, Chayanin; Sanguanjit, Prakasit; Chernchujit, Bancha
2016-10-01
To develop a Thai version of the Kujala score and show the evaluation of the validity and reliability of the score. The Thai version of the Kujala score was developed using the forward-backward translation protocol. The 49 PFPS patients answered the Thai version of questionnaires including the Kujala score, Short Form-36 (SF-36) and International Knee Documentation Committee (IKDC) Subjective Knee Form. The validity between the scores has been tested. The reliability was assessed using test-retest reliability and internal consistency. The Thai version of the Kujala score showed a good correlation with Thai IKDC Subjective Knee Form (Pearson's correlation coefficient; r = 0.74: p < 0.01) and moderate correlation with the Thai SF-36 subscales of physical component summary, total score and role physical (r = 0.586, 0.571 and 0.524, respectively: p < 0.01). The test-retest reliability was excellent with an intra-class correlation coefficient of 0.908 (p < 0.001; 95% CI [0.842-0.947]). The internal consistency was strong with Cronbach's alpha of 0.952 (p < 0.001). No floor and ceiling effects were observed. The Thai version of the Kujala score has shown good validity and reliability. This score can be effectively used for evaluating Thai patients with patellofemoral pain syndrome. Implications for Rehabilitation The Kujala score is a self-administered questionnaire for patients with patellofemoral pain syndrome (PFPS). The validity and reliability of the Thai version of Kujala are compatible with other versions (Turkish, Chinese and Persian version). The Thai version of Kujala has been shown to have validity and reliability in Thai PFPS patients and can be used for clinical evaluation and also in the research work.
DiFilippo, Kristen Nicole; Huang, Wenhao; Chapman-Novakofski, Karen M
2017-10-27
The extensive availability and increasing use of mobile apps for nutrition-based health interventions makes evaluation of the quality of these apps crucial for integration of apps into nutritional counseling. The goal of this research was the development, validation, and reliability testing of the app quality evaluation (AQEL) tool, an instrument for evaluating apps' educational quality and technical functionality. Items for evaluating app quality were adapted from website evaluations, with additional items added to evaluate the specific characteristics of apps, resulting in 79 initial items. Expert panels of nutrition and technology professionals and app users reviewed items for face and content validation. After recommended revisions, nutrition experts completed a second AQEL review to ensure clarity. On the basis of 150 sets of responses using the revised AQEL, principal component analysis was completed, reducing AQEL into 5 factors that underwent reliability testing, including internal consistency, split-half reliability, test-retest reliability, and interrater reliability (IRR). Two additional modifiable constructs for evaluating apps based on the age and needs of the target audience as selected by the evaluator were also tested for construct reliability. IRR testing using intraclass correlations (ICC) with all 7 constructs was conducted, with 15 dietitians evaluating one app. Development and validation resulted in the 51-item AQEL. These were reduced to 25 items in 5 factors after principal component analysis, plus 9 modifiable items in two constructs that were not included in principal component analysis. Internal consistency and split-half reliability of the following constructs derived from principal components analysis was good (Cronbach alpha >.80, Spearman-Brown coefficient >.80): behavior change potential, support of knowledge acquisition, app function, and skill development. App purpose split half-reliability was .65. Test-retest reliability showed no significant change over time (P>.05) for all but skill development (P=.001). Construct reliability was good for items assessing age appropriateness of apps for children, teens, and a general audience. In addition, construct reliability was acceptable for assessing app appropriateness for various target audiences (Cronbach alpha >.70). For the 5 main factors, ICC (1,k) was >.80, with a P value of <.05. When 15 nutrition professionals evaluated one app, ICC (2,15) was .98, with a P value of <.001 for all 7 constructs when the modifiable items were specified for adults seeking weight loss support. Our preliminary effort shows that AQEL is a valid, reliable instrument for evaluating nutrition apps' qualities for clinical interventions by nutrition clinicians, educators, and researchers. Further efforts in validating AQEL in various contexts are needed. ©Kristen Nicole DiFilippo, Wenhao Huang, Karen M. Chapman-Novakofski. Originally published in JMIR Mhealth and Uhealth (http://mhealth.jmir.org), 27.10.2017.
Y-balance test: a reliability study involving multiple raters.
Shaffer, Scott W; Teyhen, Deydre S; Lorenson, Chelsea L; Warren, Rick L; Koreerat, Christina M; Straseske, Crystal A; Childs, John D
2013-11-01
The Y-balance test (YBT) is one of the few field expedient tests that have shown predictive validity for injury risk in an athletic population. However, analysis of the YBT in a heterogeneous population of active adults (e.g., military, specific occupations) involving multiple raters with limited experience in a mass screening setting is lacking. The primary purpose of this study was to determine interrater test-retest reliability of the YBT in a military setting using multiple raters. Sixty-four service members (53 males, 11 females) actively conducting military training volunteered to participate. Interrater test-retest reliability of the maximal reach had intraclass correlation coefficients (2,1) of 0.80 to 0.85 with a standard error of measurement ranging from 3.1 to 4.2 cm for the 3 reach directions (anterior, posteromedial, and posterolateral). Interrater test-retest reliability of the average reach of 3 trails had an intraclass correlation coefficients (2,3) range of 0.85 to 0.93 with an associated standard error of measurement ranging from 2.0 to 3.5cm. The YBT showed good interrater test-retest reliability with an acceptable level of measurement error among multiple raters screening active duty service members. In addition, 31.3% (n = 20 of 64) of participants exhibited an anterior reach asymmetry of >4cm, suggesting impaired balance symmetry and potentially increased risk for injury. Reprint & Copyright © 2013 Association of Military Surgeons of the U.S.
[The reliability of a questionnaire regarding Colombian children's physical activity].
Herazo-Beltrán, Aliz Y; Domínguez-Anaya, Regina
2012-10-01
Reporting the Physical Activity Questionnaire for school children's (PAQ-C) test-retest reliability and internal consistency. This was a descriptive study of 100 school-aged children aged 9 to 11 years old attending a school in Cartagena, Colombia. The sample was randomly selected. The PAQ-C was given twice, one week apart, after the informed consent forms had been signing by the children's parents and school officials. Cronbach's alpha coefficient of reliability was used for assessing internal consistency and an intra-class correlation coefficient for test-retest reliability SPSS (version 17.0) was used for statistical analysis. The questionnaire scored 0.73 internal consistencies during the first measurement and 0.78 on the second; intra-class correlation coefficient was 0.60. There were differences between boys and girls regarding both measurements. The PAQ-C had acceptable internal consistency and test-retest reliability, thereby making it useful for measuring children's self-reported physical activity and a valuable tool for population studies in Colombia.
Arbab, Dariusch; Kuhlmann, Katharina; Ringendahl, Hubert; Bouillon, Bertil; Eysel, Peer; König, Dietmar
2017-06-13
Patient-reported outcome measures are a critical tool in evaluating the efficacy of orthopaedic procedures. The intention of this study was to develop and culturally adapt a German version of the Manchester-Oxford Foot Questionnaire (MOXFQ) and to evaluate reliability, validity and responsiveness. According to guidelines forward and backward translation has been performed. The German MOXFQ was investigated in 177 consecutive patients before and 6 months after foot or ankle surgery. All patients completed MOXFQ, Foot and Ankle Outcome Score (FAOS), Short form 36 and numeric scales for pain and disability (NRS). Test-Retest reliability, internal consistency, floor and ceiling effects, construct validity and minimal important change were analyzed. The German MOXFQ demonstrated excellent test-retest reliability with ICC values >0.9 Cronbach's alpha (α) values demonstrated strong internal consistency. No floor or ceiling effects were observed. As hypothesized MOXFQ subscales correlated strongly with corresponding FAOS and SF-36 domains. All subscales showed excellent (ES/SRM >0.8) responsiveness between preoperative assessment and postoperative follow-up. The German version of the MOXFQ demonstrated good psychometric properties. It proofed to be a valid and reliable instrument for use in foot and ankle patients. Copyright © 2017 European Foot and Ankle Society. Published by Elsevier Ltd. All rights reserved.
Lam, Simon C
2014-05-01
To perform detailed psychometric testing of the compliance with standard precautions scale (CSPS) in measuring compliance with standard precautions of clinical nurses and to conduct cross-cultural pilot testing and assess the relevance of the CSPS on an international platform. A cross-sectional and correlational design with repeated measures. Nursing students from a local registered nurse training university, nurses from different hospitals in Hong Kong, and experts in an international conference. The psychometric properties of the CSPS were evaluated via internal consistency, 2-week and 3-month test-retest reliability, concurrent validation, and construct validation. The cross-cultural pilot testing and relevance check was examined by experts on infection control from various developed and developing regions. Among 453 participants, 193 were nursing students, 165 were enrolled nurses, and 95 were registered nurses. The results showed that the CSPS had satisfactory reliability (Cronbach α = 0.73; intraclass correlation coefficient, 0.79 for 2-week test-retest and 0.74 for 3-month test-retest) and validity (optimum correlation with criterion measure; r = 0.76, P < .001; satisfactory results on known-group method and hypothesis testing). A total of 19 experts from 16 countries assured that most of the CSPS findings were relevant and globally applicable. The CSPS demonstrated satisfactory results on the basis of the standard international criteria on psychometric testing, which ascertained the reliability and validity of this instrument in measuring the compliance of clinical nurses with standard precautions. The cross-cultural pilot testing further reinforced the instrument's relevance and applicability in most developed and developing regions.
Loeding, B L; Greenan, J P
1998-12-01
The study examined the validity and reliability of four assessments, with three instruments per domain. Domains included generalizable mathematics, communication, interpersonal relations, and reasoning skills. Participants were deaf, legally blind, or visually impaired students enrolled in vocational classes at residential secondary schools. The researchers estimated the internal consistency reliability, test-retest reliability, and construct validity correlations of three subinstruments: student self-ratings, teacher ratings, and performance assessments. The data suggest that these instruments are highly internally consistent measures of generalizable vocational skills. Four performance assessments have high-to-moderate test-retest reliability estimates, and were generally considered to possess acceptable validity and reliability.
Peterson, Jennifer R.; Hill, Catherine C.; Kirkpatrick, Kimberly
2016-01-01
Impulsive choice is typically measured by presenting smaller-sooner (SS) versus larger-later (LL) rewards, with biases towards the SS indicating impulsivity. The current study tested rats on different impulsive choice procedures with LL delay manipulations to assess same-form and alternate-form test-retest reliability. In the systematic-GE procedure (Green & Estle, 2003), the LL delay increased after several sessions of training; in the systematic-ER procedure (Evenden & Ryan, 1996), the delay increased within each session; and in the adjusting-M procedure (Mazur, 1987), the delay changed after each block of trials within a session based on each rat’s choices in the previous block. In addition to measuring choice behavior, we also assessed temporal tracking of the LL delays using the median times of responding during LL trials. The two systematic procedures yielded similar results in both choice and temporal tracking measures following extensive training, whereas the adjusting procedure resulted in relatively more impulsive choices and poorer temporal tracking. Overall, the three procedures produced acceptable same form test-retest reliability over time, but the adjusting procedure did not show significant alternate form test-retest reliability with the other two procedures. The results suggest that systematic procedures may supply better measurements of impulsive choice in rats. PMID:25490901
One-year test-retest reliability of intrinsic connectivity network fMRI in older adults
Guo, Cong C.; Kurth, Florian; Zhou, Juan; Mayer, Emeran A.; Eickhoff, Simon B; Kramer, Joel H.; Seeley, William W.
2014-01-01
“Resting-state” or task-free fMRI can assess intrinsic connectivity network (ICN) integrity in health and disease, suggesting a potential for use of these methods as disease-monitoring biomarkers. Numerous analytical options are available, including model-driven ROI-based correlation analysis and model-free, independent component analysis (ICA). High test-retest reliability will be a necessary feature of a successful ICN biomarker, yet available reliability data remains limited. Here, we examined ICN fMRI test-retest reliability in 24 healthy older subjects scanned roughly one year apart. We focused on the salience network, a disease-relevant ICN not previously subjected to reliability analysis. Most ICN analytical methods proved reliable (intraclass coefficients > 0.4) and could be further improved by wavelet analysis. Seed-based ROI correlation analysis showed high map-wise reliability, whereas graph theoretical measures and temporal concatenation group ICA produced the most reliable individual unit-wise outcomes. Including global signal regression in ROI-based correlation analyses reduced reliability. Our study provides a direct comparison between the most commonly used ICN fMRI methods and potential guidelines for measuring intrinsic connectivity in aging control and patient populations over time. PMID:22446491
de Vreede, Paul L; Samson, Monique M; van Meeteren, Nico L; Duursma, Sijmen A; Verhaar, Harald J
2006-08-01
The Assessment of Daily Activity Performance (ADAP) test was developed, and modeled after the Continuous-scale Physical Functional Performance (CS-PFP) test, to provide a quantitative assessment of older adults' physical functional performance. The aim of this study was to determine the intra-examiner reliability and construct validity of the ADAP in a community-living older population, and to identify the importance of tester experience. Forty-three community-dwelling, older women (mean age 75 yr +/-4.3) were randomized to the test-retest reliability study (n=19) or validation study (n=24). The intra-examiner reliability of an experienced (tester 1) and an inexperienced tester (tester 2) was assessed by comparing test and retest scores of 19 participants. Construct validity was assessed by comparing the ADAP scores of 24 participants with self-perceived function by the SF-36 Health Survey, muscle function tests, and the Timed Up and Go test (TUG). Tester 1 had good consistency and reliability scores (mean difference between test and retest scores (DIF), -1.05+/-1.99; 95% confidence interval (CI), -2.58 to 0.48; Cronbach's alpha (alpha) range, 0.83 to 0.98; intraclass correlation (ICC) range, 0.75 to 0.96; Limits of Agreement (LoA), -2.58 to 4.95). Tester 2 had lower reliability scores (DIF, -2.45+/-4.36; 95% CI, -5.56 to 0.67; alpha range, 0.53 to 0.94; ICC range, 0.36 to 0.90; LoA, -6.09 to 10.99), with a systematic difference between test and retest scores for the ADAP domain lower-body strength (-3.81; 95% CI, -6.09 to -1.54), ADAP correlated with SF-36 Physical Functioning scale (r=0.67), TUG test (r=-0.91) and with isometric knee extensor strength (r=0.80). The ADAP test is a reliable and valid instrument. Our results suggest that testers should practise using the test, to improve reliability, before applying it to clinical settings.
Jin, X F; Wang, J; Li, Y J; Liu, J F; Ni, D F
2016-09-20
Objective: To cross-culturally translate the questionnaire of olfactory disorders(QOD)into a simplified Chinese version, and evaluate its reliability and validity in clinical. Method: A simplified Chinese version of the QOD was evaluated in test-retest reliability, split-half reliability and internal consistency.Then it was evaluated in validity test including content validity, criterion-related validity, responsibility. Criterion-related validity was using the medical outcome study's 36-item short rorm health survey(SF-36) and the World Health Organization quality of life-brief (WHOQOL-BREF) for comparison. Result: A total of 239 patients with olfactory dysfunction were enrolled and tested, in which 195 patients completed all three surveys(QOD, SF-36, WHOQOL-BREF). The test-retest reliabilities of the QOD-parosmia statements(QOD-P), QOD-quality of life(QOD-QoL), and the QOD-visual simulation(QOD-VAS)sections were 0.799( P <0.01),0.781( P <0.01),0.488( P <0.01), respectively, and the Cronbach' s α coefficients reliability were 0.477,0.812,0.889,respectively.The split-half reliability of QOD-QoL was 0.89. There was no correlation between the QOD-P section and the SF-36, but there were statistically significant correlations between the QOD-QoL and QOD-VAS sections with the SF-36. There was no correlation between the QOD-P section and the WHOQOL-BREF, but there were statistically significant correlations between the QOD-QoL and QOD-VAS sections with the SF-36 in most sections. Conclusion: The simplified Chinese version of the QOD was testified to be a reliable and valid questionnaire for evaluating patients with olfactory dysfunction living in mainland of China.The QOD-P section needs further modifications to properly adapt patients with Chinese cultural and knowledge background. Copyright© by the Editorial Department of Journal of Clinical Otorhinolaryngology Head and Neck Surgery.
Age-Related Differences in Test-Retest Reliability in Resting-State Brain Functional Connectivity
Song, Jie; Desphande, Alok S.; Meier, Timothy B.; Tudorascu, Dana L.; Vergun, Svyatoslav; Nair, Veena A.; Biswal, Bharat B.; Meyerand, Mary E.; Birn, Rasmus M.; Bellec, Pierre; Prabhakaran, Vivek
2012-01-01
Resting-state functional MRI (rs-fMRI) has emerged as a powerful tool for investigating brain functional connectivity (FC). Research in recent years has focused on assessing the reliability of FC across younger subjects within and between scan-sessions. Test-retest reliability in resting-state functional connectivity (RSFC) has not yet been examined in older adults. In this study, we investigated age-related differences in reliability and stability of RSFC across scans. In addition, we examined how global signal regression (GSR) affects RSFC reliability and stability. Three separate resting-state scans from 29 younger adults (18–35 yrs) and 26 older adults (55–85 yrs) were obtained from the International Consortium for Brain Mapping (ICBM) dataset made publically available as part of the 1000 Functional Connectomes project www.nitrc.org/projects/fcon_1000. 92 regions of interest (ROIs) with 5 cubic mm radius, derived from the default, cingulo-opercular, fronto-parietal and sensorimotor networks, were previously defined based on a recent study. Mean time series were extracted from each of the 92 ROIs from each scan and three matrices of z-transformed correlation coefficients were created for each subject, which were then used for evaluation of multi-scan reliability and stability. The young group showed higher reliability of RSFC than the old group with GSR (p-value = 0.028) and without GSR (p-value <0.001). Both groups showed a high degree of multi-scan stability of RSFC and no significant differences were found between groups. By comparing the test-retest reliability of RSFC with and without GSR across scans, we found significantly higher proportion of reliable connections in both groups without GSR, but decreased stability. Our results suggest that aging is associated with reduced reliability of RSFC which itself is highly stable within-subject across scans for both groups, and that GSR reduces the overall reliability but increases the stability in both age groups and could potentially alter group differences of RSFC. PMID:23227153
Tong, W W; Wang, W; Xu, W D
2016-08-15
The Western Ontario Meniscal Evaluation Tool (WOMET) is a questionnaire designed to evaluate the health-related quality of life (HRQOL) of patients with meniscal pathology. Our study aims to culturally adapt and validate the WOMET into a Chinese version. We translated the WOMET into Chinese. Then, a total of 121 patients with meniscal pathology were invited to participate in this study. To assess the test-retest reliability, the Chinese version WOMET was completed twice at 7-day intervals by the participants. The construct validity was assessed using Pearson's correlation coefficient or Spearman's correlation to test for correlations among the Chinese version WOMET and the eight domains of Short Form-36 (SF-36), the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC), and the International Knee Documentation Committee (IKDC) score. Responsiveness was tested by comparison of the preoperative and postoperative scores of the Chinese version WOMET. The test-retest reliability of the overall scale and different domains were all found to be excellent. The Cronbach's α was 0.90. The Chinese version WOMET correlated well with other questionnaires which suggested good construct validity. We observed no ceiling and floor effects of the Chinese version WOMET. We also found good responsiveness for the effect size, and the standardized response mean values were 0.86 and 1.11. The Chinese version of the WOMET appears to be reliable and valid in evaluating patients with meniscal pathology.
São-João, Thaís Moreira; Rodrigues, Roberta Cunha Matheus; Gallani, Maria Cecilia Bueno Jayme; Miura, Cinthya Tamie de Passos; Domingues, Gabriela de Barros Leite; Godin, Gaston
2013-06-01
To conduct the cultural adaptation of the Brazilian version of the Godin-Shephard Leisure-Time Physical Activity Questionnaire (GSLTPAQ) and to assess its content validity, practicability, acceptability and reliability. The stages of translation, synthesis, back translation, expert committee review and pre-test were carried out, followed by the evaluation of the practicability, acceptability and reliability (test-retest). The judges assessed its semantic, idiomatic, conceptual, cultural and metabolic equivalences. The adapted version was submitted to the pre-test (n = 20), and test-retest (n = 80), in healthy individuals and in those suffering from cardiovascular disease in Limeira, SP, Southeastern Brazil, between 2010 and 2011. The proportion of agreement of the committee of judges was assessed using the Content Validity Index. Reliability was assessed by the criterion of stability, with 15 days between applications. Practicability was evaluated by the time spent interviewing and acceptability was estimated as the percentage of unanswered items and the proportion of patients who responded to all items. The translated version of the questionnaire showed evidence of appropriate semantic-idiomatic, conceptual, cultural and metabolic equivalence, with substitutions of several physical activities more appropriate to the Brazilian population. The practicability analysis showed short time needed for the application of the instrument (mean 3.0 minutes). As for acceptability, all patients answered 100% of the items. The test-retest analysis suggested that stability was good (Intraclass Correlation Coefficient value of 0.84). The Brazilian version of the questionnaire showed satisfactory measures of the qualities in question. Its application to diverse populations in future studies is recommended in order to provide robust measures of these qualities.
Alhaiti, Ali Hassan; Alotaibi, Alanod Raffa; Jones, Linda Katherine; DaCosta, Cliff
2016-01-01
Objective. To translate the revised Michigan Diabetes Knowledge Test into the Arabic language and examine its psychometric properties. Setting. Of the 139 participants recruited through King Fahad Medical City in Riyadh, Saudi Arabia, 34 agreed to the second-round sample for retesting purposes. Methods. The translation process followed the World Health Organization's guidelines for the translation and adaptation of instruments. All translations were examined for their validity and reliability. Results. The translation process revealed excellent results throughout all stages. The Arabic version received 0.75 for internal consistency via Cronbach's alpha test and excellent outcomes in terms of the test-retest reliability of the instrument with a mean of 0.90 infraclass correlation coefficient. It also received positive content validity index scores. The item-level content validity index for all instrument scales fell between 0.83 and 1 with a mean scale-level index of 0.96. Conclusion. The Arabic version is proven to be a reliable and valid measure of patient's knowledge that is ready to be used in clinical practices. PMID:27995149
Stucki, G; Meier, D; Stucki, S; Michel, B A; Tyndall, A G; Dick, W; Theiler, R
1996-01-01
The WOMAC (Western Ontario and McMaster Universities) Osteoarthritis Index is a tested questionnaire to assess symptoms and physical functional disability. We adapted the WOMAC for the German language and tested its metric properties, test-retest reliability and validity in 51 patients with knee and hip OA. All WOMAC scales (pain, stiffness, function) were internally consistent with Cronbach's coefficient alpha ranging from 0.80 to 0.96. Test-retest reliability was satisfactory with intraclass correlation coefficients ranging from 0.55 to 0.74. All scales and the global index calculated as the mean of scale scores had a bimodal distribution and a slight ceiling effect. As hypothesized the WOMAC scales were associated with radiological OA-severity and limitations of range-of-motion. Patients with more severe symptoms and functional disability perceived more limitations in their roles at home and at work. The presented German version of the WOMAC is a reliable and valid instrument for the assessment of symptoms and physical functional disability in patients with knee and hip OA.
Salido-Vallejo, R; Ruano, J; Garnacho-Saucedo, G; Godoy-Gijón, E; Llorca, D; Gómez-Fernández, C; Moreno-Giménez, J C
2014-12-01
Tuberous sclerosis complex (TSC) is an autosomal dominant neurocutaneous disorder characterized by the development of multisystem hamartomatous tumours. Topical sirolimus has recently been suggested as a potential treatment for TSC-associated facial angiofibroma (FA). To validate a reproducible scale created for the assessment of clinical severity and treatment response in these patients. We developed a new tool, the Facial Angiofibroma Severity Index (FASI) to evaluate the grade of erythema and the size and extent of FAs. In total, 30 different photographs of patients with TSC were shown to 56 dermatologists at each evaluation. Three evaluations using the same photographs but in a different random order were performed 1 week apart. Test and retest reliability and interobserver reproducibility were determined. There was good agreement between the investigators. Inter-rater reliability showed strong correlations (> 0.98; range 0.97-0.99) with inter-rater correlation coefficients (ICCs) for the FASI. The global estimated kappa coefficient for the degree of intra-rater agreement (test-retest) was 0.94 (range 0.91-0.97). The FASI is a valid and reliable tool for measuring the clinical severity of TSC-associated FAs, which can be applied in clinical practice to evaluate the response to treatment in these patients. © 2014 British Association of Dermatologists.
Busch, Robyn M.; Lineweaver, Tara T.; Ferguson, Lisa; Haut, Jennifer S.
2015-01-01
Reliable change index scores (RCIs) and standardized regression-based change score norms (SRBs) permit evaluation of meaningful changes in test scores following treatment interventions, like epilepsy surgery, while accounting for test-retest reliability, practice effects, score fluctuations due to error, and relevant clinical and demographic factors. Although these methods are frequently used to assess cognitive change after epilepsy surgery in adults, they have not been widely applied to examine cognitive change in children with epilepsy. The goal of the current study was to develop RCIs and SRBs for use in children with epilepsy. Sixty-three children with epilepsy (age range 6–16; M=10.19, SD=2.58) underwent comprehensive neuropsychological evaluations at two time points an average of 12 months apart. Practice adjusted RCIs and SRBs were calculated for all cognitive measures in the battery. Practice effects were quite variable across the neuropsychological measures, with the greatest differences observed among older children, particularly on the Children’s Memory Scale and Wisconsin Card Sorting Test. There was also notable variability in test-retest reliabilities across measures in the battery, with coefficients ranging from 0.14 to 0.92. RCIs and SRBs for use in assessing meaningful cognitive change in children following epilepsy surgery are provided for measures with reliability coefficients above 0.50. This is the first study to provide RCIs and SRBs for a comprehensive neuropsychological battery based on a large sample of children with epilepsy. Tables to aid in evaluating cognitive changes in children who have undergone epilepsy surgery are provided for clinical use. An excel sheet to perform all relevant calculations is also available to interested clinicians or researchers. PMID:26043163
Schache, Margaret B; McClelland, Jodie A; Webster, Kate E
2016-01-01
To investigate the test-retest reliability of measuring hip abductor strength in patients with total knee arthroplasty (TKA) using a hand-held dynamometer (HHD) with two different types of resistance: belt and manual resistance. Test-retest reliability of 30 subjects (17 female, 13 male, 71.9 ± 7.4 years old), 9.2 ± 2.7 days post TKA was measured using belt and therapist resistance. Retest reliability was calculated with intra-class coefficients (ICC3,1) and 95% confidence intervals (CI) for both the group average and the individual scores. A paired t-test assessed whether a difference existed between the belt and therapist methods of resistance. ICCs were 0.82 and 0.80 for the belt and therapist resisted methods, respectively. Hip abductor strength increases of 8 N (14%) for belt resisted and 14 N (17%) for therapist resisted measurements of the group average exceeded the 95% CI and may represent real change. For individuals, hip abductor strength increases of 33 N (72%) (belt resisted) and 57 N (79%) (therapist resisted) could be interpreted as real change. Hip abductor strength can be reliably measured using HHD in the clinical setting with the described protocol. Belt resistance demonstrated slightly higher test-retest reliability. Reliable measurement of hip abductor muscle strength in patients with TKA is important to ensure deficiencies are addressed in rehabilitation programs and function is maximized. Hip abductor strength can be reliably measured with a hand-held dynamometer in the clinical setting using manual or belt resistance.
Assessing the psychometric properties of two food addiction scales.
Lemeshow, Adina R; Gearhardt, Ashley N; Genkinger, Jeanine M; Corbin, William R
2016-12-01
While food addiction is well accepted in popular culture and mainstream media, its scientific validity as an addictive behavior is still under investigation. This study evaluated the reliability and validity of the Yale Food Addiction Scale and Modified Yale Food Addiction Scale using data from two community-based convenience samples. We assessed the internal and test-retest reliability of the Yale Food Addiction Scale and Modified Yale Food Addiction Scale, and estimated the sensitivity and negative predictive value of the Modified Yale Food Addiction Scale using the Yale Food Addiction Scale as the benchmark. We calculated Cronbach's alphas and 95% confidence intervals (CIs) for internal reliability and Cohen's Kappa coefficients and 95% CIs for test-retest reliability. Internal consistency (n=232) was marginal to good, ranging from α=0.63 to 0.84. The test-retest reliability (n=45) for food addiction diagnosis was substantial, with Kappa=0.73 (95% CI, 0.48-0.88) (Yale Food Addiction Scale) and 0.79 (95% CI, 0.66-1.00) (Modified Yale Food Addiction Scale). Sensitivity and negative predictive value for classifying food addiction status were excellent: compared to the Yale Food Addiction Scale, the Modified Yale Food Addiction Scale's sensitivity was 92.3% (95% CI, 64%-99.8%), and the negative predictive value was 99.5% (95% CI, 97.5%-100%). Our analyses suggest that the Modified Yale Food Addiction Scale may be an appropriate substitute for the Yale Food Addiction Scale when a brief measure is needed, and support the continued use of both scales to investigate food addiction. Copyright © 2016 Elsevier Ltd. All rights reserved.
Test-retest reliability and stability of N400 effects in a word-pair semantic priming paradigm.
Kiang, Michael; Patriciu, Iulia; Roy, Carolyn; Christensen, Bruce K; Zipursky, Robert B
2013-04-01
Elicited by any meaningful stimulus, the N400 event-related potential (ERP) component is reduced when the stimulus is related to a preceding one. This N400 semantic priming effect has been used to probe abnormal semantic relationship processing in clinical disorders, and suggested as a possible biomarker for treatment studies. Validating N400 semantic priming effects as a clinical biomarker requires characterizing their test-retest reliability. We assessed test-retest reliability of N400 semantic priming in 16 healthy adults who viewed the same related and unrelated prime-target word pairs in two sessions one week apart. As expected, N400 amplitudes were smaller for related versus unrelated targets across sessions. N400 priming effects (amplitude differences between unrelated and related targets) were highly correlated across sessions (r=0.85, P<0.0001), but smaller in the second session due to larger N400s to related targets. N400 priming effects have high reliability over a one-week interval. They may decrease with repeat testing, possibly because of motivational changes. Use of N400 priming effects in treatment studies should account for possible magnitude decreases with repeat testing. Further research is needed to delineate N400 priming effects' test-retest reliability and stability in different age and clinical groups, and with different stimulus types. Copyright © 2012 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.
Purba, Fredrick Dermawan; Hunfeld, Joke A M; Iskandarsyah, Aulia; Fitriana, Titi Sahidah; Sadarjoen, Sawitri S; Passchier, Jan; Busschbach, Jan J V
2018-01-01
The objective of this study is to obtain population norms and to assess test-retest reliability of EQ-5D-5L and WHOQOL-BREF for the Indonesian population. A representative sample of 1056 people aged 17-75 years was recruited from the Indonesian general population. We used a multistage stratified quota sampling method with respect to residence, gender, age, education level, religion and ethnicity. Respondents completed EQ-5D-5L and WHOQOL-BREF with help from an interviewer. Norms data for both instruments were reported. For the test-retest evaluations, a sub-sample of 206 respondents completed both instruments twice. The total sample and test-retest sub-sample were representative of the Indonesian general population. The EQ-5D-5L shows almost perfect agreement between the two tests (Gwet's AC: 0.85-0.99 and percentage agreement: 90-99%) regarding the five dimensions. However, the agreement of EQ-VAS and index scores can be considered as poor (ICC: 0.45 and 0.37 respectively). For the WHOQOL-BREF, ICCs of the four domains were between 0.70 and 0.79, which indicates moderate to good agreement. For EQ-5D-5L, it was shown that female and older respondents had lower EQ-index scores, whilst rural, younger and higher-educated respondents had higher EQ-VAS scores. For WHOQOL-BREF: male, younger, higher-educated, high-income respondents had the highest scores in most of the domains, overall quality of life, and health satisfaction. This study provides representative estimates of self-reported health status and quality of life for the general Indonesian population as assessed by the EQ-5D-5L and WHOQOL-BREF instruments. The descriptive system of the EQ-5D-5L and the WHOQOL-BREF have high test-retest reliability while the EQ-VAS and the index score of EQ-5D-5L show poor agreement between the two tests. Our results can be useful to researchers and clinicians who can compare their findings with respect to these concepts with those of the Indonesian general population.
Johansson, Jarkko; Alakurtti, Kati; Joutsa, Juho; Tohka, Jussi; Ruotsalainen, Ulla; Rinne, Juha O
2016-10-01
The striatum is the primary target in regional C-raclopride-PET studies, and despite its small volume, it contains several functional and anatomical subregions. The outcome of the quantitative dopamine receptor study using C-raclopride-PET depends heavily on the quality of the region-of-interest (ROI) definition of these subregions. The aim of this study was to evaluate subregional analysis techniques because new approaches have emerged, but have not yet been compared directly. In this paper, we compared manual ROI delineation with several automatic methods. The automatic methods used either direct clustering of the PET image or individualization of chosen brain atlases on the basis of MRI or PET image normalization. State-of-the-art normalization methods and atlases were applied, including those provided in the FreeSurfer, Statistical Parametric Mapping8, and FSL software packages. Evaluation of the automatic methods was based on voxel-wise congruity with the manual delineations and the test-retest variability and reliability of the outcome measures using data from seven healthy male participants who were scanned twice with C-raclopride-PET on the same day. The results show that both manual and automatic methods can be used to define striatal subregions. Although most of the methods performed well with respect to the test-retest variability and reliability of binding potential, the smallest average test-retest variability and SEM were obtained using a connectivity-based atlas and PET normalization (test-retest variability=4.5%, SEM=0.17). The current state-of-the-art automatic ROI methods can be considered good alternatives for subjective and laborious manual segmentation in C-raclopride-PET studies.
[Turkish validity and reliability study of fear of pain questionnaire-III].
Ünver, Seher; Turan, Fatma Nesrin
2018-01-01
This study aimed to develop a Turkish version of the Fear of Pain Questionnaire-III developed by McNeil and Rainwater (1998) and examine its validity and reliability indicators. The study was conducted with 459 university students studying in the nursing department. The Turkish translation of the scale was conducted by language experts and the original scale owner. Expert opinions were taken for language validity, and the Lawshe's content validity ratio formula was used to calculate the content validity. Exploratory factor analysis was used to assess the construct validity. The factors were rotated using the Varimax rotation (orthogonal) method. For reliability indicators of the questionnaire, the internal consistency coefficient and test re-test reliability were utilized. Explanatory factor analyses using the three-factor model (explaining 50.5% of the total variance) revealed that the item factor loads varied were above the limit value of 0.30 which indicated that the questionnaire had good construct validity. The Cronbach's alpha value for the total questionnaire was 0.938, and test re-test value was 0.846 for the total scale. The Turkish version of the Fear of Pain Questionnaire-III had sufficiently high reliability and validity to be used as a tool in evaluating the fear of pain among the young Turkish population.
Roberts, Tawna L; Kester, Kristi N; Hertle, Richard W
2018-04-01
This study presents test-retest reliability of optotype visual acuity (OVA) across 60° of horizontal gaze position in patients with infantile nystagmus syndrome (INS). Also, the validity of the metric gaze-dependent functional vision space (GDFVS) is shown in patients with INS. In experiment 1, OVA was measured twice in seven horizontal gaze positions from 30° left to right in 10° steps in 20 subjects with INS and 14 without INS. Test-retest reliability was assessed using intraclass correlation coefficient (ICC) in each gaze. OVA area under the curve (AUC) was calculated with horizontal eye position on the x-axis, and logMAR visual acuity on the y-axis and then converted to GDFVS. In experiment 2, validity of GDFVS was determined over 40° horizontal gaze by applying the 95% limits of agreement from experiment 1 to pre- and post-treatment GDFVS values from 85 patients with INS. In experiment 1, test-retest reliability for OVA was high (ICC ≥ 0.88) as the difference in test-retest was on average less than 0.1 logMAR in each gaze position. In experiment 2, as a group, INS subjects had a significant increase (P < 0.001) in the size of their GDFVS that exceeded the 95% limits of agreement found during test-retest. OVA is a reliable measure in INS patients across 60° of horizontal gaze position. GDFVS is a valid clinical method to be used to quantify OVA as a function of eye position in INS patients. This method captures the dynamic nature of OVA in INS patients and may be a valuable measure to quantify visual function patients with INS, particularly in quantifying change as part of clinical studies.
Huang, Sheau-Ling; Hsieh, Ching-Lin; Wu, Ruey-Meei
2017-01-01
Background The Beck Depression Inventory II (BDI-II) and the Taiwan Geriatric Depression Scale (TGDS) are self-report scales used for assessing depression in patients with Parkinson’s disease (PD) and geriatric people. The minimal detectable change (MDC) represents the least amount of change that indicates real difference (i.e., beyond random measurement error) for a single subject. Our aim was to investigate the test-retest reliability and MDC of the BDI-II and the TGDS in people with PD. Methods Seventy patients were recruited from special clinics for movement disorders at a medical center. The patients’ mean age was 67.7 years, and 63.0% of the patients were male. All patients were assessed with the BDI-II and the TGDS twice, 2 weeks apart. We used the intraclass correlation coefficient (ICC) to determine the reliability between test and retest. We calculated the MDC based on standard error of measurement. The MDC% was calculated (i.e., by dividing the MDC by the possible maximal score of the measure). Results The test-retest reliabilities of the BDI-II/TGDS were high (ICC = 0.86/0.89). The MDCs (MDC%s) of the BDI-II and TGDS were 8.7 (13.8%) and 5.4 points (18.0%), respectively. Both measures had acceptable to nearly excellent random measurement errors. Conclusions The test-retest reliabilities of the BDI-II and the TGDS are high. The MDCs of both measures are acceptable to nearly excellent in people with PD. These findings imply that the BDI-II and the TGDS are suitable for use in a research context and in clinical settings to detect real change in a single subject. PMID:28945776
Reliability and Validity of the Korean Version of the Internet Addiction Test among College Students
Lee, Kounseok; Lee, Hye-Kyung; Gyeong, Hyunsu; Yu, Byeongkwan; Song, Yul-Mai
2013-01-01
We developed a Korean translation of the Internet Addiction Test (KIAT), widely used self-report for internet addiction and tested its reliability and validity in a sample of college students. Two hundred seventy-nine college students at a national university completed the KIAT. Internal consistency and two week test-retest reliability were calculated from the data, and principal component factor analysis was conducted. Participants also completed the Internet Addiction Diagnostic Questionnaire (IADQ), the Korea Internet addiction scale (K-scale), and the Patient Health Questionnaire-9 for the criterion validity. Cronbach's alpha of the whole scale was 0.91, and test-retest reliability was also good (r = 0.73). The IADQ, the K-scale, and depressive symptoms were significantly correlated with the KIAT scores, demonstrating concurrent and convergent validity. The factor analysis extracted four factors (Excessive use, Dependence, Withdrawal, and Avoidance of reality) that accounted for 59% of total variance. The KIAT has outstanding internal consistency and high test-retest reliability. Also, the factor structure and validity data show that the KIAT is comparable to the original version. Thus, the KIAT is a psychometrically sound tool for assessing internet addiction in the Korean-speaking population. PMID:23678270
Development and evaluation of oral Cancer quality-of-life questionnaire (QOL-OC).
Nie, Min; Liu, Chang; Pan, Yi-Chen; Jiang, Chen-Xi; Li, Bao-Ru; Yu, Xi-Jie; Wu, Xin-Yu; Zheng, Shu-Ning
2018-05-03
In this study scales and items for the Oral Cancer Quality-of-life Questionnaire (QOL-OC) were designed and the instrument was evaluated. The QOL-OC was developed and modified using the international definition of quality of life (QOL) promulgated by the European Organization for Research and Treatment of Cancer (EORTC) and analysis of the precedent measuring instruments. The contents of each item were determined in the context of the specific characteristics of oral cancer. Two hundred thirteen oral cancer patients were asked to complete both the EORTC core quality of life questionnaire (EORTC QLC-C30) and the QOL-OC. Data collected was used to conduct factor analysis, test-retest reliability, internal consistency, and construct validity. Questionnaire compliance was relatively high. Fourteen of the 213 subjects accepted the same tests after 24 to 48 h demonstrating a high test-retest reliability for all five scales. Overall internal consistency surpasses 0.8. The outcome of the factor analysis coincides substantially with our theoretical conception. Each item shows a higher correlation coefficient within its own scale than the others which indicates high construct validity. QOL-OC demonstrates fairly good statistical reliability, validity, and feasibility. However, further tests and modification are needed to ensure its applicability to the quality-of-life assessment of Chinese oral cancer patients.
Reliability of the Dutch translation of the Kujala Patellofemoral Score Questionnaire.
Ummels, P E J; Lenssen, A F; Barendrecht, M; Beurskens, A J H M
2017-01-01
There are no Dutch language disease-specific questionnaires for patients with patellofemoral pain syndrome available that could help Dutch physiotherapists to assess and monitor these symptoms and functional limitations. The aim of this study was to translate the original disease-specific Kujala Patellofemoral Score into Dutch and evaluate its reliability. The questionnaire was translated from English into Dutch in accordance with internationally recommended guidelines. Reliability was determined in 50 stable subjects with an interval of 1 week. The patient inclusion criteria were age between 14 and 60 years; knowledge of the Dutch language; and the presence of at least three of the following symptoms: pain while taking the stairs, pain when squatting, pain when running, pain when cycling, pain when sitting with knees flexed for a prolonged period, grinding of the patella and a positive clinical patella test. The internal consistency, test-retest reliability, measurement error and limits of agreement were calculated. Internal consistency was 0.78 for the first assessment and 0.80 for the second assessment. The intraclass correlation coefficient (ICC agreement ) between the first and second assessments was 0.98. The mean difference between the first and second measurements was 0.64, and standard deviation was 5.51. The standard error measurement was 3.9, and the smallest detectable change was 11. The Bland and Altman plot shows that the limits of agreement are -10.37 and 11.65. The results of the present study indicated that the test-retest reliability translated Dutch version of the Kujala Patellofemoral Score questionnaire is equivalent of the test-retest original English language version and has good internal consistency. Trial registration NTR (TC = 3258). Copyright © 2015 John Wiley & Sons, Ltd. Copyright © 2015 John Wiley & Sons, Ltd.
Cross-cultural adaptation of VISA-P score for patellar tendinopathy in Turkish population.
Çelebi, Mehmet Mesut; Köse, Serdal Kenan; Akkaya, Zehra; Zergeroglu, Ali Murat
2016-01-01
VISA-P questionnaire assesses to severity of symptoms and treatment effects in athletes with patellar tendinopathy. The purpose of this study was to translated VISA-P questionnaire into Turkish language and to determine its validity and reliability. The English version of VISA-P questionnaire was translated into Turkish according to the internationally recommended guidelines. Test-retest reliability was determined on 89 participants with time interval 24 h. To determine validity of Turkish VISA-P, 31 (17 male, 14 female) healthy students, 34 (20 male, 14 female) patients with patellar tendinopathy (diagnosed by physical examination and ultrasonography) and 24 (16 male, 8 female) volleyball players (at risk populations) were completed VISA-P-Tr. Internal consistency was determined with Cronbach's alpha. Intraclass correlation coefficients (ICCs) were calculated to analyse test-retest reliability. To assessment of discrimination, VISA-P-Tr scores compared all groups using the Mann-Whitney-U test. The VISA-P-Tr questionnaire showed good test-retest reliability (The Cronbach's alpha was 0.79 and 0.78 respectively and ICC was 0.96). The VISA-P-Tr score (mean ± SD) were 93.7 ± 8.9 and 94.0 ± 8.1 for healthy students, 81.1 ± 13.7 and 80.7 ± 13.4 for volleyball players, 58.8 ± 12.1 and 58.5 ± 11.0 for athletes with patellar tendinopathy. The translated Turkish version of VISA-P has good internal consistency and good reliability and validity. Therefore VISA-P-Tr is useful to evaluate symptoms and follow the treatment effect in athletes with patellar tendinopathy.
Gamage, Prasanna J; Fortington, Lauren V; Finch, Caroline F
2018-01-01
Cricket is a very popular sport in Sri Lanka. In this setting there has been limited research; specifically, there is little knowledge of cricket injuries. To support future research possibilities, the aim of this study was to cross-culturally adapt, translate and test the reliability of an Australian-developed questionnaire for the Sri Lankan context. The Australian 'Juniors Enjoying Cricket Safely' (JECS-Aus) injury risk perception questionnaire was cross-culturally adapted to suit the Sri Lankan context and subsequently translated into the two main languages (Sinhala and Tamil) based on standard forward-back translation. The translated questionnaires were examined for content validity by two language schoolteachers. The questionnaires were completed twice, 2 weeks apart, by two groups of school cricketers (males) aged 11-15 years (Sinhala (n=24), Tamil (n=30)) to assess reliability. Test-retest scores were evaluated for agreement. Where responses were <100% agreement, Cohen's kappa (κ) statistics were calculated. Questions with moderate-to-poor test-retest reliability (κ<0.6) were reconsidered for modification. Both the Sinhala and Tamil questionnaires had 100% agreement for questions on demographic data, and 88%-100% agreement for questions on participation in cricket and injury history. Of the injury risk perception questions, 72% (Sinhala) and 90% (Tamil) questions showed a substantial (κ=0.61-0.8) and almost perfect (κ=0.81-1.0) test-retest agreement. The adapted and translated JECS-SL questionnaire demonstrated strong reliability. This is the first study to adapt the JECS-Aus questionnaire for use in a different population, providing an outcome measure for assessing injury risk perceptions in Sri Lankan junior cricketers.
The Pareidolia Test: A Simple Neuropsychological Test Measuring Visual Hallucination-Like Illusions.
Mamiya, Yasuyuki; Nishio, Yoshiyuki; Watanabe, Hiroyuki; Yokoi, Kayoko; Uchiyama, Makoto; Baba, Toru; Iizuka, Osamu; Kanno, Shigenori; Kamimura, Naoto; Kazui, Hiroaki; Hashimoto, Mamoru; Ikeda, Manabu; Takeshita, Chieko; Shimomura, Tatsuo; Mori, Etsuro
2016-01-01
Visual hallucinations are a core clinical feature of dementia with Lewy bodies (DLB), and this symptom is important in the differential diagnosis and prediction of treatment response. The pareidolia test is a tool that evokes visual hallucination-like illusions, and these illusions may be a surrogate marker of visual hallucinations in DLB. We created a simplified version of the pareidolia test and examined its validity and reliability to establish the clinical utility of this test. The pareidolia test was administered to 52 patients with DLB, 52 patients with Alzheimer's disease (AD) and 20 healthy controls (HCs). We assessed the test-retest/inter-rater reliability using the intra-class correlation coefficient (ICC) and the concurrent validity using the Neuropsychiatric Inventory (NPI) hallucinations score as a reference. A receiver operating characteristic (ROC) analysis was used to evaluate the sensitivity and specificity of the pareidolia test to differentiate DLB from AD and HCs. The pareidolia test required approximately 15 minutes to administer, exhibited good test-retest/inter-rater reliability (ICC of 0.82), and moderately correlated with the NPI hallucinations score (rs = 0.42). Using an optimal cut-off score set according to the ROC analysis, and the pareidolia test differentiated DLB from AD with a sensitivity of 81% and a specificity of 92%. Our study suggests that the simplified version of the pareidolia test is a valid and reliable surrogate marker of visual hallucinations in DLB.
Hoffman, Hal M; Wolfe, Frederick; Belomestnov, Pavel; Mellis, Scott J
2008-09-01
Development of an instrument for characterization of symptom patterns and severity in patients with cryopyrin-associated periodic syndromes (CAPS). Two generations of daily health assessment forms (DHAFs) were evaluated in this study. The first-generation DHAF queried 11 symptoms. Analyses of results obtained with that instrument identified five symptoms included in a revised second-generation DHAF that was tested for internal consistency and test-retest reliability. This DHAF was also assessed during the initial portion of a phase 3 clinical study of CAPS treatment. Forty-eight CAPS patients provided data for the first-generation DHAFs. Five symptoms (rash, fever, joint pain, eye redness/pain, and fatigue) were included in the revised second-generation DHAF. Symptom severity was highly variable during all study phases with as many as 89% of patients reporting at least one symptom flare, and percentages of days with flares reaching 58% during evaluation of the second-generation instrument. Mean composite key symptom scores (KSSs) computed during evaluation of the second-generation DHAF correlated well with Physician's Global Assessment of Disease Activity (r=0.91, p<0.0001) and patient reports of limitations of daily activities (r=0.68, p<0.0001). Test-retest reliability and Cronbach's alpha's were high (0.93 and 0.94, respectively) for the second-generation DHAF. Further evaluation of this DHAF during a baseline period and placebo treatment in a phase 3 clinical study of CAPS patients indicated strong correlations between baseline KSS and Physician's Global Assessment of Disease Activity. Cronbach's alpha's at baseline and test-retest reliability were also high. Potentially important study limitations include small sample size, the lack of a standard tool for CAPS symptom assessment against which to validate the DHAF, and no assessment of the instrument's responsivity to CAPS therapy. The DHAF is a new instrument that may be useful for capturing symptom patterns and severity in CAPS patients and monitoring responses to therapies for these conditions.
Orrung Wallin, Anneli; Edberg, Anna-Karin; Beck, Ingela; Jakobsson, Ulf
2013-01-01
There are many instruments assessing the wellbeing of staff, but far from all have been psychometrically investigated. When evaluating supportive interventions directed toward nurse assistants in residential care, valid and reliable instruments are needed in order to detect possible changes. The aim of the study was to investigate validity in terms of data quality, construct validity, convergent and divergent validity and reliability in terms of the internal consistency and stability of the Job Satisfaction Questionnaire, the Psychosocial Aspects of Job Satisfaction, the Strain in Dementia Care Scale (SDCS), and the Stress of Conscience Questionnaire (SCQ) in a residential care context. The psychometric properties of the instruments were investigated in terms of data quality, construct validity, convergent and divergent validity and reliability, including test-retest reliability, in a residential care context with a sample consisting of nurse assistants (n=114). The four instruments responded with different psychometric-related problems such as internal missing data, floor and ceiling effects, problems with construct validity and low test-retest reliability, especially when assessed on the item level. These problems were however reduced or disappeared completely when assessed for total and factor scores. From a psychometric perspective, the SDCS seemed to stand out as the best instrument. However, it should be modified in order to reduce floor effects on item level and thereby gain sensitivity. The Job Satisfaction Questionnaire seemed to have problems both with the construct validity and test-retest reliability. The final choice of instrument must, however, be made dependent on what one intends to measure. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Validity and reliability of Optojump photoelectric cells for estimating vertical jump height.
Glatthorn, Julia F; Gouge, Sylvain; Nussbaumer, Silvio; Stauffacher, Simone; Impellizzeri, Franco M; Maffiuletti, Nicola A
2011-02-01
Vertical jump is one of the most prevalent acts performed in several sport activities. It is therefore important to ensure that the measurements of vertical jump height made as a part of research or athlete support work have adequate validity and reliability. The aim of this study was to evaluate concurrent validity and reliability of the Optojump photocell system (Microgate, Bolzano, Italy) with force plate measurements for estimating vertical jump height. Twenty subjects were asked to perform maximal squat jumps and countermovement jumps, and flight time-derived jump heights obtained by the force plate were compared with those provided by Optojump, to examine its concurrent (criterion-related) validity (study 1). Twenty other subjects completed the same jump series on 2 different occasions (separated by 1 week), and jump heights of session 1 were compared with session 2, to investigate test-retest reliability of the Optojump system (study 2). Intraclass correlation coefficients (ICCs) for validity were very high (0.997-0.998), even if a systematic difference was consistently observed between force plate and Optojump (-1.06 cm; p < 0.001). Test-retest reliability of the Optojump system was excellent, with ICCs ranging from 0.982 to 0.989, low coefficients of variation (2.7%), and low random errors (±2.81 cm). The Optojump photocell system demonstrated strong concurrent validity and excellent test-retest reliability for the estimation of vertical jump height. We propose the following equation that allows force plate and Optojump results to be used interchangeably: force plate jump height (cm) = 1.02 × Optojump jump height + 0.29. In conclusion, the use of Optojump photoelectric cells is legitimate for field-based assessments of vertical jump height.
Development and validation of the Myasthenia Gravis Impairment Index.
Barnett, Carolina; Bril, Vera; Kapral, Moira; Kulkarni, Abhaya; Davis, Aileen M
2016-08-30
We aimed to develop a measure of myasthenia gravis impairment using a previously developed framework and to evaluate reliability and validity, specifically face, content, and construct validity. The first draft of the Myasthenia Gravis Impairment Index (MGII) included examination items from available measures enriched with newly developed, patient-reported items, modified after patient input. International neuromuscular specialists evaluated face and content validity via an e-mail survey. Test-retest reliability was assessed in stable patients at a 3-week interval and interrater reliability was evaluated in the same day. Construct validity was assessed through correlations between the MGII and other measures and by comparing scores in different patient groups. The first draft was assessed by 18 patients, and 72 specialists answered the survey. The second draft had 7 examination and 22 patient-reported items. Field testing included 200 patients, with 54 patients completing the reliability studies. Test-retest reliability of the total score was good (intraclass correlation coefficient 0.92; 95% confidence interval 0.79-0.94), as was interrater reliability of the examination component (intraclass correlation coefficient 0.81; 95% confidence interval 0.79-0.94). The MGII correlated well with comparison measures, with higher correlations with the MG-activities of daily living (r = 0.91) and MG-specific quality of life 15-item scale (r = 0.78). When assessing different patient groups, the scores followed expected patterns. The MGII was developed using a patient-centered framework of myasthenia-related impairments and incorporating patient input throughout the development process. It is reliable in an outpatient setting and has demonstrated construct validity. Responsiveness studies are under way. © 2016 American Academy of Neurology.
Ruan, W June; Goldstein, Risë B; Chou, S Patricia; Smith, Sharon M; Saha, Tulshi D; Pickering, Roger P; Dawson, Deborah A; Huang, Boji; Stinson, Frederick S; Grant, Bridget F
2008-01-01
This study presents test-retest reliability statistics and information on internal consistency for new diagnostic modules and risk factors for alcohol, drug, and psychiatric disorders from the Alcohol Use Disorder and Associated Disabilities Interview Schedule-IV (AUDADIS-IV). Test-retest statistics were derived from a random sample of 1899 adults selected from 34,653 respondents who participated in the 2004-2005 Wave 2 National Epidemiologic Survey on Alcohol and Related Conditions (NESARC). Internal consistency of continuous scales was assessed using the entire Wave 2 NESARC. Both test and retest interviews were conducted face-to-face. Test-retest and internal consistency results for diagnoses and symptom scales associated with posttraumatic stress disorder, attention-deficit/hyperactivity disorder, and borderline, narcissistic, and schizotypal personality disorders were predominantly good (kappa>0.63; ICC>0.69; alpha>0.75) and reliability for risk factor measures fell within the good to excellent range (intraclass correlations=0.50-0.94; alpha=0.64-0.90). The high degree of reliability found in this study suggests that new AUDADIS-IV diagnostic measures can be useful tools in research settings. The availability of highly reliable measures of risk factors for alcohol, drug, and psychiatric disorders will contribute to the validity of conclusions drawn from future research in the domains of substance use disorder and psychiatric epidemiology.
Standardization of Brief Inventory of Social Support Exchange Network (BISSEN) in Japan.
Aiba, Miyuki; Tachikawa, Hirokazu; Fukuoka, Yoshiharu; Lebowitz, Adam; Shiratori, Yuki; Doi, Nagafumi; Matsui, Yutaka
2017-07-01
This study describes the Brief Inventory of Social Support Exchange Network (BISSEN) as a standardized brief inventory measuring various aspects of social support. We confirmed the reliability and validity for function and direction of support and standardized the BISSEN. For Sample 1, a stratified random sampling method was used to select 5200 residents in Japan. We conducted mail surveys and responses were retrieved from 2274 participants (collection rate 43.7%). Participants completed a questionnaire packet that included BISSEN, suicidal ideation, depression, support seeking, and Multidimensional Scale of Perceived Social Support (MSPSS). Sample 2 surveys for test-retest reliability were conducted on 23 residents at approximately two-week intervals. Participants were asked about gender, age, and BISSEN. First, we assessed the internal consistency, test-retest reliability, construct, convergent, and concurrent validity. McDonald's omega (.73-.92) and test-retest correlations (.78-.85) demonstrated adequate internal consistency and test-retest reliability. Depression, support seeking, and MSPSS were significantly correlated with all scores of BISSEN. The non-suicidal ideation group had significantly more support compared to the suicidal ideation group. Therefore, function and direction of support in BISSEN had sufficient reliability and validity. Next, we standardized BISSEN using Z-scores and percentile rank with respect to each 12 norm groups by age and gender. Copyright © 2017 Elsevier Ireland Ltd. All rights reserved.
Ruiz, Begoña; Urzúa, Iván; Cabello, Rodrigo; Rodríguez, Gonzalo; Espelid, Ivar
2013-01-01
To translate and validate a Spanish version of the "Questionnaire on the treatment of approximal and occlusal caries" as a method of collecting information about treatment decisions on caries management in Chilean primary health care services. The original questionnaire proposed by Espelid et al. was translated into Spanish using the forward-backward translation technique. Subsequently, validation of the Spanish version was undertaken. Data were collected from two separate samples; first, from 132 Spanish-speaking dentists recruited from primary health care services and second, from 21 individuals characterised as cariologists. Internal consistency was evaluated by the generation of Cronbach's alpha, test-retest reliability was evaluated by Cohen's kappa, convergent validity was evaluated by comparing the total scale scores to a global evaluation of treatment trends and discriminant validity was evaluated by investigating the differences in total scale scores between the Spanish-speaking dentist and cariologist samples. Cronbach's alpha indicated an internal consistency of 0.63 for the entire scale. Cohen's kappa correlation coefficient expressed a test-retest reliability of 0.83. Convergent validity determined a Pearson's correlation coefficient of 0.24 (p < 0.01). The comparison of proportions (chi-squared) indicated that discriminant validity was statistically significant (p < 0.01), using a one-tailed test. The Spanish version of the "Questionnaire on the treatment of approximal and occlusal caries" is a valid and reliable instrument for collecting information regarding treatment decisions in cariology. The clinical relevance of this study is to acquire a reliable instrument that allows for the determination of treatment decisions in Spanish-speaking dentists.
Reliability and validity of the korean version of the connor-davidson resilience scale.
Baek, Hyun-Sook; Lee, Kyoung-Uk; Joo, Eun-Jeong; Lee, Mi-Young; Choi, Kyeong-Sook
2010-06-01
The Connor-Davidson Resilience Scale (CD-RISC) measures various aspects of psychological resilience in patients with posttraumatic stress disorder (PTSD) and other psychiatric ailments. This study sought to assess the reliability and validity of the Korean version of the Connor-Davidson Resilience Scale (K-CD-RISC). In total, 576 participants were enrolled (497 females and 79 males), including hospital nurses, university students, and firefighters. Subjects were evaluated using the K-CD-RISC, the Beck Depression Inventory (BDI), the Impact of Event Scale-Revised (IES-R), the Rosenberg Self-Esteem Scale (RSES), and the Perceived Stress Scale (PSS). Test-retest reliability and internal consistency were examined as a measure of reliability, and convergent validity and factor analysis were also performed to evaluate validity. Cronbach's alpha coefficient and test-retest reliability were 0.93 and 0.93, respectively. The total score on the K-CD-RISC was positively correlated with the RSES (r=0.56, p<0.01). Conversely, BDI (r=-0.46, p<0.01), PSS (r=-0.32, p<0.01), and IES-R scores (r=-0.26, p<0.01) were negatively correlated with the K-CD-RISC. The K-CD-RISC showed a five-factor structure that explained 57.2% of the variance. The K-CD-RISC showed good reliability and validity for measurement of resilience among Korean subjects.
Barzegar-Bafrooei, Ebrahim; Bakhtiary, Jalal; Khatoonabadi, Ahmad Reza; Fatehi, Farzad; Maroufizadeh, Saman; Fathali, Mojtaba
2016-01-01
Background: Dysphagia as a common condition affecting many aspects of the patient’s life. The Dysphagia Handicap Index (DHI) is a reliable self-reported questionnaire developed specifically to measure the impact of dysphagia on the patient’s quality of life. The aim of this study was to translate the questionnaire to Persian and to measure its validity and reliability in patients with neurogenic oropharyngeal dysphagia. Methods: A formal forward-backward translation of DHI was performed based on the guidelines for the cross-cultural adaptation of self-report measures. A total of 57 patients with neurogenic dysphagia who were referred to the neurology clinics of Tehran University of Medical Sciences, Iran, participated in this study. Internal consistency reliability of the DHI was examined using Cronbach’s alpha, and test-retest reliability of the scale was evaluated using intraclass correlation coefficient (ICC). Results: The internal consistency of the Persian DHI (P-DHI) was considered to be good; Cronbach’s alpha coefficient for the total P-DHI was 0.88. The test-retest reliability for the total and three subscales of the P-DHI ranged from 0.95 to 0.98 using ICC. Conclusion: The P-DHI demonstrated a good reliability, and it can be a valid instrument for evaluating the dysphagia effects on quality of life among Persian language population. PMID:27648173
Reliability of tristimulus colourimetry in the assessment of cutaneous bruise colour.
Scafide, Katherine N; Sheridan, Daniel J; Taylor, Laura A; Hayat, Matthew J
2016-06-01
Bruising is one of the most common types of injury clinicians observe among victims of violence and other trauma patients. However, research has shown commonly used qualitative description of cutaneous bruise colour via the naked eye is subjective and unreliable. No published work has formally evaluated the reliability of tristimulus colourimetry as an alternative for assessing bruise colour, despite its clinical and research applications in accurately assessing skin colour. The purpose of this study was to systematically evaluate the test-retest and inter-observer reliability of tristimulus colourimetry in the assessment of cutaneous bruise colour. Two researchers obtained repeated tristimulus colourimetry measures of cutaneous bruises with participants of diverse skin colour. Measures were obtained using the Minolta CR-400 Chomameter. Commission Internationale d'Eclairage (CIE) L*a*b* colour space was used. Data was analysed using intraclass correlation coefficients (ICC), Cronbach's alpha, and minimal detectable change (MDC) on all three L*a*b* values. The colorimeter demonstrated excellent test-retest or intra-rater reliability (L* ICC=0.999; a* ICC=0.973; b* ICC=0.892) and inter-rater reliability (L* ICC=0.997; a* ICC=0.976; b* ICC=0.982). With consistent placement, the tristimulus colourimetry is reliable for the objective assessment and documentation of cutaneous bruise colour for purposes of clinical practice and research. Recommendations for use in practice/research are provided. Copyright © 2016 Elsevier Ltd. All rights reserved.
Feenstra, Heleen E M; Murre, Jaap M J; Vermeulen, Ivar E; Kieffer, Jacobien M; Schagen, Sanne B
2018-04-01
To facilitate large-scale assessment of a variety of cognitive abilities in clinical studies, we developed a self-administered online neuropsychological test battery: the Amsterdam Cognition Scan (ACS). The current studies evaluate in a group of adult cancer patients: test-retest reliability of the ACS and the influence of test setting (home or hospital), and the relationship between our online and a traditional test battery (concurrent validity). Test-retest reliability was studied in 96 cancer patients (57 female; M age = 51.8 years) who completed the ACS twice. Intraclass correlation coefficients (ICCs) were used to assess consistency over time. The test setting was counterbalanced between home and hospital; influence on test performance was assessed by repeated measures analyses of variance. Concurrent validity was studied in 201 cancer patients (112 female; M age = 53.5 years) who completed both the online and an equivalent traditional neuropsychological test battery. Spearman or Pearson correlations were used to assess consistency between online and traditional tests. ICCs of the online tests ranged from .29 to .76, with an ICC of .78 for the ACS total score. These correlations are generally comparable with the test-retest correlations of the traditional tests as reported in the literature. Correlating online and traditional test scores, we observed medium to large concurrent validity (r/ρ = .42 to .70; total score r = .78), except for a visuospatial memory test (ρ = .36). Correlations were affected-as expected-by design differences between online tests and their offline counterparts. Although development and optimization of the ACS is an ongoing process, and reliability can be optimized for several tests, our results indicate that it is a highly usable tool to obtain (online) measures of various cognitive abilities. The ACS is expected to facilitate efficient gathering of data on cognitive functioning in the near future.
Mehta, Saurabh P; MacDermid, Joy C; Richardson, Julie; MacIntyre, Norma J; Grewal, Ruby
2015-01-01
Clinical measurement. This study examined test-retest reliability and convergent/divergent construct validity of selected tests and measures that assess balance impairment, fear of falling (FOF), impaired physical activity (PA), and lower extremity muscle strength (LEMS) in females >45 years of age after the distal radius fracture (DRF) population. Twenty one female participants with DRF were assessed on two occasions. Timed Up and Go, Functional Reach, and One Leg Standing tests assessed balance impairment. Shortened Falls Efficacy Scale, Activity-specific Balance Confidence scale, and Fall Risk Perception Questionnaire assessed FOF. International Physical Activity Questionnaire and Rapid Assessment of Physical Activity were administered to assess PA level. Chair stand test and isometric muscle strength testing for hip and knee assessed LEMS. Intraclass correlation coefficients (ICC) examined the test-retest reliability of the measures. Pearson correlation coefficients (r) examined concurrent relationships between the measures. The results demonstrated fair to excellent test-retest reliability (ICC between 0.50 and 0.96) and low to moderate concordance between the measures (low if r ≤ 0.4; moderate if r = 0.4-0.7). The results provide preliminary estimates of test-retest reliability and convergent/divergent construct validity of selected measures associated with increased risk for falling in the females >45 years of age after DRF. Further research directions to advance knowledge regarding fall risk assessment in DRF population have been identified. Copyright © 2015 Hanley & Belfus. Published by Elsevier Inc. All rights reserved.
Kolotkin, Ronette L; Crosby, Ross D
2002-03-01
The short form of impact of weight on quality of life (IWQOL)-Lite is a 31-item, self-report, obesity-specific measure of health-related quality of life (HRQOL) that consists of a total score and scores on each of five scales--physical function, self-esteem, sexual life, public distress, and work--and that exhibits strong psychometric properties. This study was undertaken in order to assess test-retest reliability and discriminant validity in a heterogeneous sample of individuals not in treatment. Individuals were recruited from the community to complete questionnaires that included the IWQOL-Lite, SF-36, Rosenberg self-esteem (RSE) scale, Marlowe-Crowne social desirability scale, global ratings of quality of life, and sexual functioning and public distress ratings. Persons currently enrolled in weight loss programs or with a body mass index (BMI) of less than 18.5 were dropped from the analyses, leaving 341 females and 153 males for analysis, with an average BMI of 27.4. For test-retest reliability, 112 participants completed the IWQOL-Lite again. ANOVA revealed significant main effects for BMI for all IWQOL-Lite scales and total score. Females showed greater impairment than males on all scales except public distress. Internal consistency ranged from 0.816 to 0.944 for IWQOL-Lite scales and was 0.958 for total score. Test-retest reliability ranged from 0.814 to 0.877 for scales and was 0.937 for total score. Internal consistency and test-retest results for overweight/obese subjects were similar to those obtained for the total sample. There was strong evidence for convergent and discriminant validity of the IWQOL-Lite in overweight/obese subjects. As in previous studies conducted on treatment-seeking obese persons, the IWQOL-Lite appears to be a reliable and valid measure of obesity-specific quality of life in overweight/obese persons not seeking treatment.
Arifin, Nooranida; Abu Osman, Noor Azuan; Wan Abas, Wan Abu Bakar
2014-04-01
The measurements of postural balance often involve measurement error, which affects the analysis and interpretation of the outcomes. In most of the existing clinical rehabilitation research, the ability to produce reliable measures is a prerequisite for an accurate assessment of an intervention after a period of time. Although clinical balance assessment has been performed in previous study, none has determined the intrarater test-retest reliability of static and dynamic stability indexes during dominant single stance. In this study, one rater examined 20 healthy university students (female=12, male=8) in two sessions separated by 7 day intervals. Three stability indexes--the overall stability index (OSI), anterior/posterior stability index (APSI), and medial/ lateral stability index (MLSI) in static and dynamic conditions--were measured during single dominant stance. Intraclass correlation coefficient (ICC), standard error measurement (SEM) and 95% confidence interval (95% CI) were calculated. Test-retest ICCs for OSI, APSI, and MLSI were 0.85, 0.78, and 0.84 during static condition and were 0.77, 0.77, and 0.65 during dynamic condition, respectively. We concluded that the postural stability assessment using Biodex stability system demonstrates good-to-excellent test-retest reliability over a 1 week time interval.
Neto, Jose Osni Bruggemann; Gesser, Rafael Lehmkuhl; Steglich, Valdir; Bonilauri Ferreira, Ana Paula; Gandhi, Mihir; Vissoci, João Ricardo Nickenig; Pietrobon, Ricardo
2013-01-01
The validation of widely used scales facilitates the comparison across international patient samples. The objective of this study was to translate, culturally adapt and validate the Simple Shoulder Test into Brazilian Portuguese. Also we test the stability of factor analysis across different cultures. The objective of this study was to translate, culturally adapt and validate the Simple Shoulder Test into Brazilian Portuguese. Also we test the stability of factor analysis across different cultures. The Simple Shoulder Test was translated from English into Brazilian Portuguese, translated back into English, and evaluated for accuracy by an expert committee. It was then administered to 100 patients with shoulder conditions. Psychometric properties were analyzed including factor analysis, internal reliability, test-retest reliability at seven days, and construct validity in relation to the Short Form 36 health survey (SF-36). Factor analysis demonstrated a three factor solution. Cronbach's alpha was 0.82. Test-retest reliability index as measured by intra-class correlation coefficient (ICC) was 0.84. Associations were observed in the hypothesized direction with all subscales of SF-36 questionnaire. The Simple Shoulder Test translation and cultural adaptation to Brazilian-Portuguese demonstrated adequate factor structure, internal reliability, and validity, ultimately allowing for its use in the comparison with international patient samples.
Neto, Jose Osni Bruggemann; Gesser, Rafael Lehmkuhl; Steglich, Valdir; Bonilauri Ferreira, Ana Paula; Gandhi, Mihir; Vissoci, João Ricardo Nickenig; Pietrobon, Ricardo
2013-01-01
Background The validation of widely used scales facilitates the comparison across international patient samples. The objective of this study was to translate, culturally adapt and validate the Simple Shoulder Test into Brazilian Portuguese. Also we test the stability of factor analysis across different cultures. Objective The objective of this study was to translate, culturally adapt and validate the Simple Shoulder Test into Brazilian Portuguese. Also we test the stability of factor analysis across different cultures. Methods The Simple Shoulder Test was translated from English into Brazilian Portuguese, translated back into English, and evaluated for accuracy by an expert committee. It was then administered to 100 patients with shoulder conditions. Psychometric properties were analyzed including factor analysis, internal reliability, test-retest reliability at seven days, and construct validity in relation to the Short Form 36 health survey (SF-36). Results Factor analysis demonstrated a three factor solution. Cronbach’s alpha was 0.82. Test-retest reliability index as measured by intra-class correlation coefficient (ICC) was 0.84. Associations were observed in the hypothesized direction with all subscales of SF-36 questionnaire. Conclusion The Simple Shoulder Test translation and cultural adaptation to Brazilian-Portuguese demonstrated adequate factor structure, internal reliability, and validity, ultimately allowing for its use in the comparison with international patient samples. PMID:23675436
Benitez-Rosario, Miguel Angel; Caceres-Miranda, Raquel; Aguirre-Jaime, Armando
2016-03-01
A reliable and valid measure of the structure and process of end-of-life care is important for improving the outcomes of care. This study evaluated the validity and reliability of the Spanish adaptation of a satisfaction tool of the Care Evaluation Scale (CES), which was developed in Japan to evaluate palliative care structure and process from the perspective of family members. Standard forward-backward translation and a pilot test were conducted. A multicenter survey was conducted with the relatives of patients admitted to palliative care units for symptom control. The dimensional structure was assessed using confirmatory factor analyses. Concurrent and discriminant validity were tested by correlation with the SERQVHOS, a Spanish hospital care satisfaction scale and with an 11-point rating scale on satisfaction with care. The reliability of the CES was tested by Cronbach α and by test-retest correlation. A total of 284 primary caregivers completed the CES, with low missing response rates. The results of the factor analysis suggested a six-factor solution explaining 69% of the total variance. The CES moderately correlated with the SERQVHOS and with the overall satisfaction scale (intraclass correlation coefficients of 0.66 and 0.44, respectively; P = 0.001). Cronbach α was 0.90 overall and ranged from 0.85 to 0.89 for subdomains. Intraclass correlation coefficient was 0.88 (P = 0.001) for test-retest analysis. The Spanish CES was found to be a reliable and valid measure of the satisfaction with end-of-life care structure and process from family members' perspectives. Copyright © 2016 American Academy of Hospice and Palliative Medicine. Published by Elsevier Inc. All rights reserved.
Aertssen, W F M; Steenbergen, B; Smits-Engelsman, B C M
2018-06-07
There is lack of valid and reliable field-based tests for assessing functional strength in young children with mild intellectual disabilities (IDs). The aim of this study was to investigate the test-retest reliability and construct validity of the Functional Strength Measurement in children with ID (FSM-ID). Fifty-two children with mild ID (40 boys and 12 girls, mean age 8.48 years, SD = 1.48) were tested with the FSM. Test-retest reliability (n = 32) was examined by a two-way interclass correlation coefficient for agreement (ICC 2.1A). Standard error of measurement and smallest detectable change were calculated. Construct validity was determined by calculating correlations between the FSM-ID and handheld dynamometry (HHD) (convergent validity), FSM-ID, FSM-ID and subtest strength of the Bruininks-Oseretsky test of motor proficiency - second edition (BOT-2) (convergent validity) and the FSM-ID and balance subtest of the BOT-2 (discriminant validity). Test-retest reliability ICC ranged 0.89-0.98. Correlation between the items of the FSM-ID and HHD ranged 0.39-0.79 and between FSM-ID and BOT-2 (strength items) 0.41-0.80. Correlation between items of the FSM-ID and BOT-2 (balance items) ranged 0.41-0.70. The FSM-ID showed good test-retest reliability and good convergent validity with the HHD and BOT-2 subtest strength. The correlations assessing discriminant validity were higher than expected. Poor levels of postural control and core stability in children with mild IDs may be the underlying factor of those higher correlations. © 2018 MENCAP and International Association of the Scientific Study of Intellectual and Developmental Disabilities and John Wiley & Sons Ltd.
Lubans, David R; Smith, Jordan J; Harries, Simon K; Barnett, Lisa M; Faigenbaum, Avery D
2014-05-01
The aim of this study was to describe the development and assess test-retest reliability and construct validity of the Resistance Training Skills Battery (RTSB) for adolescents. The RTSB provides an assessment of resistance training skill competency and includes 6 exercises (i.e., body weight squat, push-up, lunge, suspended row, standing overhead press, and front support with chest touches). Scoring for each skill is based on the number of performance criteria successfully demonstrated. An overall resistance training skill quotient (RTSQ) is created by adding participants' scores for the 6 skills. Participants (44 boys and 19 girls, mean age = 14.5 ± 1.2 years) completed the RTSB on 2 occasions separated by 7 days. Participants also completed the following fitness tests, which were used to create a muscular fitness score (MFS): handgrip strength, timed push-up, and standing long jump tests. Intraclass correlation (ICC), paired samples t-tests, and typical error were used to assess test-retest reliability. To assess construct validity, gender and RTSQ were entered into a regression model predicting MFS. The rank order repeatability of the RTSQ was high (ICC = 0.88). The model explained 39% of the variance in MFS (p ≤ 0.001) and RTSQ (r = 0.40, p ≤ 0.001) was a significant predictor. This study has demonstrated the construct validity and test-retest reliability of the RTSB in a sample of adolescents. The RTSB can reliably rank participants in regards to their resistance training competency and has the necessary sensitivity to detect small changes in resistance training skill proficiency.
Yang, Nan; Waddington, Gordon; Adams, Roger; Han, Jia
2018-05-01
Quantitative assessments of handedness and footedness are often required in studies of human cognition and behaviour, yet no reliable Chinese versions of commonly used handedness and footedness questionnaires are available. Accordingly, the objective of the present study was to translate the Edinburgh Handedness Inventory (EHI) and the Waterloo Footedness Questionnaire-Revised (WFQ-R) into Mandarin Chinese and to evaluate the reliability and validity of these translated versions in healthy Chinese people. In the first stage of the study, Chinese versions of the EHI and WFQ-R were produced from a process of translation, back translation and examination, with necessary cultural adaptations. The second stage involved determining the reliability and validity of the translated EHI and WFQ-R for the Chinese population. One hundred and ten Chinese participants were tested online, and the results showed that the Cronbach's alpha coefficient of internal consistency was 0.877 for the translated EHI and 0.855 for the translated WFQ-R. Another 170 Chinese participants were tested and re-tested after a 30-day interval. The intra-class correlation coefficients showed high reliability, 0.898 for the translated EHI and 0.869 for the translated WFQ-R. This preliminary validation study found the translated versions to be reliable and valid tools for assessing handedness and footedness in this population.
[Reliability and validity of Meaningful Life Measure-Chinese Revised in Chinese college students].
Xiao, Rong; Lai, Qiao-Zhen; Yang, Jia-Ping
2016-04-20
To test the reliability and validity of Meaningful Life Measure-Chinese Revised (MLM-CR) in Chinese college students. A total of 1035 college students were evaluated with MLM-CR, Satisfaction with Life Scale (SWLS), Purpose in Life (PIL) and Patient Health Questionnaire-2 (PHQ-2), and 120 of the students were examined with PIL-SF twice. All the items in MLM-CR had good discrimination indexes (r=0.753-0.838, P<0.001). Confirmatory factor analysis confirmed the hypothesized five-factor model of MLM-CR (Χ 2 /df=3.4, GFI=0.946, AGFI=0.924, RMR=0.069, NFI=0.953, CFI=0.966, RMSEA=0.048). The total internal consistency reliability of MLM-CR was 0.942, and the alpha coefficients of the 5 dimensions ranged from 0.782 to 0.877; the total split-half reliability was 0.920, and the split-half reliability of the 5 dimensions ranged from 0.752 to 0.830; the total test-retest reliability was 0.871, and the test-retest reliability of the 5 dimensions ranged from 0.783 to 0.805. The criterion validity of MLM-CR in correlation with SWLS, PIL and PHQ-2 was 0.66, 0.755 and -0.388, respectively (P<0.01). The Average score of MLM-CR of the college students was 5.20∓0.90, and the scores were significantly higher in female students than in the male students (P<0.001). MLM-CR has good psychometric properties for application in comprehensive evaluation of personal meaning in life.
Validity and reliability of a new tool to evaluate handwriting difficulties in Parkinson’s disease
Nackaerts, Evelien; Heremans, Elke; Smits-Engelsman, Bouwien C. M.; Broeder, Sanne; Vandenberghe, Wim; Bergmans, Bruno; Nieuwboer, Alice
2017-01-01
Background Handwriting in Parkinson’s disease (PD) features specific abnormalities which are difficult to assess in clinical practice since no specific tool for evaluation of spontaneous movement is currently available. Objective This study aims to validate the ‘Systematic Screening of Handwriting Difficulties’ (SOS-test) in patients with PD. Methods Handwriting performance of 87 patients and 26 healthy age-matched controls was examined using the SOS-test. Sixty-seven patients were tested a second time within a period of one month. Participants were asked to copy as much as possible of a text within 5 minutes with the instruction to write as neatly and quickly as in daily life. Writing speed (letters in 5 minutes), size (mm) and quality of handwriting were compared. Correlation analysis was performed between SOS outcomes and other fine motor skill measurements and disease characteristics. Intrarater, interrater and test-retest reliability were assessed using the intraclass correlation coefficient (ICC) and Spearman correlation coefficient. Results Patients with PD had a smaller (p = 0.043) and slower (p<0.001) handwriting and showed worse writing quality (p = 0.031) compared to controls. The outcomes of the SOS-test significantly correlated with fine motor skill performance and disease duration and severity. Furthermore, the test showed excellent intrarater, interrater and test-retest reliability (ICC > 0.769 for both groups). Conclusion The SOS-test is a short and effective tool to detect handwriting problems in PD with excellent reliability. It can therefore be recommended as a clinical instrument for standardized screening of handwriting deficits in PD. PMID:28253374
Validity, Reliability, and Sensitivity of a Volleyball Intermittent Endurance Test.
Rodríguez-Marroyo, Jose A; Medina-Carrillo, Javier; García-López, Juan; Morante, Juan C; Villa, José G; Foster, Carl
2017-03-01
To analyze the concurrent and construct validity of a volleyball intermittent endurance test (VIET). The VIET's test-retest reliability and sensitivity to assess seasonal changes was also studied. During the preseason, 71 volleyball players of different competitive levels took part in this study. All performed the VIET and a graded treadmill test with gas-exchange measurement (GXT). Thirty-one of the players performed an additional VIET to analyze the test-retest reliability. To test the VIET's sensitivity, 28 players repeated the VIET and GXT at the end of their season. Significant (P < .001) relationships between VIET distance and maximal oxygen uptake (r = .74) and GXT maximal speed (r = .78) were observed. There were no significant differences between the VIET performance test and retest (1542.1 ± 338.1 vs 1567.1 ± 358.2 m). Significant (P < .001) relationships and intraclass correlation coefficient (ICC) were found (r = .95, ICC = .96) for VIET performance. VIET performance increased significantly (P < .001) with player performance level and was sensitive to fitness changes across the season (1458.8 ± 343.5 vs 1581.1 ± 334.0 m, P < .01). The VIET may be considered a valid, reliable, and sensitive test to assess the aerobic endurance in volleyball players.
Reliabilities of mental rotation tasks: limits to the assessment of individual differences.
Hirschfeld, Gerrit; Thielsch, Meinald T; Zernikow, Boris
2013-01-01
Mental rotation tasks with objects and body parts as targets are widely used in cognitive neuropsychology. Even though these tasks are well established to study between-groups differences, the reliability on an individual level is largely unknown. We present a systematic study on the internal consistency and test-retest reliability of individual differences in mental rotation tasks comparing different target types and orders of presentations. In total n = 99 participants (n = 63 for the retest) completed the mental rotation tasks with hands, feet, faces, and cars as targets. Different target types were presented in either randomly mixed blocks or blocks of homogeneous targets. Across all target types, the consistency (split-half reliability) and stability (test-retest reliabilities) were good or acceptable both for intercepts and slopes. At the level of individual targets, only intercepts showed acceptable reliabilities. Blocked presentations resulted in significantly faster and numerically more consistent and stable responses. Mental rotation tasks-especially in blocked variants-can be used to reliably assess individual differences in global processing speed. However, the assessment of the theoretically important slope parameter for individual targets requires further adaptations to mental rotation tests.
Reliability and validity of the Chinese pediatric voice handicap index.
Liu, Kena; Liu, Shaofeng; Zhou, Zhou; Ren, Qinyi; Zhong, Jie; Luo, Renzhong; Qin, Huabiao; Zhang, Siyi; Ge, Pingjiang
2018-02-01
To evaluate the reliability and validity of the Chinese version of pediatric voice handicap index (pVHI). The original English version-pVHI was translated into Chinese. Parents of 52 children with voice dysphonia and 43 children with no history or symptoms of voice problems were asked to fill the Chinese pVHI questionnaires twice with an interval of 2 weeks. GRB (Grade, Roughness, Breathiness) scale was used for perceptual assessment by two otolaryngologists and one speech pathologist for each child's voice. The internal consistency was assessed using Cronbach's alpha coefficient. Pearson's correlation coefficient was used to evaluate the test-retest reliability. The Kendall's coefficient of concordance W was used to assess the consistency of GRB scores of 3 voice specialists. The nonparametric Mann-Whitney test was used to assess the differences between the dysphonia group and controls. The correlation between pVHI and GRB scores were assessed using Pearson's correlation coefficient. The internal consistency of total score and three subscales scores of Chinese pVHI were 0.788-0.944. The test-retest reliability was 0.631-0.887(P < .001). The pVHI scores of control group significantly were lower than the pathological group (P = .000). The GRB scores of 3 voice specialists have an excellent consistency (W = 0.694-0.807, P = .000). The pVHI scores positively correlated with GRB assessment (P < .01). The Chinese version of pVHI had a good reliability and validity. It can be applicable and useful supplementary tool for evaluating parents' perception of their children's dysphonia. Copyright © 2017. Published by Elsevier B.V.
Olsen, J. Pat; Fellows, Robert P.; Rivera-Mindt, Monica; Morgello, Susan; Byrd, Desiree A.
2015-01-01
The Wide Range Achievement Test, 3rd edition, Reading-Recognition subtest (WRAT-3 RR) is an established measure of premorbid ability. Furthermore, its long-term reliability is not well documented, particularly in diverse populations with CNS-relevant disease. Objective: We examined test-retest reliability of the WRAT-3 RR over time in an HIV+ sample of predominantly racial/ethnic minority adults. Method: Participants (N = 88) completed a comprehensive neuropsychological battery, including the WRAT-3 RR, on at least two separate study visits. Intraclass correlation coefficients (ICCs) were computed using scores from baseline and follow-up assessments to determine the test-retest reliability of the WRAT-3 RR across racial/ethnic groups and changes in medical (immunological) and clinical (neurocognitive) factors. Additionally, Fisher’s Z tests were used to determine the significance of the differences between ICCs. Results: The average test-retest interval was 58.7 months (SD=36.4). The overall WRAT-3 RR test-retest reliability was high (r = .97, p < .001), and remained robust across all demographic, medical, and clinical variables (all r’s > .92). Intraclass correlation coefficients did not differ significantly between the subgroups tested (all Fisher’s Z p’s > .05). Conclusions: Overall, this study supports the appropriateness of word-reading tests, such as the WRAT-3 RR, for use as stable premorbid IQ estimates among ethnically diverse groups. Moreover, this study supports the reliability of this measure in the context of change in health and neurocognitive status, and in lengthy inter-test intervals. These findings offer strong rationale for reading as a “hold” test, even in the presence of a chronic, variable disease such as HIV. PMID:26689235
Walton, David M; Macdermid, Joy C; Nielson, Warren; Teasell, Robert W; Chiasson, Marco; Brown, Lauren
2011-09-01
Clinical measurement. To evaluate the intrarater, interrater, and test-retest reliability of an accessible digital algometer, and to determine the minimum detectable change in normal healthy individuals and a clinical population with neck pain. Pressure pain threshold testing may be a valuable assessment and prognostic indicator for people with neck pain. To date, most of this research has been completed using algometers that are too resource intensive for routine clinical use. Novice raters (physiotherapy students or clinical physiotherapists) were trained to perform algometry testing over 2 clinically relevant sites: the angle of the upper trapezius and the belly of the tibialis anterior. A convenience sample of normal healthy individuals and a clinical sample of people with neck pain were tested by 2 different raters (all participants) and on 2 different days (healthy participants only). Intraclass correlation coefficient (ICC), standard error of measurement, and minimum detectable change were calculated. A total of 60 healthy volunteers and 40 people with neck pain were recruited. Intrarater reliability was almost perfect (ICC = 0.94-0.97), interrater reliability was substantial to near perfect (ICC = 0.79-0.90), and test-retest reliability was substantial (ICC = 0.76-0.79). Smaller change was detectable in the trapezius compared to the tibialis anterior. This study provides evidence that novice raters can perform digital algometry with adequate reliability for research and clinical use in people with and without neck pain.
Validation of the Fatigue Impact Scale in Hungarian patients with multiple sclerosis.
Losonczi, Erika; Bencsik, Krisztina; Rajda, Cecília; Lencsés, Gyula; Török, Margit; Vécsei, László
2011-03-01
Fatigue is one of the most frequent complaints of patients with multiple sclerosis (MS). The Fatigue Impact Scale (FIS), one of the 30 available fatigue questionnaires, is commonly applied because it evaluates multidimensional aspects of fatigue. The main purposes of this study were to test the validity, test-retest reliability, and internal consistency of the Hungarian version of the FIS. One hundred and eleven MS patients and 85 healthy control (HC) subjects completed the FIS and the Beck Depression Inventory, a large majority of them on two occasions, 3 months apart. The total FIS score and subscale scores differed statistically between the MS patients and the HC subjects in both FIS sessions. In the test-retest reliability assessment, statistically, the intraclass correlation coefficients were high in both the MS and HC groups. Cronbach's alpha values were also notably high. The results of this study indicate that the FIS can be regarded as a valid and reliable scale with which to improve our understanding of the impact of fatigue on the health-related quality of life in MS patients without severe disability.
Development of a short version of the new brief job stress questionnaire.
Inoue, Akiomi; Kawakami, Norito; Shimomitsu, Teruichi; Tsutsumi, Akizumi; Haratani, Takashi; Yoshikawa, Toru; Shimazu, Akihito; Odagiri, Yuko
2014-01-01
This study was aimed to investigate the test-retest reliability and validity of a short version of the New Brief Job Stress Questionnaire (New BJSQ) whose scales have one item selected from a standard version. Based on the results from an anonymous web-based questionnaire of occupational health staffs and personnel/labor staffs, we selected higher-priority scales from the standard version. After selecting one item with highest item-total correlation coefficient from each scale, a 23-item questionnaire was developed. A nationally representative survey was administered to Japanese employees (n=1,633) to examine test-retest reliability and validity. Most scales (or items) showed modest but adequate levels of test-retest reliability (r>0.50). Furthermore, job demands and job resources scales (or items) were associated with mental and physical stress reactions while job resources scales (or items) were also associated with positive outcomes. These findings provided a piece of evidence that the short version of the New BJSQ is reliable and valid.
Development of a Short Version of the New Brief Job Stress Questionnaire
INOUE, Akiomi; KAWAKAMI, Norito; SHIMOMITSU, Teruichi; TSUTSUMI, Akizumi; HARATANI, Takashi; YOSHIKAWA, Toru; SHIMAZU, Akihito; ODAGIRI, Yuko
2014-01-01
This study was aimed to investigate the test-retest reliability and validity of a short version of the New Brief Job Stress Questionnaire (New BJSQ) whose scales have one item selected from a standard version. Based on the results from an anonymous web-based questionnaire of occupational health staffs and personnel/labor staffs, we selected higher-priority scales from the standard version. After selecting one item with highest item-total correlation coefficient from each scale, a 23-item questionnaire was developed. A nationally representative survey was administered to Japanese employees (n=1,633) to examine test-retest reliability and validity. Most scales (or items) showed modest but adequate levels of test-retest reliability (r>0.50). Furthermore, job demands and job resources scales (or items) were associated with mental and physical stress reactions while job resources scales (or items) were also associated with positive outcomes. These findings provided a piece of evidence that the short version of the New BJSQ is reliable and valid. PMID:24975108
Trippolini, Maurizio Alen; Janssen, Svenja; Hilfiker, Roger; Oesch, Peter
2018-06-01
Purpose To analyze the reliability and validity of a picture-based questionnaire, the Modified Spinal Function Sort (M-SFS). Methods Sixty-two injured workers with chronic musculoskeletal disorders (MSD) were recruited from two work rehabilitation centers. Internal consistency was assessed by Cronbach's alpha. Construct validity was tested based on four a priori hypotheses. Structural validity was measured with principal component analysis (PCA). Test-retest reliability and agreement was evaluated using intraclass correlation coefficient (ICC) and measurement error with the limits of agreement (LoA). Results Total score of the M-SFS was 54.4 (SD 16.4) and 56.1 (16.4) for test and retest, respectively. Item distribution showed no ceiling effects. Cronbach's alpha was 0.94 and 0.95 for test and retest, respectively. PCA showed the presence of four components explaining a total of 74% of the variance. Item communalities were >0.6 in 17 out of 20 items. ICC was 0.90, LoA was ±12.6/16.2 points. The correlations between the M-SFS were 0.89 with the original SFS, 0.49 with the Pain Disability Index, -0.37 and -0.33 with the Numeric Rating Scale for actual pain, -0.52 for selfreported disability due to chronic low back pain, and 0.50, 0.56-0.59 with three distinct lifting tests. No a priori defined hypothesis for construct validity was rejected. Conclusions The M-SFS allows reliable and valid assessment of perceived self-efficacy for work-related tasks and can be recommended for use in patients with chronic MSD. Further research should investigate the proposed M-SFS score of <56 for its predictive validity for non-return to work.
Lohr, Christine; Braumann, Klaus-Michael; Reer, Ruediger; Schroeder, Jan; Schmidt, Tobias
2018-04-20
Tensiomyography™ (TMG) and MyotonPRO ® (MMT) are two non-invasive devices for monitoring muscle contractile and mechanical characteristics. This study aimed to evaluate the test-retest reliability of TMG and MMT parameters for measuring (TMG:) muscle displacement (D m ), contraction time (T c ), and velocity (V c ) and (MMT:) frequency (F), stiffness (S), and decrement (D) of the erector spinae muscles (ES) in healthy adults. A particular focus was set on the establishment of reliability measures for the previously barely evaluated secondary TMG parameter V c . Twenty-four subjects (13 female and 11 male, mean ± SD, 38.0 ± 12.0 years) were measured using TMG and MMT over 2 consecutive days. Absolute and relative reliability was calculated by standard error of measurement (SEM, SEM%), Minimum detectable change (MDC, MDC%), coefficient of variation (CV%) and intraclass correlation coefficient (ICC, 3.1) with a 95% confidence interval (CI). The ICCs for all variables and test-retest intervals ranged from 0.75 to 0.99 indicating a good to excellent relative reliability for both TMG and MMT, demonstrating the lowest values for TMG T c and between-day MMT D (ICC < 0.90). Absolute reliability was suitable for all parameters (CV 2-8%) except for D m (10-12%). V c demonstrated to be the most reliable and repeatable TMG parameter (ICC > 0.95, CV < 8%). The reliability for TMG V c could be established successfully. Its further applicability needs to be confirmed in future studies. MMT was found to be more reliable on repeated testing than the two other TMG parameters D m and T c .
The Arthroscopic Surgical Skill Evaluation Tool (ASSET)
Koehler, Ryan J.; Amsdell, Simon; Arendt, Elizabeth A; Bisson, Leslie J; Braman, Jonathan P; Butler, Aaron; Cosgarea, Andrew J; Harner, Christopher D; Garrett, William E; Olson, Tyson; Warme, Winston J.; Nicandri, Gregg T.
2014-01-01
Background Surgeries employing arthroscopic techniques are among the most commonly performed in orthopaedic clinical practice however, valid and reliable methods of assessing the arthroscopic skill of orthopaedic surgeons are lacking. Hypothesis The Arthroscopic Surgery Skill Evaluation Tool (ASSET) will demonstrate content validity, concurrent criterion-oriented validity, and reliability, when used to assess the technical ability of surgeons performing diagnostic knee arthroscopy on cadaveric specimens. Study Design Cross-sectional study; Level of evidence, 3 Methods Content validity was determined by a group of seven experts using a Delphi process. Intra-articular performance of a right and left diagnostic knee arthroscopy was recorded for twenty-eight residents and two sports medicine fellowship trained attending surgeons. Subject performance was assessed by two blinded raters using the ASSET. Concurrent criterion-oriented validity, inter-rater reliability, and test-retest reliability were evaluated. Results Content validity: The content development group identified 8 arthroscopic skill domains to evaluate using the ASSET. Concurrent criterion-oriented validity: Significant differences in total ASSET score (p<0.05) between novice, intermediate, and advanced experience groups were identified. Inter-rater reliability: The ASSET scores assigned by each rater were strongly correlated (r=0.91, p <0.01) and the intra-class correlation coefficient between raters for the total ASSET score was 0.90. Test-retest reliability: there was a significant correlation between ASSET scores for both procedures attempted by each individual (r = 0.79, p<0.01). Conclusion The ASSET appears to be a useful, valid, and reliable method for assessing surgeon performance of diagnostic knee arthroscopy in cadaveric specimens. Studies are ongoing to determine its generalizability to other procedures as well as to the live OR and other simulated environments. PMID:23548808
Validation of an Italian version of the Fibromyalgia Impact Questionnaire (FIQ-I).
Sarzi-Puttini, P; Atzeni, F; Fiorini, T; Panni, B; Randisi, G; Turiel, M; Carrabba, M
2003-01-01
To validate a translated Italian version of the Fibromyalgia Impact Questionnaire (FIQ). The Italian version of the FIQ was administered to 50 patients affected by fibromyalgia (FM) (48 patients filled out the questionnaire again 10 days later) together with the Italian version of the Stanford Health Assessment Questionnaire (HAQ), the Medical Outcomes Survey Short Form-36 (SF-36), and a tender point count (TPC) obtained by summing the score (0-3) of each tender point tested by thumb palpation. All patients were asked about the severity of pain today (10 cm visual analog scale) and the duration of symptoms. Test-retest reliability was assessed using Spearman correlations. Internal consistency was evaluated with Cronbach's alpha of reliability. Construct validity of the FIQ was evaluated by correlations between the HAQ and subscales of the SF-36 as well as the TPC. The mean duration of symptoms was 6.5 years and the mean age of the participants was 57.4 years. Test-retest reliability was between 0.74 and 0.95 for physical functioning as well as for the total FIQ and other components. Internal consistency was 0.90 for the overall FIQ. Significant correlations were obtained between the FIQ items, the HAQ and the SF-36. The Italian FIQ is a reliable and valid instrument for detecting and measuring functional disability and health status in Italian patients with FM.
Kim, Kyoung-Eun; Lim, Jae-Young
2011-01-01
The Roland-Morris Disability Questionnaire (RMDQ) is a reliable tool for evaluating disability in patients with back pain, but no Korean version has been published and validated. We developed a cross-culturally adapted Korean version of the RMDQ (RMDQ-K) and validated its use for assessing disability in Korean patients with low back pain. Two hundred thirty-one patients with low back pain were assessed using the RMDQ-K, visual analog scale (VAS) during rest and activity, and the Oswestry Disability Index (ODI). The results of 40 patients were used to evaluate the test-retest reliability. The correlations of the RMDQ-K with the VAS and ODI were used to assess validity. The reliability of the RMDQ-K estimated using the internal consistency reached a Cronbach's alpha of 0.893. Test-retest trials showed a high intraclass correlation coefficient of 0.837 (95% CI 0.833-0.953). The RMDQ-K was significantly correlated with the ODI (r=0.738) and VAS during rest (r=0.450) and activity (r=0.412). This study demonstrates that the RMDQ-K is a reliable, valid instrument for measuring of disability in Korean patients with low back pain.
Krukowski, Rebecca A; Eddings, Kenya; West, Delia Smith
2011-06-01
Restaurant foods represent a substantial portion of children's dietary intake, and consumption of foods away from home has been shown to contribute to excess adiposity. This descriptive study aimed to pilot-test and establish the reliability of a standardized and comprehensive assessment tool, the Children's Menu Assessment, for evaluating the restaurant food environment for children. The tool is an expansion of the Nutrition Environment Measures Survey-Restaurant. In 2009-2010, a randomly selected sample of 130 local and chain restaurants were chosen from within 20 miles of Little Rock, AR, to examine the availability of children's menus and to conduct initial calibration of the Children's Menu Assessment tool (final sample: n=46). Independent raters completed the Children's Menu Assessment in order to determine inter-rater reliability. Test-retest reliability was also examined. Inter-rater reliability was high: percent agreement was 97% and Spearman correlation was 0.90. Test-retest was also high: percent agreement was 91% and Spearman correlation was 0.96. Mean Children's Menu Assessment completion time was 14 minutes, 56 seconds ± 10 minutes, 21 seconds. Analysis of Children's Menu Assessment findings revealed that few healthier options were available on children's menus, and most menus did not provide parents with information for making healthy choices, including nutrition information or identification of healthier options. The Children's Menu Assessment tool allows for comprehensive, rapid measurement of the restaurant food environment for children with high inter-rater reliability. This tool has the potential to contribute to public health efforts to develop and evaluate targeted environmental interventions and/or policy changes regarding restaurant foods. Copyright © 2011 American Dietetic Association. Published by Elsevier Inc. All rights reserved.
Boer, Annemarie; Dutmer, Alisa L; Schiphorst Preuper, Henrica R; van der Woude, Lucas H V; Stewart, Roy E; Deyo, Richard A; Reneman, Michiel F; Soer, Remko
2017-10-01
Validation study with cross-sectional and longitudinal measurements. To translate the US National Institutes of Health (NIH)-minimal dataset for clinical research on chronic low back pain into the Dutch language and to test its validity and reliability among people with chronic low back pain. The NIH developed a minimal dataset to encourage more complete and consistent reporting of clinical research and to be able to compare studies across countries in patients with low back pain. In the Netherlands, the NIH-minimal dataset has not been translated before and measurement properties are unknown. Cross-cultural validity was tested by a formal forward-backward translation. Structural validity was tested with exploratory factor analyses (comparative fit index, Tucker-Lewis index, and root mean square error of approximation). Hypothesis testing was performed to compare subscales of the NIH dataset with the Pain Disability Index and the EurQol-5D (Pearson correlation coefficients). Internal consistency was tested with Cronbach α and test-retest reliability at 2 weeks was calculated in a subsample of patients with Intraclass Correlation Coefficients and weighted Kappa (κω). In total, 452 patients were included of which 52 were included for the test-retest study. factor analysis for structural validity pointed into the direction of a seven-factor model (Cronbach α = 0.78). Factors and total score of the NIH-minimal dataset showed fair to good correlations with Pain Disability Index (r = 0.43-0.70) and EuroQol-5D (r = -0.41 to -0.64). Reliability: test-retest reliability per item showed substantial agreement (κω=0.65). Test-retest reliability per factor was moderate to good (Intraclass Correlation Coefficient = 0.71). The Dutch language version measurement properties of the NIH-minimal were satisfactory. N/A.
ERIC Educational Resources Information Center
Anderson, Daniel; Park, Jasmine, Bitnara; Lai, Cheng-Fei; Alonzo, Julie; Tindal, Gerald
2012-01-01
This technical report is one in a series of five describing the reliability (test/retest/and alternate form) and G-Theory/D-Study research on the easy CBM reading measures, grades 1-5. Data were gathered in the spring 2011 from a convenience sample of students nested within classrooms at a medium-sized school district in the Pacific Northwest. Due…
Hauge, Cindy Horst; Jacobs-Knight, Jacque; Jensen, Jamie L; Burgess, Katherine M; Puumala, Susan E; Wilton, Georgiana; Hanson, Jessica D
2015-06-01
The purpose of this study was to use a mixed-methods approach to determine the validity and reliability of measurements used within an alcohol-exposed pregnancy prevention program for American Indian women. To develop validity, content experts provided input into the survey measures, and a "think aloud" methodology was conducted with 23 American Indian women. After revising the measurements based on this input, a test-retest was conducted with 79 American Indian women who were randomized to complete either the original measurements or the new, modified measurements. The test-retest revealed that some of the questions performed better for the modified version, whereas others appeared to be more reliable for the original version. The mixed-methods approach was a useful methodology for gathering feedback on survey measurements from American Indian participants and in indicating specific survey questions that needed to be modified for this population. © The Author(s) 2015.
The Trojan Lifetime Champions Health Survey: Development, Validity, and Reliability
Sorenson, Shawn C.; Romano, Russell; Scholefield, Robin M.; Schroeder, E. Todd; Azen, Stanley P.; Salem, George J.
2015-01-01
Context Self-report questionnaires are an important method of evaluating lifespan health, exercise, and health-related quality of life (HRQL) outcomes among elite, competitive athletes. Few instruments, however, have undergone formal characterization of their psychometric properties within this population. Objective To evaluate the validity and reliability of a novel health and exercise questionnaire, the Trojan Lifetime Champions (TLC) Health Survey. Design Descriptive laboratory study. Setting A large National Collegiate Athletic Association Division I university. Patients or Other Participants A total of 63 university alumni (age range, 24 to 84 years), including former varsity collegiate athletes and a control group of nonathletes. Intervention(s) Participants completed the TLC Health Survey twice at a mean interval of 23 days with randomization to the paper or electronic version of the instrument. Main Outcome Measure(s) Content validity, feasibility of administration, test-retest reliability, parallel-form reliability between paper and electronic forms, and estimates of systematic and typical error versus differences of clinical interest were assessed across a broad range of health, exercise, and HRQL measures. Results Correlation coefficients, including intraclass correlation coefficients (ICCs) for continuous variables and κ agreement statistics for ordinal variables, for test-retest reliability averaged 0.86, 0.90, 0.80, and 0.74 for HRQL, lifetime health, recent health, and exercise variables, respectively. Correlation coefficients, again ICCs and κ, for parallel-form reliability (ie, equivalence) between paper and electronic versions averaged 0.90, 0.85, 0.85, and 0.81 for HRQL, lifetime health, recent health, and exercise variables, respectively. Typical measurement error was less than the a priori thresholds of clinical interest, and we found minimal evidence of systematic test-retest error. We found strong evidence of content validity, convergent construct validity with the Short-Form 12 Version 2 HRQL instrument, and feasibility of administration in an elite, competitive athletic population. Conclusions These data suggest that the TLC Health Survey is a valid and reliable instrument for assessing lifetime and recent health, exercise, and HRQL, among elite competitive athletes. Generalizability of the instrument may be enhanced by additional, larger-scale studies in diverse populations. PMID:25611315
Angers, Magalie; Svotelis, Amy; Balg, Frederic; Allard, Jean-Pascal
2016-04-01
The Ankle Osteoarthritis Scale (AOS) is a self-administered score specific for ankle osteoarthritis (OA) with excellent reliability and strong construct and criterion validity. Many recent randomized multicentre trials have used the AOS, and the involvement of the French-speaking population is limited by the absence of a French version. Our goal was to develop a French version and validate the psychometric properties to assure equivalence to the original English version. Translation was performed according to American Association of Orthopaedic Surgeons (AAOS) 2000 guidelines for cross-cultural adaptation. Similar to the validation process of the English AOS, we evaluated the psychometric properties of the French version (AOS-Fr): criterion validity (AOS-Fr v. Western Ontario and McMaster Universities Arthritis Index [WOMAC] and SF-36 scores), construct validity (AOS-Fr correlation to single heel-lift test), and reliability (AOS-Fr test-retest). Sixty healthy individuals tested a prefinal version of the AOS-Fr for comprehension, leading to modifications and a final version that was approved by C. Saltzman, author of the AOS. We then recruited patients with ankle OA for evaluation of the AOS-Fr psychometric properties. Twenty-eight patients with ankle OA participated in the evaluation. The AOS-Fr showed strong criterion validity (AOS:WOMAC r = 0.709 and AOS:SF-36 r = -0.654) and construct validity (r = 0.664) and proved to be reliable (test-retest intraclass correlation coefficient = 0.922). The AOS-Fr is a reliable and valid score equivalent to the English version in terms of psychometric properties, thus is available for use in multicentre trials.
Measuring family-centred practices of professionals in early intervention services in Taiwan.
Kang, L-J; Palisano, R J; Simeonsson, R J; Hwang, A-W
2017-09-01
Family-centred practices emphasize professional supports for forming partnerships with families in early intervention. The Measure of Processes of Care for Service Providers (MPOC-SP) measures the perceptions of paediatric service providers in supporting children and families. This study aimed to establish reliability of the Chinese version of the MPOC-SP (C-MPOC-SP) and to examine professional perceptions of family-centred practices in relation to professional discipline and years of experience. A convenience sample of 94 physical therapists, occupational therapists, speech-language pathologists, social workers and early childhood educators completed the C-MPOC-SP. Thirty-seven professionals completed the measure a second time within 2-4 weeks for test-retest reliability. Internal consistency and test-retest reliability were examined by Cronbach's α and intra-class correlation coefficient. Comparisons were made across professional disciplines by multivariate analyses of variance followed by analyses of variance. Relationships between years of experience and ratings of family-centred practices were examined by Pearson's correlation coefficients (r). Cronbach's α for items on each of the four scales of the C-MPOC-SP ranged from 0.80 to 0.92, indicating adequate internal consistency. Intra-class correlation coefficient between the initial and repeat completion of the C-MPOC-SP for each scale ranged from 0.56 to 0.77, indicating adequate to excellent test-retest reliability. Mean ratings for the Communicating Specific Information were significantly higher for physical therapists, occupational therapists and speech-language pathologists than for social workers (P = 0.001). The C-MPOC-SP scores were positively correlated with years of experience for all four scales (r = 0.23-0.38; P < 0.05). This study established adequate internal consistency and adequate to excellent test-retest reliability of the C-MPOC-SP in measuring perceptions of family centeredness of early intervention service providers. Cross-discipline differences were found in communicating specific information about the child. Higher perceptions of family centeredness were associated with more years of experience. The results support the utility of the C-MPOC-SP in professional education and programme evaluation of early intervention services in Taiwan. © 2017 John Wiley & Sons Ltd.
Extensive validation of the pain disability index in 3 groups of patients with musculoskeletal pain.
Soer, Remko; Köke, Albère J A; Vroomen, Patrick C A J; Stegeman, Patrick; Smeets, Rob J E M; Coppes, Maarten H; Reneman, Michiel F
2013-04-20
A cross-sectional study design was performed. To validate the pain disability index (PDI) extensively in 3 groups of patients with musculoskeletal pain. The PDI is a widely used and studied instrument for disability related to various pain syndromes, although there is conflicting evidence concerning factor structure, test-retest reliability, and missing items. Additionally, an official translation of the Dutch language version has never been performed. For reliability, internal consistency, factor structure, test-retest reliability and measurement error were calculated. Validity was tested with hypothesized correlations with pain intensity, kinesiophobia, Rand-36 subscales, Depression, Roland-Morris Disability Questionnaire, Quality of Life, and Work Status. Structural validity was tested with independent backward translation and approval from the original authors. One hundred seventy-eight patients with acute back pain, 425 patients with chronic low back pain and 365 with widespread pain were included. Internal consistency of the PDI was good. One factor was identified with factor analyses. Test-retest reliability was good for the PDI (intraclass correlation coefficient, 0.76). Standard error of measurement was 6.5 points and smallest detectable change was 17.9 points. Little correlations between the PDI were observed with kinesiophobia and depression, fair correlations with pain intensity, work status, and vitality and moderate correlations with the Rand-36 subscales and the Roland-Morris Disability Questionnaire. The PDI-Dutch language version is internally consistent as a 1-factor structure, and test-retest reliable. Missing items seem high in sexual and professional items. Using the PDI as a 2-factor questionnaire has no additional value and is unreliable.
Marques, Alda; Almeida, Sara; Carvalho, Joana; Cruz, Joana; Oliveira, Ana; Jácome, Cristina
2016-12-01
To assess the reliability, validity, and ability to identify fall status of the Balance Evaluation Systems Test (BESTest), Mini-BESTest, and Brief-BESTest, compared with the Berg Balance Scale (BBS), in older people living in the community. Cross-sectional. Community centers. Older adults (N=122; mean age ± SD, 76±9y). Not applicable. Participants reported on falls history in the preceding year and completed the Activities-Specific Balance Confidence (ABC) Scale. The BBS, BESTest, and the Five Times Sit-To-Stand Test were administered. Interrater (2 physiotherapists) and test-retest relative (48-72h) and absolute reliabilities were explored with the intraclass correlation coefficient (ICC) equation (2,1) and the Bland and Altman method. Minimal detectable changes at the 95% confidence level (MDC 95 ) were established. Validity was assessed by correlating the balance tests with each other and with the ABC Scale (Spearman correlation coefficients-ρ). Receiver operating characteristics assessed the ability of each balance test to differentiate between people with and without a history of falls. All balance tests presented good to excellent interrater (ICC=.71-.93) and test-retest (ICC=.50-.82) relative reliability, with no evidence of bias. MDC 95 values were 4.6, 9, 3.8, and 4.1 points for the BBS, BESTest, Mini-BESTest, and Brief-BESTest, respectively. All tests were significantly correlated with each other (ρ=.83-.96) and with the ABC Scale (ρ=.46-.61). Acceptable ability to identify fall status (areas under the curve, .71-.78) was found for all tests. Cutoff points were 48.5, 82, 19.5, and 12.5 points for the BBS, BESTest, Mini-BESTest, and Brief-BESTest, respectively. All balance tests are reliable, valid, and able to identify fall status in older people living in the community. Therefore, the choice of which test to use will depend on the level of balance impairment, purpose, and time availability. Copyright © 2016. Published by Elsevier Inc.
Karanikola, Maria N K; Papathanassoglou, Elizabeth D E
2015-02-01
The Index of Work Satisfaction (IWS) is a comprehensive scale assessing nurses' professional satisfaction. The aim of the present study was to explore: a) the applicability, reliability and validity of the Greek version of the IWS and b) contrasts among the factors addressed by IWS against the main themes emerging from a qualitative phenomenological investigation of nurses' professional experiences. A descriptive correlational design was applied using a sample of 246 emergency and critical care nurses. Internal consistency and test-retest reliability were tested. Construct and content validity were assessed by factor analysis, and through qualitative phenomenological analysis with a purposive sample of 12 nurses. Scale factors were contrasted to qualitative themes to assure that IWS embraces all aspects of Greek nurses' professional satisfaction. The internal consistency (α = 0.81) and test-retest (tau = 1, p < 0.0001) reliability were adequate. Following appropriate modifications, factor analysis confirmed the construct validity of the scale and subscales. The qualitative data partially clarified the low reliability of one subscale. The Greek version of the IWS scale is supported for use in acute care. The mixed methods approach constitutes a powerful tool for transferring scales to different cultures and healthcare systems. Copyright © 2014 Elsevier Inc. All rights reserved.
van de Pol, Daan; Zacharian, Tigran; Maas, Mario; Kuijer, P Paul F M
2017-06-01
The Shoulder posterior circumflex humeral artery Pathology and digital Ischemia - questionnaire (SPI-Q) has been developed to enable periodic surveillance of elite volleyball players, who are at risk for digital ischemia. Prior to implementation, assessing reliability is mandatory. Therefore, the test-retest reliability and agreement of the SPI-Q were evaluated among the population at risk. A questionnaire survey was performed with a 2-week interval among 65 elite male volleyball players assessing symptoms of cold, pale and blue digits in the dominant hand during or after practice or competition using a 4-point Likert scale (never, sometimes, often and always). Kappa (κ) and percentage of agreement (POA) were calculated for individual symptoms, and to distinguish symptomatic and asymptomatic players. For the individual symptoms, κ ranged from "poor" (0.25) to "good" (0.63), and POA ranged from "moderate" (78%) to "good" (97%). To classify symptomatic players, the SPI-Q showed "good" reliability (κ = 0.83; 95%CI 0.69-0.97) and "good" agreement (POA = 92%). The current study has proven the SPI-Q to be reliable for detecting elite male indoor volleyball players with symptoms of digital ischemia.
Development of a scale to measure individuals’ ratings of peace
2014-01-01
Background The evolving concept of peace-building and the interplay between peace and health is examined in many venues, including at the World Health Assembly. However, without a metric to determine effectiveness of intervention programs all efforts are prone to subjective assessment. This paper develops a psychometric index that lays the foundation for measuring community peace stemming from intervention programs. Methods After developing a working definition of ‘peace’ and delineating a Peace Evaluation Across Cultures and Environments (PEACE) scale with seven constructs comprised of 71 items, a beta version of the index was pilot-tested. Two hundred and fifty subjects in three sites in the U.S. were studied using a five-point Likert scale to evaluate the psychometric functioning of the PEACE scale. Known groups validation was performed using the SOS-10. In addition, test-retest reliability was performed on 20 subjects. Results The preliminary data demonstrated that the scale has acceptable psychometric properties for measuring an individual’s level of peacefulness. The study also provides reliability and validity data for the scale. The data demonstrated internal consistency, correlation between data and psychological well-being, and test-retest reliability. Conclusions The PEACE scale may serve as a novel assessment tool in the health sector and be valuable in monitoring and evaluating the peace-building impact of health initiatives in conflict-affected regions. PMID:25298781
Chen, Y-W; HajGhanbari, B; Road, J D; Coxson, H O; Camp, P G; Reid, W D
2018-06-08
Pain is prevalent in chronic obstructive pulmonary disease (COPD) and the Brief Pain Inventory (BPI) appears to be a feasible questionnaire to assess this symptom. However, the reliability and validity of the BPI have not been determined in individuals with COPD. This study aimed to determine the internal consistency, test-retest reliability and validity (construct, convergent, divergent and discriminant) of the BPI in individuals with COPD. In order to examine the test-retest reliability, individuals with COPD were recruited from pulmonary rehabilitation programmes to complete the BPI twice 1 week apart. In order to investigate validity, de-identified data was retrieved from two previous studies, including forced expiratory volume in 1-s, age, sex and data from four questionnaires: the BPI, short-form McGill Pain Questionnaire (SF-MPQ), 36-Item Short Form Survey (SF-36) and Community Health Activities Model Program for Seniors (CHAMPS) questionnaire. In total, 123 participants were included in the analyses (eligible data were retrieved from 86 participants and additional 37 participants were recruited). The BPI demonstrated excellent internal consistency and test-retest reliability. It also showed convergent validity with the SF-MPQ and divergent validity with the SF-36. The factor analysis yielded two factors of the BPI, which demonstrated that the two domains of the BPI measure the intended constructs. The BPI can also discriminate pain levels among COPD patients with varied levels of quality of life (SF-36) and physical activity (CHAMPS). The BPI is a reliable and valid pain questionnaire that can be used to evaluate pain in COPD. This study formally established the reliability and validity of the BPI in individuals with COPD, which have not been determined in this patient group. The results of this study provide strong evidence that assessment results from this pain questionnaire are reliable and valid. © 2018 European Pain Federation - EFIC®.
Evaluation of lower leg function in patients with Achilles tendinopathy.
Silbernagel, Karin Grävare; Gustavsson, Alexander; Thomeé, Roland; Karlsson, Jon
2006-11-01
Achilles tendinopathy is considered to be one of the most common overuse injuries in elite and recreational athletes. However, the effect that the Achilles tendinopathy has on patients' physical performance is still unclear. The purpose of this study was to evaluate if Achilles tendinopathy caused functional deficits on the injured side compared with the non-injured side in patients. A test battery comprised of tests for different aspects of muscle-tendon function of the gastrocnemius, soleus and Achilles tendon complex was developed to evaluate lower leg function. The test battery's test-retest reliability and sensitivity (the percent probability that the tests would demonstrate abnormal lower limb symmetry index in patients) were also evaluated. The test battery consisted of three jump tests, a counter movements jump (CMJ), a drop counter movement jump (drop CMJ) and hopping, and two strength tests, concentric toe-raises, eccentric-concentric toe-raises and toe-raises for endurance. The reliability was evaluated through a test-retest design on 15 healthy subjects. The test battery's sensitivity and possible functional deficits in patients with Achilles tendinopathy were evaluated on 42 patients (19 women and 23 men). An excellent reliability was found between test days 1-2 and 2-3 for all tests (ICC = 0.76-0.94) except for concentric toe-raise, test 2-3, which had fair reliability (ICC = 0.73). The methodological error ranged from 8 to 17%. There were significant differences (P = 0.001-0.049) between the non-injured (or least symptomatic) side and injured (most symptomatic) side for hopping, drop CMJ, concentric and eccentric-concentric toe-raises, and significant differences (P = 0.000-0.012) in the level of pain during CMJ, hopping, and drop CMJ. The sensitivity of the test battery at a 90% capacity was 88. Achilles tendinopathy causes not only pain and symptoms in patients but also apparent impairments in various aspects of lower leg muscle-tendon function as measured with the test battery. This test battery is reliable and able to detect differences in lower leg function between the injured or "most symptomatic" and non-injured or "least symptomatic" side in patients with Achilles tendinopathy. The test battery has higher demand on patients' function compared with each individual test.
Loo, Jo Lin; Ang, Yee Kwang; Yim, Hip Seng
2013-01-01
To describe the development and validation of a cancer awareness questionnaire (CAQ) based on a literature review of previous studies, focusing on cancer awareness and prevention. A total of 388 Chinese undergraduate students in a private university in Kuala Lumpur, Malaysia, were recruited to evaluate the developed self-administered questionnaire. The CAQ consisted of four sections: awareness of cancer warning signs and screening tests; knowledge of cancer risk factors; barriers in seeking medical advice; and attitudes towards cancer and cancer prevention. The questionnaire was evaluated for construct validity using principal component analysis and internal consistency using Cronbach's alpha (α) coefficient. Test-retest reliability was assessed with a 10-14 days interval and measured using Pearson product-moment correlation. The initial 77-item CAQ was reduced to 63 items, with satisfactory construct validity, and a high total internal consistency (Cronbach's α=0.77). A total of 143 students completed the questionnaire for the test-retest reliability obtaining a correlation of 0.72 (p<0.001) overall. The CAQ could provide a reliable and valid measure that can be used to assess cancer awareness among local Chinese undergraduate students. However, further studies among students from different backgrounds (e.g. ethnicity) are required in order to facilitate the use of the cancer awareness questionnaire among all university students.
Psychometric Properties of the Adolescent Health Concern Inventory: The Persian Version
Baheiraei, Azam; Ahmadi, Fazlollah; Foroushani, Abbas Rahimi; Ghofranipour, Fazlollah; Weiler, Robert M
2013-01-01
Objective It is important to consider the health concerns of adolescents before developing and implementing public health promotion or health education curriculum programs aimed at ameliorating priority health problems experienced by adolescents. The aim of this study was to test the psychometric properties of the original Adolescent Health Concern Inventory (AHCI) for use with an Iranian population. Methods This was a methodological study in which 50 adolescents with age range of 14-18 years were selected using convenience sampling. The translation and cultural adaptation process of The AHCI followed recognized and established guidelines. The face and content validity was established by analyzing feedback solicited from teenagers and professionals with expertise in health, sociology and psychology. Reliability was examined using test-retest and Cronbach's alpha for internal consistency reliability. Kappa and McNemar tests were used to examine test-retest reliability for each item. Results Minor cultural differences were identified and resolved during the translation process and determining the validity of the checklist. Results from Kappa and McNemar tests indicate a high degree of test-retest reliability. Internal consistency reliability as measured by Cronbach's alpha for the subscales were between 0.68 and 0.87 with total instrument reliability of 0.96 indicating considerable overall reliability. Conclusion The Persian version of the AHCI appears valid and reliable. Hence, it can be used for filling a gap in identifying the adolescents’ health concerns in the research and community settings and school health education programs in Iran to design appropriate interventions. PMID:23682249
Development of a Multidimensional Index for Assessing Social Support in Rehabilitation.
ERIC Educational Resources Information Center
McColl, Mary Ann; Friedland, Judith
1989-01-01
Discusses the development and psychometric evaluation of the Social Support Inventory for Stroke Survivors, a multidimensional instrument for measuring social support and its influence on the rehabilitation of stroke patients. Examines the test-retest reliability and internal consistency of the instrument and suggests modifications and clinical…
Butler, Andrew J.; Cazeaux, Jennifer; Fidler, Anna; Jansen, Jessica; Lefkove, Nehama; Gregg, Melanie; Hall, Craig; Easley, Kirk A.; Shenvi, Neeta; Wolf, Steven L.
2012-01-01
Mental imagery can improve motor performance in stroke populations when combined with physical therapy. Valid and reliable instruments to evaluate the imagery ability of stroke survivors are needed to maximize the benefits of mental imagery therapy. The purposes of this study were to: examine and compare the test-retest intra-rate reliability of the Movement Imagery Questionnaire-Revised, Second Edition (MIQ-RS) in stroke survivors and able-bodied controls, examine internal consistency of the visual and kinesthetic items of the MIQ-RS, determine if the MIQ-RS includes both the visual and kinesthetic dimensions of mental imagery, correlate impairment and motor imagery scores, and investigate the criterion validity of the MIQ-RS in stroke survivors by comparing the results to the KVIQ-10. Test-retest analysis indicated good levels of reliability (ICC range: .83–.99) and internal consistency (Cronbach α: .95–.98) of the visual and kinesthetic subscales in both groups. The two-factor structure of the MIQ-RS was supported by factor analysis, with the visual and kinesthetic components accounting for 88.6% and 83.4% of the total variance in the able-bodied and stroke groups, respectively. The MIQ-RS is a valid and reliable instrument in the stroke population examined and able-bodied populations and therefore useful as an outcome measure for motor imagery ability. PMID:22474504
Arbab, Dariusch; van Ochten, Johannes H M; Schnurr, Christoph; Bouillon, Bertil; König, Dietmar
2017-12-01
Patient-reported outcome measures are a critical tool in evaluating the efficacy of orthopedic procedures. The intention of this study was to evaluate reliability, validity, responsiveness and minimally important change of the German version of the Hip dysfunction and osteoarthritis outcome score (HOOS). The German HOOS was investigated in 251 consecutive patients before and 6 months after total hip arthroplasty. All patients completed HOOS, Oxford-Hip Score, Short-Form (SF-36) and numeric scales for pain and disability. Test-retest reliability, internal consistency, floor and ceiling effects, construct validity and minimal important change were analyzed. The German HOOS demonstrated excellent test-retest reliability with intraclass correlation coefficient values > 0.7. Cronbach´s alpha values demonstrated strong internal consistency. As hypothesized, HOOS subscales strongly correlated with corresponding OHS and SF-36 domains. All subscales showed excellent (effect size/standardized response means > 0.8) responsiveness between preoperative assessment and postoperative follow-up. The HOOS and all subdomains showed higher changes than the minimal detectable change which indicates true changes. The German version of the HOOS demonstrated good psychometric properties. It proved to be valid, reliable and responsive to the changes instrument for use in patients with hip osteoarthritis undergoing total hip replacement.
Assessing the Psychometric Properties of Two Food Addiction Scales
Lemeshow, Adina; Gearhardt, Ashley; Genkinger, Jeanine; Corbin, William R.
2016-01-01
Background While food addiction is well accepted in popular culture and mainstream media, its scientific validity as an addictive behavior is still under investigation. This study evaluated the reliability and validity of the Yale Food Addiction Scale and Modified Yale Food Addiction Scale using data from two community-based convenience samples. Methods We assessed the internal and test-retest reliability of the Yale Food Addiction Scale and Modified Yale Food Addiction Scale, and estimated the sensitivity and negative predictive value of the Modified Yale Food Addiction Scale using the Yale Food Addiction Scale as the benchmark. We calculated Cronbach’s alphas and 95% confidence intervals (CIs) for internal reliability and Cohen’s Kappa coefficients and 95% CIs for test-retest reliability. Results Internal consistency (n=232) was marginal to good, ranging from α=0.63 to 0.84. The test-retest reliability (n=45) for food addiction diagnosis was substantial, with Kappa=0.73 (95% CI, 0.48–0.88) (Yale Food Addiction Scale) and 0.79 (95% CI, 0.66–1.00) (Modified Yale Food Addiction Scale). Sensitivity and negative predictive value for classifying food addiction status were excellent: compared to the Yale Food Addiction Scale, the Modified Yale Food Addiction Scale’s sensitivity was 92.3% (95% CI, 64%–99.8%), and the negative predictive value was 99.5% (95% CI, 97.5%–100%). Conclusions Our analyses suggest that the Modified Yale Food Addiction Scale may be an appropriate substitute for the Yale Food Addiction Scale when a brief measure is needed, and support the continued use of both scales to investigate food addiction. PMID:27623221
Alla, Arben; Czabanowska, Katarzyna; Kijowska, Violetta; Roshi, Enver; Burazeri, Genc
2012-01-01
Our aim was to validate an international instrument measuring self-perceived competency level of family physicians in Albania. A representative sample of 57 family physicians operating in primary health care services was interviewed twice in March-April 2012 in Tirana (26 men and 31 women; median age: 46 years, inter-quartile range: 38-56 years). A structured questionnaire was administered [and subsequently re-administered after two weeks (test-retest)] to all family physicians aiming to self-assess physicians' level of abilities, skills and competencies regarding different domains of quality of health care. The questionnaire included 37 items organized into 6 subscales/domains. Answers for each item of the tool ranged from 1 ("novice" physicians) to 5 ("expert" physicians). An overall summary score (range: 37-185) and a subscale summary score for each domain were calculated for the test and retest procedures. Cronbach's alpha was used to assess the internal consistency for both the test and the retest procedures, whereas Spearman's rho was employed to assess the stability over time (test-retest reliability) of the instrument. Cronbach's alpha was 0.87 for the test and 0.86 for the retest procedure. Overall, Spearman's rho was 0.84 (P<0.001). The overall summary score for the 37 items of the instrument was 96.3±10.0 for the test and 97.3±10.1 for the retest. All the subscale summary scores were very similar for the test and the retest procedure. This study provides evidence on cross-cultural adaptation of an international instrument taping self-perceived level of competencies of family physicians in Albania. The questionnaire displayed a satisfactory internal consistency for both test and retest procedures in this sample of family physicians in Albania. Furthermore, the high test-retest reliability (stability over time) of the instrument suggests a good potential for wide scale application to nationally representative samples of family physicians in Albanian populations.
Nagai, Takashi; Sell, Timothy C; Abt, John P; Lephart, Scott M
2012-11-01
To develop and assess the reliability and precision of knee internal/external rotation (IR/ER) threshold to detect passive motion (TTDPM) and determine if gender differences exist. Test-retest for the reliability/precision and cross-sectional for gender comparisons. University neuromuscular and human performance research laboratory. Ten subjects for the reliability and precision aim. Twenty subjects (10 males and 10 females) for gender comparisons. All TTDPM tests were performed using a multi-mode dynamometer. Subjects performed TTDPM at two knee positions (near IR or ER end-range). Intraclass correlation coefficient (ICC (3,k)) and standard error of measurement (SEM) were used to evaluate the reliability and precision. Independent t-tests were used to compare genders. TTDPM toward IR and ER at two knee positions. Intrasession and intersession reliability and precision were good (ICC=0.68-0.86; SEM=0.22°-0.37°). Females had significantly diminished TTDPM toward IR at IR-test position (males: 0.77°±0.14°, females: 1.18°±0.46°, p=0.021) and TTDPM toward IR at the ER-test position (males: 0.87°±0.13°, females: 1.36°±0.58°, p=0.026). No other significant gender differences were found (p>0.05). The current IR/ER TTDPM methods are reliable and accurate for the test-retest or cross-section research design. Gender differences were found toward IR where the ACL acts as the secondary restraint. Copyright © 2011 Elsevier Ltd. All rights reserved.
Schrimshaw, Eric W.; Rosario, Margaret; Meyer-Bahlburg, Heino F. L.; Scharf-Matlick, Alice A.
2011-01-01
Despite the importance of reliable self-reported sexual information for research on sexuality and sexual health, research has not examined reliability of information provided by gay, lesbian, and bisexual (GLB) youths. Test-retest reliability of self-reported sexual behaviors, sexual orientation, sexual identity, and psychosexual developmental milestones was examined among an ethnically diverse sample of 64 self-identified GLB youths. Two face-to-face interviews were conducted approximately two weeks apart using the Sexual Risk Behavior Assessment Schedule for Homosexual Youths (SERBAS-Y-HM). Overall, the mean of the test-retest reliability coefficients was substantial for 6 of the 7 domains: lifetime sexual behaviors (M = .89), sexual behavior in the past 3 months (M = .96), unprotected sexual behavior in the past 3 months (M = .93), sexual identity (κ = .89), sexual orientation (M = .82), and ages of various psychosexual developmental milestones (M = .77). Inconsistent reliability was found for reports of sexual behaviors while using substances. A small number of gender differences emerged, with lower reliability among female youths in the lifetime number of same-sex partners. The overall findings suggest that a wide range of self-reported sexual information can be reliably assessed among GLB youths by means of interviewer-administered questionnaires, such as the SERBAS-Y-HM. PMID:16752124
Ruan, W. June; Goldstein, Risë B.; Chou, S. Patricia; Smith, Sharon M.; Saha, Tulshi D.; Pickering, Roger P.; Dawson, Deborah A.; Huang, Boji; Stinson, Frederick S.; Grant, Bridget F.
2008-01-01
This study presents test-retest reliability statistics and information on internal consistency for new diagnostic modules and risk factor of alcohol, drug, and psychiatric disorders the Alcohol Use Disorder and Associated Disabilities Interview Schedule-IV (AUDADIS-IV). Test-retest statistics were derived from a random sample of 1,899 adults selected from 34,653 respondents who participated in the 2004–2005 Wave 2 National Epidemiologic Survey on Alcohol and Related Conditions (NESARC). Internal consistency of continuous scales was assessed using the entire Wave 2 NESARC. Both test and retest interviews were conducted face-to-face. Test-retest and internal consistency results for diagnoses and symptom scales associated with posttraumatic stress disorder, attention-deficit/hyperactivity disorder, and borderline, narcissistic, and schizotypal personality disorders were predominantly good (kappa > 0.63; ICC > 0.69; alpha > 0.75) and reliability for risk factor measures fell within the good to excellent range (intraclass correlations = 0.50–0.94; alpha = 0.64–0.90). The high degree of reliability found in this study suggests that new AUDADIS-IV diagnostic measures can be useful tools in research settings. The availability of highly reliable measures of risk factors of alcohol, drug, and psychiatric disorders will contribute to the validity of conclusions drawn from future research in the domains of substance use disorder and psychiatric epidemiology. PMID:17706375
Muhamad, Zailani; Ramli, Ayiesah; Amat, Salleh
2015-05-01
The aim of this study was to determine the content validity, internal consistency, test-retest reliability and inter-rater reliability of the Clinical Competency Evaluation Instrument (CCEVI) in assessing the clinical performance of physiotherapy students. This study was carried out between June and September 2013 at University Kebangsaan Malaysia (UKM), Kuala Lumpur, Malaysia. A panel of 10 experts were identified to establish content validity by evaluating and rating each of the items used in the CCEVI with regards to their relevance in measuring students' clinical competency. A total of 50 UKM undergraduate physiotherapy students were assessed throughout their clinical placement to determine the construct validity of these items. The instrument's reliability was determined through a cross-sectional study involving a clinical performance assessment of 14 final-year undergraduate physiotherapy students. The content validity index of the entire CCEVI was 0.91, while the proportion of agreement on the content validity indices ranged from 0.83-1.00. The CCEVI construct validity was established with factor loading of ≥0.6, while internal consistency (Cronbach's alpha) overall was 0.97. Test-retest reliability of the CCEVI was confirmed with a Pearson's correlation range of 0.91-0.97 and an intraclass coefficient correlation range of 0.95-0.98. Inter-rater reliability of the CCEVI domains ranged from 0.59 to 0.97 on initial and subsequent assessments. This pilot study confirmed the content validity of the CCEVI. It showed high internal consistency, thereby providing evidence that the CCEVI has moderate to excellent inter-rater reliability. However, additional refinement in the wording of the CCEVI items, particularly in the domains of safety and documentation, is recommended to further improve the validity and reliability of the instrument.
O’Connor, David; Potler, Natan Vega; Kovacs, Meagan; Xu, Ting; Ai, Lei; Pellman, John; Vanderwal, Tamara; Parra, Lucas C.; Cohen, Samantha; Ghosh, Satrajit; Escalera, Jasmine; Grant-Villegas, Natalie; Osman, Yael; Bui, Anastasia; Craddock, R. Cameron
2017-01-01
Abstract Background: Although typically measured during the resting state, a growing literature is illustrating the ability to map intrinsic connectivity with functional MRI during task and naturalistic viewing conditions. These paradigms are drawing excitement due to their greater tolerability in clinical and developing populations and because they enable a wider range of analyses (e.g., inter-subject correlations). To be clinically useful, the test-retest reliability of connectivity measured during these paradigms needs to be established. This resource provides data for evaluating test-retest reliability for full-brain connectivity patterns detected during each of four scan conditions that differ with respect to level of engagement (rest, abstract animations, movie clips, flanker task). Data are provided for 13 participants, each scanned in 12 sessions with 10 minutes for each scan of the four conditions. Diffusion kurtosis imaging data was also obtained at each session. Findings: Technical validation and demonstrative reliability analyses were carried out at the connection-level using the Intraclass Correlation Coefficient and at network-level representations of the data using the Image Intraclass Correlation Coefficient. Variation in intrinsic functional connectivity across sessions was generally found to be greater than that attributable to scan condition. Between-condition reliability was generally high, particularly for the frontoparietal and default networks. Between-session reliabilities obtained separately for the different scan conditions were comparable, though notably lower than between-condition reliabilities. Conclusions: This resource provides a test-bed for quantifying the reliability of connectivity indices across subjects, conditions and time. The resource can be used to compare and optimize different frameworks for measuring connectivity and data collection parameters such as scan length. Additionally, investigators can explore the unique perspectives of the brain's functional architecture offered by each of the scan conditions. PMID:28369458
Utility of computer-assisted approaches for population surveillance of physical activity.
Creamer, MeLisa; Bowles, Heather R; von Hofe, Belinda; Pettee Gabriel, Kelley; Kohl, Harold W; Bauman, Adrian
2014-08-01
Computer-assisted techniques may be a useful way to enhance physical activity surveillance and increase accuracy of reported behaviors. Evaluate the reliability and validity of a physical activity (PA) self-report instrument administered by telephone and internet. The telephone-administered Active Australia Survey was adapted into 2 forms for internet self-administration: survey questions only (internet-text) and with videos demonstrating intensity (internet-video). Data were collected from 158 adults (20-69 years, 61% female) assigned to telephone (telephone-interview) (n = 56), internet-text (n = 51), or internet-video (n = 51). Participants wore an accelerometer and completed a logbook for 7 days. Test-retest reliability was assessed using intraclass correlation coefficients (ICC). Convergent validity was assessed using Spearman correlations. Strong test-retest reliability was observed for PA variables in the internet-text (ICC = 0.69 to 0.88), internet-video (ICC = 0.66 to 0.79), and telephone-interview (ICC = 0.69 to 0.92) groups (P-values < 0.001). For total PA, correlations (ρ) between the survey and Actigraph+logbook were ρ = 0.47 for the internet-text group, ρ = 0.57 for the internet-video group, and ρ = 0.65 for the telephone-interview group. For vigorous-intensity activity, the correlations between the survey and Actigraph+logbook were 0.52 for internet-text, 0.57 for internet-video, and 0.65 for telephone-interview (P < .05). Internet-video of the survey had similar test-retest reliability and convergent validity when compared with the telephone-interview, and should continue to be developed.
Standard setting: comparison of two methods.
George, Sanju; Haque, M Sayeed; Oyebode, Femi
2006-09-14
The outcome of assessments is determined by the standard-setting method used. There is a wide range of standard-setting methods and the two used most extensively in undergraduate medical education in the UK are the norm-reference and the criterion-reference methods. The aims of the study were to compare these two standard-setting methods for a multiple-choice question examination and to estimate the test-retest and inter-rater reliability of the modified Angoff method. The norm-reference method of standard-setting (mean minus 1 SD) was applied to the 'raw' scores of 78 4th-year medical students on a multiple-choice examination (MCQ). Two panels of raters also set the standard using the modified Angoff method for the same multiple-choice question paper on two occasions (6 months apart). We compared the pass/fail rates derived from the norm reference and the Angoff methods and also assessed the test-retest and inter-rater reliability of the modified Angoff method. The pass rate with the norm-reference method was 85% (66/78) and that by the Angoff method was 100% (78 out of 78). The percentage agreement between Angoff method and norm-reference was 78% (95% CI 69% - 87%). The modified Angoff method had an inter-rater reliability of 0.81-0.82 and a test-retest reliability of 0.59-0.74. There were significant differences in the outcomes of these two standard-setting methods, as shown by the difference in the proportion of candidates that passed and failed the assessment. The modified Angoff method was found to have good inter-rater reliability and moderate test-retest reliability.
Polcin, Douglas L.; Galloway, Gantt P.; Bond, Jason; Korcha, Rachael; Greenfield, Thomas K.
2008-01-01
The addiction field lacks an accepted definition and reliable measure of confrontation. The Alcohol and Drug Confrontation Scale (ADCS) defines confrontation as warnings about the potential consequences of substance use. To assess psychometric properties, 323 individual entering recovery houses in U.S. urban and suburban areas were interviewed between 2003 and 2005 (20% women, 68% white). Analyses included test-retest reliability, confirmatory factor analysis, and measures of internal consistency. Findings support the ADCS as a reliable way of assessing two factors: Internal Support and External intensity. Confrontation was experienced as supportive, accurate and helpful. Additional studies should assess confrontation in different contexts. PMID:20686635
Influences on and Limitations of Classical Test Theory Reliability Estimates.
ERIC Educational Resources Information Center
Arnold, Margery E.
It is incorrect to say "the test is reliable" because reliability is a function not only of the test itself, but of many factors. The present paper explains how different factors affect classical reliability estimates such as test-retest, interrater, internal consistency, and equivalent forms coefficients. Furthermore, the limits of classical test…
Alghadir, Ahmad H; Anwer, Shahnawaz; Iqbal, Amir; Iqbal, Zaheen Ahmed
2018-01-01
Objective Several scales are commonly used for assessing pain intensity. Among them, the numerical rating scale (NRS), visual analog scale (VAS), and verbal rating scale (VRS) are often used in clinical practice. However, no study has performed psychometric analyses of their reliability and validity in the measurement of osteoarthritic (OA) pain. Therefore, the present study examined the test–retest reliability, validity, and minimum detectable change (MDC) of the VAS, NRS, and VRS for the measurement of OA knee pain. In addition, the correlations of VAS, NRS, and VRS with demographic variables were evaluated. Methods The study included 121 subjects (65 women, 56 men; aged 40–80 years) with OA of the knee. Test–retest reliability of the VAS, NRS, and VRS was assessed during two consecutive visits in a 24 h interval. The validity was tested using Pearson’s correlation coefficients between the baseline scores of VAS, NRS, and VRS and the demographic variables (age, body mass index [BMI], sex, and OA grade). The standard error of measurement (SEM) and the MDC were calculated to assess statistically meaningful changes. Results The intraclass correlation coefficients of the VAS, NRS, and VRS were 0.97, 0.95, and 0.93, respectively. VAS, NRS, and VRS were significantly related to demographic variables (age, BMI, sex, and OA grade). The SEM of VAS, NRS, and VRS was 0.03, 0.48, and 0.21, respectively. The MDC of VAS, NRS, and VRS was 0.08, 1.33, and 0.58, respectively. Conclusion All the three scales had excellent test–retest reliability. However, the VAS was the most reliable, with the smallest errors in the measurement of OA knee pain. PMID:29731662
Coolidge, Trilby; Hillstead, M Blake; Farjo, Nadia; Weinstein, Philip; Coldwell, Susan E
2010-05-13
Hispanics comprise the largest ethnic minority group in the United States. Previous work with the Spanish Modified Dental Anxiety Scale (MDAS) yielded good validity, but lower test-retest reliability. We report the performance of the Spanish MDAS in a new sample, as well as the performance of the Spanish Revised Dental Beliefs Survey (R-DBS). One hundred sixty two Spanish-speaking adults attending Spanish-language church services or an Hispanic cultural festival completed questionnaires containing the Spanish MDAS, Spanish R-DBS, and dental attendance questions, and underwent a brief oral examination. Church attendees completed the questionnaire a second time, for test-retest purposes. The Spanish MDAS and R-DBS were completed by 156 and 136 adults, respectively. The test-retest reliability of the Spanish MDAS was 0.83 (95% CI = 0.60-0.92). The internal reliability of the Spanish R-DBS was 0.96 (95% CI = 0.94-0.97), and the test-retest reliability was 0.86 (95% CI = 0.64-0.94). The two measures were significantly correlated (Spearman's rho = 0.38, p < 0.001). Participants who do not currently go to a dentist had significantly higher MDAS scores (t = 3.40, df = 106, p = 0.003) as well as significantly higher R-DBS scores (t = 2.21, df = 131, p = 0.029). Participants whose most recent dental visit was for pain or a problem, rather than for a check-up, scored significantly higher on both the MDAS (t = 3.00, df = 106, p = 0.003) and the R-DBS (t = 2.85, df = 92, p = 0.005). Those with high dental fear (MDAS score 19 or greater) were significantly more likely to have severe caries (Chi square = 6.644, df = 2, p = 0.036). Higher scores on the R-DBS were significantly related to having more missing teeth (Spearman's rho = 0.23, p = 0.009). In this sample, the test-retest reliability of the Spanish MDAS was higher. The significant relationships between dental attendance and questionnaire scores, as well as the difference in caries severity seen in those with high fear, add to the evidence of this scale's construct validity in Hispanic samples. Our results also provide evidence for the internal and test-retest reliabilities, as well as the construct validity, of the Spanish R-DBS.
Kim, Hannah; Ricketts, Todd A
2013-01-01
To investigate the test-retest reliability of real-ear aided response (REAR) measures in open and closed hearing aid fittings in children using appropriate probe-microphone calibration techniques (stored equalization for open fittings and concurrent equalization for closed fittings). Probe-microphone measurements were completed for two mini-behind-the-ear (BTE) hearing aids which were coupled to the ear using open and closed eartips via thin (0.9 mm) tubing. Before probe-microphone testing, the gain of each of the test hearing aids was programmed using an artificial ear simulator (IEC 711) and a Knowles Electronic Manikin for Acoustic Research to match the National Acoustic Laboratories-Non-Linear, version 1 targets for one of two separate hearing loss configurations using an Audioscan Verifit. No further adjustments were made, and the same amplifier gain was used within each hearing aid across both eartip configurations and all participants. Probe-microphone testing included real-ear occluded response (REOR) and REAR measures using the Verifit's standard speech signal (the carrot passage) presented at 65 dB sound pressure level (SPL). Two repeated probe-microphone measures were made for each participant with the probe-tube and hearing aid removed and repositioned between each trial in order to assess intrasubject measurement variability. These procedures were repeated using both open and closed domes. Thirty-two children, ages ranging from 4 to 14 yr. The test-retest standard deviations for open and closed measures did not exceed 4 dB at any frequency. There was also no significant difference between the open (stored equalization) and closed (concurrent equalization) methods. Reliability was particularly similar in the high frequencies and was also quite similar to that reported in previous research. There was no correlation between reliability and age, suggesting high reliability across all ages evaluated. The findings from this study suggest that reliable probe-microphone measurements are obtainable on children 4 yr and older for both traditional unvented and open-canal hearing aid fittings. These data suggest that clinicians should not avoid fitting open technology to children as young as 4 y because of concerns regarding the reliability of verification techniques. American Academy of Audiology.
Stroke Impact Scale 3.0: Reliability and Validity Evaluation of the Korean Version
2017-01-01
Objective To establish the reliability and validity the Korean version of the Stroke Impact Scale (K-SIS) 3.0. Methods A total of 70 post-stroke patients were enrolled. All subjects were evaluated for general characteristics, Mini-Mental State Examination (MMSE), the National Institutes of Health Stroke Scale (NIHSS), Modified Barthel Index, Hospital Anxiety and Depression Scale (HADS). The SF-36 and K-SIS 3.0 assessed their health-related quality of life. Statistical analysis after evaluation, determined the reliability and validity of the K-SIS 3.0. Results A total of 70 patients (mean age, 54.97 years) participated in this study. Internal consistency of the SIS 3.0 (Cronbach's alpha) was obtained, and all domains had good co-efficiency, with threshold above 0.70. Test-retest reliability of SIS 3.0 required correlation (Spearman's rho) of the same domain scores obtained on the first and second assessments. Results were above 0.5, with the exception of social participation and mobility. Concurrent validity of K-SIS 3.0 was assessed using the SF-36, and other scales with the same or similar domains. Each domain of K-SIS 3.0 had a positive correlation with corresponding similar domain of SF-36 and other scales (HADS, MMSE, and NIHSS). Conclusion The newly developed K-SIS 3.0 showed high inter-intra reliability and test-retest reliabilities, together with high concurrent validity with the original and various other scales, for patients with stroke. K-SIS 3.0 can therefore be used for stroke patients, to assess their health-related quality of life and treatment efficacy. PMID:28758075
Stroke Impact Scale 3.0: Reliability and Validity Evaluation of the Korean Version.
Choi, Seong Uk; Lee, Hye Sun; Shin, Joon Ho; Ho, Seung Hee; Koo, Mi Jung; Park, Kyoung Hae; Yoon, Jeong Ah; Kim, Dong Min; Oh, Jung Eun; Yu, Se Hwa; Kim, Dong A
2017-06-01
To establish the reliability and validity the Korean version of the Stroke Impact Scale (K-SIS) 3.0. A total of 70 post-stroke patients were enrolled. All subjects were evaluated for general characteristics, Mini-Mental State Examination (MMSE), the National Institutes of Health Stroke Scale (NIHSS), Modified Barthel Index, Hospital Anxiety and Depression Scale (HADS). The SF-36 and K-SIS 3.0 assessed their health-related quality of life. Statistical analysis after evaluation, determined the reliability and validity of the K-SIS 3.0. A total of 70 patients (mean age, 54.97 years) participated in this study. Internal consistency of the SIS 3.0 (Cronbach's alpha) was obtained, and all domains had good co-efficiency, with threshold above 0.70. Test-retest reliability of SIS 3.0 required correlation (Spearman's rho) of the same domain scores obtained on the first and second assessments. Results were above 0.5, with the exception of social participation and mobility. Concurrent validity of K-SIS 3.0 was assessed using the SF-36, and other scales with the same or similar domains. Each domain of K-SIS 3.0 had a positive correlation with corresponding similar domain of SF-36 and other scales (HADS, MMSE, and NIHSS). The newly developed K-SIS 3.0 showed high inter-intra reliability and test-retest reliabilities, together with high concurrent validity with the original and various other scales, for patients with stroke. K-SIS 3.0 can therefore be used for stroke patients, to assess their health-related quality of life and treatment efficacy.
The De-Escalating Aggressive Behaviour Scale: development and psychometric testing.
Nau, Johannes; Halfens, Ruud; Needham, Ian; Dassen, Theo
2009-09-01
This paper is a report of a study to develop and test the psychometric properties of a scale measuring nursing students' performance in de-escalation of aggressive behaviour. Successful training should lead not merely to more knowledge and amended attitudes but also to improved performance. However, the quality of de-escalation performance is difficult to assess. Based on a qualitative investigation, seven topics pertaining to de-escalating behaviour were identified and the wording of items tested. The properties of the items and the scale were investigated quantitatively. A total of 1748 performance evaluations by students (rater group 1) from a skills laboratory were used to check distribution and conduct a factor analysis. Likewise, 456 completed evaluations by de-escalation experts (rater group 2) of videotaped performances at pre- and posttest were used to investigate internal consistency, interrater reliability, test-retest reliability, effect size and factor structure. Data were collected in 2007-2008 in German. Factor analysis showed a unidimensional 7-item scale with factor loadings ranging from 0.55 to 0.81 (rater group 1) and 0.48 to 0.88 (rater group 2). Cronbach's alphas of 0.87 and 0.88 indicated good internal consistency irrespective of rater group. A Pearson's r of 0.80 confirmed acceptable test-retest reliability, and interrater reliability Intraclass Correlation 3 ranging from 0.77 to 0.93 also showed acceptable results. The effect size r of 0.53 plus Cohen's d of 1.25 indicates the capacity of the scale to detect changes in performance. Further research is needed to test the English version of the scale and its validity.
Salyers, M P; McHugo, G J; Cook, J A; Razzano, L A; Drake, R E; Mueser, K T
2001-09-01
Reliability of well-known instruments was examined in 202 people with severe mental illness participating in a multisite vocational study. We examined interrater reliability of the Positive and Negative Syndrome Scale (PANSS) and the internal consistency and test-retest reliability of the PANSS, the Rosenberg Self-Esteem Scale, the Medical Outcomes Study Short Form-36 (SF-36), and the Quality of Life Interview. Most scales had good levels of reliability, with intraclass correlation coefficients (ICCs) and coefficient alphas above .70. However, the SF-36 scales were generally less stable over time, particularly Social Functioning (ICC = .55). Test-retest reliability was lower among less educated respondents and among ethnic minorities. We recommend close monitoring of psychometric issues in future multisite studies.
Medina-Mirapeix, Francesc; Vivo-Fernández, Iván; López-Cañizares, Juan; García-Vidal, José A; Benítez-Martínez, Josep Carles; Del Baño-Aledo, María Elena
2018-01-01
The objective was to determine the inter-observer and test/retest reliability of the "Five-repetition sit-to-stand" (5STS) test in patients with total knee replacement (TKR). To explore correlation between 5STS and two mobility tests. A reliability study was conducted among 24 (mean age 72.13, S.D. 10.67; 50% were women) outpatients with TKR. They were recruited from a traumatology unit of a public hospital via convenience sampling. A physiotherapist and trauma physician assessed each patient at the same time. The same physiotherapist realized a 5STS second measurement 45-60min after the first one. Reliability was assessed with intraclass correlation coefficients (ICCs) and Bland-Altman plots. Pearson coefficient was calculated to assess the correlation between 5STS, time up to go test (TUG) and four meters gait speed (4MGS). ICC for inter-observer and test-retest reliability of the 5STS were 0.998 (95% confidence interval [CI], 0.995-0.999) and 0.982 (95% CI, 0.959-0.992). Bland-Altman plot inter-observer showed limits between -0.82 and 1.06 with a mean of 0.11 and no heteroscedasticity within the data. Bland-Altman plot for test-retest showed the limits between 1.76 and 4.16, a mean of 1.20 and heteroscedasticity within the data. Pearson correlation coefficient revealed significant correlation between 5STS and TUG (r=0.7, p<0.001) and 4MGS (r=-0.583, p=0.003). This study demonstrates excellent inter-observer and test-retest reliability when it is used in people with TKR, and also significant correlation with other functional mobility tests. These findings support the use of 5STS as outcome measure in TKR population. Copyright © 2017 Elsevier B.V. All rights reserved.
Validation of a pregnancy planning measure for Arabic-speaking women.
Almaghaslah, Eman; Rochat, Roger; Farhat, Ghada
2017-01-01
The prevalence of unplanned pregnancy in Saudi Arabia has not been thoroughly investigated. To conduct a psychometric evaluation study of the Arabic version of the London Measure of Unplanned Pregnancy (LMUP). To evaluate the psychometric properties of the LMUP, we conducted a self-administered online survey among 796 ever-married Saudi women aged 20-49 years, and a re-test survey among 24 women. The psychometric properties evaluated included content validity measured by content validity index (CVI), structural validity assessed by exploratory factor analysis (EFA), substantive validity assessed by hypothesis testing, contextual stability for the test-retest assessed by weighted Kappa, and internal consistency assessed by Cronbach's alpha. The psychometric analysis of the Arabic version of LMUP exhibited valid and reliable properties. The CVIs for individual items and at the scale level were >0.7. EFA confirmed a unidimensional extraction of the scale item. Hypothesis testing confirmed expected associations. The tool was stable with weighted kappa = 0.78 and Cronbach's alpha = 0.88. In this study, the validity and reliability of the Arabic version of the LMUP were confirmed according to well-known psychometric criteria. This LMUP version can be used in research studies among Arabic-speaking women to measure unplanned pregnancy and investigate correlates and outcomes related to unplanned pregnancy.
Evaluating the use of in-store measures in retail food stores and restaurants in Brazil.
Duran, Ana Clara; Lock, Karen; Latorre, Maria do Rosario D O; Jaime, Patricia Constante
2015-01-01
To assess inter-rater reliability, test-retest reliability, and construct validity of retail food store, open-air food market, and restaurant observation tools adapted to the Brazilian urban context. This study is part of a cross-sectional observation survey conducted in 13 districts across the city of Sao Paulo, Brazil in 2010-2011. Food store and restaurant observational tools were developed based on previously available tools, and then tested it. They included measures on the availability, variety, quality, pricing, and promotion of fruits and vegetables and ultra-processed foods. We used Kappa statistics and intra-class correlation coefficients to assess inter-rater and test-retest reliabilities in samples of 142 restaurants, 97 retail food stores (including open-air food markets), and of 62 restaurants and 45 retail food stores (including open-air food markets), respectively. Construct validity as the tool's abilities to discriminate based on store types and different income contexts were assessed in the entire sample: 305 retail food stores, 8 fruits and vegetable markets, and 472 restaurants. Inter-rater and test-retest reliability were generally high, with most Kappa values greater than 0.70 (range 0.49-1.00). Both tools discriminated between store types and neighborhoods with different median income. Fruits and vegetables were more likely to be found in middle to higher-income neighborhoods, while soda, fruit-flavored drink mixes, cookies, and chips were cheaper and more likely to be found in lower-income neighborhoods. The measures were reliable and able to reveal significant differences across store types and different contexts. Although some items may require revision, results suggest that the tools may be used to reliably measure the food stores and restaurant food environment in urban settings of middle-income countries. Such studies can help .inform health promotion interventions and policies in these contexts.
Evaluating the use of in-store measures in retail food stores and restaurants in Brazil
Duran, Ana Clara; Lock, Karen; Latorre, Maria do Rosario D O; Jaime, Patricia Constante
2015-01-01
ABSTRACT OBJECTIVE To assess inter-rater reliability, test-retest reliability, and construct validity of retail food store, open-air food market, and restaurant observation tools adapted to the Brazilian urban context. METHODS This study is part of a cross-sectional observation survey conducted in 13 districts across the city of Sao Paulo, Brazil in 2010-2011. Food store and restaurant observational tools were developed based on previously available tools, and then tested it. They included measures on the availability, variety, quality, pricing, and promotion of fruits and vegetables and ultra-processed foods. We used Kappa statistics and intra-class correlation coefficients to assess inter-rater and test-retest reliabilities in samples of 142 restaurants, 97 retail food stores (including open-air food markets), and of 62 restaurants and 45 retail food stores (including open-air food markets), respectively. Construct validity as the tool’s abilities to discriminate based on store types and different income contexts were assessed in the entire sample: 305 retail food stores, 8 fruits and vegetable markets, and 472 restaurants. RESULTS Inter-rater and test-retest reliability were generally high, with most Kappa values greater than 0.70 (range 0.49-1.00). Both tools discriminated between store types and neighborhoods with different median income. Fruits and vegetables were more likely to be found in middle to higher-income neighborhoods, while soda, fruit-flavored drink mixes, cookies, and chips were cheaper and more likely to be found in lower-income neighborhoods. CONCLUSIONS The measures were reliable and able to reveal significant differences across store types and different contexts. Although some items may require revision, results suggest that the tools may be used to reliably measure the food stores and restaurant food environment in urban settings of middle-income countries. Such studies can help .inform health promotion interventions and policies in these contexts. PMID:26538101
Ruschel, Caroline; Haupenthal, Alessandro; Jacomel, Gabriel Fernandes; Fontana, Heiliane de Brito; Santos, Daniela Pacheco dos; Scoz, Robson Dias; Roesler, Helio
2015-05-20
Isometric muscle strength of knee extensors has been assessed for estimating performance, evaluating progress during physical training, and investigating the relationship between isometric and dynamic/functional performance. To assess the validity and reliability of an adapted leg-extension machine for measuring isometric knee extensor force. Validity (concurrent approach) and reliability (test and test-retest approach) study. University laboratory. 70 healthy men and women aged between 20 and 30 y (39 in the validity study and 31 in the reliability study). Intraclass correlation coefficient (ICC) values calculated for the maximum voluntary isometric torque of knee extensors at 30°, 60°, and 90°, measured with the prototype and with an isokinetic dynamometer (ICC2,1, validity study) and measured with the prototype in test and retest sessions, scheduled from 48 h to 72 h apart (ICC1,1, reliability study). In the validity analysis, the prototype showed good agreement for measurements at 30° (ICC2,1 = .75, SEM = 18.2 Nm) and excellent agreement for measurements at 60° (ICC2,1 = .93, SEM = 9.6 Nm) and at 90° (ICC2,1 = .94, SEM = 8.9 Nm). Regarding the reliability analysis, between-days' ICC1,1 were good to excellent, ranging from .88 to .93. Standard error of measurement and minimal detectable difference based on test-retest ranged from 11.7 Nm to 18.1 Nm and 32.5 Nm to 50.1 Nm, respectively, for the 3 analyzed knee angles. The analysis of validity and repeatability of the prototype for measuring isometric muscle strength has shown to be good or excellent, depending on the knee joint angle analyzed. The new instrument, which presents a relative low cost and easiness of transportation when compared with an isokinetic dynamometer, is valid and provides consistent data concerning isometric strength of knee extensors and, for this reason, can be used for practical, clinical, and research purposes.
Mahdavi, Mohammad Ebrahim; Pourbakht, Akram; Parand, Akram; Jalaie, Shohreh
2018-03-01
Evaluation of dichotic listening to digits is a common part of many studies for diagnosis and managing auditory processing disorders in children. Previous researchers have verified test-retest relative reliability of dichotic digits results in normal children and adults. However, detecting intervention-related changes in the ear scores after dichotic listening training requires information regarding trial-to-trial typical variation of individual ear scores that is estimated using indices of absolute reliability. Previous studies have not addressed absolute reliability of dichotic listening results. To compare the results of the Persian randomized dichotic digits test (PRDDT) and its relative and absolute indices of reliability between typical achieving (TA) and learning-disabled (LD) children. A repeated measures observational study. Fifteen LD children were recruited from a previously performed study with age range of 7-12 yr. The control group consisted of 15 TA schoolchildren with age range of 8-11 yr. The Persian randomized dichotic digits test was administered on the children under free recall condition in two test sessions 7-12 days apart. We compared the average of the ear scores and ear advantage between TA and LD children. Relative indices of reliability included Pearson's correlation and intraclass correlation (ICC 2,1 ) coefficients and absolute reliability was evaluated by calculation of standard error of measurement (SEM) and minimal detectable change (MDC) using the raw ear scores. The Pearson correlation coefficient indicated that in both groups of children the ear scores of test and retest sessions were strongly and positively (greater than +0.8) correlated. The ear scores showed excellent ICC coefficient of consistency (0.78-0.82) and fair to excellent ICC coefficient of absolute agreement (0.62-0.74) in TA children and excellent ICC coefficients of consistency and absolute agreement in LD children (0.76-0.87). SEM and SEM% of the ear scores in TA children were 1.46 and 1.44% for the right ear and 4.68 and 5.47% for the left ear. SEM and SEM% of the ear scores in LD children were 4.55 and 5.88% for the right ear to 7.56 and 12.81% for the left ear. MDC and MDC% of the ear scores in TA children varied from 4.03 and 3.99% for the right ear to 12.93 and 15.13% for the left ear. MDC and MDC% of the ear scores in LD children varied from 12.57 and 16.25% for the right ear to 20.89 and 35.39% for the left ear. The LD children indicated test-retest relative reliability as high as TA children in the ear scores measured by PRDDT. However, within-subject variations of the ear scores calculated by indices of absolute reliability were considerably higher in LD children versus TA children. The results of the current study could have implications for detecting real training-related changes in the ear scores. American Academy of Audiology
John, Andrew B; Kreisman, Brian M
2017-09-01
Extended high-frequency (EHF) audiometry is useful for evaluating ototoxic exposures and may relate to speech recognition, localisation and hearing aid benefit. There is a need to determine whether common clinical practice for EHF audiometry using tone and noise stimuli is reliable. We evaluated equivalence and compared test-retest (TRT) reproducibility for audiometric thresholds obtained using pure tones and narrowband noise (NBN) from 0.25 to 16 kHz. Thresholds and test-retest reproducibility for stimuli in the conventional (0.25-6 kHz) and EHF (8-16 kHz) frequency ranges were compared in a repeated-measures design. A total of 70 ears of adults with normal hearing. Thresholds obtained using NBN were significantly lower than thresholds obtained using pure tones from 0.5 to 16 kHz, but not 0.25 kHz. Good TRT reproducibility (within 2 dB) was observed for both stimuli at all frequencies. Responses at the lower limit of the presentation range for NBN centred at 14 and 16 kHz suggest unreliability for NBN as a threshold stimulus at these frequencies. Thresholds in the conventional and EHF ranges showed good test-retest reproducibility, but differed between stimulus types. Care should be taken when comparing pure-tone thresholds with NBN thresholds especially at these frequencies.
Salavati, M; Krijnen, W P; Rameckers, E A A; Looijestijn, P L; Maathuis, C G B; van der Schans, C P; Steenbergen, B
2015-01-01
The aims of this study were to adapt the Gross Motor Function Measure-88 (GMFM-88) for children with Cerebral Palsy (CP) and Cerebral Visual Impairment (CVI) and to determine the test-retest and interobserver reliability of the adapted version. Sixteen paediatric physical therapists familiar with CVI participated in the adaptation process. The Delphi method was used to gain consensus among a panel of experts. Seventy-seven children with CP and CVI (44 boys and 33 girls, aged between 50 and 144 months) participated in this study. To assess test-retest and interobserver reliability, the GMFM-88 was administered twice within three weeks (Mean=9 days, SD=6 days) by trained paediatric physical therapists, one of whom was familiar with the child and one who wasn't. Percentages of identical scores, Cronbach's alphas and intraclass correlation coefficients (ICC) were computed for each dimension level. All experts agreed on the proposed adaptations of the GMFM-88 for children with CP and CVI. Test-retest reliability ICCs for dimension scores were between 0.94 and 1.00, mean percentages of identical scores between 29 and 71, and interobserver reliability ICCs of the adapted GMFM-88 were 0.99-1.00 for dimension scores. Mean percentages of identical scores varied between 53 and 91. Test-retest and interobserver reliability of the GMFM-88-CVI for children with CP and CVI was excellent. Internal consistency of dimension scores lay between 0.97 and 1.00. The psychometric properties of the adapted GMFM-88 for children with CP and CVI are reliable and comparable to the original GMFM-88. Copyright © 2015 Elsevier Ltd. All rights reserved.
Almarwani, Maha; Perera, Subashan; VanSwearingen, Jessie M; Sparto, Patrick J; Brach, Jennifer S
2016-02-01
Gait variability is a marker of gait performance and future mobility status in older adults. Reliability of gait variability has been examined mainly in community dwelling older adults who are likely to fluctuate over time. The purpose of this study was to compare test-retest reliability and determine minimal detectable change (MDC) of spatial and temporal gait variability in younger and older adults. Forty younger (mean age=26.6 ± 6.0 years) and 46 older adults (mean age=78.1 ± 6.2 years) were included in the study. Gait characteristics were measured twice, approximately 1 week apart, using a computerized walkway (GaitMat II). Participants completed 4 passes on the GaitMat II at their self-selected walking speed. Test-retest reliability was calculated using Intra-class correlation coefficients (ICCs(2,1)), 95% limits of agreement (95% LoA) in conjunction with Bland-Altman plots, relative limits of agreement (LoA%) and standard error of measurement (SEM). The MDC at 90% and 95% level were also calculated. ICCs of gait variability ranged 0.26-0.65 in younger and 0.28-0.74 in older adults. The LoA% and SEM were consistently higher (i.e. less reliable) for all gait variables in older compared to younger adults except SEM for step width. The MDC was consistently larger for all gait variables in older compared to younger adults except step width. ICCs were of limited utility due to restricted ranges in younger adults. Based on absolute reliability measures and MDC, younger had greater test-retest reliability and smaller MDC of spatial and temporal gait variability compared to older adults. Copyright © 2015 Elsevier B.V. All rights reserved.
The Pareidolia Test: A Simple Neuropsychological Test Measuring Visual Hallucination-Like Illusions
Mamiya, Yasuyuki; Nishio, Yoshiyuki; Watanabe, Hiroyuki; Yokoi, Kayoko; Uchiyama, Makoto; Baba, Toru; Iizuka, Osamu; Kanno, Shigenori; Kamimura, Naoto; Kazui, Hiroaki; Hashimoto, Mamoru; Ikeda, Manabu; Takeshita, Chieko; Shimomura, Tatsuo; Mori, Etsuro
2016-01-01
Background Visual hallucinations are a core clinical feature of dementia with Lewy bodies (DLB), and this symptom is important in the differential diagnosis and prediction of treatment response. The pareidolia test is a tool that evokes visual hallucination-like illusions, and these illusions may be a surrogate marker of visual hallucinations in DLB. We created a simplified version of the pareidolia test and examined its validity and reliability to establish the clinical utility of this test. Methods The pareidolia test was administered to 52 patients with DLB, 52 patients with Alzheimer’s disease (AD) and 20 healthy controls (HCs). We assessed the test-retest/inter-rater reliability using the intra-class correlation coefficient (ICC) and the concurrent validity using the Neuropsychiatric Inventory (NPI) hallucinations score as a reference. A receiver operating characteristic (ROC) analysis was used to evaluate the sensitivity and specificity of the pareidolia test to differentiate DLB from AD and HCs. Results The pareidolia test required approximately 15 minutes to administer, exhibited good test-retest/inter-rater reliability (ICC of 0.82), and moderately correlated with the NPI hallucinations score (rs = 0.42). Using an optimal cut-off score set according to the ROC analysis, and the pareidolia test differentiated DLB from AD with a sensitivity of 81% and a specificity of 92%. Conclusions Our study suggests that the simplified version of the pareidolia test is a valid and reliable surrogate marker of visual hallucinations in DLB. PMID:27171377
ERIC Educational Resources Information Center
Anderson, Daniel; Lai, Cheg-Fei; Park, Bitnara Jasmine; Alonzo, Julie; Tindal, Gerald
2012-01-01
This technical report is one in a series of five describing the reliability (test/retest an alternate form) and G-Theory/D-Study on the easyCBM reading measures, grades 1-5. Data were gathered in the spring of 2011 from the convenience sample of students nested within classrooms at a medium-sized school district in the Pacific Northwest. Due to…
ERIC Educational Resources Information Center
Lai, Cheng-Fei; Park, Bitnara Jasmine; Anderson, Daniel; Alonzo, Julie; Tindal, Gerald
2012-01-01
This technical report is one in a series of five describing the reliability (test/retest and alternate form) and G-Theory/D-Study research on the easyCBM reading measures, grades 1-5. Data were gathered in the spring of 2011 from a convenience sample of students nested within classrooms at a medium-sized school district in the Pacific Northwest.…
ERIC Educational Resources Information Center
Park, Bitnara Jasmine; Anderson, Daniel; Alonzo, Julie; Lai, Cheng-Fei; Tindal, Gerald
2012-01-01
This technical report is one in a series of five describing the reliability (test/retest and alternate form) and G-Theory/D-Study research on the easyCBM reading measures, grades 1-5. Data were gathered in the spring of 2011 from a convenience sample of students nested within classrooms at a medium-sized school district in the Pacific Northwest.…
Alanazi, Fahad; Gleeson, Peggy; Olson, Sharon; Roddey, Toni
2017-04-01
Prospective cohort study of a cross-cultural low back pain (LBP) questionnaire OBJECTIVE.: The objectives of the present study were to translate and cross-culturally adapt the Fear-Avoidance Beliefs Questionnaire (FABQ) to create a version in Arabic and to test its psychometric properties. The FABQ measures the effects that fear and avoidance beliefs have on work and on physical activity. An FABQ cross-culturally adapted for Arabic readers and speakers was created by forward translation, translation synthesis, and backward translation. Forty patients in Riyadh, Saudi Arabia, with LBP evaluated use of the questionnaire, and 70 patients from the same hospital participated in reliability, validity, and sensitivity studies. To determine test-retest reliability of the Arabic FABQ, patients completed it twice within 48 hours without receiving any active treatment between these two sessions. Patients completed the Arabic FABQ (and three other scales) at baseline and 14 days later to determine its validity and sensitivity. Test-retest reliability was good (FABQ-work: intraclass coefficient [ICC] = 0.74; FABQ-physical activity: ICC = 0.90; FABQ overall: ICC = 0.76). Correlations between the FABQ and three other instruments for measuring pain and disability were weak. The strongest correlation was found at the follow-up session with the Arabic Oswestry Questionnaire (r = 0.283; P ≤ 0.05). Sensitivity to change was low. The translation and adaptation of the Arabic version of the FABQ was successful. Overall, the Arabic FABQ had good test-retest reliability, acceptable construct validity, and low sensitivity to change. The Arabic version of the FABQ shows promise in the assessment of fear-avoidance beliefs among patients with LBP who speak and read Arabic. 3.
Palmer, Kara K.
2017-01-01
Assessing children’s perceptions of their movement abilities (i.e., perceived competence) is traditionally done using picture scales—Pictorial Scale of Perceived Competence and Acceptance for Young Children or Pictorial Scale of Perceived Movement Skill Competence. Pictures fail to capture the temporal components of movement. To address this limitation, we created a digital-based instrument to assess perceived motor competence: the Digital Scale of Perceived Motor Competence. The purpose of this study was to determine the validity, reliability, and internal consistency of the Digital-based Scale of Perceived Motor Skill Competence. The Digital-based Scale of Perceived Motor Skill Competence is based on the twelve fundamental motor skills from the Test of Gross Motor Development-2nd Edition with a similar layout and item structure as the Pictorial Scale of Perceived Movement Skill Competence. Face Validity of the instrument was examined in Phase I (n = 56; Mage = 8.6 ± 0.7 years, 26 girls). Test-retest reliability and internal consistency were assessed in Phase II (n = 54, Mage = 8.7 years ± 0.5 years, 26 girls). Intra-class correlations (ICC) and Cronbach’s alpha were conducted to determine test-retest reliability and internal consistency for all twelve skills along with locomotor and object control subscales. The Digital Scale of Perceived Motor Competence demonstrates excellent test-retest reliability (ICC = 0.83, total; ICC = 0.77, locomotor; ICC = 0.79, object control) and acceptable/good internal consistency (α = 0.62, total; α = 0.57, locomotor; α = 0.49, object control). Findings provide evidence of the reliability of the three level digital-based instrument of perceived motor competence for older children. PMID:29910408
Tan, Christine L; Hassali, Mohamed A; Saleem, Fahad; Shafie, Asrul A; Aljadhey, Hisham; Gan, Vincent B
2015-01-01
(i) To develop the Pharmacy Value-Added Services Questionnaire (PVASQ) using emerging themes generated from interviews. (ii) To establish reliability and validity of questionnaire instrument. Using an extended Theory of Planned Behavior as the theoretical model, face-to-face interviews generated salient beliefs of pharmacy value-added services. The PVASQ was constructed initially in English incorporating important themes and later translated into the Malay language with forward and backward translation. Intention (INT) to adopt pharmacy value-added services is predicted by attitudes (ATT), subjective norms (SN), perceived behavioral control (PBC), knowledge and expectations. Using a 7-point Likert-type scale and a dichotomous scale, test-retest reliability (N=25) was assessed by administrating the questionnaire instrument twice at an interval of one week apart. Internal consistency was measured by Cronbach's alpha and construct validity between two administrations was assessed using the kappa statistic and the intraclass correlation coefficient (ICC). Confirmatory Factor Analysis, CFA (N=410) was conducted to assess construct validity of the PVASQ. The kappa coefficients indicate a moderate to almost perfect strength of agreement between test and retest. The ICC for all scales tested for intra-rater (test-retest) reliability was good. The overall Cronbach' s alpha (N=25) is 0.912 and 0.908 for the two time points. The result of CFA (N=410) showed most items loaded strongly and correctly into corresponding factors. Only one item was eliminated. This study is the first to develop and establish the reliability and validity of the Pharmacy Value-Added Services Questionnaire instrument using the Theory of Planned Behavior as the theoretical model. The translated Malay language version of PVASQ is reliable and valid to predict Malaysian patients' intention to adopt pharmacy value-added services to collect partial medicine supply.
Lee, Myungmo; Song, Changho; Lee, Kyoungjin; Shin, Doochul; Shin, Seungho
2014-07-14
Treadmill gait analysis was more advantageous than over-ground walking because it allowed continuous measurements of the gait parameters. The purpose of this study was to investigate the concurrent validity and the test-retest reliability of the OPTOGait photoelectric cell system against the treadmill-based gait analysis system by assessing spatio-temporal gait parameters. Twenty-six stroke patients and 18 healthy adults were asked to walk on the treadmill at their preferred speed. The concurrent validity was assessed by comparing data obtained from the 2 systems, and the test-retest reliability was determined by comparing data obtained from the 1st and the 2nd session of the OPTOGait system. The concurrent validity, identified by the intra-class correlation coefficients (ICC [2, 1]), coefficients of variation (CVME), and 95% limits of agreement (LOA) for the spatial-temporal gait parameters, were excellent but the temporal parameters expressed as a percentage of the gait cycle were poor. The test-retest reliability of the OPTOGait System, identified by ICC (3, 1), CVME, 95% LOA, standard error of measurement (SEM), and minimum detectable change (MDC95%) for the spatio-temporal gait parameters, was high. These findings indicated that the treadmill-based OPTOGait System had strong concurrent validity and test-retest reliability. This portable system could be useful for clinical assessments.
Slagers, Anton J; Reininga, Inge H F; van den Akker-Scheek, Inge
2017-02-01
The ACL-Return to Sport after Injury scale (ACL-RSI) measures athletes' emotions, confidence in performance, and risk appraisal in relation to return to sport after ACL reconstruction. Aim of this study was to study the validity and reliability of the Dutch version of the ACL-RSI (ACL-RSI (NL)). Total 150 patients, who were 3-16 months postoperative, completed the ACL-RSI(NL) and 5 other questionnaires regarding psychological readiness to return to sports, knee-specific physical functioning, kinesiophobia, and health-specific locus of control. Construct validity of the ACL-RSI(NL) was determined with factor analysis and by exploring 10 hypotheses regarding correlations between ACL-RSI(NL) and the other questionnaires. For test-retest reliability, 107 patients (5-16 months postoperative) completed the ACL-RSI(NL) again 2 weeks after the first administration. Cronbach's alpha, Intraclass Correlation Coefficient (ICC), SEM, and SDC, were calculated. Bland-Altman analysis was conducted to assess bias between test and retest. Nine hypotheses (90%) were confirmed, indicating good construct validity. The ACL-RSI(NL) showed good internal consistency (Cronbach's alpha 0.94) and test-retest reliability (ICC 0.93). SEM was 5.5 and SDC was 15. A significant bias of 3.2 points between test and retest was found. Therefore, the ACL-RSI(NL) can be used to investigate psychological factors relevant to returning to sport after ACL reconstruction.
Zheng, Jing; You, Li-Ming; Lou, Tan-Qi; Chen, Nian-Chang; Lai, De-Yuan; Liang, Yan-Yi; Li, Ying-Na; Gu, Ying-Ming; Lv, Shao-Fen; Zhai, Cui-Qiu
2010-02-01
Perceptions of exercise benefits and barriers affect exercise behavior. Because of the clinical course and treatment, dialysis patients differ from the general population in their perceptions of exercise benefits and barriers, especially the latter. At present, no valid instruments for assessing perceived exercise benefits and barriers in dialysis patients are available. Our goal was to develop and test the psychometric properties of the Dialysis patient-perceived Exercise Benefits and Barriers Scale (DPEBBS). A literature review and two focus groups were conducted to generate the initial item pool. An expert panel examined the content validity. Then, 269 Chinese hemodialysis patients were recruited by convenience sampling. Exploratory and confirmatory factor analyses were used to test construct validity. Finally, internal consistency and test-retest reliability were assessed. The expert panel determined that the content validity index was satisfactory. The final 24-item scale consisted of six factors explaining 57% of the total variance in the data. Confirmative factor analysis supported the six-factor structure and a higher-order model. Cronbach's alpha was 0.87 for the total scale, and 0.84 for test-retest reliability. The DPEBBS was a valid and reliable instrument for evaluating dialysis patients' perceived benefits and barriers to exercise. The application value of this scale remains to be investigated by increasing the sample size and evaluating patients undergoing different dialysis modalities and coming from different regions and cultural backgrounds. Copyright 2009 Elsevier Ltd. All rights reserved.
Mai, Zhi-Ming; Lin, Jia-Huang; Chiang, Shing-Chun; Ngan, Roger Kai-Cheong; Kwong, Dora Lai-Wan; Ng, Wai-Tong; Ng, Alice Wan-Ying; Yuen, Kam-Tong; Ip, Kai-Ming; Chan, Yap-Hang; Lee, Anne Wing-Mui; Ho, Sai-Yin; Lung, Maria Li; Lam, Tai-Hing
2018-05-04
We evaluated the reliability of early life nasopharyngeal carcinoma (NPC) aetiology factors in the questionnaire of an NPC case-control study in Hong Kong during 2014-2017. 140 subjects aged 18+ completed the same computer-assisted questionnaire twice, separated by at least 2 weeks. The questionnaire included most known NPC aetiology factors and the present analysis focused on early life exposure. Test-retest reliability of all the 285 questionnaire items was assessed in all subjects and in 5 subgroups defined by cases/controls, sex, time between 1 st and 2 nd questionnaire (2-29/≥30 weeks), education (secondary or less/postsecondary), and age (25-44/45-59/60+ years) at the first questionnaire. The reliability of items on dietary habits, body figure, skin tone and sun exposure in early life periods (age 6-12 and 13-18) was moderate-to-almost perfect, and most other items had fair-to-substantial reliability in all life periods (age 6-12, 13-18 and 19-30, and 10 years ago). Differences in reliability by strata of the 5 subgroups were only observed in a few items. This study is the first to report the reliability of an NPC questionnaire, and make the questionnaire available online. Overall, our questionnaire had acceptable reliability, suggesting that previous NPC study results on the same risk factors would have similar reliability.
Reliability and Validity of Ten Consumer Activity Trackers Depend on Walking Speed.
Fokkema, Tryntsje; Kooiman, Thea J M; Krijnen, Wim P; VAN DER Schans, Cees P; DE Groot, Martijn
2017-04-01
To examine the test-retest reliability and validity of ten activity trackers for step counting at three different walking speeds. Thirty-one healthy participants walked twice on a treadmill for 30 min while wearing 10 activity trackers (Polar Loop, Garmin Vivosmart, Fitbit Charge HR, Apple Watch Sport, Pebble Smartwatch, Samsung Gear S, Misfit Flash, Jawbone Up Move, Flyfit, and Moves). Participants walked three walking speeds for 10 min each; slow (3.2 km·h), average (4.8 km·h), and vigorous (6.4 km·h). To measure test-retest reliability, intraclass correlations (ICC) were determined between the first and second treadmill test. Validity was determined by comparing the trackers with the gold standard (hand counting), using mean differences, mean absolute percentage errors, and ICC. Statistical differences were calculated by paired-sample t tests, Wilcoxon signed-rank tests, and by constructing Bland-Altman plots. Test-retest reliability varied with ICC ranging from -0.02 to 0.97. Validity varied between trackers and different walking speeds with mean differences between the gold standard and activity trackers ranging from 0.0 to 26.4%. Most trackers showed relatively low ICC and broad limits of agreement of the Bland-Altman plots at the different speeds. For the slow walking speed, the Garmin Vivosmart and Fitbit Charge HR showed the most accurate results. The Garmin Vivosmart and Apple Watch Sport demonstrated the best accuracy at an average walking speed. For vigorous walking, the Apple Watch Sport, Pebble Smartwatch, and Samsung Gear S exhibited the most accurate results. Test-retest reliability and validity of activity trackers depends on walking speed. In general, consumer activity trackers perform better at an average and vigorous walking speed than at a slower walking speed.
Method matters: Understanding diagnostic reliability in DSM-IV and DSM-5.
Chmielewski, Michael; Clark, Lee Anna; Bagby, R Michael; Watson, David
2015-08-01
Diagnostic reliability is essential for the science and practice of psychology, in part because reliability is necessary for validity. Recently, the DSM-5 field trials documented lower diagnostic reliability than past field trials and the general research literature, resulting in substantial criticism of the DSM-5 diagnostic criteria. Rather than indicating specific problems with DSM-5, however, the field trials may have revealed long-standing diagnostic issues that have been hidden due to a reliance on audio/video recordings for estimating reliability. We estimated the reliability of DSM-IV diagnoses using both the standard audio-recording method and the test-retest method used in the DSM-5 field trials, in which different clinicians conduct separate interviews. Psychiatric patients (N = 339) were diagnosed using the SCID-I/P; 218 were diagnosed a second time by an independent interviewer. Diagnostic reliability using the audio-recording method (N = 49) was "good" to "excellent" (M κ = .80) and comparable to the DSM-IV field trials estimates. Reliability using the test-retest method (N = 218) was "poor" to "fair" (M κ = .47) and similar to DSM-5 field-trials' estimates. Despite low test-retest diagnostic reliability, self-reported symptoms were highly stable. Moreover, there was no association between change in self-report and change in diagnostic status. These results demonstrate the influence of method on estimates of diagnostic reliability. (c) 2015 APA, all rights reserved).
Test-Retest Reliability of a Novel Isokinetic Squat Device With Strength-Trained Athletes.
Bridgeman, Lee A; McGuigan, Michael R; Gill, Nicholas D; Dulson, Deborah K
2016-11-01
Bridgeman, LA, McGuigan, MR, Gill, ND, and Dulson, DK. Test-retest reliability of a novel isokinetic squat device with strength-trained athletes. J Strength Cond Res 30(11): 3261-3265, 2016-The aim of this study was to investigate the test-retest reliability of a novel multijoint isokinetic squat device. The subjects in this study were 10 strength-trained athletes. Each subject completed 3 maximal testing sessions to assess peak concentric and eccentric force (N) over a 3-week period using the Exerbotics squat device. Mean differences between eccentric and concentric force across the trials were calculated. Intraclass correlation coefficients (ICCs) and coefficients of variation (CVs) for the variables of interest were calculated using an excel reliability spreadsheet. Between trials 1 and 2 an 11.0 and 2.3% increase in mean concentric and eccentric forces, respectively, was reported. Between trials 2 and 3 a 1.35% increase in the mean concentric force production and a 1.4% increase in eccentric force production was reported. The mean concentric peak force CV and ICC across the 3 trials was 10% (7.6-15.4) and 0.95 (0.87-0.98) respectively. However, the mean eccentric peak force CV and ICC across the trials was 7.2% (5.5-11.1) and 0.90 (0.76-0.97), respectively. Based on these findings it is suggested that the Exerbotics squat device shows good test-retest reliability. Therefore practitioners and investigators may consider its use to monitor changes in concentric and eccentric peak force.
A New Protocol to Evaluate the Effect of Topical Anesthesia
List, Thomas; Mojir, Katerina; Svensson, Peter; Pigg, Maria
2014-01-01
This double-blind, placebo-controlled, randomized cross-over clinical experimental study tested the reliability, validity, and sensitivity to change of punctuate pain thresholds and self-reported pain on needle penetration. Female subjects without orofacial pain were tested in 2 sessions at 1- to 2-week intervals. The test site was the mucobuccal fold adjacent to the first upper right premolar. Active lidocaine hydrochloride 2% (Dynexan) or placebo gel was applied for 5 minutes, and sensory testing was performed before and after application. The standardized quantitative sensory test protocol included mechanical pain threshold (MPT), pressure pain threshold (PPT), mechanical pain sensitivity (MPS), and needle penetration sensitivity (NPS) assessments. Twenty-nine subjects, mean (SD) age 29.0 (10.2) years, completed the study. Test-retest reliability intraclass correlation coefficient at 10-minute intervals between examinations was MPT 0.69, PPT 0.79, MPS 0.72, and NPS 0.86. A high correlation was found between NPS and MPS (r = 0.84; P < .001), whereas NPS and PPT were not significantly correlated. The study found good to excellent test-retest reliability for all measures. None of the sensory measures detected changes in sensitivity following lidocaine 2% or placebo gel. Electronic von Frey assessments of MPT/MPS on oral mucosa have good validity. PMID:25517548
Reliability of provocative tests of motion sickness susceptibility
NASA Technical Reports Server (NTRS)
Calkins, D. S.; Reschke, M. F.; Kennedy, R. S.; Dunlop, W. P.
1987-01-01
Test-retest reliability values were derived from motion sickness susceptibility scores obtained from two successive exposures to each of three tests: (1) Coriolis sickness sensitivity test; (2) staircase velocity movement test; and (3) parabolic flight static chair test. The reliability of the three tests ranged from 0.70 to 0.88. Normalizing values from predictors with skewed distributions improved the reliability.
Nguyen, Allison M; Arbuckle, Rob; Korver, Tjeerd; Chen, Fang; Taylor, Beverley; Turnbull, Alice; Norquist, Josephine M
2017-08-01
The objective of this study was to evaluate the psychometric properties of the Dysmenorrhea Daily Diary (DysDD), an electronic patient-reported outcome, in a sample of 355 women with primary dysmenorrhea enrolled in a phase IIb, multicenter, randomized, partially blinded, placebo-controlled trial for treatment of dysmenorrhea. Subjects completed the DysDD over three menstrual cycles, one pre-treatment baseline cycle and two treatment cycles. The DysDD was administered alongside the Menstrual Distress Questionnaire (MDQ), the Short-Form 36 Version 2.0 (SF-36v2), and a Global Assessment of Change (GAC). Item response distributions, test-retest reliability, concurrent and known groups validity, responsiveness, and minimally important difference (MID) were evaluated for the DysDD. As expected, item response distributions varied throughout the menstrual period for all items, with the response scales fully utilized. Within-cycle test-retest reliability was adequate (weighted kappa: 0.5-0.7), although between-cycle test-retest was poor (weighted kappa: 0.1-0.5), most likely due to the highly variable nature of dysmenorrhea between cycles rather than limitations of the measure. Correlations with the MDQ and SF-36v2 were low-moderate, but in the predicted direction, supporting concurrent validity. There were significant differences in DysDD scores across severity groups based on pain medication use. The DysDD was responsive to changes in patients' dysmenorrhea with significantly different changes in scores between change groups (p < 0.0001). MID analyses suggest changes on the DysDD 0-10 pelvic pain score of three points can be considered clinically meaningful. Overall, findings indicate that the DysDD has acceptable reliability and is a valid and responsive instrument for assessing dysmenorrhea.
Linder, Martin; Michaelson, Peter; Röijezon, Ulrik
2016-02-01
Disruption of cortical representation, or body schema, has been indicated as a factor in the persistence and recurrence of low back pain (LBP). This has been observed through impaired laterality judgment ability and it has been suggested that this ability is affected in a spatial rather than anatomical manner. We compared laterality judgment performance of foot and trunk movements between people with LBP with or without leg pain and healthy controls, and investigated associations between test performance and pain. We also assessed the test-retest reliability of the Recognise Online™ software when used in a clinical and a home setting. Cross-sectional observational and test-retest study. Thirty individuals with LBP and 30 healthy controls performed judgment tests of foot and trunk laterality once supervised in a clinic and twice at home. No statistically significant group differences were found. LBP intensity was negatively related to trunk laterality accuracy (p = 0.019). Intraclass correlation values ranged from 0.51 to 0.91. Reaction time improved significantly between test occasions while accuracy did not. Laterality judgments were not impaired in subjects with LBP compared to controls. Further research may clarify the relationship between pain mechanisms in LBP and laterality judgment ability. Reliability values were mostly acceptable, with wide and low confidence intervals, suggesting test-retest reliability for Recognise Online™ could be questioned in this trial. A significant learning effect was observed which should be considered in clinical and research application of the test. Copyright © 2015 Elsevier Ltd. All rights reserved.
Reliable change on the Boston naming test.
Sachs, Bonnie C; Lucas, John A; Smith, Glenn E; Ivnik, Robert J; Petersen, Ronald C; Graff-Radford, Neill R; Pedraza, Otto
2012-03-01
Serial assessments are commonplace in neuropsychological practice and used to document cognitive trajectory for many clinical conditions. However, true change scores may be distorted by measurement error, repeated exposure to the assessment instrument, or person variables. The present study provides reliable change indices (RCI) for the Boston Naming Test, derived from a sample of 844 cognitively normal adults aged 56 years and older. All participants were retested between 9 and 24 months after their baseline exam. Results showed that a 4-point decline during a 9-15 month retest period or a 6-point decline during a 16-24 month retest period represents reliable change. These cutoff values were further characterized as a function of a person's age and family history of dementia. These findings may help clinicians and researchers to characterize with greater precision the temporal changes in confrontation naming ability.
Carlier, Ingrid V E; Kovács, Viktória; van Noorden, Martijn S; van der Feltz-Cornelis, Christina; Mooij, Nanda; Schulte-van Maaren, Yvonne W M; van Hemert, Albert M; Zitman, Frans G; Giltay, Erik J
2017-01-01
Assessment of psychological distress is important, because it may help to monitor treatment effects and predict treatment outcomes. We previously developed the 48-item Symptom Questionnaire (SQ-48) as a public domain self-report psychological distress instrument and showed good internal consistency as well as good convergent and divergent validity among clinical and non-clinical samples. The present study, conducted among psychiatric outpatients in a routine clinical setting, describes additional psychometric properties of the SQ-48. The primary focus is on responsiveness to therapeutic change, which to date has been rarely examined within psychiatry or clinical psychology. Since a questionnaire should also be stable when no clinically important change occurs, we also examined test-retest reliability within a test-retest design before treatment (n = 43). A pre-treatment/post-treatment design was used for responsiveness to therapeutic change, comparing the SQ-48 with two internationally widely used instruments: the Brief Symptom Inventory (n = 97) and the Outcome Questionnaire-45 (n = 109). The results showed that the SQ-48 has excellent test-retest reliability and good responsiveness to therapeutic change, without significant differences between the questionnaires in terms of responsiveness. In sum, the SQ-48 is a psychometrically sound public domain self-report instrument that can be used for routine outcome monitoring, as a benchmark tool or for research purposes. Copyright © 2015 John Wiley & Sons, Ltd. Key Practitioner Message The SQ-48 is developed as a public domain self-report questionnaire, in line with growing efforts to develop clinical instruments that are free of charge. The SQ-48 has excellent test-retest reliability and good responsiveness to therapeutic change or patient progress. There were no significant differences in terms of responsiveness between the SQ-48 and BSI or OQ-45. The SQ-48 can be used as a routine evaluation outcome measure for quality assurance in clinical practice. Providing feedback on patient progress via outcome measures could contribute to the enhancement of treatment outcomes. Copyright © 2015 John Wiley & Sons, Ltd.
Babor, Thomas F; Xuan, Ziming; Proctor, Dwayne
2008-03-01
The purposes of this study were to develop reliable procedures to monitor the content of alcohol advertisements broadcast on television and in other media, and to detect violations of the content guidelines of the alcohol industry's self-regulation codes. A set of rating-scale items was developed to measure the content guidelines of the 1997 version of the U.S. Beer Institute Code. Six focus groups were conducted with 60 college students to evaluate the face validity of the items and the feasibility of the procedure. A test-retest reliability study was then conducted with 74 participants, who rated five alcohol advertisements on two occasions separated by 1 week. Average correlations across all advertisements using three reliability statistics (r, rho, and kappa) were almost all statistically significant and the kappas were good for most items, which indicated high test-retest agreement. We also found high interrater reliabilities (intraclass correlations) among raters for item-level and guideline-level violations, indicating that regardless of the specific item, raters were consistent in their general evaluations of the advertisements. Naïve (untrained) raters can provide consistent (reliable) ratings of the main content guidelines proposed in the U.S. Beer Institute Code. The rating procedure may have future applications for monitoring compliance with industry self-regulation codes and for conducting research on the ways in which alcohol advertisements are perceived by young adults and other vulnerable populations.
Behrangrad, Shabnam; Kordi Yoosefinejad, Amin
2018-03-01
The purpose of this study is to investigate the validity and reliability of the Persian version of the Multidimensional Assessment of Fatigue Scale (MAFS) in an Iranian population with multiple sclerosis. A self-reported survey on fatigue including the MAFS, Fatigue Impact Scale and demographic measures was completed by 130 patients with multiple sclerosis and 60 healthy persons sampled with a convenience method. Test-retest reliability and validity were evaluated 3 days apart. Construct validity of the MAFS was assessed with the Fatigue Impact Scale. The MAFS had high internal consistency (Cronbach's alpha >0.9) and 3-d test-retest reliability (intraclass correlation coefficient = 0.99). Correlation between the Fatigue Impact Scale and MAFS was high (r = 0.99). Correlation between MAFS scores and the Expanded Disability Status Scale was also strong (r = 0.85). Questionnaire items showed acceptable item-scale correlation (0.968-0.993). The Persian version of the MAFS appears to be a valid and reliable questionnaire. It is an appropriate short multidimensional instrument to assess fatigue in patients with multiple sclerosis in clinical practice and research. Implications for Rehabilitation The Persian version of Multidimensional Assessment of Fatigue is a valid and reliable instrument for the assessment and monitoring the fatigue in Persian-language patients with multiple sclerosis. It is very easy to administer and a time efficient scale in comparison to other instruments evaluating fatigue in patients with multiple sclerosis.
Hendricson, William D; Rugh, John D; Hatch, John P; Stark, Debra L; Deahl, Thomas; Wallmann, Elizabeth R
2011-02-01
This article reports the validation of an assessment instrument designed to measure the outcomes of training in evidence-based practice (EBP) in the context of dentistry. Four EBP dimensions are measured by this instrument: 1) understanding of EBP concepts, 2) attitudes about EBP, 3) evidence-accessing methods, and 4) confidence in critical appraisal. The instrument-the Knowledge, Attitudes, Access, and Confidence Evaluation (KACE)-has four scales, with a total of thirty-five items: EBP knowledge (ten items), EBP attitudes (ten), accessing evidence (nine), and confidence (six). Four elements of validity were assessed: consistency of items within the KACE scales (extent to which items within a scale measure the same dimension), discrimination (capacity to detect differences between individuals with different training or experience), responsiveness (capacity to detect the effects of education on trainees), and test-retest reliability. Internal consistency of scales was assessed by analyzing responses of second-year dental students, dental residents, and dental faculty members using Cronbach coefficient alpha, a statistical measure of reliability. Discriminative validity was assessed by comparing KACE scores for the three groups. Responsiveness was assessed by comparing pre- and post-training responses for dental students and residents. To measure test-retest reliability, the full KACE was completed twice by a class of freshman dental students seventeen days apart, and the knowledge scale was completed twice by sixteen faculty members fourteen days apart. Item-to-scale consistency ranged from 0.21 to 0.78 for knowledge, 0.57 to 0.83 for attitude, 0.70 to 0.84 for accessing evidence, and 0.87 to 0.94 for confidence. For discrimination, ANOVA and post hoc testing by the Tukey-Kramer method revealed significant score differences among students, residents, and faculty members consistent with education and experience levels. For responsiveness to training, dental students and residents demonstrated statistically significant changes, in desired directions, from pre- to post-test. For the student test-retest, Pearson correlations for KACE scales were as follows: knowledge 0.66, attitudes 0.66, accessing evidence 0.74, and confidence 0.76. For the knowledge scale test-retest by faculty members, the Pearson correlation was 0.79. The construct validity of the KACE is equivalent to that of instruments that assess similar EBP dimensions in medicine. Item consistency for the knowledge scale was more variable than for other KACE scales, a finding also reported for medically oriented EBP instruments. We conclude that the KACE has good discriminative validity, responsiveness to training effects, and test-retest reliability.
VALIDATION OF A CLINICAL ASSESSMENT OF SPECTRAL RIPPLE RESOLUTION FOR COCHLEAR-IMPLANT USERS
Drennan, Ward. R.; Anderson, Elizabeth S.; Won, Jong Ho; Rubinstein, Jay T.
2013-01-01
Objectives Non-speech psychophysical tests of spectral resolution, such as the spectral-ripple discrimination task, have been shown to correlate with speech recognition performance in cochlear implant (CI) users (Henry et al., 2005; Won et al. 2007, 2011; Drennan et al. 2008; Anderson et al. 2011). However, these tests are best suited for use in the research laboratory setting and are impractical for clinical use. A test of spectral resolution that is quicker and could more easily be implemented in the clinical setting has been developed. The objectives of this study were 1) To determine if this new clinical ripple test would yield individual results equivalent to the longer, adaptive version of the ripple discrimination test; 2) To evaluate test-retest reliability for the clinical ripple measure; and 3) To examine the relationship between clinical ripple performance and monosyllabic word recognition in quiet for a group of CI listeners. Design Twenty-eight CI recipients participated in the study. Each subject was tested on both the adaptive and the clinical versions of spectral ripple discrimination, as well as CNC word recognition in quiet. The adaptive version of spectral ripple employed a 2-up, 1-down procedure for determining spectral ripple discrimination threshold. The clinical ripple test used a method of constant stimuli, with trials for each of 12 fixed ripple densities occurring six times in random order. Results from the clinical ripple test (proportion correct) were then compared to ripple discrimination thresholds (in ripples per octave) from the adaptive test. Results The clinical ripple test showed strong concurrent validity, evidenced by a good correlation between clinical ripple and adaptive ripple results (r=0.79), as well as a correlation with word recognition (r = 0.7). Excellent test-retest reliability was also demonstrated with a high test-retest correlation (r = 0.9). Conclusions The clinical ripple test is a reliable non-linguistic measure of spectral resolution, optimized for use with cochlear implant users in a clinical setting. The test might be useful as a diagnostic tool or as a possible surrogate outcome measure for evaluating treatment effects in hearing. PMID:24552679
Berghmans, Danielle D; Lenssen, Antoine F; Bastiaenen, Carolien H; Ilhan, Mustafa; Lencer, Nicole H; Roox, George M
2013-02-01
The 6-minute walk test (6 MWT) is widely used to assess exercise tolerance in cardiac rehabilitation (CR). However, previous research shows it to be insufficiently responsive, especially for patients with a relatively high maximal exercise tolerance at baseline. We therefore designed a 6-minute walk/run test (6 MWRT), which has the same duration as the 6 MWT but allows running. The objective of this study was to determine the test-retest reproducibility and responsiveness of this 6 MWRT. Responsiveness was investigated in a prospective cohort study among a group of patients entering CR at Maastricht University Medical Center, with a cross-sectional part to assess the test-retest reproducibility. Test-retest reproducibility (reliability and agreement) was investigated using the intraclass correlation (ICC) and a Bland-Altman plot of two measurements implemented in the first week of rehabilitation. Responsiveness of the 6 MWT and the 6 MWRT was calculated using the standard response mean (SRM) over a 6-week period. The first reproducibility analysis included 34 patients, the second 22 patients. The ICCs were 0.935 and 0.906, respectively, with limits of agreement of ± 79 and ± 61 m. The responsiveness analysis included 27 patients. The SRM values were 0.83 for the 6 MWT and 0.71 for the 6 MWRT. Although the 6 MWRT is a reproducible test in CR, its responsiveness is not superior to that of the 6 MWT. We therefore prefer the conventional 6 MWT as an evaluative measurement in CR and advise against using the 6 MWRT as (evaluative) measurement in CR for this purpose.
Moran, Robert W; Rushworth, Wendy M; Mason, Jesse
2017-12-01
Healthcare practitioner beliefs influence advice and management provided to patients with back pain. Several instruments measuring practitioner beliefs have been developed but psychometric properties for some have not been investigated. To investigate internal consistency, test-retest reliability and convergent validity of the Fear Avoidance Beliefs Tool (FABT), the Tampa Scale of Kinesiophobia for Health Care Providers (TSK-HC), the Back Pain Attitudes Questionnaire (Back-PAQ), and the Health Care Pain and Impairment Relationship Scale (HC-PAIRS). A secondary aim was to explore beliefs of New Zealand osteopaths and physiotherapists regarding low back pain. FABT, TSK-HC, Back-PAQ, and HC-PAIRS were administered twice, 14 days apart. Data from 91 osteopaths and 35 physiotherapists were analysed. The FABT, TSK-HC and Back-PAQ each demonstrated excellent internal consistency, (Cronbach's α = 0.92, 0.91, and 0.91 respectively), and excellent test-retest reliability (lower limit of 95% CI for intraclass correlation coefficient >0.75). Correlations between instruments (Pearson's r = 0.51 to 0.77, p < 0.001) demonstrated good convergent validity. There was a medium to large effect (Cohen's d > 0.47) for mean differences in scores, for all instruments, between professions. This study found excellent internal consistency, test-retest reliability and good convergent validity for the FABT, TSK-HC, and Back-PAQ. Previously reported internal consistency, test-retest and convergent validity of the HC-PAIRS were confirmed, and test-retest reliability was excellent. There were significant scoring differences on each instrument between professions, and while both groups demonstrated fear avoidant beliefs, physiotherapist respondent scores indicated that as a group, they held fewer fear-avoidant beliefs than osteopath respondents. Copyright © 2017 Elsevier Ltd. All rights reserved.
Test-retest and between-site reliability in a multicenter fMRI study.
Friedman, Lee; Stern, Hal; Brown, Gregory G; Mathalon, Daniel H; Turner, Jessica; Glover, Gary H; Gollub, Randy L; Lauriello, John; Lim, Kelvin O; Cannon, Tyrone; Greve, Douglas N; Bockholt, Henry Jeremy; Belger, Aysenil; Mueller, Bryon; Doty, Michael J; He, Jianchun; Wells, William; Smyth, Padhraic; Pieper, Steve; Kim, Seyoung; Kubicki, Marek; Vangel, Mark; Potkin, Steven G
2008-08-01
In the present report, estimates of test-retest and between-site reliability of fMRI assessments were produced in the context of a multicenter fMRI reliability study (FBIRN Phase 1, www.nbirn.net). Five subjects were scanned on 10 MRI scanners on two occasions. The fMRI task was a simple block design sensorimotor task. The impulse response functions to the stimulation block were derived using an FIR-deconvolution analysis with FMRISTAT. Six functionally-derived ROIs covering the visual, auditory and motor cortices, created from a prior analysis, were used. Two dependent variables were compared: percent signal change and contrast-to-noise-ratio. Reliability was assessed with intraclass correlation coefficients derived from a variance components analysis. Test-retest reliability was high, but initially, between-site reliability was low, indicating a strong contribution from site and site-by-subject variance. However, a number of factors that can markedly improve between-site reliability were uncovered, including increasing the size of the ROIs, adjusting for smoothness differences, and inclusion of additional runs. By employing multiple steps, between-site reliability for 3T scanners was increased by 123%. Dropping one site at a time and assessing reliability can be a useful method of assessing the sensitivity of the results to particular sites. These findings should provide guidance toothers on the best practices for future multicenter studies.
Davies, Kylie; Bulsara, Max K; Ramelet, Anne-Sylvie; Monterosso, Leanne
2018-05-01
To establish criterion-related construct validity and test-retest reliability for the Endotracheal Suction Assessment Tool© (ESAT©). Endotracheal tube suction performed in children can significantly affect clinical stability. Previously identified clinical indicators for endotracheal tube suction were used as criteria when designing the ESAT©. Content validity was reported previously. The final stages of psychometric testing are presented. Observational testing was used to measure construct validity and determine whether the ESAT© could guide "inexperienced" paediatric intensive care nurses' decision-making regarding endotracheal tube suction. Test-retest reliability of the ESAT© was performed at two time points. The researchers and paediatric intensive care nurse "experts" developed 10 hypothetical clinical scenarios with predetermined endotracheal tube suction outcomes. "Experienced" (n = 12) and "inexperienced" (n = 14) paediatric intensive care nurses were presented with the scenarios and the ESAT© guiding decision-making about whether to perform endotracheal tube suction for each scenario. Outcomes were compared with those predetermined by the "experts" (n = 9). Test-retest reliability of the ESAT© was measured at two consecutive time points (4 weeks apart) with "experienced" and "inexperienced" paediatric intensive care nurses using the same scenarios and tool to guide decision-making. No differences were observed between endotracheal tube suction decisions made by "experts" (n = 9), "inexperienced" (n = 14) and "experienced" (n = 12) nurses confirming the tool's construct validity. No differences were observed between groups for endotracheal tube suction decisions at T1 and T2. Criterion-related construct validity and test-retest reliability of the ESAT© were demonstrated. Further testing is recommended to confirm reliability in the clinical setting with the "inexperienced" nurse to guide decision-making related to endotracheal tube suction. The ESAT© is the first validated tool to systematically guide endotracheal nursing practice for the "inexperienced" nurse. © 2018 John Wiley & Sons Ltd.
Validation of the VISA-A questionnaire for Turkish language: the VISA-A-Tr study.
Dogramaci, Yunus; Kalaci, Aydiner; Kücükkübas, Nigar; Inandi, Taceddin; Esen, Erdinc; Yanat, A Nedim
2011-04-01
To evaluate the validity and reliability of the Turkish version of the Victorian Institute of Sports Assessment-Achilles (VISA-A) questionnaire for patients with Achilles tendinopathy. Fifty-five patients with a diagnosis of Achilles tendinopathy and 55 healthy subjects were included in the study. VISA-A questionnaires were translated and culturally adapted into Turkish. The final Turkish version (VISA-A-Tr) was tested for reliability on healthy individuals and patients. Tests for internal consistency, validity and structure were performed on 55 patients. The VISA-A-Tr showed good test-retest reliability (Pearson's r=0.99, p<0.001). The patients with Achilles tendinopathy had a significantly lower score (p<0.001) than the healthy individuals. The VISA-A-Tr score correlated significantly with the Stanish tendon grading system (Spearman's r=-0.86; p<0.001). The VISA-A-Tr is a valid and reliable tool for evaluating the severity of Achilles tendinopathy.
[Translation and Development of the Chinese-Version Patient Privacy Scale].
Chen, Li; Feng, Xian-Qiong; Yang, Xiao-Li; Li, Luo-Hong
2017-06-01
The unauthorized releasing of confidential patient information is a serious problem worldwide. Nurses, the healthcare professionals who are in most frequent contact with patients, have access to a significant amount of confidential patient information and play a key role in protecting patient privacy. However, currently, there is no proper tool to measure the level to which clinical nurses protect the privacy of their patients in China. To translate the patient privacy scale (PPS) into Chinese and to test the reliability and validity of this Chinese version. The original scale was developed by Özturk, Bahcecik, and Özçelik (2014) to identify whether nurses protect or violate patient privacy in the workplace. This study used the "back translation" method to translate the scale. A total of 616 nurses in two tertiary hospitals in the Western region of China were enrolled to test the internal consistency, test-retest reliability, and construct validity of the translated scale. The Cronbach's coefficients of the total scale and its 5 factors ranged from .84 to .94; the split half reliability was .91; the test-retest reliability was .82; and the content validity index was .95. Explanatory factor analysis revealed that the 5 factors explained 64.98% of the total variance. The Chinese version of the PPS is reliable and valid, and may be used to reliably assess the behaviors of nurses with regard to protecting the privacy of their patients. The scale may also be used to evaluate the effects of training on patient privacy protection.
Huang, X N; Zhang, Y; Feng, W W; Wang, H S; Cao, B; Zhang, B; Yang, Y F; Wang, H M; Zheng, Y; Jin, X M; Jia, M X; Zou, X B; Zhao, C X; Robert, J; Jing, Jin
2017-06-02
Objective: To evaluate the reliability and validity of warning signs checklist developed by the National Health and Family Planning Commission of the People's Republic of China (NHFPC), so as to determine the screening effectiveness of warning signs on developmental problems of early childhood. Method: Stratified random sampling method was used to assess the reliability and validity of checklist of warning sign and 2 110 children 0 to 6 years of age(1 513 low-risk subjects and 597 high-risk subjects) were recruited from 11 provinces of China. The reliability evaluation for the warning signs included the test-retest reliability and interrater reliability. With the use of Age and Stage Questionnaire (ASQ) and Gesell Development Diagnosis Scale (GESELL) as the criterion scales, criterion validity was assessed by determining the correlation and consistency between the screening results of warning signs and the criterion scales. Result: In terms of the warning signs, the screening positive rates at different ages ranged from 10.8%(21/141) to 26.2%(51/137). The median (interquartile) testing time for each subject was 1(0.6) minute. Both the test-retest reliability and interrater reliability of warning signs reached 0.7 or above, indicating that the stability was good. In terms of validity assessment, there was remarkable consistency between ASQ and warning signs, with the Kappa value of 0.63. With the use of GESELL as criterion, it was determined that the sensitivity of warning signs in children with suspected developmental delay was 82.2%, and the specificity was 77.7%. The overall Youden index was 0.6. Conclusion: The reliability and validity of warning signs checklist for screening early childhood developmental problems have met the basic requirements of psychological screening scales, with the characteristics of short testing time and easy operation. Thus, this warning signs checklist can be used for screening psychological and behavioral problems of early childhood, especially in community settings.
ERIC Educational Resources Information Center
Park, Bitnara Jasmine; Anderson, Daniel; Alonzo, Julie; Lai, Cheng-Fei; Tindal, Gerald
2012-01-01
This technical report is one in a series of five describing the reliability (test/retest and alternate form) and G-Theory/D-Study research on the easyCBM reading measures, grades 1-5. Data were gathered in the spring of 2011 from a convenience sample of students nested within classrooms at a medium-sized school district in the Pacific Northwest.…
Wang, Jin-Hui; Zuo, Xi-Nian; Gohel, Suril; Milham, Michael P.; Biswal, Bharat B.; He, Yong
2011-01-01
Graph-based computational network analysis has proven a powerful tool to quantitatively characterize functional architectures of the brain. However, the test-retest (TRT) reliability of graph metrics of functional networks has not been systematically examined. Here, we investigated TRT reliability of topological metrics of functional brain networks derived from resting-state functional magnetic resonance imaging data. Specifically, we evaluated both short-term (<1 hour apart) and long-term (>5 months apart) TRT reliability for 12 global and 6 local nodal network metrics. We found that reliability of global network metrics was overall low, threshold-sensitive and dependent on several factors of scanning time interval (TI, long-term>short-term), network membership (NM, networks excluding negative correlations>networks including negative correlations) and network type (NT, binarized networks>weighted networks). The dependence was modulated by another factor of node definition (ND) strategy. The local nodal reliability exhibited large variability across nodal metrics and a spatially heterogeneous distribution. Nodal degree was the most reliable metric and varied the least across the factors above. Hub regions in association and limbic/paralimbic cortices showed moderate TRT reliability. Importantly, nodal reliability was robust to above-mentioned four factors. Simulation analysis revealed that global network metrics were extremely sensitive (but varying degrees) to noise in functional connectivity and weighted networks generated numerically more reliable results in compared with binarized networks. For nodal network metrics, they showed high resistance to noise in functional connectivity and no NT related differences were found in the resistance. These findings provide important implications on how to choose reliable analytical schemes and network metrics of interest. PMID:21818285
Ryman, Tove K.; Boyer, Bert B.; Hopkins, Scarlett; Philip, Jacques; O’Brien, Diane; Thummel, Kenneth; Austin, Melissa A.
2015-01-01
Food frequency questionnaire (FFQ) data can be used to characterize dietary patterns for diet-disease association studies. Among a sample of Yup’ik people from Southwest Alaska, we evaluated three previously defined dietary patterns: “subsistence foods” and market-based “processed foods” and “fruits and vegetables”. We tested the reproducibility and reliability of the dietary patterns and tested associations of the patterns with dietary biomarkers and participant characteristics. We analyzed data from adult study participants who completed at least one FFQ with the Center for Alaska Native Health Research 9/2009–5/2013. To test reproducibility we conducted a confirmatory factor analysis (CFA) of a hypothesized model using 18 foods to measure the dietary patterns (n=272). To test the reliability of the dietary patterns, we used CFA to measure the composite reliability (n=272) and intraclass correlation coefficients for test-retest reliability (n=113). Finally, to test associations we used linear regression (n=637). All CFA factor loadings, except one, indicated acceptable correlations between foods and dietary patterns (r > 0.40) and model fit criteria were greater than 0.90. Composite and test-retest reliability of dietary patterns were respectively 0.56 and 0.34 for subsistence foods, 0.73 and 0.66 for processed foods, and 0.72 and 0.54 for fruits and vegetables. In the multi-predictor analysis, dietary patterns were significantly associated with dietary biomarkers, community location, age, sex, and self-reported lifestyle. This analysis confirmed the reproducibility and reliability of the dietary patterns in this study population. These dietary patterns can be used for future research and development of dietary interventions in this underserved population. PMID:25656871
Worts, Phillip R; Schatz, Philip; Burkhart, Scott O
2018-05-01
The Vestibular/Ocular Motor Screening (VOMS) and King-Devick (K-D) test are tools designed to assess ocular or vestibular function after a sport-related concussion. To determine the test-retest reliability and rate of false-positive results of the VOMS and K-D test in a healthy athlete sample. Cohort study (diagnosis); Level of evidence, 2. Forty-five healthy high school student-athletes (mean age, 16.11 ± 1.43 years) completed self-reported demographics and medical history and were administered the VOMS and K-D test during rest on day 1 (baseline). The VOMS and K-D test were administered again once during rest (prepractice) and once within 5 minutes of removal from sport practice on day 2 (removal). The Borg rating of perceived exertion scale was administered at removal. Intraclass correlation coefficients were used to determine test-retest reliability on the K-D test and the average near point of convergence (NPC) distance on the VOMS. Level of agreement was used to examine VOMS symptom provocation over the 3 administration times. Multivariate base rates were used to determine the rate of false-positive results when simultaneously considering multiple clinical cutoffs. Test-retest reliability of total time on the K-D test (0.91 [95% CI, 0.86-0.95]) and NPC distance (0.91 [95% CI, 0.85-0.95]) was high across the 3 administration times. Level of agreement ranged from 48.9% to 88.9% across all 3 times for the VOMS items. Using established clinical cutoffs, false-positive results occurred in 2% of the sample using the VOMS at removal and 36% using the K-D test. The VOMS displayed a false-positive rate of 2% in this high school student-athlete cohort. The K-D test's false-positive rate was 36% while maintaining a high level of test-retest reliability (0.91). Results from this study support future investigation of VOMS administration in an acutely injured high school athletic sample. Going forward, the VOMS may be more stable than other neurological and symptom report screening measures and less vulnerable to false-positive results than the K-D test.