TCOPPE school environmental audit tool: assessing safety and walkability of school environments.
Lee, Chanam; Kim, Hyung Jin; Dowdy, Diane M; Hoelscher, Deanna M; Ory, Marcia G
2013-09-01
Several environmental audit instruments have been developed for assessing streets, parks and trails, but none for schools. This paper introduces a school audit tool that includes 3 subcomponents: 1) street audit, 2) school site audit, and 3) map audit. It presents the conceptual basis and the development process of this instrument, and the methods and results of the reliability assessments. Reliability tests were conducted by 2 trained auditors on 12 study schools (high-low income and urban-suburban-rural settings). Kappa statistics (categorical, factual items) and ICC (Likert-scale, perceptual items) were used to assess a) interrater, b) test-retest, and c) peak vs. off-peak hour reliability tests. For the interrater reliability test, the average Kappa was 0.839 and the ICC was 0.602. For the test-retest reliability, the average Kappa was 0.903 and the ICC was 0.774. The peak-off peak reliability was 0.801. Rural schools showed the most consistent results in the peak-off peak and test-retest assessments. For interrater tests, urban schools showed the highest ICC, and rural schools showed the highest Kappa. Most items achieved moderate to high levels of reliabilities in all study schools. With proper training, this audit can be used to assess school environments reliably for research, outreach, and policy-support purposes.
Reliability Generalization of the Alcohol Use Disorder Identification Test.
ERIC Educational Resources Information Center
Shields, Alan L.; Caruso, John C.
2002-01-01
Evaluated the reliability of scores from the Alcohol Use Disorders Identification Test (AUDIT; J. Sounders and others, 1993) in a reliability generalization study based on 17 empirical journal articles. Results show AUDIT scores to be generally reliable for basic assessment. (SLD)
Test-retest reliability of infant event related potentials evoked by faces.
Munsters, N M; van Ravenswaaij, H; van den Boomen, C; Kemner, C
2017-04-05
Reliable measures are required to draw meaningful conclusions regarding developmental changes in longitudinal studies. Little is known, however, about the test-retest reliability of face-sensitive event related potentials (ERPs), a frequently used neural measure in infants. The aim of the current study is to investigate the test-retest reliability of ERPs typically evoked by faces in 9-10 month-old infants. The infants (N=31) were presented with neutral, fearful and happy faces that contained only the lower or higher spatial frequency information. They were tested twice within two weeks. The present results show that the test-retest reliability of the face-sensitive ERP components is moderate (P400 and Nc) to substantial (N290). However, there is low test-retest reliability for the effects of the specific experimental manipulations (i.e. emotion and spatial frequency) on the face-sensitive ERPs. To conclude, in infants the face-sensitive ERP components (i.e. N290, P400 and Nc) show adequate test-retest reliability, but not the effects of emotion and spatial frequency on these ERP components. We propose that further research focuses on investigating elements that might increase the test-retest reliability, as adequate test-retest reliability is necessary to draw meaningful conclusions on individual developmental trajectories of the face-sensitive ERPs in infants. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.
The long-term reliability of static and dynamic quantitative sensory testing in healthy individuals.
Marcuzzi, Anna; Wrigley, Paul J; Dean, Catherine M; Adams, Roger; Hush, Julia M
2017-07-01
Quantitative sensory tests (QSTs) have been increasingly used to investigate alterations in somatosensory function in a wide range of painful conditions. The interpretation of these findings is based on the assumption that the measures are stable and reproducible. To date, reliability of QST has been investigated for short test-retest intervals. The aim of this study was to investigate the long-term reliability of a multimodal QST assessment in healthy people, with testing conducted on 3 occasions over 4 months. Forty-two healthy people were enrolled in the study. Static and dynamic tests were performed, including cold and heat pain threshold (CPT, HPT), mechanical wind-up [wind-up ratio (WUR)], pressure pain threshold (PPT), 2-point discrimination (TPD), and conditioned pain modulation (CPM). Systematic bias, relative reliability and agreement were analysed using repeated measure analysis of variance, intraclass correlation coefficients (ICCs3,1) and SE of the measurement (SEM), respectively. Static QST (CPT, HPT, PPT, and TPD) showed good-to-excellent reliability (ICCs: 0.68-0.90). Dynamic QST (WUR and CPM) showed poor-to-good reliability (ICCs: 0.35-0.61). A significant linear decrease over time was observed for mechanical QST at the back (PPT and TPD) and for CPM (P < 0.01). Static QST were stable over a period of 4 months; however, a small systematic decrease over time has been observed for mechanical QST. Dynamic QST showed considerable variability over time; in particular, CPM using PPT as the test stimulus did not show adequate reliability, suggesting that this test paradigm may be less useful for monitoring individuals over time.
Omari, Taher I.; Savilampi, Johanna; Kokkinn, Karmen; Schar, Mistyka; Lamvik, Kristin; Doeltgen, Sebastian; Cock, Charles
2016-01-01
Purpose. We evaluated the intra- and interrater agreement and test-retest reliability of analyst derivation of swallow function variables based on repeated high resolution manometry with impedance measurements. Methods. Five subjects swallowed 10 × 10 mL saline on two occasions one week apart producing a database of 100 swallows. Swallows were repeat-analysed by six observers using software. Swallow variables were indicative of contractility, intrabolus pressure, and flow timing. Results. The average intraclass correlation coefficients (ICC) for intra- and interrater comparisons of all variable means showed substantial to excellent agreement (intrarater ICC 0.85–1.00; mean interrater ICC 0.77–1.00). Test-retest results were less reliable. ICC for test-retest comparisons ranged from slight to excellent depending on the class of variable. Contractility variables differed most in terms of test-retest reliability. Amongst contractility variables, UES basal pressure showed excellent test-retest agreement (mean ICC 0.94), measures of UES postrelaxation contractile pressure showed moderate to substantial test-retest agreement (mean Interrater ICC 0.47–0.67), and test-retest agreement of pharyngeal contractile pressure ranged from slight to substantial (mean Interrater ICC 0.15–0.61). Conclusions. Test-retest reliability of HRIM measures depends on the class of variable. Measures of bolus distension pressure and flow timing appear to be more test-retest reliable than measures of contractility. PMID:27190520
Reliability of movement control tests in the lumbar spine
Luomajoki, Hannu; Kool, Jan; de Bruin, Eling D; Airaksinen, Olavi
2007-01-01
Background Movement control dysfunction [MCD] reduces active control of movements. Patients with MCD might form an important subgroup among patients with non specific low back pain. The diagnosis is based on the observation of active movements. Although widely used clinically, only a few studies have been performed to determine the test reliability. The aim of this study was to determine the inter- and intra-observer reliability of movement control dysfunction tests of the lumbar spine. Methods We videoed patients performing a standardized test battery consisting of 10 active movement tests for motor control in 27 patients with non specific low back pain and 13 patients with other diagnoses but without back pain. Four physiotherapists independently rated test performances as correct or incorrect per observation, blinded to all other patient information and to each other. The study was conducted in a private physiotherapy outpatient practice in Reinach, Switzerland. Kappa coefficients, percentage agreements and confidence intervals for inter- and intra-rater results were calculated. Results The kappa values for inter-tester reliability ranged between 0.24 – 0.71. Six tests out of ten showed a substantial reliability [k > 0.6]. Intra-tester reliability was between 0.51 – 0.96, all tests but one showed substantial reliability [k > 0.6]. Conclusion Physiotherapists were able to reliably rate most of the tests in this series of motor control tasks as being performed correctly or not, by viewing films of patients with and without back pain performing the task. PMID:17850669
Test Reliability at the Individual Level
Hu, Yueqin; Nesselroade, John R.; Erbacher, Monica K.; Boker, Steven M.; Burt, S. Alexandra; Keel, Pamela K.; Neale, Michael C.; Sisk, Cheryl L.; Klump, Kelly
2016-01-01
Reliability has a long history as one of the key psychometric properties of a test. However, a given test might not measure people equally reliably. Test scores from some individuals may have considerably greater error than others. This study proposed two approaches using intraindividual variation to estimate test reliability for each person. A simulation study suggested that the parallel tests approach and the structural equation modeling approach recovered the simulated reliability coefficients. Then in an empirical study, where forty-five females were measured daily on the Positive and Negative Affect Schedule (PANAS) for 45 consecutive days, separate estimates of reliability were generated for each person. Results showed that reliability estimates of the PANAS varied substantially from person to person. The methods provided in this article apply to tests measuring changeable attributes and require repeated measures across time on each individual. This article also provides a set of parallel forms of PANAS. PMID:28936107
Lenzlinger-Asprion, Rahel; Keller, Niculina; Meichtry, André; Luomajoki, Hannu
2017-01-31
Hip joint complaints are a problem associated with increasing age and impair the mobility of a large section of the elderly population. Reliable and valid tests are necessary for a thorough investigation of a joint. A fundamental function of the hip joint is movement control and a test of this function forms a part of the standard examination. Until now there have been few scientific studies which specifically investigate the reliability of measurement tests of movement control of the hip joint. The aim of this study was to examine the intratester and intertester reliability of the movement control tests of the hip joint which are in use in current clinical practice. Sixteen participants with hip joint complaints and 14 without hip joint impairment were recruited. All participants performed five active movement control tests for the hip joint and were video filmed whilst performing these tests. These films formed the basis for the evaluation and were assessed by two independent physiotherapists. For the intertester and intratester reliability calculations specially set weighted kappa values and the calculated percentages were used. The intertester reliability of the five examined movement control tests of the hip joint showed good to almost perfect values (weighted kappa (wk) = 0.56-0.87). The intratester reliability of the more experienced evaluator A was better in regards to the less experienced evaluator B (average wk = 0.62 vs 0.38). The visual evaluation of movement control tests of the hip joint is especially reliable when carried out by an experienced evaluator. 4 out of 5 tests also showed good results for intertester reliability and support their use in clinical practice.
Palmer, Clare E; Langbehn, Douglas; Tabrizi, Sarah J; Papoutsi, Marina
2017-01-01
Cognitive impairment is common amongst many neurodegenerative movement disorders such as Huntington's disease (HD) and Parkinson's disease (PD) across multiple domains. There are many tasks available to assess different aspects of this dysfunction, however, it is imperative that these show high test-retest reliability if they are to be used to track disease progression or response to treatment in patient populations. Moreover, in order to ensure effects of practice across testing sessions are not misconstrued as clinical improvement in clinical trials, tasks which are particularly vulnerable to practice effects need to be highlighted. In this study we evaluated test-retest reliability in mean performance across three testing sessions of four tasks that are commonly used to measure cognitive dysfunction associated with striatal impairment: a combined Simon Stop-Signal Task; a modified emotion recognition task; a circle tracing task; and the trail making task. Practice effects were seen between sessions 1 and 2 across all tasks for the majority of dependent variables, particularly reaction time variables; some, but not all, diminished in the third session. Good test-retest reliability across all sessions was seen for the emotion recognition, circle tracing, and trail making test. The Simon interference effect and stop-signal reaction time (SSRT) from the combined-Simon-Stop-Signal task showed moderate test-retest reliability, however, the combined SSRT interference effect showed poor test-retest reliability. Our results emphasize the need to use control groups when tracking clinical progression or use pre-baseline training on tasks susceptible to practice effects.
A Comparison of Three Multivariate Models for Estimating Test Battery Reliability.
ERIC Educational Resources Information Center
Wood, Terry M.; Safrit, Margaret J.
1987-01-01
A comparison of three multivariate models (canonical reliability model, maximum generalizability model, canonical correlation model) for estimating test battery reliability indicated that the maximum generalizability model showed the least degree of bias, smallest errors in estimation, and the greatest relative efficiency across all experimental…
Powell, T; Brooker, D J; Papadopolous, A
1993-05-01
Relative and absolute test-retest reliability of the MEAMS was examined in 12 subjects with probable dementia and 12 matched controls. Relative reliability was good. Measures of absolute reliability showed scores changing by up to 3 points over an interval of a week. A version effect was found to be in evidence.
Reliability and Concurrent Validity of Dynamic Rotator Stability Test-A Cross Sectional study.
Binoy Mathew, K V; Eapen, Charu; Kumar, P Senthil
2012-01-01
To find intra rater and inter rater reliability of Dynamic Rotator Stability Test (DRST) and to find concurrent validity of Dynamic Rotator Stability Test (DRST) with University of Pennsylvania Shoulder Score (PENN) Scale. 40 subjects of either gender between the age group of 18-70 with painful shoulder conditions of musculoskeletal origin was selected through convenient sampling. Tester 1 and tester 2 administered DRST and PENN scale randomly. In a subgroup of 20 subjects DRST was administered by both the testers to find the inter rater reliability. 180° Standard Universal Goniometer was used to take measurements. For intra-rater reliability, all the test variables were showing highly significant correlation (p=.94 - 1). For inter -rater, with tester 2, test variables like position, ROM, force, direction of abnormal translation, pain during the test, compensatory movement during test were found to be significant (p=.71-1).only some variables of DRST showed significant correlation with PENN scale (P=.320-.450). Dynamic Rotator Stability Test has good intra rater and moderate inter rater reliability. Concurrent validity of Dynamic Rotator Stability Test was found to be poor when compared to PENN Shoulder Score.
Bloemen, Manon A T; de Groot, Janke F; Backx, Frank J G; Westerveld, Rosalyne A; Takken, Tim
2015-05-01
To determine the best test performance and feasibility using a Graded Arm Cranking Test vs a Graded Wheelchair Propulsion Test in young people with spina bifida who use a wheelchair, and to determine the reliability of the best test. Validity and reliability study. Young people with spina bifida who use a wheelchair. Physiological responses were measured during a Graded Arm Cranking Test and a Graded Wheelchair Propulsion Test using a heart rate monitor and calibrated mobile gas analysis system (Cortex Metamax). For validity, peak oxygen uptake (VO2peak) and peak heart rate (HRpeak) were compared using paired t-tests. For reliability, the intra-class correlation coefficients, standard error of measurement, and standard detectable change were calculated. VO2peak and HRpeak were higher during wheelchair propulsion compared with arm cranking (23.1 vs 19.5 ml/kg/min, p = 0.11; 165 vs 150 beats/min, p < 0.05). Reliability of wheelchair propulsion showed high intra-class correlation coefficients (ICCs) for both VO2peak (ICC = 0.93) and HRpeak (ICC = 0.90). This pilot study shows higher HRpeak and a tendency to higher VO2peak in young people with spina bifida who are using a wheelchair when tested during wheelchair propulsion compared with arm cranking. Wheelchair propulsion showed good reliability. We recommend performing a wheelchair propulsion test for aerobic fitness testing in this population.
Lindskog, Marcus; Winman, Anders; Juslin, Peter; Poom, Leo
2013-01-01
Two studies investigated the reliability and predictive validity of commonly used measures and models of Approximate Number System acuity (ANS). Study 1 investigated reliability by both an empirical approach and a simulation of maximum obtainable reliability under ideal conditions. Results showed that common measures of the Weber fraction (w) are reliable only when using a substantial number of trials, even under ideal conditions. Study 2 compared different purported measures of ANS acuity as for convergent and predictive validity in a within-subjects design and evaluated an adaptive test using the ZEST algorithm. Results showed that the adaptive measure can reduce the number of trials needed to reach acceptable reliability. Only direct tests with non-symbolic numerosity discriminations of stimuli presented simultaneously were related to arithmetic fluency. This correlation remained when controlling for general cognitive ability and perceptual speed. Further, the purported indirect measure of ANS acuity in terms of the Numeric Distance Effect (NDE) was not reliable and showed no sign of predictive validity. The non-symbolic NDE for reaction time was significantly related to direct w estimates in a direction contrary to the expected. Easier stimuli were found to be more reliable, but only harder (7:8 ratio) stimuli contributed to predictive validity. PMID:23964256
Reliability Generalization (RG) Analysis: The Test Is Not Reliable
ERIC Educational Resources Information Center
Warne, Russell
2008-01-01
Literature shows that most researchers are unaware of some of the characteristics of reliability. This paper clarifies some misconceptions by describing the procedures, benefits, and limitations of reliability generalization while using it to illustrate the nature of score reliability. Reliability generalization (RG) is a meta-analytic method…
Reliability of two social cognition tests: The combined stories test and the social knowledge test.
Thibaudeau, Élisabeth; Cellard, Caroline; Legendre, Maxime; Villeneuve, Karèle; Achim, Amélie M
2018-04-01
Deficits in social cognition are common in psychiatric disorders. Validated social cognition measures with good psychometric properties are necessary to assess and target social cognitive deficits. Two recent social cognition tests, the Combined Stories Test (COST) and the Social Knowledge Test (SKT), respectively assess theory of mind and social knowledge. Previous studies have shown good psychometric properties for these tests, but the test-retest reliability has never been documented. The aim of this study was to evaluate the test-retest reliability and the inter-rater reliability of the COST and the SKT. The COST and the SKT were administered twice to a group of forty-two healthy adults, with a delay of approximately four weeks between the assessments. Excellent test-retest reliability was observed for the COST, and a good test-retest reliability was observed for the SKT. There was no evidence of practice effect. Furthermore, an excellent inter-rater reliability was observed for both tests. This study shows a good reliability of the COST and the SKT that adds to the good validity previously reported for these two tests. These good psychometrics properties thus support that the COST and the SKT are adequate measures for the assessment of social cognition. Copyright © 2018. Published by Elsevier B.V.
Prospective patients rate practice factors: development of a questionnaire.
St Louis, Brian Lingg; Firestone, Allen R; Johnston, William; Shanker, Shiva; Vig, Katherine W L
2011-02-01
The importance that prospective patients place on practice characteristics when choosing an orthodontic practice has not been extensively reported. The objective of this research was to develop a valid and reliable questionnaire to address the relative importance of orthodontic office and doctor characteristics for prospective patients or parents of child patients during the initial orthodontic office consultation. An initial questionnaire, based on published literature, was field-tested on 16 subjects to assess its validity. Based on the field test, the questionnaire was modified and tested for reliability by using a test-retest method. The questionnaire covered the following areas: doctor, office, staff, and finances. The reliability study included 2 groups of subjects: 12 consecutive prospective adult patients and 41 consecutive parents of prospective child patients. The questionnaires consisted of 43 and 50 questions for the adult patients and the parents of patients, respectively. The subjects rated the importance of practice characteristics in their selection of an orthodontic practice using a 100-mm visual analog scale anchored at "not important at all" and "most important." Reliability was analyzed by using the intraclass correlation coefficient (ICC). Summary scores of all 53 subjects showed excellent reliability (ICC, 0.88; range, 0.61-1.0). Summary scores of all 50 questions showed acceptable reliability (ICC, 0.70; range, 0.45-0.88). Twenty-one questions had excellent reliability (ICC, >.75), and 29 questions had fair-to-good reliability (ICC, 0.41-0.75). No questions showed poor reliability (ICC, <0.4). The pilot study data indicated that the overall reliability of the questionnaire is acceptable. Copyright © 2011 American Association of Orthodontists. Published by Mosby, Inc. All rights reserved.
40 CFR 792.15 - Inspection of a testing facility.
Code of Federal Regulations, 2011 CFR
2011-07-01
...) EPA will not consider reliable for purposes of showing that a chemical substance or mixture does not... be considered reliable does not, however, relieve the sponsor of a required test of any obligation...
Im, Sun; Suntrup-Krueger, Sonja; Colbow, Sigrid; Sauer, Sonja; Claus, Inga; Meuth, Sven G; Dziewas, Rainer; Warnecke, Tobias
2018-05-26
Diagnosis of pharyngeal dysphagia caused by myasthenia gravis (MG) based on clinical examination alone is often challenging. Flexible endoscopic evaluation of swallowing (FEES) combined with Tensilon (edrophonium) application, referred to as the FEES-Tensilon Test, was developed to improve diagnostic accuracy and to detect the main symptoms of pharyngeal dysphagia in MG. Here we investigated inter- and intra-rater reliability of the FEES-Tensilon Test and analyzed the main endoscopic findings. Four experienced raters reviewed a total of 20 FEES-Tensilon-Test videos in randomized order. Residue severity was graded at 4 different pharyngeal spaces before and after Tensilon administration. All interpretations were performed twice per rater, 4 weeks apart (a total of 160 scorings). Intra-rater test-retest reliability and inter-rater reliability levels were calculated. The most frequent FEES findings in MG patients before Tensilon application were prominent residues of semi solids spread all over the hypopharynx in varying locations. The reliability level in the interpretation of the FEES-Tensilon test was excellent regardless of the raters' profession or years of experience with FEES. All 4 raters showed high inter- and intra- reliability levels in interpreting the FEES-Tensilon Test based on residue clearance (kappa=0.922, 0.981). Degree of residue normalization in the vallecular space after Tensilon application showed the highest inter- and intra-rater reliability level (kappa=0.863, 0.957) followed by the epiglottis (kappa=0.813, 0.946) and pyriform sinuses (kappa=0.836, 0.929). Interpretation of the FEES-Tensilon Test based on residue severity and degree of Tensilon clearance, especially in the vallecular space, is consistent and reliable. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Exercise-Induced Hypoalgesia After Isometric Wall Squat Exercise: A Test-Retest Reliabilty Study.
Vaegter, Henrik Bjarke; Lyng, Kristian Damgaard; Yttereng, Fredrik Wannebo; Christensen, Mads Holst; Sørensen, Mathias Brandhøj; Graven-Nielsen, Thomas
2018-05-19
Isometric exercises decrease pressure pain sensitivity in exercising and nonexercising muscles known as exercise-induced hypoalgesia (EIH). No studies have assessed the test-retest reliability of EIH after isometric exercise. This study investigated the EIH on pressure pain thresholds (PPTs) after an isometric wall squat exercise. The relative and absolute test-retest reliability of the PPT as a test stimulus and the EIH response in exercising and nonexercising muscles were calculated. In two identical sessions, PPTs of the thigh and shoulder were assessed before and after three minutes of quiet rest and three minutes of wall squat exercise, respectively, in 35 healthy subjects. The relative test-retest reliability of PPT and EIH was determined using analysis of variance models, Person's r, and intraclass correlations (ICCs). The absolute test-retest reliability of EIH was determined based on PPT standard error of measurements and Cohen's kappa for agreement between sessions. Squat increased PPTs of exercising and nonexercising muscles by 16.8% ± 16.9% and 6.7% ± 12.9%, respectively (P < 0.001), with no significant differences between sessions. PPTs within and between sessions showed moderately strong correlations (r ≥ 0.74) and excellent (ICC ≥ 0.84) within-session (rest) and between-session test-retest reliability. EIH responses of exercising and nonexercising muscles showed no systematic errors between sessions; however, the relative test-retest reliability was low (ICCs = 0.03-0.43), and agreement in EIH responders and nonresponders between sessions was not significant (κ < 0.13, P > 0.43). A wall squat exercise increased PPTs compared with quiet rest; however, the relative and absolute reliability of the EIH response was poor. Future research is warranted to investigate the reliability of EIH in clinical pain populations.
An, Hyeong Su; Moon, Won-Jin; Ryu, Jae-Kyun; Park, Ju Yeon; Yun, Won Sung; Choi, Jin Woo; Jahng, Geon-Ho; Park, Jang-Yeon
2017-12-01
This prospective multi-center study aimed to evaluate the inter-vendor and test-retest reliabilities of resting-state functional magnetic resonance imaging (RS-fMRI) by assessing the temporal signal-to-noise ratio (tSNR) and functional connectivity. Study included 10 healthy subjects and each subject was scanned using three 3T MR scanners (GE Signa HDxt, Siemens Skyra, and Philips Achieva) in two sessions. The tSNR was calculated from the time course data. Inter-vendor and test-retest reliabilities were assessed with intra-class correlation coefficients (ICCs) derived from variant component analysis. Independent component analysis was performed to identify the connectivity of the default-mode network (DMN). In result, the tSNR for the DMN was not significantly different among the GE, Philips, and Siemens scanners (P=0.638). In terms of vendor differences, the inter-vendor reliability was good (ICC=0.774). Regarding the test-retest reliability, the GE scanner showed excellent correlation (ICC=0.961), while the Philips (ICC=0.671) and Siemens (ICC=0.726) scanners showed relatively good correlation. The DMN pattern of the subjects between the two sessions for each scanner and between three scanners showed the identical patterns of functional connectivity. The inter-vendor and test-retest reliabilities of RS-fMRI using different 3T MR scanners are good. Thus, we suggest that RS-fMRI could be used in multicenter imaging studies as a reliable imaging marker. Copyright © 2017 Elsevier Inc. All rights reserved.
An Investigation of the Impact of Guessing on Coefficient α and Reliability
2014-01-01
Guessing is known to influence the test reliability of multiple-choice tests. Although there are many studies that have examined the impact of guessing, they used rather restrictive assumptions (e.g., parallel test assumptions, homogeneous inter-item correlations, homogeneous item difficulty, and homogeneous guessing levels across items) to evaluate the relation between guessing and test reliability. Based on the item response theory (IRT) framework, this study investigated the extent of the impact of guessing on reliability under more realistic conditions where item difficulty, item discrimination, and guessing levels actually vary across items with three different test lengths (TL). By accommodating multiple item characteristics simultaneously, this study also focused on examining interaction effects between guessing and other variables entered in the simulation to be more realistic. The simulation of the more realistic conditions and calculations of reliability and classical test theory (CTT) item statistics were facilitated by expressing CTT item statistics, coefficient α, and reliability in terms of IRT model parameters. In addition to the general negative impact of guessing on reliability, results showed interaction effects between TL and guessing and between guessing and test difficulty.
Impact on Participation and Autonomy: Test of Validity and Reliability for Older Persons.
Hammar, Isabelle Ottenvall; Ekelund, Christina; Wilhelmson, Katarina; Eklund, Kajsa
2014-11-06
In research and healthcare it is important to measure older persons' self-determination in order to improve their possibilities to decide for themselves in daily life. The questionnaire Impact on Participation and Autonomy (IPA) assesses self-determination, but is not constructed for older persons. The aim of this study was to examine the validity and reliability of the IPA-S questionnaire for persons aged 70 years and older. The study was performed in two steps; first a validity test of the Swedish version of the questionnaire, IPA-S, followed by a reliability test-retest of an adjusted version. The validity was tested with focus groups and individual interviews on persons aged 77-88 years, and the reliability on persons aged 70-99 years. The validity test result showed that IPA-S is valid for older persons but it was too extensive and the phrasing of the items needed adjustments. The reliability test-retest on the adjusted questionnaire, IPA- Older persons (IPA-O), showed that 15 of 22 items had high agreement. IPA-O can be used to measure older persons' self-determination in their care and rehabilitation.
Hamre, Charlotta; Botolfsen, Pernille; Tangen, Gro Gujord; Helbostad, Jorunn L
2017-04-20
The Balance Evaluation Systems Test (BESTest) was developed to assess underlying systems for balance control in order to be able to individually tailor rehabilitation interventions to people with balance disorders. A short form, the Mini-BESTest, was developed as a screening test. The study aimed to assess interrater and test-retest reliability of the Norwegian version of the BESTest and the Mini-BESTest in community-dwelling people with increased risk of falling and to assess concurrent validity with the Fall Efficacy Scale-International (FES-I), and it was an observational study with a cross-sectional design. Forty-two persons with increased risk of falling (elderly over 65 years of age, persons with a history of stroke or Multiple Sclerosis) were assessed twice by two raters. Relative reliability was analysed with Intraclass Correlation Coefficient (ICC), and absolute reliability with standard error of measurement (SEM) and smallest detectable change (SDC). Concurrent validity was assessed against the FES-I using Spearman's rho. The BESTest showed very good interrater reliability (ICC = 0.98, SEM = 1.79, SDC 95 = 5.0) and test-retest reliability (rater A/rater B = ICC = 0.89/0.89, SEM = 3.9/4.3, SDC 95 = 10.8/11.8). The Mini-BESTest also showed very good interrater reliability (ICC = 0.95, SEM = 1.19, SDC 95 = 3.3) and test-retest reliability (rater A/rater B = ICC = 0.85/0.84, SEM = 1.8/1.9, SDC 95 = 4.9/5.2). The correlations were moderate between the FES-I and both the BESTest and the Mini-BESTest (Spearman's rho -0.51 and-0.50, p < 0.01). The BESTest and its short form, the Mini-BESTest, showed very good interrater and test-retest reliability when assessed in a heterogeneous sample of people with increased risk of falling. The concurrent validity measured against the FES-I showed moderate correlation. The results are comparable with earlier studies and indicate that the Norwegian versions can be used in daily clinic and in research.
Lim, J X; Toh, R X; Chook, S K H; Sebastin, S J; Karjalainen, T
2014-06-01
Previous studies have established the role of quantitative measurements of palmar abduction strength of the thumb (PAST). This study compares the reliability of the 'make' versus the 'break' test in measuring PAST in healthy volunteers. In a 'make' test, the body part being tested is positioned at the start of its range of motion and the participant is asked to exert his/her maximal force. In a 'break' test, increasing force is applied to a body part after it has completed its range of motion, until the joint being tested gives way. PAST was measured in both hands in 100 healthy volunteers using a handheld device. Two examiners measured PAST using both the 'make' and 'break' test to determine inter-rater reliability. The tests were repeated in 30 volunteers 6 weeks after the initial testing to determine intra-rater reliability. Our results showed that the 'make' test has better inter and intra-rater reliability.
El-Housseiny, Azza A; Alsadat, Farah A; Alamoudi, Najlaa M; El Derwi, Douaa A; Farsi, Najat M; Attar, Moaz H; Andijani, Basil M
2016-04-14
Early recognition of dental fear is essential for the effective delivery of dental care. This study aimed to test the reliability and validity of the Arabic version of the Children's Fear Survey Schedule-Dental Subscale (CFSS-DS). A school-based sample of 1546 children was randomly recruited. The Arabic version of the CFSS-DS was completed by children during class time. The scale was tested for internal consistency and test-retest reliability. To test criterion validity, children's behavior was assessed using the Frankl scale during dental examination, and results were compared with children's CFSS-DS scores. To test the scale's construct validity, scores on "fear of going to the dentist soon" were correlated with CFSS-DS scores. Factor analysis was also used. The Arabic version of the CFSS-DS showed high reliability regarding both test-retest reliability (intraclass correlation = 0.83, p < 0.001) and internal consistency (Cronbach's α = 0.88). It showed good criterion validity: children with negative behavior had significantly higher fear scores (t = 13.67, p < 0.001). It also showed moderate construct validity (Spearman's rho correlation, r = 0.53, p < 0.001). Factor analysis identified the following factors: "fear of invasive dental procedures," "fear of less invasive dental procedures" and "fear of strangers." The Arabic version of the CFSS-DS is a reliable and valid measure of dental fear in Arabic-speaking children. Pediatric dentists and researchers may use this validated version of the CFSS-DS to measure dental fear in Arabic-speaking children.
Kaukinen, P.T.; Arokoski, J.P.; Huber, E.O.; Luomajoki, H.A.
2017-01-01
Objectives: To develop a test battery of movement control (MC) tests and assess its intertester and intratester reliability. Methods: 29 subjects with knee OA with mean age of 64.7 (SD 8.7) years and 12 controls without either knee pain or previous diagnosis of OA (mean age 36.6 (SD 16.2) years) were included. Two experienced physiotherapists rated the filmed test performance of six MC tests blinded to the patients and to each other on 3-point scale as correct, incorrect or failed. Weighted kappa coefficient (wK) with 95% confidence interval (95%CI) and the percentage of agreement were calculated for each test. Results: One-leg stance, one-leg squat 30 degrees and step down tests showed moderate to excellent inter- and intratester reliability with wK ranging between 0.43-0.85 for intertester and 0.51-0.80 for intratester reliability. The reliability of the 90 degrees squat test, small squat and step up tests was poor (wK ranging between 0.09-0.50). Conclusions: One-leg stance test, one-leg squat 30 degrees and step down test are reliable in the subjects with knee OA and controls. Further studies are needed to evaluate the discriminative validity of the reliable tests. PMID:28860422
Face Perception and Test Reliabilities in Congenital Prosopagnosia in Seven Tests
Esins, Janina; Schultz, Johannes; Stemper, Claudia; Kennerknecht, Ingo
2016-01-01
Congenital prosopagnosia, the innate impairment in recognizing faces, is a very heterogeneous disorder with different phenotypical manifestations. To investigate the nature of prosopagnosia in more detail, we tested 16 prosopagnosics and 21 controls with an extended test battery addressing various aspects of face recognition. Our results show that prosopagnosics exhibited significant impairments in several face recognition tasks: impaired holistic processing (they were tested amongst others with the Cambridge Face Memory Test (CFMT)) as well as reduced processing of configural information of faces. This test battery also revealed some new findings. While controls recognized moving faces better than static faces, prosopagnosics did not exhibit this effect. Furthermore, prosopagnosics had significantly impaired gender recognition—which is shown on a groupwise level for the first time in our study. There was no difference between groups in the automatic extraction of face identity information or in object recognition as tested with the Cambridge Car Memory Test. In addition, a methodological analysis of the tests revealed reduced reliability for holistic face processing tests in prosopagnosics. To our knowledge, this is the first study to show that prosopagnosics showed a significantly reduced reliability coefficient (Cronbach’s alpha) in the CFMT compared to the controls. We suggest that compensatory strategies employed by the prosopagnosics might be the cause for the vast variety of response patterns revealed by the reduced test reliability. This finding raises the question whether classical face tests measure the same perceptual processes in controls and prosopagnosics. PMID:27482369
Translation, reliability, and clinical utility of the Melbourne Assessment 2.
Gerber, Corinna N; Plebani, Anael; Labruyère, Rob
2017-10-12
The aims were to (i) provide a German translation of the Melbourne Assessment 2 (MA2), a quantitative test to measure unilateral upper limb function in children with neurological disabilities and (ii) to evaluate its reliability and aspects of clinical utility. After its translation into German and approval of the back translation by the original authors, the MA2 was performed and videotaped twice with 30 children with neuromotor disorders. For each participant, two raters scored the video of the first test for inter-rater reliability. To determine test-retest reliability, one rater additionally scored the video of the second test while the other rater repeated the scoring of the first video to evaluate intra-rater reliability. Time needed for rater training, test administration, and scoring was recorded. The four subscale scores showed excellent intra-, inter-rater, and test-retest reliability with intraclass correlation coefficients of 0.90-1.00 (95%-confidence intervals 0.78-1.00). Score items revealed substantial to almost perfect intra-rater reliability (weighted kappa k w = 0.66-1.00) for the more affected side. Score item inter-rater and test-retest reliability of the same extremity were, with one exception, moderate to almost perfect (k w = 0.42-0.97; k w = 0.40-0.89). Furthermore, the MA2 was feasible and acceptable for patients and clinicians. The MA2 showed excellent subscale and moderate to almost perfect score item reliability. Implications for Rehabilitation There is a lack of high-quality studies about psychometric properties of upper limb measurement tools in the neuropediatric population. The Melbourne Assessment 2 is a promising tool for reliable measurement of unilateral upper limb movement quality in the neuropediatric population. The Melbourne Assessment 2 is acceptable and practicable to therapists and patients for routine use in clinical care.
Evaluating the reliability of an injury prevention screening tool: Test-retest study.
Gittelman, Michael A; Kincaid, Madeline; Denny, Sarah; Wervey Arnold, Melissa; FitzGerald, Michael; Carle, Adam C; Mara, Constance A
2016-10-01
A standardized injury prevention (IP) screening tool can identify family risks and allow pediatricians to address behaviors. To assess behavior changes on later screens, the tool must be reliable for an individual and ideally between household members. Little research has examined the reliability of safety screening tool questions. This study utilized test-retest reliability of parent responses on an existing IP questionnaire and also compared responses between household parents. Investigators recruited parents of children 0 to 1 year of age during admission to a tertiary care children's hospital. When both parents were present, one was chosen as the "primary" respondent. Primary respondents completed the 30-question IP screening tool after consent, and they were re-screened approximately 4 hours later to test individual reliability. The "second" parent, when present, only completed the tool once. All participants received a 10-dollar gift card. Cohen's Kappa was used to estimate test-retest reliability and inter-rater agreement. Standard test-retest criteria consider Kappa values: 0.0 to 0.40 poor to fair, 0.41 to 0.60 moderate, 0.61 to 0.80 substantial, and 0.81 to 1.00 as almost perfect reliability. One hundred five families participated, with five lost to follow-up. Thirty-two (30.5%) parent dyads completed the tool. Primary respondents were generally mothers (88%) and Caucasian (72%). Test-retest of the primary respondents showed their responses to be almost perfect; average 0.82 (SD = 0.13, range 0.49-1.00). Seventeen questions had almost perfect test-retest reliability and 11 had substantial reliability. However, inter-rater agreement between household members for 12 objective questions showed little agreement between responses; inter-rater agreement averaged 0.35 (SD = 0.34, range -0.19-1.00). One question had almost perfect inter-rater agreement and two had substantial inter-rater agreement. The IP screening tool used by a single individual had excellent test-retest reliability for nearly all questions. However, when a reporter changes from pre- to postintervention, differences may reflect poor reliability or different subjective experiences rather than true change.
Sleeper, Mark D; Kenyon, Lisa K; Elliott, James M; Cheng, M Samuel
2016-12-01
Despite the availability of various field-tests for many competitive sports, a reliable and valid test specifically developed for use in men's gymnastics has not yet been developed. The Men's Gymnastics Functional Measurement Tool (MGFMT) was designed to assess sport-specific physical abilities in male competitive gymnasts. The purpose of this study was to develop the MGFMT by establishing a scoring system for individual test items and to initiate the process of establishing test-retest reliability and construct validity. A total of 83 competitive male gymnasts ages 7-18 underwent testing using the MGFMT. Thirty of these subjects underwent re-testing one week later in order to assess test-retest reliability. Construct validity was assessed using a simple regression analysis between total MGFMT scores and the gymnasts' USA-Gymnastics competitive level to calculate the coefficient of determination (r 2 ). Test-retest reliability was analyzed using Model 1 Intraclass correlation coefficients (ICC). Statistical significance was set at the p<0.05 level. The relationship between total MGFMT scores and subjects' current USA-Gymnastics competitive level was found to be good (r 2 = 0.63). Reliability testing of the MGFMT composite test score showed excellent test-retest reliability over a one-week period (ICC = 0.97). Test-retest reliability of the individual component tests ranged from good to excellent (ICC = 0.75-0.97). The results of this study provide initial support for the construct validity and test-retest reliability of the MGFMT. Level 3.
Paap, Kenneth R; Sawi, Oliver
2016-12-01
Studies testing for individual or group differences in executive functioning can be compromised by unknown test-retest reliability. Test-retest reliabilities across an interval of about one week were obtained from performance in the antisaccade, flanker, Simon, and color-shape switching tasks. There is a general trade-off between the greater reliability of single mean RT measures, and the greater process purity of measures based on contrasts between mean RTs in two conditions. The individual differences in RT model recently developed by Miller and Ulrich was used to evaluate the trade-off. Test-retest reliability was statistically significant for 11 of the 12 measures, but was of moderate size, at best, for the difference scores. The test-retest reliabilities for the Simon and flanker interference scores were lower than those for switching costs. Standard practice evaluates the reliability of executive-functioning measures using split-half methods based on data obtained in a single day. Our test-retest measures of reliability are lower, especially for difference scores. These reliability measures must also take into account possible day effects that classical test theory assumes do not occur. Measures based on single mean RTs tend to have acceptable levels of reliability and convergent validity, but are "impure" measures of specific executive functions. The individual differences in RT model shows that the impurity problem is worse than typically assumed. However, the "purer" measures based on difference scores have low convergent validity that is partly caused by deficiencies in test-retest reliability. Copyright © 2016 Elsevier B.V. All rights reserved.
Reliability Measure of a Clinical Test: Appreciation of Music in Cochlear Implantees (AMICI)
Cheng, Min-Yu; Spitzer, Jaclyn B.; Shafiro, Valeriy; Sheft, Stanley; Mancuso, Dean
2014-01-01
Purpose The goals of this study were (1) to investigate the reliability of a clinical music perception test, Appreciation of Music in Cochlear Implantees (AMICI), and (2) examine associations between the perception of music and speech. AMICI was developed as a clinical instrument for assessing music perception in persons with cochlear implants (CIs). The test consists of four subtests: (1) music versus environmental noise discrimination, (2) musical instrument identification (closed-set), (3) musical style identification (closed-set), and (4) identification of musical pieces (open-set). To be clinically useful, it is crucial for AMICI to demonstrate high test-retest reliability, so that CI users can be assessed and retested after changes in maps or programming strategies. Research Design Thirteen CI subjects were tested with AMICI for the initial visit and retested again 10–14 days later. Two speech perception tests (consonant-nucleus-consonant [CNC] and Bamford-Kowal-Bench Speech-in-Noise [BKB-SIN]) were also administered. Data Analysis Test-retest reliability and equivalence of the test’s three forms were analyzed using paired t-tests and correlation coefficients, respectively. Correlation analysis was also conducted between results from the music and speech perception tests. Results Results showed no significant difference between test and retest (p > 0.05) with adequate power (0.9) as well as high correlations between the three forms (Forms A and B, r = 0.91; Forms A and C, r = 0.91; Forms B and C, r = 0.95). Correlation analysis showed high correlation between AMICI and BKB-SIN (r = −0.71), and moderate correlation between AMICI and CNC (r = 0.4). Conclusions The study showed AMICI is highly reliable for assessing musical perception in CI users. PMID:24384082
Kwon, Sungjun; Kim, Jeehoon; Kang, Seungwoo; Lee, Youngki; Baek, Hyunjae
2014-01-01
Abstract We propose CardioGuard, a brassiere-based reliable electrocardiogram (ECG) monitoring sensor system, for supporting daily smartphone healthcare applications. It is designed to satisfy two key requirements for user-unobtrusive daily ECG monitoring: reliability of ECG sensing and usability of the sensor. The system is validated through extensive evaluations. The evaluation results showed that the CardioGuard sensor reliably measure the ECG during 12 representative daily activities including diverse movement levels; 89.53% of QRS peaks were detected on average. The questionnaire-based user study with 15 participants showed that the CardioGuard sensor was comfortable and unobtrusive. Additionally, the signal-to-noise ratio test and the washing durability test were conducted to show the high-quality sensing of the proposed sensor and its physical durability in practical use, respectively. PMID:25405527
ASSOCIATIONS BETWEEN THREE CLINICAL ASSESSMENT TOOLS FOR POSTURAL STABILITY
Saxion, Casie E.; Cameron, Kenneth L.; Gerber, J. Parry
2010-01-01
Study Design: Clinical Measurement, Correlation, Reliability Objectives: To assess the relationship between the Single Leg Balance (SLB), modified Balance Error Scoring System (mBESS), and modified Star Excursion Balance (mSEBT) tests and secondarily to assess inter-rater and test-retest reliability of these tests. Background: Ankle sprains often result in chronic instability and dysfunction. Several clinical tests assess postural deficits as a potential cause of this dysfunction; however, limited information exists pertaining to the relationship that these tests have with one another. Methods: Two independent examiners measured the performance of 34 healthy participants completing the SLB Test, mBESS test, and mSEBT at two different time periods. The relationship between tests was assessed using the Pearson Correlation and Fisher's Exact Tests. Inter-rater and test-retest reliability were assessed using the intraclass correlation coefficient (ICC) and Kappa statistics. Results: A significant correlation (r = -0.35) was observed between the mSEBT and the mBESS. Fisher's Exact Test showed a significant association between the SLB Test and mBESS (P = .048), but no association between the SLB and mSEBT (P = 1.000). Inter-rater reliability was excellent for the mSEBT and fair for the mBESS (ICCs of .91 and .61 respectively). Excellent agreement was observed between raters for the SLB test (k = 1.00). Test-retest reliability was excellent for the mSEBT (ICC = 0.98) and fair for the mBESS (ICC = 0.74). There was poor test-retest agreement for the SLB test (k = .211). Conclusion: There was a significant relationship observed between the SLB Test, mBESS test, and mSEBT: however; strength of association measures showed limited overlap between these tests. This suggests that these tests are interrelated but may not assess equal components of postural stability. PMID:21589668
Larsson, Helena; Tegern, Matthias; Monnier, Andreas; Skoglund, Jörgen; Helander, Charlotte; Persson, Emelie; Malm, Christer; Broman, Lisbet; Aasa, Ulrika
2015-01-01
The objective of this study was to examine the content validity of commonly used muscle performance tests in military personnel and to investigate the reliability of a proposed test battery. For the content validity investigation, thirty selected tests were those described in the literature and/or commonly used in the Nordic and North Atlantic Treaty Organization (NATO) countries. Nine selected experts rated, on a four-point Likert scale, the relevance of these tests in relation to five different work tasks: lifting, carrying equipment on the body or in the hands, climbing, and digging. Thereafter, a content validity index (CVI) was calculated for each work task. The result showed excellent CVI (≥0.78) for sixteen tests, which comprised of one or more of the military work tasks. Three of the tests; the functional lower-limb loading test (the Ranger test), dead-lift with kettlebells, and back extension, showed excellent content validity for four of the work tasks. For the development of a new muscle strength/endurance test battery, these three tests were further supplemented with two other tests, namely, the chins and side-bridge test. The inter-rater reliability was high (intraclass correlation coefficient, ICC2,1 0.99) for all five tests. The intra-rater reliability was good to high (ICC3,1 0.82–0.96) with an acceptable standard error of mean (SEM), except for the side-bridge test (SEM%>15). Thus, the final suggested test battery for a valid and reliable evaluation of soldiers’ muscle performance comprised the following four tests; the Ranger test, dead-lift with kettlebells, chins, and back extension test. The criterion-related validity of the test battery should be further evaluated for soldiers exposed to varying physical workload. PMID:26177030
Larson, Tomas; Kerekes, Nóra; Selinus, Eva Norén; Lichtenstein, Paul; Gumpert, Clara Hellner; Anckarsäter, Henrik; Nilsson, Thomas; Lundström, Sebastian
2014-02-01
The Autism-Tics, AD/HD, and other Comorbidities (A-TAC) inventory is used in epidemiological research to assess neurodevelopmental problems and coexisting conditions. Although the A-TAC has been applied in various populations, data on retest reliability are limited. The objective of the present study was to present additional reliability data. The A-TAC was administered by lay assessors and was completed on two occasions by parents of 400 individual twins, with an average interval of 70 days between test sessions. Intra- and inter-rater reliability were analysed with intraclass correlations and Cohen's kappa. A-TAC showed excellent test-retest intraclass correlations for both autism spectrum disorder and attention deficit hyperactivity disorder (each at .84). Most modules in the A-TAC had intra- and inter-rater reliability intraclass correlation coefficients of > or = .60. Cohen's kappa indi- cated acceptable reliability. The current study provides statistical evidence that the A-TAC yields good test-retest reliability in a population-based cohort of children.
Reliability and validity of a Chinese version of the Diagnostic Interview for Borderlines-Revised.
Wang, Lanlan; Yuan, Chenmei; Qiu, Jianying; Gunderson, John; Zhang, Min; Jiang, Kaida; Leung, Freedom; Zhong, Jie; Xiao, Zeping
2014-09-01
Borderline personality disorder (BPD) is the most studied of the axis II disorders. One of the most widely used diagnostic instruments is the Diagnostic Interview for Borderline Patients-Revised (DIB-R). The aim of this study was to test the reliability and validity of DIB-R for use in the Chinese culture. The reliability and validity of the DIB-R Chinese version were assessed in a sample of 236 outpatients with a probable BPD diagnosis. The Structured Clinical Interview for DSM-IV Personality Disorders (SCID-II) was used as a standard. Test-retest reliability was tested six months later with 20 patients, and inter-rater reliability was tested on 32 patients. The Chinese version of the DIB-R showed good internal global consistency (Cronbach's α of 0.916), good test-retest reliability (Pearson correlation of 0.704), good inter-rater reliability (intra-class correlation coefficient of 0.892 and kappa of 0.861). When compared with the DSM-IV diagnosis as measured by the SCID-II, the DIB-R showed relatively good sensitivity (0.768) and specificity (0.891) at the cutoff of 7, moderate diagnostic convergence (kappa of 0.631), as well as good discriminating validity. The Chinese version of the DIB-R has good psychometric properties, which renders it a valuable method for examining the presence, the severity, and component phenotypes of BPD in Chinese samples. © 2013 Wiley Publishing Asia Pty Ltd.
Kenyon, Lisa K.; Elliott, James M; Cheng, M. Samuel
2016-01-01
Purpose/Background Despite the availability of various field-tests for many competitive sports, a reliable and valid test specifically developed for use in men's gymnastics has not yet been developed. The Men's Gymnastics Functional Measurement Tool (MGFMT) was designed to assess sport-specific physical abilities in male competitive gymnasts. The purpose of this study was to develop the MGFMT by establishing a scoring system for individual test items and to initiate the process of establishing test-retest reliability and construct validity. Methods A total of 83 competitive male gymnasts ages 7-18 underwent testing using the MGFMT. Thirty of these subjects underwent re-testing one week later in order to assess test-retest reliability. Construct validity was assessed using a simple regression analysis between total MGFMT scores and the gymnasts’ USA-Gymnastics competitive level to calculate the coefficient of determination (r2). Test-retest reliability was analyzed using Model 1 Intraclass correlation coefficients (ICC). Statistical significance was set at the p<0.05 level. Results The relationship between total MGFMT scores and subjects’ current USA-Gymnastics competitive level was found to be good (r2 = 0.63). Reliability testing of the MGFMT composite test score showed excellent test-retest reliability over a one-week period (ICC = 0.97). Test-retest reliability of the individual component tests ranged from good to excellent (ICC = 0.75-0.97). Conclusions The results of this study provide initial support for the construct validity and test-retest reliability of the MGFMT. Level of Evidence Level 3 PMID:27999723
Validity and Reliability Testing of an e-learning Questionnaire for Chemistry Instruction
NASA Astrophysics Data System (ADS)
Guspatni, G.; Kurniawati, Y.
2018-04-01
The aim of this paper is to examine validity and reliability of a questionnaire used to evaluate e-learning implementation in chemistry instruction. 48 questionnaires were filled in by students who had studied chemistry through e-learning system. The questionnaire consisted of 20 indicators evaluating students’ perception on using e-learning. Parametric testing was done as data were assumed to follow normal distribution. Item validity of the questionnaire was examined through item-total correlation using Pearson’s formula while its reliability was assessed with Cronbach’s alpha formula. Moreover, convergent validity was assessed to see whether indicators building a factor had theoretically the same underlying construct. The result of validity testing revealed 19 valid indicators while the result of reliability testing revealed Cronbach’s alpha value of .886. The result of factor analysis showed that questionnaire consisted of five factors, and each of them had indicators building the same construct. This article shows the importance of factor analysis to get a construct valid questionnaire before it is used as research instrument.
González-Gil, E M; Mouratidou, T; Cardon, G; Androutsos, O; De Bourdeaudhuij, I; Góźdź, M; Usheva, N; Birnbaum, J; Manios, Y; Moreno, L A
2014-08-01
Reliable assessments of health-related behaviours are necessary for accurate evaluation on the efficiency of public health interventions. The aim of the current study was to examine the reliability of a self-administered primary caregivers questionnaire (PCQ) used in the ToyBox-intervention. The questionnaire consisted of six sections addressing sociodemographic and perinatal factors, water and beverages consumption, physical activity, snacking and sedentary behaviours. Parents/caregivers from six countries (Belgium, Bulgaria, Germany, Greece, Poland and Spain) were asked to complete the questionnaire twice within a 2-week interval. A total of 93 questionnaires were collected. Test-retest reliability was assessed using intra-class correlation coefficient (ICC). Reliability of the six questionnaire sections was assessed. A stronger agreement was observed in the questions addressing sociodemographic and perinatal factors as opposed to questions addressing behaviours. Findings showed that 92% of the ToyBox PCQ had a moderate-to-excellent test-retest reliability (defined as ICC values from 0.41 to 1) and less than 8% poor test-retest reliability (ICC < 0.40). Out of the total ICC values, 67% showed good-to-excellent reliability (ICC from 0.61 to 1). We conclude that the PCQ is a reliable tool to assess sociodemographic characteristics, perinatal factors and lifestyle behaviours of pre-school children and their families participating in the ToyBox-intervention. © 2014 World Obesity.
Bragança, Sara; Arezes, Pedro; Carvalho, Miguel; Ashdown, Susan P; Castellucci, Ignacio; Leão, Celina
2018-01-01
Collecting anthropometric data for real-life applications demands a high degree of precision and reliability. It is important to test new equipment that will be used for data collectionOBJECTIVE:Compare two anthropometric data gathering techniques - manual methods and a Kinect-based 3D body scanner - to understand which of them gives more precise and reliable results. The data was collected using a measuring tape and a Kinect-based 3D body scanner. It was evaluated in terms of precision by considering the regular and relative Technical Error of Measurement and in terms of reliability by using the Intraclass Correlation Coefficient, Reliability Coefficient, Standard Error of Measurement and Coefficient of Variation. The results obtained showed that both methods presented better results for reliability than for precision. Both methods showed relatively good results for these two variables, however, manual methods had better results for some body measurements. Despite being considered sufficiently precise and reliable for certain applications (e.g. apparel industry), the 3D scanner tested showed, for almost every anthropometric measurement, a different result than the manual technique. Many companies design their products based on data obtained from 3D scanners, hence, understanding the precision and reliability of the equipment used is essential to obtain feasible results.
Bergamin, Marco; Gobbo, Stefano; Bullo, Valentina; Vendramin, Barbara; Duregon, Federica; Frizziero, Antonio; Di Blasio, Andrea; Cugusi, Lucia; Zaccaria, Marco; Ermolao, Andrea
2017-01-01
Lower extremity muscle mass, strength, power, and physical performance are critical determinants of independent functioning in later life. Isokinetic dynamometers are becoming very common in assessing different features of muscle strength, in both research and clinical practice; however, reliability studies are still needed to support the extended use of those devices. The purpose of this study is to assess the test-retest reliability of knee and ankle isokinetic and isometric strength testing protocols in a sample of older healthy subjects, using a new and untested isokinetic multi-joint evaluation system. Sixteen male and fourteen female older adults (mean age 65.2 ± 4.6 years) were assessed in two testing sessions. Each participant performed a randomized testing procedure that includes different isometric and isokinetic tests for knee and ankle joints. All participants concluded the trial safety and no subject reported any discomfort throughout the overall assessment. Coefficients of correlation between measures were calculated showing moderate to strong effects among all test-retest assessments and paired-sample t test showed only one significant difference (p<0.05) in the maximal isokinetic bilateral knee flexion torque. The multi-joint evaluation system for the assessment of knee and ankle isokinetic and isometric strength provided reliable test-retest measures in healthy older adults. Ib.
Wu, X; Lund, M S; Sun, D; Zhang, Q; Su, G
2015-10-01
One of the factors affecting the reliability of genomic prediction is the relationship among the animals of interest. This study investigated the reliability of genomic prediction in various scenarios with regard to the relationship between test and training animals, and among animals within the training data set. Different training data sets were generated from EuroGenomics data and a group of Nordic Holstein bulls (born in 2005 and afterwards) as a common test data set. Genomic breeding values were predicted using a genomic best linear unbiased prediction model and a Bayesian mixture model. The results showed that a closer relationship between test and training animals led to a higher reliability of genomic predictions for the test animals, while a closer relationship among training animals resulted in a lower reliability. In addition, the Bayesian mixture model in general led to a slightly higher reliability of genomic prediction, especially for the scenario of distant relationships between training and test animals. Therefore, to prevent a decrease in reliability, constant updates of the training population with animals from more recent generations are required. Moreover, a training population consisting of less-related animals is favourable for reliability of genomic prediction. © 2015 Blackwell Verlag GmbH.
Barbado, David; Moreside, Janice; Vera-Garcia, Francisco J
2017-03-01
Although unstable seat methodology has been used to assess trunk postural control, the reliability of the variables that characterize it remains unclear. To analyze reliability and learning effect of center of pressure (COP) and kinematic parameters that characterize trunk postural control performance in unstable seating. The relationships between kinematic and COP parameters also were explored. Test-retest reliability design. Biomechanics laboratory setting. Twenty-three healthy male subjects. Participants volunteered to perform 3 sessions at 1-week intervals, each consisting of five 70-second balancing trials. A force platform and a motion capture system were used to measure COP and pelvis, thorax, and spine displacements. Reliability was assessed through standard error of measurement (SEM) and intraclass correlation coefficients (ICC 2,1 ) using 3 methods: (1) comparing the last trial score of each day; (2) comparing the best trial score of each day; and (3) calculating the average of the three last trial scores of each day. Standard deviation and mean velocity were calculated to assess balance performance. Although analyses of variance showed some differences in balance performance between days, these differences were not significant between days 2 and 3. Best result and average methods showed the greatest reliability. Mean velocity of the COP showed high reliability (0.71 < ICC < 0.86; 10.3 < SEM < 13.0), whereas standard deviation only showed a low to moderate reliability (0.37 < ICC < 0.61; 14.5 < SEM < 23.0). Regarding the kinematic variables, only pelvis displacement mean velocity achieved a high reliability using the average method (0.62 < ICC < 0.83; 18.8 < SEM < 23.1). Correlations between COP and kinematics were high only for mean velocity (0.45
Moro, Maria Francesca; Colom, Francesc; Floris, Francesca; Pintus, Elisa; Pintus, Mirra; Contini, Francesca; Carta, Mauro Giovanni
2012-01-01
Background: Functioning Assessment Short Test (FAST) is a brief instrument designed to assess the main functioning problems experienced by psychiatric patients, specifically bipolar patients. It includes 24 items assessing impairment or disability in six domains of functioning: autonomy, occupational functioning, cognitive functioning, financial issues, interpersonal relationships and leisure time. The aim of this study is to measure the validity and reliability of the Italian version of this instrument. Methods: Twenty-four patients with DSM-IV TR bipolar disorder and 20 healthy controls were recruited and evaluated in three private clinics in Cagliari (Sardinia, Italy). The psychometric properties of FAST (feasibility, internal consistency, concurrent validity, discriminant validity (patients vs controls and eutimic patients vs manic and depressed), and test-retest reliability were analyzed. Results: The internal consistency obtained was very high with a Cronbach's alpha of 0.955. A highly significant negative correlation with GAF was obtained (r = -0.9; p < 0.001) pointing to a reasonable degree of concurrent validity. FAST show a good test-retest reliability between two independent evaluation differing of one week (mean K =0.73). The total FAST scores were lower in controls as compared with Bipolar Patients and in Euthimic patients compared with Depressed or Manic. Conclusion: The Italian version of the FAST showed similar psychometrics properties as far as regard internal consistency and discriminant validity of the original version and show a good test retest reliability measure by means of K statistics. PMID:22905035
Wright, F Virginia; Ryan, Jennifer; Brewer, Kelly
2010-01-01
To examine inter-rater, intra-rater and test-re-test reliability of the Community Balance and Mobility Scale (CB&M) and compare reliability in live vs videotape rating contexts for children with acquired brain injury (ABI). Repeated measures design. Seven physiotherapists (PTs) were trained as assessors. The primary assessor administered and scored baseline CB&M while the second assessor observed and scored independently (inter-rater reliability). Re-assessment occurred 3-10 days later by primary assessor (test-re-test reliability). Assessments were videotaped. There were 32 participants with ABI (mean age = 14 years 1 month (SD = 2 years 1 month)). Baseline mean scores were 67.4% (18.2) and 66.7% (18.3) for primary and second assessor, respectively. Primary assessors' re-test mean score was 69.3%. Inter-rater reliability ICC was 0.93 (95% confidence interval (CI) = 0.87-0.97). Test-re-test ICC was 0.90 (95%CI = 0.81-0.95) and Bland-Altman plot indicated greatest test-re-test differences for mid-range CB&M scores. Minimum detectable change (MDC₉₀) was 13.5% points. The CB&M showed excellent reliability in youth. Reliability was comparable for live and videotape rating approaches, meaning that the easier and less expensive live-rating can be recommended. Future work should focus on evaluation of responsiveness to change in rehabilitation centre and community intervention contexts.
Ribeiro, João Carlos; Simões, João; Silva, Filipe; Silva, Eduardo D.; Hummel, Cornelia; Hummel, Thomas; Paiva, António
2016-01-01
The cross-cultural adaptation and validation of the Sniffin`Sticks test for the Portuguese population is described. Over 270 people participated in four experiments. In Experiment 1, 67 participants rated the familiarity of presented odors and seven descriptors of the original test were adapted to a Portuguese context. In Experiment 2, the Portuguese version of Sniffin`Sticks test was administered to 203 healthy participants. Older age, male gender and active smoking status were confirmed as confounding factors. The third experiment showed the validity of the Portuguese version of Sniffin`Sticks test in discriminating healthy controls from patients with olfactory dysfunction. In Experiment 4, the test-retest reliability for both the composite score (r71 = 0.86) and the identification test (r71 = 0.62) was established (p<0.001). Normative data for the Portuguese version of Sniffin`Sticks test is provided, showing good validity and reliability and effectively distinguishing patients from healthy controls with high sensitivity and specificity. The Portuguese version of Sniffin`Sticks test identification test is a clinically suitable screening tool in routine outpatient Portuguese settings. PMID:26863023
Beemster, Timo T; van Velzen, Judith M; van Bennekom, Coen A M; Reneman, Michiel F; Frings-Dresen, Monique H W
2018-03-16
The purpose of this study was to assess test-retest reliability, agreement, and responsiveness of questionnaires on productivity loss (iPCQ-VR) and healthcare utilization (TiCP-VR) for sick-listed workers with chronic musculoskeletal pain who were referred to vocational rehabilitation. Methods Test-retest reliability and agreement was assessed with a 2-week interval. Responsiveness was assessed at discharge after a 15-week vocational rehabilitation (VR) program. Data was obtained from six Dutch VR centers. Test-retest reliability was determined with intraclass correlation coefficient (ICC) and Cohen's kappa. Agreement was determined by Standard Error of Measurement (SEM), smallest detectable changes (on group and individual level), and percentage observed, positive and negative agreement. Responsiveness was determined with area under the curve (AUC) obtained from receiver operation characteristic (ROC). Results A sample of 52 participants on test-retest reliability and agreement, and a sample of 223 on responsiveness were included in the analysis. Productivity loss (iPCQ-VR): ICCs ranged from 0.52 to 0.90, kappa ranged from 0.42 to 0.96, and AUC ranged from 0.55 to 0.86. Healthcare utilization (TiCP-VR): ICC was 0.81, and kappa values of the single healthcare utilization items ranged from 0.11 to 1.00. Conclusions The iPCQ-VR showed good measurement properties on working status, number of hours working per week and long-term sick leave, and low measurement properties on short-term sick leave and presenteeism. The TiCP-VR showed adequate reliability on all healthcare utilization items together and medication use, but showed low measurement properties on the single healthcare utilization items.
Akram, A J; Ireland, A J; Postlethwaite, K C; Sandy, J R; Jerreat, A S
2013-11-01
This article describes the process of validity and reliability testing of a condition-specific quality-of-life measure for patients with hypodontia presenting for orthodontic treatment. The development of the instrument is described in a previous article. Royal Devon and Exeter NHS Foundation Trust & Musgrove Park Hospital, Taunton. The child perception questionnaire was used as a standard against which to test criterion validity. The Bland and Altman method was used to check agreement between the two questionnaires. Construct validity was tested using principal component analysis on the four sections of the questionnaire. Test-retest reliability was tested using intraclass correlation coefficient and Bland and Altman method. Cronbach's alpha was used to test internal consistency reliability. Overall the questionnaire showed good reliability, criterion and construct validity. This together with previous evidence of good face and content validity suggests that the instrument may prove useful in clinical practice and further research. This study has demonstrated that the newly developed condition-specific quality-of-life questionnaire is both valid and reliable for use in young patients with hypodontia. © 2013 John Wiley & Sons A/S. Published by Blackwell Publishing Ltd.
Operator adaptation to changes in system reliability under adaptable automation.
Chavaillaz, Alain; Sauer, Juergen
2017-09-01
This experiment examined how operators coped with a change in system reliability between training and testing. Forty participants were trained for 3 h on a complex process control simulation modelling six levels of automation (LOA). In training, participants either experienced a high- (100%) or low-reliability system (50%). The impact of training experience on operator behaviour was examined during a 2.5 h testing session, in which participants either experienced a high- (100%) or low-reliability system (60%). The results showed that most operators did not often switch between LOA. Most chose an LOA that relieved them of most tasks but maintained their decision authority. Training experience did not have a strong impact on the outcome measures (e.g. performance, complacency). Low system reliability led to decreased performance and self-confidence. Furthermore, complacency was observed under high system reliability. Overall, the findings suggest benefits of adaptable automation because it accommodates different operator preferences for LOA. Practitioner Summary: The present research shows that operators can adapt to changes in system reliability between training and testing sessions. Furthermore, it provides evidence that each operator has his/her preferred automation level. Since this preference varies strongly between operators, adaptable automation seems to be suitable to accommodate these large differences.
Reliability based design optimization: Formulations and methodologies
NASA Astrophysics Data System (ADS)
Agarwal, Harish
Modern products ranging from simple components to complex systems should be designed to be optimal and reliable. The challenge of modern engineering is to ensure that manufacturing costs are reduced and design cycle times are minimized while achieving requirements for performance and reliability. If the market for the product is competitive, improved quality and reliability can generate very strong competitive advantages. Simulation based design plays an important role in designing almost any kind of automotive, aerospace, and consumer products under these competitive conditions. Single discipline simulations used for analysis are being coupled together to create complex coupled simulation tools. This investigation focuses on the development of efficient and robust methodologies for reliability based design optimization in a simulation based design environment. Original contributions of this research are the development of a novel efficient and robust unilevel methodology for reliability based design optimization, the development of an innovative decoupled reliability based design optimization methodology, the application of homotopy techniques in unilevel reliability based design optimization methodology, and the development of a new framework for reliability based design optimization under epistemic uncertainty. The unilevel methodology for reliability based design optimization is shown to be mathematically equivalent to the traditional nested formulation. Numerical test problems show that the unilevel methodology can reduce computational cost by at least 50% as compared to the nested approach. The decoupled reliability based design optimization methodology is an approximate technique to obtain consistent reliable designs at lesser computational expense. Test problems show that the methodology is computationally efficient compared to the nested approach. A framework for performing reliability based design optimization under epistemic uncertainty is also developed. A trust region managed sequential approximate optimization methodology is employed for this purpose. Results from numerical test studies indicate that the methodology can be used for performing design optimization under severe uncertainty.
ERIC Educational Resources Information Center
Vacha-Haase, Tammi; Kogan, Lori R.; Tani, Crystal R.; Woodall, Renee A.
2001-01-01
Used reliability generalization to explore the variance of scores on 10 Minnesota Multiphasic Personality Inventory (MMPI) clinical scales drawing on 1,972 articles in the literature on the MMPI. Results highlight the premise that scores, not tests, are reliable or unreliable, and they show that study characteristics do influence scores on the…
Safipour, Jalal; Tessma, Mesfin Kassaye; Higginbottom, Gina; Emami, Azita
2010-12-01
The objective of the study is to translate and examine the reliability and validity of the Jessor and Jessor Social Alienation Scale for use in a Swedish context. The study involved four phases of testing: (1) Translation and back-translation; (2) a pilot test to evaluate the translation; (3) reliability testing; and (4) a validity test. Main participants of this study were 446 students (Age = 15-19, SD = 1.01, Mean = 17). Results from the reliability test showed high internal consistency and stability. Face, content and construct validity were demonstrated using experts and confirmatory factor analysis. The results of testing the Swedish version of the alienation scale revealed an acceptable level of reliability and validity, and is appropriate for use in the Swedish context. © 2010 The Authors. Scandinavian Journal of Psychology © 2010 The Scandinavian Psychological Associations.
System reliability of randomly vibrating structures: Computational modeling and laboratory testing
NASA Astrophysics Data System (ADS)
Sundar, V. S.; Ammanagi, S.; Manohar, C. S.
2015-09-01
The problem of determination of system reliability of randomly vibrating structures arises in many application areas of engineering. We discuss in this paper approaches based on Monte Carlo simulations and laboratory testing to tackle problems of time variant system reliability estimation. The strategy we adopt is based on the application of Girsanov's transformation to the governing stochastic differential equations which enables estimation of probability of failure with significantly reduced number of samples than what is needed in a direct simulation study. Notably, we show that the ideas from Girsanov's transformation based Monte Carlo simulations can be extended to conduct laboratory testing to assess system reliability of engineering structures with reduced number of samples and hence with reduced testing times. Illustrative examples include computational studies on a 10-degree of freedom nonlinear system model and laboratory/computational investigations on road load response of an automotive system tested on a four-post test rig.
Rodríguez-Rosell, David; Mora-Custodio, Ricardo; Franco-Márquez, Felipe; Yáñez-García, Juan M; González-Badillo, Juan J
2017-01-01
Rodríguez-Rosell, D, Mora-Custodio, R, Franco-Márquez, F, Yáñez-García, JM, González-Badillo, JJ. Traditional vs. sport-specific vertical jump tests: reliability, validity, and relationship with the legs strength and sprint performance in adult and teen soccer and basketball players. J Strength Cond Res 31(1): 196-206, 2017-The vertical jump is considered an essential motor skill in many team sports. Many protocols have been used to assess vertical jump ability. However, controversy regarding test selection still exists based on the reliability and specificity of the tests. The main aim of this study was to analyze the reliability and validity of 2 standardized (countermovement jump [CMJ] and Abalakov jump [AJ]) and 2 sport-specific (run-up with 2 [2-LEGS] or 1 leg [1-LEG] take-off jump) vertical jump tests, and their usefulness as predictors of sprint and strength performance for soccer (n = 127) and basketball (n = 59) players in 3 different categories (Under-15, Under-18, and Adults). Three attempts for each of the 4 jump tests were recorded. Twenty-meter sprint time and estimated 1 repetition maximum in full squat were also evaluated. All jump tests showed high intraclass correlation coefficients (0.969-0.995) and low coefficients of variation (1.54-4.82%), although 1-LEG was the jump test with the lowest absolute and relative reliability. All selected jump tests were significantly correlated (r = 0.580-0.983). Factor analysis resulted in the extraction of one principal component, which explained 82.90-95.79% of the variance of all jump tests. The 1-LEG test showed the lowest associations with sprint and strength performance. The results of this study suggest that CMJ and AJ are the most reliable tests for the estimation of explosive force in soccer and basketball players in different age categories.
Gilkison, C R; Fenton, M V; Lester, J W
1992-05-01
This study was designed to establish the reliability of a health history questionnaire used as a screening tool for incoming university students. The authors used a test-retest design, with a test interval of 6 months, on a sample of medical and nursing students. The analysis focused on overall reliability of the questionnaire and reproducibility of specific items, based on question format. Questionnaire items of specific interest were those with dichotomous yes/no response options versus open-ended format questions, those using the words frequently or recently, or those that asked multiple questions. Demographic characteristics of the subjects were considered in the evaluation of reliability. Overall reliability of the questionnaire (93.6%) was above the anticipated level of 90%, and subject sex or program of study did not show any significant differences in reproducibility of responses. Although wording of questions did not affect item reliability, dichotomous format questions demonstrated a higher degree of reliability (96.4%) than the overall reliability of the questionnaire. Recommendations for enhancing the reliability of the questionnaire are based on item analysis and information gathered from interviews with subjects.
Tankevicius, Gediminas; Lankaite, Doanata; Krisciunas, Aleksandras
2013-08-01
The lack of knowledge about isometric ankle testing indicates the need for research in this area. to assess test-retest reliability and to determine the optimal position for isometric ankle-eversion and -inversion testing. Test-retest reliability study. Isometric ankle eversion and inversion were assessed in 3 different dynamometer foot-plate positions: 0°, 7°, and 14° of inversion. Two maximal repetitions were performed at each angle. Both limbs were tested (40 ankles in total). The test was performed 2 times with a period of 7 d between the tests. University hospital. The study was carried out on 20 healthy athletes with no history of ankle sprains. Reliability was assessed using intraclass correlation coefficient (ICC2,1); minimal detectable change (MDC) was calculated using a 95% confidence interval. Paired t test was used to measure statistically significant changes, and P <.05 was considered statistically significant. Eversion and inversion peak torques showed high ICCs in all 3 angles (ICC values .87-.96, MDC values 3.09-6.81 Nm). Eversion peak torque was the smallest when testing at the 0° angle and gradually increased, reaching maximum values at 14° angle. The increase of eversion peak torque was statistically significant at 7 ° and 14° of inversion. Inversion peak torque showed an opposite pattern-it was the smallest when measured at the 14° angle and increased at the other 2 angles; statistically significant changes were seen only between measures taken at 0° and 14°. Isometric eversion and inversion testing using the Biodex 4 Pro system is a reliable method. The authors suggest that the angle of 7° of inversion is the best for isometric eversion and inversion testing.
NASA Astrophysics Data System (ADS)
Nair, S. P.; Righetti, R.
2015-05-01
Recent elastography techniques focus on imaging information on properties of materials which can be modeled as viscoelastic or poroelastic. These techniques often require the fitting of temporal strain data, acquired from either a creep or stress-relaxation experiment to a mathematical model using least square error (LSE) parameter estimation. It is known that the strain versus time relationships for tissues undergoing creep compression have a non-linear relationship. In non-linear cases, devising a measure of estimate reliability can be challenging. In this article, we have developed and tested a method to provide non linear LSE parameter estimate reliability: which we called Resimulation of Noise (RoN). RoN provides a measure of reliability by estimating the spread of parameter estimates from a single experiment realization. We have tested RoN specifically for the case of axial strain time constant parameter estimation in poroelastic media. Our tests show that the RoN estimated precision has a linear relationship to the actual precision of the LSE estimator. We have also compared results from the RoN derived measure of reliability against a commonly used reliability measure: the correlation coefficient (CorrCoeff). Our results show that CorrCoeff is a poor measure of estimate reliability for non-linear LSE parameter estimation. While the RoN is specifically tested only for axial strain time constant imaging, a general algorithm is provided for use in all LSE parameter estimation.
Swanenburg, Jaap; Nevzati, Arian; Mittaz Hager, Anne Gabrielle; de Bruin, Eling D; Klipstein, Andreas
2013-01-01
The aim of this study was to test the reliability and validity of a preferred-standing test for measuring the risk of falling. The preferred-standing position of elderly fallers and non-fallers and healthy young adults was measured. The maximal BSW was measured. The absolute and relative reliability and discriminant validity were assessed. The expanded timed get-up-and-go test (ETGUG), one-leg stance test (OS), tandem stance (TS), and falls efficacy scale international version (FES-I) were used to determine criterion validity. In total, 146 persons (102 females, 44 males; mean age 55±22 years, range 20-94) were recruited. Forty elderly community dwellers (8 fallers) and 26 young adults were tested twice to determine the test-retest reliability. The BSW showed acceptable test-retest reliability (Intraclass correlation coefficient, ICC2,1=0.77-0.83) and inter-rater reliability (ICC3,1=0.77-0.95) for all groups. The standard error of measurement (SEM) was between 0.77 and 1.87, and the smallest detectable change (SDC) was between 2.14cm and 5.19cm. The Bland-Altman plot revealed no systematic errors. There was significant difference between elderly fallers and non-fallers (F(1/75)=11.951; p=0.001. Spearman's rho coefficient values showed no correlation between the BSW and the ETGUG (-0.17, p=0.47), OLS (-0.04, p=0.65), TS (-0.11, p=0.21), and FES-I (-0.10; p=0.27). Only the BSW was a significant predictor for falling (odds ratio=0.736, p=0.007). The reliability and validity of the BSW protocol were acceptable overall. Prospective studies are warranted to evaluate the predictive value of the BSW for determining the risk of falling. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
de Witte, Annemarie M H; Hoozemans, Marco J M; Berger, Monique A M; van der Slikke, Rienk M A; van der Woude, Lucas H V; Veeger, Dirkjan H E J
2018-01-01
The aim of this study was to develop and describe a wheelchair mobility performance test in wheelchair basketball and to assess its construct validity and reliability. To mimic mobility performance of wheelchair basketball matches in a standardised manner, a test was designed based on observation of wheelchair basketball matches and expert judgement. Forty-six players performed the test to determine its validity and 23 players performed the test twice for reliability. Independent-samples t-tests were used to assess whether the times needed to complete the test were different for classifications, playing standards and sex. Intraclass correlation coefficients (ICC) were calculated to quantify reliability of performance times. Males performed better than females (P < 0.001, effect size [ES] = -1.26) and international men performed better than national men (P < 0.001, ES = -1.62). Performance time of low (≤2.5) and high (≥3.0) classification players was borderline not significant with a moderate ES (P = 0.06, ES = 0.58). The reliability was excellent for overall performance time (ICC = 0.95). These results show that the test can be used as a standardised mobility performance test to validly and reliably assess the capacity in mobility performance of elite wheelchair basketball athletes. Furthermore, the described methodology of development is recommended for use in other sports to develop sport-specific tests.
Sekir, U; Yildiz, Y; Hazneci, B; Ors, F; Saka, T; Aydin, T
2008-12-01
In contrast to the single evaluation methods used in the past, the combination of multiple tests allows one to obtain a global assessment of the ankle joint. The aim of this study was to determine the reliability of the different tests in a functional test battery. Twenty-four male recreational athletes with unilateral functional ankle instability (FAI) were recruited for this study. One component of the test battery included five different functional ability tests. These tests included a single limb hopping course, single-legged and triple-legged hop for distance, and six and cross six meter hop for time. The ankle joint position sense and one leg standing test were used for evaluation of proprioception and sensorimotor control. The isokinetic strengths of the ankle invertor and evertor muscles were evaluated at a velocity of 120 degrees /s. The reliability of the test battery was assessed by calculating the intraclass correlation coefficient (ICC). Each subject was tested two times, with an interval of 3-5 days between the test sessions. The ICCs for ankle functional and proprioceptive ability showed high reliability (ICCs ranging from 0.94 to 0.98). Additionally, isokinetic ankle joint inversion and eversion strength measurements represented good to high reliability (ICCs between 0.82 and 0.98). The functional test battery investigated in this study proved to be a reliable tool for the assessment of athletes with functional ankle instability. Therefore, clinicians may obtain reliable information from the functional test battery during the assessment of ankle joint performance in patients with functional ankle instability.
Comparing the Fit of Item Response Theory and Factor Analysis Models
ERIC Educational Resources Information Center
Maydeu-Olivares, Alberto; Cai, Li; Hernandez, Adolfo
2011-01-01
Linear factor analysis (FA) models can be reliably tested using test statistics based on residual covariances. We show that the same statistics can be used to reliably test the fit of item response theory (IRT) models for ordinal data (under some conditions). Hence, the fit of an FA model and of an IRT model to the same data set can now be…
Quantitative metal magnetic memory reliability modeling for welded joints
NASA Astrophysics Data System (ADS)
Xing, Haiyan; Dang, Yongbin; Wang, Ben; Leng, Jiancheng
2016-03-01
Metal magnetic memory(MMM) testing has been widely used to detect welded joints. However, load levels, environmental magnetic field, and measurement noises make the MMM data dispersive and bring difficulty to quantitative evaluation. In order to promote the development of quantitative MMM reliability assessment, a new MMM model is presented for welded joints. Steel Q235 welded specimens are tested along the longitudinal and horizontal lines by TSC-2M-8 instrument in the tensile fatigue experiments. The X-ray testing is carried out synchronously to verify the MMM results. It is found that MMM testing can detect the hidden crack earlier than X-ray testing. Moreover, the MMM gradient vector sum K vs is sensitive to the damage degree, especially at early and hidden damage stages. Considering the dispersion of MMM data, the K vs statistical law is investigated, which shows that K vs obeys Gaussian distribution. So K vs is the suitable MMM parameter to establish reliability model of welded joints. At last, the original quantitative MMM reliability model is first presented based on the improved stress strength interference theory. It is shown that the reliability degree R gradually decreases with the decreasing of the residual life ratio T, and the maximal error between prediction reliability degree R 1 and verification reliability degree R 2 is 9.15%. This presented method provides a novel tool of reliability testing and evaluating in practical engineering for welded joints.
Yang, Nan; Waddington, Gordon; Adams, Roger; Han, Jia
2018-05-01
Quantitative assessments of handedness and footedness are often required in studies of human cognition and behaviour, yet no reliable Chinese versions of commonly used handedness and footedness questionnaires are available. Accordingly, the objective of the present study was to translate the Edinburgh Handedness Inventory (EHI) and the Waterloo Footedness Questionnaire-Revised (WFQ-R) into Mandarin Chinese and to evaluate the reliability and validity of these translated versions in healthy Chinese people. In the first stage of the study, Chinese versions of the EHI and WFQ-R were produced from a process of translation, back translation and examination, with necessary cultural adaptations. The second stage involved determining the reliability and validity of the translated EHI and WFQ-R for the Chinese population. One hundred and ten Chinese participants were tested online, and the results showed that the Cronbach's alpha coefficient of internal consistency was 0.877 for the translated EHI and 0.855 for the translated WFQ-R. Another 170 Chinese participants were tested and re-tested after a 30-day interval. The intra-class correlation coefficients showed high reliability, 0.898 for the translated EHI and 0.869 for the translated WFQ-R. This preliminary validation study found the translated versions to be reliable and valid tools for assessing handedness and footedness in this population.
Sawle, Leanne; Freeman, Jennifer; Marsden, Jonathan
2017-04-01
Balance is a complex construct, affected by multiple components such as strength and co-ordination. However, whilst assessing an athlete's dynamic balance is an important part of clinical examination, there is no gold standard measure. The multiple single-leg hop-stabilization test is a functional test which may offer a method of evaluating the dynamic attributes of balance, but it needs to show adequate intra-tester reliability. The purpose of this study was to assess the intra-rater reliability of a dynamic balance test, the multiple single-leg hop-stabilization test on the dominant and non-dominant legs. Intra-rater reliability study. Fifteen active participants were tested twice with a 10-minute break between tests. The outcome measure was the multiple single-leg hop-stabilization test score, based on a clinically assessed numerical scoring system. Results were analysed using an Intraclass Correlations Coefficient (ICC 2,1 ) and Bland-Altman plots. Regression analyses explored relationships between test scores, leg dominance, age and training (an alpha level of p = 0.05 was selected). ICCs for intra-rater reliability were 0.85 for the dominant and non-dominant legs (confidence intervals = 0.62-0.95 and 0.61-0.95 respectively). Bland-Altman plots showed scores within two standard deviations. A significant correlation was observed between the dominant and non-dominant leg on balance scores (R 2 =0.49, p<0.05), and better balance was associated with younger participants in their non-dominant leg (R 2 =0.28, p<0.05) and their dominant leg (R 2 =0.39, p<0.05), and a higher number of hours spent training for the non-dominant leg R 2 =0.37, p<0.05). The multiple single-leg hop-stabilisation test demonstrated strong intra-tester reliability with active participants. Younger participants who trained more, have better balance scores. This test may be a useful measure for evaluating the dynamic attributes of balance. 3.
Huang, Wenhao; Chapman-Novakofski, Karen M
2017-01-01
Background The extensive availability and increasing use of mobile apps for nutrition-based health interventions makes evaluation of the quality of these apps crucial for integration of apps into nutritional counseling. Objective The goal of this research was the development, validation, and reliability testing of the app quality evaluation (AQEL) tool, an instrument for evaluating apps’ educational quality and technical functionality. Methods Items for evaluating app quality were adapted from website evaluations, with additional items added to evaluate the specific characteristics of apps, resulting in 79 initial items. Expert panels of nutrition and technology professionals and app users reviewed items for face and content validation. After recommended revisions, nutrition experts completed a second AQEL review to ensure clarity. On the basis of 150 sets of responses using the revised AQEL, principal component analysis was completed, reducing AQEL into 5 factors that underwent reliability testing, including internal consistency, split-half reliability, test-retest reliability, and interrater reliability (IRR). Two additional modifiable constructs for evaluating apps based on the age and needs of the target audience as selected by the evaluator were also tested for construct reliability. IRR testing using intraclass correlations (ICC) with all 7 constructs was conducted, with 15 dietitians evaluating one app. Results Development and validation resulted in the 51-item AQEL. These were reduced to 25 items in 5 factors after principal component analysis, plus 9 modifiable items in two constructs that were not included in principal component analysis. Internal consistency and split-half reliability of the following constructs derived from principal components analysis was good (Cronbach alpha >.80, Spearman-Brown coefficient >.80): behavior change potential, support of knowledge acquisition, app function, and skill development. App purpose split half-reliability was .65. Test-retest reliability showed no significant change over time (P>.05) for all but skill development (P=.001). Construct reliability was good for items assessing age appropriateness of apps for children, teens, and a general audience. In addition, construct reliability was acceptable for assessing app appropriateness for various target audiences (Cronbach alpha >.70). For the 5 main factors, ICC (1,k) was >.80, with a P value of <.05. When 15 nutrition professionals evaluated one app, ICC (2,15) was .98, with a P value of <.001 for all 7 constructs when the modifiable items were specified for adults seeking weight loss support. Conclusions Our preliminary effort shows that AQEL is a valid, reliable instrument for evaluating nutrition apps’ qualities for clinical interventions by nutrition clinicians, educators, and researchers. Further efforts in validating AQEL in various contexts are needed. PMID:29079554
Reliability Testing of NASA Piezocomposite Actuators
NASA Technical Reports Server (NTRS)
Wilkie, W.; High, J.; Bockman, J.
2002-01-01
NASA Langley Research Center has developed a low-cost piezocomposite actuator which has application for controlling vibrations in large inflatable smart space structures, space telescopes, and high performance aircraft. Tests show the NASA piezocomposite device is capable of producing large, directional, in-plane strains on the order of 2000 parts-per-million peak-to-peak, with no reduction in free-strain performance to 100 million electrical cycles. This paper describes methods, measurements, and preliminary results from our reliability evaluation of the device under externally applied mechanical loads and at various operational temperatures. Tests performed to date show no net reductions in actuation amplitude while the device was moderately loaded through 10 million electrical cycles. Tests were performed at both room temperature and at the maximum operational temperature of the epoxy resin system used in manufacture of the device. Initial indications are that actuator reliability is excellent, with no actuator failures or large net reduction in actuator performance.
Ahlström, Isabell; Hellström, Karin; Emtner, Margareta; Anens, Elisabeth
2015-03-01
To examine the test-retest reliability of the Swedish translated version of the Exercise Self-Efficacy Scale (S-ESES) in people with neurological disease and to examine internal consistency. Test-retest study. A total of 30 adults with neurological diseases including: Parkinson's disease; Multiple Sclerosis; Cervical Dystonia; and Charcot-Marie-Tooth disease. The S-ESES was sent twice by surface mail. Completion interval mean was 16 days apart. Weighted kappa, intraclass correlation coefficient 2,1 [ICC (2,1)], standard error of measurement (SEM), also expressed as a percentage value (SEM%), and Cronbach's alpha were calculated. The relative reliability of the test-retest results showed substantial agreement measured using weighted kappa (MD = 0.62) and a very high-reliability ICC (2,1) (0.92). Absolute reliability measured using SEM was 5.3 and SEM% was 20.7. Excellent internal consistency was shown, with an alpha coefficient of 0.91 (test 1) and 0.93 (test 2). The S-ESES is recommended for use in research and in clinical work for people with neurological diseases. The low-absolute reliability, however, indicates a limited ability to measure changes on an individual level.
Bosakova, Lucia; Kolarcik, Peter; Bobakova, Daniela; Sulcova, Martina; Van Dijk, Jitse P; Reijneveld, Sijmen A; Geckova, Andrea Madarasova
2016-04-01
Participation in organized activities is related with a range of positive outcomes, but the way such participation is measured has not been scrutinized. Test-retest reliability as an important indicator of a scale's reliability has been assessed rarely and for "The scale of participation in organized activities" lacks completely. This test-retest study is based on the Health Behaviour in School-aged Children study and is consistent with its methodology. We obtained data from 353 Czech (51.9 % boys) and 227 Slovak (52.9 % boys) primary school pupils, grades five and nine, who participated in this study in 2013. We used Cohen's kappa statistic and single measures of the intraclass correlation coefficient to estimate the test-retest reliability of all selected items in the sample, stratified by gender, age and country. We mostly observed a large correlation between the test and retest in all of the examined variables (κ ranged from 0.46 to 0.68). Test-retest reliability of the sum score of individual items showed substantial agreement (ICC = 0.64). The scale of participation in organized activities has an acceptable level of agreement, indicating good reliability.
Bergamin, Marco; Gobbo, Stefano; Bullo, Valentina; Vendramin, Barbara; Duregon, Federica; Frizziero, Antonio; Di Blasio, Andrea; Cugusi, Lucia; Zaccaria, Marco; Ermolao, Andrea
2017-01-01
Summary Background Lower extremity muscle mass, strength, power, and physical performance are critical determinants of independent functioning in later life. Isokinetic dynamometers are becoming very common in assessing different features of muscle strength, in both research and clinical practice; however, reliability studies are still needed to support the extended use of those devices. Objective The purpose of this study is to assess the test-retest reliability of knee and ankle isokinetic and isometric strength testing protocols in a sample of older healthy subjects, using a new and untested isokinetic multi-joint evaluation system. Methods Sixteen male and fourteen female older adults (mean age 65.2 ± 4.6 years) were assessed in two testing sessions. Each participant performed a randomized testing procedure that includes different isometric and isokinetic tests for knee and ankle joints. Results All participants concluded the trial safety and no subject reported any discomfort throughout the overall assessment. Coefficients of correlation between measures were calculated showing moderate to strong effects among all test-retest assessments and paired-sample t test showed only one significant difference (p<0.05) in the maximal isokinetic bilateral knee flexion torque. Conclusions The multi-joint evaluation system for the assessment of knee and ankle isokinetic and isometric strength provided reliable test-retest measures in healthy older adults. Level of evidence Ib. PMID:29264344
Test-retest reliability of sensor-based sit-to-stand measures in young and older adults.
Regterschot, G Ruben H; Zhang, Wei; Baldus, Heribert; Stevens, Martin; Zijlstra, Wiebren
2014-01-01
This study investigated test-retest reliability of sensor-based sit-to-stand (STS) peak power and other STS measures in young and older adults. In addition, test-retest reliability of the sensor method was compared to test-retest reliability of the Timed Up and Go Test (TUGT) and Five-Times-Sit-to-Stand Test (FTSST) in older adults. Ten healthy young female adults (20-23 years) and 31 older adults (21 females; 73-94 years) participated in two assessment sessions separated by 3-8 days. Vertical peak power was assessed during three (young adults) and five (older adults) normal and fast STS trials with a hybrid motion sensor worn on the hip. Older adults also performed the FTSST and TUGT. The average sensor-based STS peak power of the normal STS trials and the average sensor-based STS peak power of the fast STS trials showed excellent test-retest reliability in young adults (intra-class correlation (ICC)≥0.90; zero in 95% confidence interval of mean difference between test and retest (95%CI of D); standard error of measurement (SEM)≤6.7% of mean peak power) and older adults (ICC≥0.91; zero in 95%CI of D; SEM≤9.9%). Test-retest reliability of sensor-based STS peak power and TUGT (ICC=0.98; zero in 95%CI of D; SEM=8.5%) was comparable in older adults, test-retest reliability of the FTSST was lower (ICC=0.73; zero outside 95%CI of D; SEM=14.4%). Sensor-based STS peak power demonstrated excellent test-retest reliability and may therefore be useful for clinical assessment of functional status and fall risk. Copyright © 2014 Elsevier B.V. All rights reserved.
Sauer, Juergen; Chavaillaz, Alain
2017-01-01
This experiment aimed to examine how skill lay-off and system reliability would affect operator behaviour in a simulated work environment under wide-range and large-choice adaptable automation comprising six different levels. Twenty-four participants were tested twice during a 2-hr testing session, with the second session taking place 8 months after the first. In the middle of the second testing session, system reliability changed. The results showed that after the retention interval trust increased and self-confidence decreased. Complacency was unaffected by the lay-off period. Diagnostic speed slowed down after the retention interval but diagnostic accuracy was maintained. No difference between experimental conditions was found for automation management behaviour (i.e. level of automation chosen and frequency of switching between levels). There were few effects of system reliability. Overall, the findings showed that subjective measures were more sensitive to the impact of skill lay-off than objective behavioural measures. Copyright © 2016 Elsevier Ltd. All rights reserved.
The reliability and validity of the SF-8 with a conflict-affected population in northern Uganda.
Roberts, Bayard; Browne, John; Ocaka, Kaducu Felix; Oyok, Thomas; Sondorp, Egbert
2008-12-02
The SF-8 is a health-related quality of life instrument that could provide a useful means of assessing general physical and mental health amongst populations affected by conflict. The purpose of this study was to test the validity and reliability of the SF-8 with a conflict-affected population in northern Uganda. A cross-sectional multi-staged, random cluster survey was conducted with 1206 adults in camps for internally displaced persons in Gulu and Amuru districts of northern Uganda. Data quality was assessed by analysing the number of incomplete responses to SF-8 items. Response distribution was analysed using aggregate endorsement frequency. Test-retest reliability was assessed in a separate smaller survey using the intraclass correlation test. Construct validity was measured using principal component analysis, and the Pearson Correlation test for item-summary score correlation and inter-instrument correlations. Known groups validity was assessed using a two sample t-test to evaluates the ability of the SF-8 to discriminate between groups known to have, and not have, physical and mental health problems. The SF-8 showed excellent data quality. It showed acceptable item response distribution based upon analysis of aggregate endorsement frequencies. Test-retest showed a good intraclass correlation of 0.61 for PCS and 0.68 for MCS. The principal component analysis indicated strong construct validity and concurred with the results of the validity tests by the SF-8 developers. The SF-8 also showed strong construct validity between the 8 items and PCS and MCS summary score, moderate inter-instrument validity, and strong known groups validity. This study provides evidence on the reliability and validity of the SF-8 amongst IDPs in northern Uganda.
The reliability and validity of the SF-8 with a conflict-affected population in northern Uganda
Roberts, Bayard; Browne, John; Ocaka, Kaducu Felix; Oyok, Thomas; Sondorp, Egbert
2008-01-01
Background The SF-8 is a health-related quality of life instrument that could provide a useful means of assessing general physical and mental health amongst populations affected by conflict. The purpose of this study was to test the validity and reliability of the SF-8 with a conflict-affected population in northern Uganda. Methods A cross-sectional multi-staged, random cluster survey was conducted with 1206 adults in camps for internally displaced persons in Gulu and Amuru districts of northern Uganda. Data quality was assessed by analysing the number of incomplete responses to SF-8 items. Response distribution was analysed using aggregate endorsement frequency. Test-retest reliability was assessed in a separate smaller survey using the intraclass correlation test. Construct validity was measured using principal component analysis, and the Pearson Correlation test for item-summary score correlation and inter-instrument correlations. Known groups validity was assessed using a two sample t-test to evaluates the ability of the SF-8 to discriminate between groups known to have, and not have, physical and mental health problems. Results The SF-8 showed excellent data quality. It showed acceptable item response distribution based upon analysis of aggregate endorsement frequencies. Test-retest showed a good intraclass correlation of 0.61 for PCS and 0.68 for MCS. The principal component analysis indicated strong construct validity and concurred with the results of the validity tests by the SF-8 developers. The SF-8 also showed strong construct validity between the 8 items and PCS and MCS summary score, moderate inter-instrument validity, and strong known groups validity. Conclusion This study provides evidence on the reliability and validity of the SF-8 amongst IDPs in northern Uganda. PMID:19055716
Burnstein, Bryan D; Steele, Russell J; Shrier, Ian
2011-01-01
Fitness testing is used frequently in many areas of physical activity, but the reliability of these measurements under real-world, practical conditions is unknown. To evaluate the reliability of specific fitness tests using the methods and time periods used in the context of real-world sport and occupational management. Cohort study. Eighteen different Cirque du Soleil shows. Cirque du Soleil physical performers who completed 4 consecutive tests (6-month intervals) and were free of injury or illness at each session (n = 238 of 701 physical performers). Performers completed 6 fitness tests on each assessment date: dynamic balance, Harvard step test, handgrip, vertical jump, pull-ups, and 60-second jump test. We calculated the intraclass coefficient (ICC) and limits of agreement between baseline and each time point and the ICC over all 4 time points combined. Reliability was acceptable (ICC > 0.6) over an 18-month time period for all pairwise comparisons and all time points together for the handgrip, vertical jump, and pull-up assessments. The Harvard step test and 60-second jump test had poor reliability (ICC < 0.6) between baseline and other time points. When we excluded the baseline data and calculated the ICC for 6-month, 12-month, and 18-month time points, both the Harvard step test and 60-second jump test demonstrated acceptable reliability. Dynamic balance was unreliable in all contexts. Limit-of-agreement analysis demonstrated considerable intraindividual variability for some tests and a learning effect by administrators on others. Five of the 6 tests in this battery had acceptable reliability over an 18-month time frame, but the values for certain individuals may vary considerably from time to time for some tests. Specific tests may require a learning period for administrators.
Brownson, Ross C.; Chang, Jen Jen; Eyler, Amy A.; Ainsworth, Barbara E.; Kirtland, Karen A.; Saelens, Brian E.; Sallis, James F.
2004-01-01
Objectives. We tested the reliability of 3 instruments that assessed social and physical environments. Methods. We conducted a test–retest study among US adults (n = 289). We used telephone survey methods to measure suitableness of the perceived (vs objective) environment for recreational physical activity and nonmotorized transportation. Results. Most questions in our surveys that attempted to measure specific characteristics of the built environment showed moderate to high reliability. Questions about the social environment showed lower reliability than those that assessed the physical environment. Certain blocks of questions appeared to be selectively more reliable for urban or rural respondents. Conclusions. Despite differences in content and in response formats, all 3 surveys showed evidence of reliability, and most items are now ready for use in research and in public health surveillance. PMID:14998817
Medina-Mirapeix, Francesc; Vivo-Fernández, Iván; López-Cañizares, Juan; García-Vidal, José A; Benítez-Martínez, Josep Carles; Del Baño-Aledo, María Elena
2018-01-01
The objective was to determine the inter-observer and test/retest reliability of the "Five-repetition sit-to-stand" (5STS) test in patients with total knee replacement (TKR). To explore correlation between 5STS and two mobility tests. A reliability study was conducted among 24 (mean age 72.13, S.D. 10.67; 50% were women) outpatients with TKR. They were recruited from a traumatology unit of a public hospital via convenience sampling. A physiotherapist and trauma physician assessed each patient at the same time. The same physiotherapist realized a 5STS second measurement 45-60min after the first one. Reliability was assessed with intraclass correlation coefficients (ICCs) and Bland-Altman plots. Pearson coefficient was calculated to assess the correlation between 5STS, time up to go test (TUG) and four meters gait speed (4MGS). ICC for inter-observer and test-retest reliability of the 5STS were 0.998 (95% confidence interval [CI], 0.995-0.999) and 0.982 (95% CI, 0.959-0.992). Bland-Altman plot inter-observer showed limits between -0.82 and 1.06 with a mean of 0.11 and no heteroscedasticity within the data. Bland-Altman plot for test-retest showed the limits between 1.76 and 4.16, a mean of 1.20 and heteroscedasticity within the data. Pearson correlation coefficient revealed significant correlation between 5STS and TUG (r=0.7, p<0.001) and 4MGS (r=-0.583, p=0.003). This study demonstrates excellent inter-observer and test-retest reliability when it is used in people with TKR, and also significant correlation with other functional mobility tests. These findings support the use of 5STS as outcome measure in TKR population. Copyright © 2017 Elsevier B.V. All rights reserved.
Reliability and validity of television food advertising questionnaire in Malaysia.
Zalma, Abdul Razak; Safiah, Md Yusof; Ajau, Danis; Khairil Anuar, Md Isa
2015-09-01
Interventions to counter the influence of television food advertising amongst children are important. Thus, reliable and valid instrument to assess its effect is needed. The objective of this study was to determine the reliability and validity of such a questionnaire. The questionnaire was administered twice on 32 primary schoolchildren aged 10-11 years in Selangor, Malaysia. The interval between the first and second administration was 2 weeks. Test-retest method was used to examine the reliability of the questionnaire. Intra-rater reliability was determined by kappa coefficient and internal consistency by Cronbach's alpha coefficient. Construct validity was evaluated using factor analysis. The test-retest correlation showed moderate-to-high reliability for all scores (r = 0.40*, p = 0.02 to r = 0.95**, p = 0.00), with one exception, consumption of fast foods (r = 0.24, p = 0.20). Kappa coefficient showed acceptable-to-strong intra-rater reliability (K = 0.40-0.92), except for two items under knowledge on television food advertising (K = 0.26 and K = 0.21) and one item under preference for healthier foods (K = 0.33). Cronbach's alpha coefficient indicated acceptable internal consistency for all scores (0.45-0.60). After deleting two items under Consumption of Commonly Advertised Food, the items showed moderate-to-high loading (0.52, 0.84, 0.42 and 0.42) with the Scree plot showing that there was only one factor. The Kaiser-Meyer-Olkin was 0.60, showing that the sample was adequate for factor analysis. The questionnaire on television food advertising is reliable and valid to assess the effect of media literacy education on television food advertising on schoolchildren. © The Author (2013). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Reliability and Validity of Korean Version of Apraxia Screen of TULIA (K-AST).
Kim, Soo Jin; Yang, You-Na; Lee, Jong Won; Lee, Jin-Youn; Jeong, Eunhwa; Kim, Bo-Ram; Lee, Jongmin
2016-10-01
To evaluate the reliability and validity of Korean version of AST (K-AST) as a bedside screening test of apraxia in patients with stroke for early and reliable detection. AST was translated into Korean, and the translated version received authorization from the author of AST. The performances of K-AST in 26 patients (21 males, 5 females; mean age 65.42±17.31 years) with stroke (23 ischemic, 3 hemorrhagic) were videotaped. To test the reliability and validity of K-AST, the recorded performances were assessed by two physiatrists and two occupational therapists twice at a 1-week interval. The patient performances at admission in Korean version of Mini-Mental State Examination (K-MMSE), self-care and transfer categories of Functional Independence Measure (FIM), and motor praxis area of Loewenstein Occupational Therapy Cognitive Assessment, the second edition (LOTCA-II) were also evaluated. Scores of motor praxis area of LOTCA-II was used to assess the validity of K-AST. Inter-rater reliabilities were 0.983 (p<0.001) at the first assessment and 0.982 (p<0.001) at the second assessment. For intra-rater (test-retest) reliabilities, the values of four raters were 0.978 (p<0.001), 0.957 (p<0.001), 0.987 (p<0.001), and 0.977 (p<0.001). K-AST showed significant correlation (r=0.758, p<0.001) with motor praxis area of LOTCA-II test. K-AST also showed positive correlations with the total FIM score (r=0.694, p<0.001), the selfcare category of FIM (r=0.705, p<0.001) and the transfer category of FIM (r=653, p<0.001). K-AST is a reliable and valid test for bedside screening of apraxia.
[Design and validation of a questionnaire for psychosocial nursing diagnosis in Primary Care].
Brito-Brito, Pedro Ruymán; Rodríguez-Álvarez, Cristobalina; Sierra-López, Antonio; Rodríguez-Gómez, José Ángel; Aguirre-Jaime, Armando
2012-01-01
To develop a valid, reliable and easy-to-use questionnaire for a psychosocial nursing diagnosis. The study was performed in two phases: first phase, questionnaire design and construction; second phase, validity and reliability tests. A bank of items was constructed using the NANDA classification as a theoretical framework. Each item was assigned a Likert scale or dichotomous response. The combination of responses to the items constituted the diagnostic rules to assign up to 28 labels. A group of experts carried out the validity test for content. Other validated scales were used as reference standards for the criterion validity tests. Forty-five nurses provided the questionnaire to the patients on three separate occasions over a period of three weeks, and the other validated scales only once to 188 randomly selected patients in Primary Care centres in Tenerife (Spain). Validity tests for construct confirmed the six dimensions of the questionnaire with 91% of total variance explained. Validity tests for criterion showed a specificity of 66%-100%, and showed high correlations with the reference scales when the questionnaire was assigning nursing diagnoses. Reliability tests showed agreement of 56%-91% (P<.001), and a 93% internal consistency. The Questionnaire for Psychosocial Nursing Diagnosis was called CdePS, and included 61 items. The CdePS is a valid, reliable and easy-to-use tool in Primary Care centres to improve the assigning of a psychosocial nursing diagnosis. Copyright © 2011 Elsevier España, S.L. All rights reserved.
De Kegel, A; Dhooge, I; Cambier, D; Baetens, T; Palmans, T; Van Waelvelde, H
2011-04-01
The purpose of this study was to establish test-retest reliability of centre of pressure (COP) measurements obtained by an AccuGait portable forceplate (ACG), mean COG sway velocity measured by a Basic Balance Master (BBM) and clinical balance tests in children with and without balance difficulties. 49 typically developing children and 23 hearing impaired children, with a higher risk for stability problems, between 6 and 12 years of age participated. Each child performed the modified Clinical Test of Sensory Interaction on Balance (mCTSIB), Unilateral Stance (US) and Tandem Stance on ACG, mCTSIB and US on BBM and clinical balance tests: one-leg standing, balance beam walking and one-leg hopping. All subjects completed 2 test sessions on 2 different days in the same week assessed by the same examiner. Among COP measurements obtained by the ACG, mean sway velocity was the most reliable parameter with all ICCs higher than 0.72. The standard deviation (SD) of sway velocity, sway area, SD of anterior-posterior and SD of medio-lateral COP data showed moderate to excellent reliability with ICCs between 0.55 and 0.96 but some caution must be taken into account in some conditions. BBM is less reliable but clinical balance tests are as reliable as ACG. Hearing impaired children exhibited better relative reliability (ICC) and comparable absolute reliability (SEM) for most balance parameters compared to typically developing children. Reliable information regarding postural stability of typically developing children and hearing impaired children may be obtained utilizing COP measurements generated by an AccuGait system and clinical balance tests. Copyright © 2011 Elsevier B.V. All rights reserved.
Duanngai, Krit; Sirasaporn, Patpiya; Ngaosinchai, Siriwan Surapaitoon
2017-01-01
The aim of this is to evaluate the reliability of the urine dipstick test by patients' self-assessment for urinary tract infection (UTI) screening and to determine the validity of urine dipstick test. Rehabilitation Department, Srinagarind Hospital, Thailand. A diagnostic study. This study compared the urine dipstick test (index test) with the National Institute on Disability and Rehabilitation Research (NIDRR) criteria (gold standard test) in spinal cord injury (SCI) patients. The urine dipstick test informed positive and negative results. Besides the NIDRR criteria classified as UTI and no UTI. The interrater reliability was measured in the sense of Kappa whereas the validity of urine dipstick test was reported in terms of sensitivity, specificity, positive likelihood ratio (LR) (+LR), negative LR (-LR), positive predictive value (PPV), and negative predictive value (NPV). Out of the 56 participants, the kappa of urine dipstick test for leukocyte esterase, nitrite, and combined leukocyte esterase and nitrite were 0.09, 0.21, and 0.52, respectively. The nitrite urine dipstick test showed the highest sensitivity (90%). The combined leukocyte esterase and nitrite urine dipstick test gave the highest specificity (87%), PPV (60%), NPV (93%), and +LR (5.63). The interrater reliability of combined leukocyte esterase and nitrite urine dipstick test was moderate agreement. The combined leukocyte esterase and nitrite urine dipstick test showed high level of both sensitivity and specificity. The combined leukocyte esterase and nitrite urine dipstick test should be promoted for patients' self-assessment for UTI screening in SCI patients.
Test-retest reliability of cognitive EEG
NASA Technical Reports Server (NTRS)
McEvoy, L. K.; Smith, M. E.; Gevins, A.
2000-01-01
OBJECTIVE: Task-related EEG is sensitive to changes in cognitive state produced by increased task difficulty and by transient impairment. If task-related EEG has high test-retest reliability, it could be used as part of a clinical test to assess changes in cognitive function. The aim of this study was to determine the reliability of the EEG recorded during the performance of a working memory (WM) task and a psychomotor vigilance task (PVT). METHODS: EEG was recorded while subjects rested quietly and while they performed the tasks. Within session (test-retest interval of approximately 1 h) and between session (test-retest interval of approximately 7 days) reliability was calculated for four EEG components: frontal midline theta at Fz, posterior theta at Pz, and slow and fast alpha at Pz. RESULTS: Task-related EEG was highly reliable within and between sessions (r0.9 for all components in WM task, and r0.8 for all components in the PVT). Resting EEG also showed high reliability, although the magnitude of the correlation was somewhat smaller than that of the task-related EEG (r0.7 for all 4 components). CONCLUSIONS: These results suggest that under appropriate conditions, task-related EEG has sufficient retest reliability for use in assessing clinical changes in cognitive status.
Boer, Annemarie; Dutmer, Alisa L; Schiphorst Preuper, Henrica R; van der Woude, Lucas H V; Stewart, Roy E; Deyo, Richard A; Reneman, Michiel F; Soer, Remko
2017-10-01
Validation study with cross-sectional and longitudinal measurements. To translate the US National Institutes of Health (NIH)-minimal dataset for clinical research on chronic low back pain into the Dutch language and to test its validity and reliability among people with chronic low back pain. The NIH developed a minimal dataset to encourage more complete and consistent reporting of clinical research and to be able to compare studies across countries in patients with low back pain. In the Netherlands, the NIH-minimal dataset has not been translated before and measurement properties are unknown. Cross-cultural validity was tested by a formal forward-backward translation. Structural validity was tested with exploratory factor analyses (comparative fit index, Tucker-Lewis index, and root mean square error of approximation). Hypothesis testing was performed to compare subscales of the NIH dataset with the Pain Disability Index and the EurQol-5D (Pearson correlation coefficients). Internal consistency was tested with Cronbach α and test-retest reliability at 2 weeks was calculated in a subsample of patients with Intraclass Correlation Coefficients and weighted Kappa (κω). In total, 452 patients were included of which 52 were included for the test-retest study. factor analysis for structural validity pointed into the direction of a seven-factor model (Cronbach α = 0.78). Factors and total score of the NIH-minimal dataset showed fair to good correlations with Pain Disability Index (r = 0.43-0.70) and EuroQol-5D (r = -0.41 to -0.64). Reliability: test-retest reliability per item showed substantial agreement (κω=0.65). Test-retest reliability per factor was moderate to good (Intraclass Correlation Coefficient = 0.71). The Dutch language version measurement properties of the NIH-minimal were satisfactory. N/A.
One-year test-retest reliability of intrinsic connectivity network fMRI in older adults
Guo, Cong C.; Kurth, Florian; Zhou, Juan; Mayer, Emeran A.; Eickhoff, Simon B; Kramer, Joel H.; Seeley, William W.
2014-01-01
“Resting-state” or task-free fMRI can assess intrinsic connectivity network (ICN) integrity in health and disease, suggesting a potential for use of these methods as disease-monitoring biomarkers. Numerous analytical options are available, including model-driven ROI-based correlation analysis and model-free, independent component analysis (ICA). High test-retest reliability will be a necessary feature of a successful ICN biomarker, yet available reliability data remains limited. Here, we examined ICN fMRI test-retest reliability in 24 healthy older subjects scanned roughly one year apart. We focused on the salience network, a disease-relevant ICN not previously subjected to reliability analysis. Most ICN analytical methods proved reliable (intraclass coefficients > 0.4) and could be further improved by wavelet analysis. Seed-based ROI correlation analysis showed high map-wise reliability, whereas graph theoretical measures and temporal concatenation group ICA produced the most reliable individual unit-wise outcomes. Including global signal regression in ROI-based correlation analyses reduced reliability. Our study provides a direct comparison between the most commonly used ICN fMRI methods and potential guidelines for measuring intrinsic connectivity in aging control and patient populations over time. PMID:22446491
Test-retest reliability and practice effects of the Wechsler Memory Scale-III.
Lo, Ada H Y; Humphreys, Michael; Byrne, Gerard J; Pachana, Nancy A
2012-09-01
Although serial administration of cognitive tests is increasingly common, there is a paucity of research on test-retest reliabilities and practice effects, both of which are important for evaluating changes in functioning. Reliability is generally conceptualized as involving short-lasting changes in performance. However, when repeated testing occurs over a period of years, there will be some longer lasting effects. The implications of these longer lasting effects and practice effects on reliability were examined in the context of repeated administrations of the Wechsler Memory Scale-III in 339 community-dwelling women aged 40-79 years over 2 to 7 years. The results showed that Logical Memory and Verbal Paired Associates subtests were consistently the most reliable subtests across the age cohorts. The magnitude of practice effects varied as a function of subtests and age. The largest practice effects were found in the youngest age cohort, especially on the Faces, Logical Memory, and Verbal Paired Associates subtests. ©2012 The British Psychological Society.
Munguía-Izquierdo, Diego; Legaz-Arrese, Alejandro
2012-11-01
To evaluate the reliability, standard error of the mean (SEM), clinical significant change, and known group validity of 2 assessments of endurance strength to low loads in patients with fibromyalgia syndrome (FS). Cross-sectional reliability and comparative study. University Pablo de Olavide, Seville, Spain. Middle-aged women with FS (n=95) and healthy women (n=64) matched for age, weight, and body mass index (BMI) were recruited for the study. Not applicable. The endurance strength to low loads tests of the upper and lower extremities and anthropometric measures (BMI) were used for the evaluations. The differences between the readings (tests 1 and 2) and the SDs of the differences, intraclass correlation coefficient (ICC) model (2,1), 95% confidence interval for the ICC, coefficient of repeatability, intrapatient SD, SEM, Wilcoxon signed-rank test, and Bland-Altman plots were used to examine reliability. A Mann-Whitney U test was used to analyze the differences in test values between the patient group and the control group. We hypothesized that patients with FS would have an endurance strength to low loads performance in lower and upper extremities at least twice as low as that of the healthy controls. Satisfactory test-retest reliability and SEMs were found for the lower extremity, dominant arm, and nondominant arm tests (ICC=.973-.979; P<.001; SEMs=1.44-1.66 repetitions). The differences in the mean between the test and retest were lower than the SEM for all performed tests, varying from -.10 to .29 repetitions. No significant differences were found between the test and retest (P>.05 for all). The Bland-Altman plots showed 95% limits of agreement for the lower extremity (4.7 to -4.5), dominant arm (3.8 to -4.4), and nondominant arm (3.9 to -4.1) tests. The endurance strength to low loads test scores for the patients with FS were 4-fold lower than for the controls in all performed tests (P<.001 for all). The endurance strength to low loads tests showed good reliability and known group validity and can be recommended for evaluating endurance strength to low loads in patients with FS. For individual evaluation, however, an improved score of at least 4 and 5 repetitions for the upper and lower extremities, respectively, was required for the differences to be considered as substantial clinical change. Patients with FS showed impaired endurance strength to low loads performance when compared with the general population. Copyright © 2012 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Validity and Reliability Study of the Korean Tinetti Mobility Test for Parkinson's Disease.
Park, Jinse; Koh, Seong-Beom; Kim, Hee Jin; Oh, Eungseok; Kim, Joong-Seok; Yun, Ji Young; Kwon, Do-Young; Kim, Younsoo; Kim, Ji Seon; Kwon, Kyum-Yil; Park, Jeong-Ho; Youn, Jinyoung; Jang, Wooyoung
2018-01-01
Postural instability and gait disturbance are the cardinal symptoms associated with falling among patients with Parkinson's disease (PD). The Tinetti mobility test (TMT) is a well-established measurement tool used to predict falls among elderly people. However, the TMT has not been established or widely used among PD patients in Korea. The purpose of this study was to evaluate the reliability and validity of the Korean version of the TMT for PD patients. Twenty-four patients diagnosed with PD were enrolled in this study. For the interrater reliability test, thirteen clinicians scored the TMT after watching a video clip. We also used the test-retest method to determine intrarater reliability. For concurrent validation, the unified Parkinson's disease rating scale, Hoehn and Yahr staging, Berg Balance Scale, Timed-Up and Go test, 10-m walk test, and gait analysis by three-dimensional motion capture were also used. We analyzed receiver operating characteristic curve to predict falling. The interrater reliability and intrarater reliability of the Korean Tinetti balance scale were 0.97 and 0.98, respectively. The interrater reliability and intra-rater reliability of the Korean Tinetti gait scale were 0.94 and 0.96, respectively. The Korean TMT scores were significantly correlated with the other clinical scales and three-dimensional motion capture. The cutoff values for predicting falling were 14 points (balance subscale) and 10 points (gait subscale). We found that the Korean version of the TMT showed excellent validity and reliability for gait and balance and had high sensitivity and specificity for predicting falls among patients with PD.
Lie, Marie Udnesseter; Matre, Dagfinn; Hansson, Per; Stubhaug, Audun; Zwart, John-Anker; Nilsen, Kristian Bernhard
2017-01-01
Abstract Introduction: The interest in conditioned pain modulation (CPM) as a clinical tool for measuring endogenously induced analgesia is increasing. There is, however, large variation in the CPM methodology, hindering comparison of results across studies. Research comparing different CPM protocols is needed in order to obtain a standardized test paradigm. Objectives: The aim of the study was to assess whether a protocol with phasic heat stimuli as test-stimulus is preferable to a protocol with tonic heat stimulus as test-stimulus. Methods: In this experimental crossover study, we compared 2 CPM protocols with different test-stimulus; one with tonic test-stimulus (constant heat stimulus of 120-second duration) and one with phasic test-stimuli (3 heat stimulations of 5 seconds duration separated by 10 seconds). Conditioning stimulus was a 7°C water bath in parallel with the test-stimulus. Twenty-four healthy volunteers were assessed on 2 occasions with minimum 1 week apart. Differences in the magnitude and test–retest reliability of the CPM effect in the 2 protocols were investigated with repeated-measures analysis of variance and by relative and absolute reliability indices. Results: The protocol with tonic test-stimulus induced a significantly larger CPM effect compared to the protocol with phasic test-stimuli (P < 0.001). Fair and good relative reliability was found with the phasic and tonic test-stimuli, respectively. Absolute reliability indices showed large intraindividual variability from session to session in both protocols. Conclusion: The present study shows that a CPM protocol with a tonic test-stimulus is preferable to a protocol with phasic test-stimuli. However, we emphasize that one should be cautious to use the CPM effect as biomarker or in clinical decision making on an individual level due to large intraindividual variability. PMID:29392240
Xiao, Yuan-mei; Wang, Zhi-ming; Wang, Mian-zhen; Lan, Ya-jia
2005-06-01
To test the reliability and validity of two mental workload assessment scales, i.e. subjective workload assessment technique (SWAT) and NASA task load index (NASA-TLX). One thousand two hundred and sixty-eight mental workers were sampled from various kinds of occupations, such as scientific research, education, administration and medicine, etc, with randomized cluster sampling. The re-test reliability, split-half reliability, Cronbach's alpha coefficient and correlation coefficients between item score and total score were adopted to test the reliability. The test of validity included structure validity. The re-test reliability coefficients of these two scales and their items were ranged from 0.516 to 0.753 (P < 0.01), indicating the two scales had good re-test reliability; the split-half reliability of SWAT was 0.645, and its Cronbach's alpha coefficient was more than 0.80, all the correlation coefficients between its items score and total score were more than 0.70; as for NASA-TLX, both the split-half reliability and Cronbach's alpha coefficient were more than 0.80, the correlation coefficients between its items score and total score were all more than 0.60 (P < 0.01) except the item of performance. Both scales had good inner consistency. The Pearson correlation coefficient between the two scales was 0.492 (P < 0.01), implying the results of the two scales had good consistency. Factor analysis showed that the two scales had good structure validity. Both SWAT and NASA-TLX have good reliability and validity and may be used as a valid tool to assess mental workload in China after being revised properly.
Validity and reliability of a new ankle dorsiflexion measurement device.
Gatt, Alfred; Chockalingam, Nachiappan
2013-08-01
The assessment of the maximum ankle dorsiflexion angle is an important clinical examination procedure. Evidence shows that the traditional goniometer is highly unreliable, and various designs of goniometers to measure the maximum ankle dorsiflexion angle rely on the application of a known force to obtain reliable results. Hence, an innovative ankle dorsiflexion measurement device was designed to make this measurement more reliable by holding the foot in a selected posture without the application of a known moment. To report on the comprehensive validity and reliability testing carried out on the new device. Following validity testing, four different trials to test reliability of the ankle dorsiflexion measurement device were performed. These trials included inter-rater and intra-rater testings with a controlled moment, intra-rater reliability testing with knees flexed and extended without a controlled moment, intra-rater testing with a patient population, and inter-rater reliability testing between four raters of varying experience without controlling moment. All raters were blinded. A series of trials to test intra-rater and inter-rater reliabilities. Intra-rater reliability intraclass correlation coefficient was 0.98 and inter-rater reliability intraclass correlation coefficient (2,1) was 0.953 with a controlled moment. With uncontrolled moment, very high reliability for intra-tester was also achieved (intraclass correlation coefficient = 0.94 with knees extended and intraclass correlation coefficient = 0.95 with knees flexed). For the trial investigating test-retest reliability with actual patients, intraclass correlation coefficient of 0.99 was obtained. In the trial investigating four different raters with uncontrolled moment, intraclass correlation coefficient of 0.91 was achieved. The new ankle dorsiflexion measurement device is a valid and reliable device for measuring ankle dorsiflexion in both healthy subjects and patients, with both controlled and uncontrolled moments, even by multiple raters of varying experience when the foot is dorsiflexed to its end of range of motion. An ankle dorsiflexion measuring device has been designed to increase the reliability of ankle dorsiflexion measurement and replace the traditional goniometer. While the majority of similar devices rely on application of a known moment to perform this measurement, it has been shown that this is not required with the new ankle dorsiflexion measurement device and, rather, foot posture should be taken into consideration as this affects the maximum ankle dorsiflexion angle.
[Reliability and validity of the Chinese version on Alcohol Use Disorders Identification Test].
Zhang, C; Yang, G P; Li, Z; Li, X N; Li, Y; Hu, J; Zhang, F Y; Zhang, X J
2017-08-10
Objective: To assess the reliability and validity of the Chinese version on Alcohol Use Disorders Identification Test (AUDIT) among medical students in China and to provide correct way of application on the recommended scales. Methods: An E-questionnaire was developed and sent to medical students in five different colleges. Students were all active volunteers to accept the testings. Cronbach's α and split-half reliability were calculated to evaluate the reliability of AUDIT while content, contract, discriminant and convergent validity were performed to measure the validity of the scales. Results: The overall Cronbach's α of AUDIT was 0.782 and the split-half reliability was 0.711. Data showed that the domain Cronbach's α and split-half reliability were 0.796 and 0.794 for hazardous alcohol use, 0.561 and 0.623 for dependence symptoms, and 0.647 and 0.640 for harmful alcohol use. Results also showed that the content validity index on the levels of items I-CVI) were from 0.83 to 1.00, the content validity index of scale level (S-CVI/UA) was 0.90, content validity index of average scale level (S-CVI/Ave) was 0.99 and the content validity ratios (CVR) were from 0.80 to 1.00. The simplified version of AUDIT supported a presupposed three-factor structure which could explain 61.175% of the total variance revealed through exploratory factor analysis. AUDIT semed to have good convergent and discriminant validity, with the success rate of calibration experiment as 100%. Conclusion: AUDIT showed good reliability and validity among medical students in China thus worth for promotion on its use.
DiFilippo, Kristen Nicole; Huang, Wenhao; Chapman-Novakofski, Karen M
2017-10-27
The extensive availability and increasing use of mobile apps for nutrition-based health interventions makes evaluation of the quality of these apps crucial for integration of apps into nutritional counseling. The goal of this research was the development, validation, and reliability testing of the app quality evaluation (AQEL) tool, an instrument for evaluating apps' educational quality and technical functionality. Items for evaluating app quality were adapted from website evaluations, with additional items added to evaluate the specific characteristics of apps, resulting in 79 initial items. Expert panels of nutrition and technology professionals and app users reviewed items for face and content validation. After recommended revisions, nutrition experts completed a second AQEL review to ensure clarity. On the basis of 150 sets of responses using the revised AQEL, principal component analysis was completed, reducing AQEL into 5 factors that underwent reliability testing, including internal consistency, split-half reliability, test-retest reliability, and interrater reliability (IRR). Two additional modifiable constructs for evaluating apps based on the age and needs of the target audience as selected by the evaluator were also tested for construct reliability. IRR testing using intraclass correlations (ICC) with all 7 constructs was conducted, with 15 dietitians evaluating one app. Development and validation resulted in the 51-item AQEL. These were reduced to 25 items in 5 factors after principal component analysis, plus 9 modifiable items in two constructs that were not included in principal component analysis. Internal consistency and split-half reliability of the following constructs derived from principal components analysis was good (Cronbach alpha >.80, Spearman-Brown coefficient >.80): behavior change potential, support of knowledge acquisition, app function, and skill development. App purpose split half-reliability was .65. Test-retest reliability showed no significant change over time (P>.05) for all but skill development (P=.001). Construct reliability was good for items assessing age appropriateness of apps for children, teens, and a general audience. In addition, construct reliability was acceptable for assessing app appropriateness for various target audiences (Cronbach alpha >.70). For the 5 main factors, ICC (1,k) was >.80, with a P value of <.05. When 15 nutrition professionals evaluated one app, ICC (2,15) was .98, with a P value of <.001 for all 7 constructs when the modifiable items were specified for adults seeking weight loss support. Our preliminary effort shows that AQEL is a valid, reliable instrument for evaluating nutrition apps' qualities for clinical interventions by nutrition clinicians, educators, and researchers. Further efforts in validating AQEL in various contexts are needed. ©Kristen Nicole DiFilippo, Wenhao Huang, Karen M. Chapman-Novakofski. Originally published in JMIR Mhealth and Uhealth (http://mhealth.jmir.org), 27.10.2017.
40 CFR 792.17 - Effects of non-compliance.
Code of Federal Regulations, 2011 CFR
2011-07-01
... U.S.C. 2 or 1001. (b) EPA, at its discretion, may not consider reliable for purposes of showing that... reliable does not, however, relieve the sponsor of a required test of the obligation under any applicable...
Varikuti, Deepthi P; Hoffstaedter, Felix; Genon, Sarah; Schwender, Holger; Reid, Andrew T; Eickhoff, Simon B
2017-04-01
Resting-state functional connectivity analysis has become a widely used method for the investigation of human brain connectivity and pathology. The measurement of neuronal activity by functional MRI, however, is impeded by various nuisance signals that reduce the stability of functional connectivity. Several methods exist to address this predicament, but little consensus has yet been reached on the most appropriate approach. Given the crucial importance of reliability for the development of clinical applications, we here investigated the effect of various confound removal approaches on the test-retest reliability of functional-connectivity estimates in two previously defined functional brain networks. Our results showed that gray matter masking improved the reliability of connectivity estimates, whereas denoising based on principal components analysis reduced it. We additionally observed that refraining from using any correction for global signals provided the best test-retest reliability, but failed to reproduce anti-correlations between what have been previously described as antagonistic networks. This suggests that improved reliability can come at the expense of potentially poorer biological validity. Consistent with this, we observed that reliability was proportional to the retained variance, which presumably included structured noise, such as reliable nuisance signals (for instance, noise induced by cardiac processes). We conclude that compromises are necessary between maximizing test-retest reliability and removing variance that may be attributable to non-neuronal sources.
Varikuti, Deepthi P.; Hoffstaedter, Felix; Genon, Sarah; Schwender, Holger; Reid, Andrew T.; Eickhoff, Simon B.
2016-01-01
Resting-state functional connectivity analysis has become a widely used method for the investigation of human brain connectivity and pathology. The measurement of neuronal activity by functional MRI, however, is impeded by various nuisance signals that reduce the stability of functional connectivity. Several methods exist to address this predicament, but little consensus has yet been reached on the most appropriate approach. Given the crucial importance of reliability for the development of clinical applications, we here investigated the effect of various confound removal approaches on the test-retest reliability of functional-connectivity estimates in two previously defined functional brain networks. Our results showed that grey matter masking improved the reliability of connectivity estimates, whereas de-noising based on principal components analysis reduced it. We additionally observed that refraining from using any correction for global signals provided the best test-retest reliability, but failed to reproduce anti-correlations between what have been previously described as antagonistic networks. This suggests that improved reliability can come at the expense of potentially poorer biological validity. Consistent with this, we observed that reliability was proportional to the retained variance, which presumably included structured noise, such as reliable nuisance signals (for instance, noise induced by cardiac processes). We conclude that compromises are necessary between maximizing test-retest reliability and removing variance that may be attributable to non-neuronal sources. PMID:27550015
Four-way-leaning test shows larger limits of stability than a circular-leaning test.
Thomsen, Mikkel Højgaard; Støttrup, Nicolai; Larsen, Frederik Greve; Pedersen, Ann-Marie Sydow Krogh; Poulsen, Anne Grove; Hirata, Rogerio Pessoto
2017-01-01
Limits of stability (LOS) have extensive clinical and rehabilitational value yet no standard consensus on measuring LOS exists. LOS measured using a leaning or a circling protocol is commonly used in research and clinical settings, however differences in protocols and reliability problems exist. This study measured LOS using a four-way-leaning test and a circular-leaning test to test which showed larger LOS measurements. Furthermore, number of adaptation trials needed for consistent results was assessed. Limits of stability were measured using a force plate (Metitur Good Balance System ® ) sampling at 50Hz. Thirty healthy subjects completed 30 trials assessing LOS alternating between four-way-leaning test and circular-leaning test. A main effect of methods (ANOVA:F(1,28)=45.86, P<0.01) with the four-way-leaning test showing larger values than the circular-leaning test (NK, P<0.01). An interaction between method×directions was found (ANOVA:F(3, 84)=24.87, P<0.01). The four-way-leaning test showed larger LOS in anterior (NK, P<0.05), right (NK, P<0.01) and left direction (NK, P<0.01). Analysis of LOS for the four-way-leaning test showed a difference between trials (ANOVA:F(14,392)=7.81, P<0.01). Differences were found between trial 1 and 7 (NK, P<0.03), trial 6 and 8 (NK, P<0.02) and trial 7 and 15 (NK, P<0.02). Four-way-leaning test showed high correlation (ICC>0.87) between first and second trial for all directions. Four-way-leaning test yields larger LOS in anterior, right and left direction making it more reliable when measuring LOS. A learning effect was found up to the 8th trial, which suggests using 8 adaptation trials before reliable LOS is measured. Copyright © 2016 Elsevier B.V. All rights reserved.
Analysis of strain gage reliability in F-100 jet engine testing at NASA Lewis Research Center
NASA Technical Reports Server (NTRS)
Holanda, R.
1983-01-01
A reliability analysis was performed on 64 strain gage systems mounted on the 3 rotor stages of the fan of a YF-100 engine. The strain gages were used in a 65 hour fan flutter research program which included about 5 hours of blade flutter. The analysis was part of a reliability improvement program. Eighty-four percent of the strain gages survived the test and performed satisfactorily. A post test analysis determined most failure causes. Five failures were caused by open circuits, three failed gages showed elevated circuit resistance, and one gage circuit was grounded. One failure was undetermined.
Emotional-volitional components of operator reliability. [sensorimotor function testing under stress
NASA Technical Reports Server (NTRS)
Mileryan, Y. A.
1975-01-01
Sensorimotor function testing in a tracking task under stressfull working conditions established a psychological characterization for a successful aviation pilot: Motivation significantly increased the reliability and effectiveness of their work. Their acitivities were aimed at suppressing weariness and the feeling of fear caused by the stress factors; they showed patience, endurance, persistence, and a capacity for lengthy volitional efforts.
Killgore, William D S; Gogel, Hannah
2014-01-01
Neuropsychological assessments are frequently time-consuming and fatiguing for patients. Brief screening evaluations may reduce test duration and allow more efficient use of time by permitting greater attention toward neuropsychological domains showing probable deficits. The Design Organization Test (DOT) was initially developed as a 2-min paper-and-pencil alternative for the Block Design (BD) subtest of the Wechsler scales. Although initially validated for clinical neurologic patients, we sought to further establish the reliability and validity of this test in a healthy, more diverse population. Two alternate versions of the DOT and the Wechsler Abbreviated Scale of Intelligence (WASI) were administered to 61 healthy adult participants. The DOT showed high alternate forms reliability (r = .90-.92), and the two versions yielded equivalent levels of performance. The DOT was highly correlated with BD (r = .76-.79) and was significantly correlated with all subscales of the WASI. The DOT proved useful when used in lieu of BD in the calculation of WASI IQ scores. Findings support the reliability and validity of the DOT as a measure of visuospatial ability and suggest its potential worth as an efficient estimate of intellectual functioning in situations where lengthier tests may be inappropriate or unfeasible.
Laser notching ceramics for reliable fracture toughness testing
Barth, Holly D.; Elmer, John W.; Freeman, Dennis C.; ...
2015-09-19
A new method for notching ceramics was developed using a picosecond laser for fracture toughness testing of alumina samples. The test geometry incorporated a single-edge-V-notch that was notched using picosecond laser micromachining. This method has been used in the past for cutting ceramics, and is known to remove material with little to no thermal effect on the surrounding material matrix. This study showed that laser-assisted-machining for fracture toughness testing of ceramics was reliable, quick, and cost effective. In order to assess the laser notched single-edge-V-notch beam method, fracture toughness results were compared to results from other more traditional methods, specificallymore » surface-crack in flexure and the chevron notch bend tests. Lastly, the results showed that picosecond laser notching produced precise notches in post-failure measurements, and that the measured fracture toughness results showed improved consistency compared to traditional fracture toughness methods.« less
Thermal shock testing for assuring reliability of glass-sealed microelectronic packages
NASA Technical Reports Server (NTRS)
Thomas, Walter B., III; Lewis, Michael D.
1991-01-01
Tests were performed to determine if thermal shocking is destructive to glass-to-metal seal microelectronic packages and if thermal shock step stressing can compare package reliabilities. Thermal shocking was shown to be not destructive to highly reliable glass seals. Pin-pull tests used to compare the interfacial pin glass strengths showed no differences between thermal shocked and not-thermal shocked headers. A 'critical stress resistance temperature' was not exhibited by the 14 pin Dual In-line Package (DIP) headers evaluated. Headers manufactured in cryogenic nitrogen based and exothermically generated atmospheres showed differences in as-received leak rates, residual oxide depths and pin glass interfacial strengths; these were caused by the different manufacturing methods, in particular, by the chemically etched pins used by one manufacturer. Both header types passed thermal shock tests to temperature differentials of 646 C. The sensitivity of helium leak rate measurements was improved up to 70 percent by baking headers for two hours at 200 C after thermal shocking.
Drake, David; Kennedy, Rodney; Wallace, Eric
2018-02-06
Isometric multi-joint tests are considered reliable and have strong relationships with 1RM performance. However, limited evidence is available for the isometric squat in terms of effects of familiarization and reliability. This study aimed to assess, the effect of familiarization, stability reliability, determine the smallest detectible difference, and the correlation of the isometric squat test with 1RM squat performance. Thirty-six strength-trained participants volunteered to take part in this study. Following three familiarization sessions, test-retest reliability was evaluated with a 48-hour window between each time point. Isometric squat peak, net and relative force were assessed. Results showed three familiarizations were required, isometric squat had a high level of stability reliability and smallest detectible difference of 11% for peak and relative force. Isometric strength at a knee angle of ninety degrees had a strong significant relationship with 1RM squat performance. In conclusion, the isometric squat is a valid test to assess multi-joint strength and can discriminate between strong and weak 1RM squat performance. Changes greater than 11% in peak and relative isometric squat performance should be considered as meaningful in participants who are familiar with the test.
Reliability and Validity of the Korean Version of the Internet Addiction Test among College Students
Lee, Kounseok; Lee, Hye-Kyung; Gyeong, Hyunsu; Yu, Byeongkwan; Song, Yul-Mai
2013-01-01
We developed a Korean translation of the Internet Addiction Test (KIAT), widely used self-report for internet addiction and tested its reliability and validity in a sample of college students. Two hundred seventy-nine college students at a national university completed the KIAT. Internal consistency and two week test-retest reliability were calculated from the data, and principal component factor analysis was conducted. Participants also completed the Internet Addiction Diagnostic Questionnaire (IADQ), the Korea Internet addiction scale (K-scale), and the Patient Health Questionnaire-9 for the criterion validity. Cronbach's alpha of the whole scale was 0.91, and test-retest reliability was also good (r = 0.73). The IADQ, the K-scale, and depressive symptoms were significantly correlated with the KIAT scores, demonstrating concurrent and convergent validity. The factor analysis extracted four factors (Excessive use, Dependence, Withdrawal, and Avoidance of reality) that accounted for 59% of total variance. The KIAT has outstanding internal consistency and high test-retest reliability. Also, the factor structure and validity data show that the KIAT is comparable to the original version. Thus, the KIAT is a psychometrically sound tool for assessing internet addiction in the Korean-speaking population. PMID:23678270
Daneshfar, Amin; Gahreman, Daniel E.; Koozehchian, Majid S.; Amani Shalamzari, Sadegh; Hassanzadeh Sablouei, Mozhgan; Rosemann, Thomas; Knechtle, Beat; Nikolaidis, Pantelis T.
2018-01-01
The aim of the present study was to examine the validity and reliability of a 10 × (6 × 5 m) multi-directional repeated sprint ability test (RSM) in elite young team handball (TH) players. Participants were members of the Iranian national team (n = 20, age 16.4 ± 0.7 years, weight 82.5 ± 5.5 kg, height 184.8 ± 4.6 cm, body fat 15.4 ± 4.3%). The validity of RSM was tested against a 10 × (15 + 15 m) repeated sprint ability test (RSA), Yo-Yo Intermittent Recovery test Level 1 (Yo-Yo IR1), squat jump (SJ) and countermovement jump (CMJ). To test the reliability of RSM, the participants repeated the testing sessions of RSM and RSA 1 week later. Both RSA and RSM tests showed good to excellent reliability of the total time (TT), best time (BT), and weakest time (WT). The results of the correlation analysis showed significant inverse correlations between maximum aerobic capacity and TT in RSA (r = −0.57, p ≤ 0.05) and RSM (r = −0.76, p ≤ 0.01). There was also a significant inverse correlation between maximum aerobic capacity with fatigue index (FI) in RSA test (r = −0.64, p ≤ 0.01) and in RSM test (r = −0.53, p ≤ 0.05). BT, WT, and TT of RSA was largely-to-very largely correlated with BT (r = 0.58, p ≤ 0.01), WT (r = 0.62, p ≤ 0.01), and TT (r = 0.65, p ≤ 0.01) of RSM. BT in RSM was also correlated with FI in RSM (r = 0.88, p ≤ 0.01). In conclusion, based on the findings of the current study, the recently developed RSM test is a valid and reliable test and should be utilized for assessment of repeated sprint ability in handball players. PMID:29670536
Daneshfar, Amin; Gahreman, Daniel E; Koozehchian, Majid S; Amani Shalamzari, Sadegh; Hassanzadeh Sablouei, Mozhgan; Rosemann, Thomas; Knechtle, Beat; Nikolaidis, Pantelis T
2018-01-01
The aim of the present study was to examine the validity and reliability of a 10 × (6 × 5 m) multi-directional repeated sprint ability test (RSM) in elite young team handball (TH) players. Participants were members of the Iranian national team ( n = 20, age 16.4 ± 0.7 years, weight 82.5 ± 5.5 kg, height 184.8 ± 4.6 cm, body fat 15.4 ± 4.3%). The validity of RSM was tested against a 10 × (15 + 15 m) repeated sprint ability test (RSA), Yo-Yo Intermittent Recovery test Level 1 (Yo-Yo IR1), squat jump (SJ) and countermovement jump (CMJ). To test the reliability of RSM, the participants repeated the testing sessions of RSM and RSA 1 week later. Both RSA and RSM tests showed good to excellent reliability of the total time (TT), best time (BT), and weakest time (WT). The results of the correlation analysis showed significant inverse correlations between maximum aerobic capacity and TT in RSA ( r = -0.57, p ≤ 0.05) and RSM ( r = -0.76, p ≤ 0.01). There was also a significant inverse correlation between maximum aerobic capacity with fatigue index (FI) in RSA test ( r = -0.64, p ≤ 0.01) and in RSM test ( r = -0.53, p ≤ 0.05). BT, WT, and TT of RSA was largely-to-very largely correlated with BT ( r = 0.58, p ≤ 0.01), WT ( r = 0.62, p ≤ 0.01), and TT ( r = 0 .65, p ≤ 0.01) of RSM. BT in RSM was also correlated with FI in RSM ( r = 0.88, p ≤ 0.01). In conclusion, based on the findings of the current study, the recently developed RSM test is a valid and reliable test and should be utilized for assessment of repeated sprint ability in handball players.
Reliability of resting-state microstate features in electroencephalography.
Khanna, Arjun; Pascual-Leone, Alvaro; Farzan, Faranak
2014-01-01
Electroencephalographic (EEG) microstate analysis is a method of identifying quasi-stable functional brain states ("microstates") that are altered in a number of neuropsychiatric disorders, suggesting their potential use as biomarkers of neurophysiological health and disease. However, use of EEG microstates as neurophysiological biomarkers requires assessment of the test-retest reliability of microstate analysis. We analyzed resting-state, eyes-closed, 30-channel EEG from 10 healthy subjects over 3 sessions spaced approximately 48 hours apart. We identified four microstate classes and calculated the average duration, frequency, and coverage fraction of these microstates. Using Cronbach's α and the standard error of measurement (SEM) as indicators of reliability, we examined: (1) the test-retest reliability of microstate features using a variety of different approaches; (2) the consistency between TAAHC and k-means clustering algorithms; and (3) whether microstate analysis can be reliably conducted with 19 and 8 electrodes. The approach of identifying a single set of "global" microstate maps showed the highest reliability (mean Cronbach's α > 0.8, SEM ≈ 10% of mean values) compared to microstates derived by each session or each recording. There was notably low reliability in features calculated from maps extracted individually for each recording, suggesting that the analysis is most reliable when maps are held constant. Features were highly consistent across clustering methods (Cronbach's α > 0.9). All features had high test-retest reliability with 19 and 8 electrodes. High test-retest reliability and cross-method consistency of microstate features suggests their potential as biomarkers for assessment of the brain's neurophysiological health.
Validity and reliability of a new tool to evaluate handwriting difficulties in Parkinson's disease.
Nackaerts, Evelien; Heremans, Elke; Smits-Engelsman, Bouwien C M; Broeder, Sanne; Vandenberghe, Wim; Bergmans, Bruno; Nieuwboer, Alice
2017-01-01
Handwriting in Parkinson's disease (PD) features specific abnormalities which are difficult to assess in clinical practice since no specific tool for evaluation of spontaneous movement is currently available. This study aims to validate the 'Systematic Screening of Handwriting Difficulties' (SOS-test) in patients with PD. Handwriting performance of 87 patients and 26 healthy age-matched controls was examined using the SOS-test. Sixty-seven patients were tested a second time within a period of one month. Participants were asked to copy as much as possible of a text within 5 minutes with the instruction to write as neatly and quickly as in daily life. Writing speed (letters in 5 minutes), size (mm) and quality of handwriting were compared. Correlation analysis was performed between SOS outcomes and other fine motor skill measurements and disease characteristics. Intrarater, interrater and test-retest reliability were assessed using the intraclass correlation coefficient (ICC) and Spearman correlation coefficient. Patients with PD had a smaller (p = 0.043) and slower (p<0.001) handwriting and showed worse writing quality (p = 0.031) compared to controls. The outcomes of the SOS-test significantly correlated with fine motor skill performance and disease duration and severity. Furthermore, the test showed excellent intrarater, interrater and test-retest reliability (ICC > 0.769 for both groups). The SOS-test is a short and effective tool to detect handwriting problems in PD with excellent reliability. It can therefore be recommended as a clinical instrument for standardized screening of handwriting deficits in PD.
Aartun, Ellen; Degerfalk, Anna; Kentsdotter, Linn; Hestbaek, Lise
2014-02-10
Evidence on the reliability of clinical tests used for the spinal screening of children and adolescents is currently lacking. The aim of this study was to determine the inter- and intra-rater reliability and measurement error of clinical tests commonly used when screening young spines. Two experienced chiropractors independently assessed 111 adolescents aged 12-14 years who were recruited from a primary school in Denmark. A standardised examination protocol was used to test inter-rater reliability including tests for scoliosis, hypermobility, general mobility, inter-segmental mobility and end range pain in the spine. Seventy-five of the 111 subjects were re-examined after one to four hours to test intra-rater reliability. Percentage agreement and Cohen's Kappa were calculated for binary variables, and interclass correlation (ICC) and Bland-Altman plots with Limits of Agreement (LoA) were calculated for continuous measures. Inter-rater percentage agreement for binary data ranged from 59.5% to 100%. Kappa ranged from 0.06-1.00. Kappa ≥ 0.40 was seen for elbow, thumb, fifth finger and trunk/hip flexion hypermobility, pain response in inter-segmental mobility and end range pain in lumbar flexion and extension. For continuous data, ICCs ranged from 0.40-0.95. Only forward flexion as measured by finger-to-floor distance reached an acceptable ICC(≥ 0.75). Overall, results for intra-rater reliability were better than for inter-rater reliability but for both components, the LoA were quite wide compared with the range of assessments. Some clinical tests showed good, and some tests poor, reliability when applied in a spinal screening of adolescents. The results could probably be improved by additional training and further test standardization. This is the first step in evaluating the value of these tests for the spinal screening of adolescents. Future research should determine the association between these tests and current and/or future neck and back pain.
Seo, Hyun-Ju; Kim, Soo Young; Lee, Yoon Jae; Jang, Bo-Hyoung; Park, Ji-Eun; Sheen, Seung-Soo; Hahn, Seo Kyung
2016-02-01
To develop a study Design Algorithm for Medical Literature on Intervention (DAMI) and test its interrater reliability, construct validity, and ease of use. We developed and then revised the DAMI to include detailed instructions. To test the DAMI's reliability, we used a purposive sample of 134 primary, mainly nonrandomized studies. We then compared the study designs as classified by the original authors and through the DAMI. Unweighted kappa statistics were computed to test interrater reliability and construct validity based on the level of agreement between the original and DAMI classifications. Assessment time was also recorded to evaluate ease of use. The DAMI includes 13 study designs, including experimental and observational studies of interventions and exposure. Both the interrater reliability (unweighted kappa = 0.67; 95% CI [0.64-0.75]) and construct validity (unweighted kappa = 0.63, 95% CI [0.52-0.67]) were substantial. Mean classification time using the DAMI was 4.08 ± 2.44 minutes (range, 0.51-10.92). The DAMI showed substantial interrater reliability and construct validity. Furthermore, given its ease of use, it could be used to accurately classify medical literature for systematic reviews of interventions although minimizing disagreement between authors of such reviews. Copyright © 2016 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Treichel, Todd H.
Commercial space designers are required to manage space flight designs in accordance with parts selections made from qualified parts listings approved by Department of Defense and NASA agencies for reliability and safety. The research problem was a government and private aerospace industry problem involving how LEDs cannot replace existing fluorescent lighting in manned space flight vehicles until such technology meets DOD and NASA requirements for reliability and safety, and effects on astronaut cognition and health. The purpose of this quantitative experimental study was to determine to what extent commercial LEDs can suitably meet NASA requirements for manufacturer reliability, color reliability, robustness to environmental test requirements, and degradation effects from operational power, while providing comfortable ambient light free of eyestrain to astronauts in lieu of current fluorescent lighting. A fractional factorial experiment tested white and blue LEDs for NASA required space flight environmental stress testing and applied operating current. The second phase of the study used a randomized block design, to test human factor effects of LEDs and a qualified ISS fluorescent for retinal fatigue and eye strain. Eighteen human subjects were recruited from university student members of the American Institute of Aeronautics and Astronautics. Findings for Phase 1 testing showed that commercial LEDs met all DOD and NASA requirements for manufacturer reliability, color reliability, robustness to environmental requirements, and degradation effects from operational power. Findings showed statistical significance for LED color and operational power variables but degraded light output levels did not fall below the industry recognized <70%. Findings from Phase 2 human factors testing showed no statistically significant evidence that the NASA approved ISS fluorescent lights or blue or white LEDs caused fatigue, eye strain and/or headache, when study participants perform detailed tasks of reading and assembling mechanical parts for an extended period of two uninterrupted hours. However, human subjects self-reported that blue LEDs provided the most white light and the favored light source over the white LED and the ISS fluorescent as a sole artificial light source for space travel. According to NASA standards, findings from this study indicate that LEDs meet criteria for the NASA TRL 7 rating, as study findings showed that commercial LED manufacturers passed the rigorous testing standards of suitability for space flight environments and human factor effects. Recommendations for future research include further testing for space flight using the basis of this study for replication, but reduce study limitations by 1) testing human subjects exposure to LEDs in a simulated space capsule environment over several days, and 2) installing and testing LEDs in space modules being tested for human spaceflight.
Baad-Hansen, L; Pigg, M; Yang, G; List, T; Svensson, P; Drangsholt, M
2015-02-01
The reliability of comprehensive intra-oral quantitative sensory testing (QST) protocol has not been examined systematically in patients with chronic oro-facial pain. The aim of the present multicentre study was to examine test-retest and interexaminer reliability of intra-oral QST measures in terms of absolute values and z-scores as well as within-session coefficients of variation (CV) values in patients with atypical odontalgia (AO) and healthy pain-free controls. Forty-five patients with AO and 68 healthy controls were subjected to bilateral intra-oral gingival QST and unilateral extratrigeminal QST (thenar) on three occasions (twice on 1 day by two different examiners and once approximately 1 week later by one of the examiners). Intra-class correlation coefficients and kappa values for interexaminer and test-retest reliability were computed. Most of the standardised intra-oral QST measures showed fair to excellent interexaminer (9-12 of 13 measures) and test-retest (7-11 of 13 measures) reliability. Furthermore, no robust differences in reliability measures or within-session variability (CV) were detected between patients with AO and the healthy reference group. These reliability results in chronic orofacial pain patients support earlier suggestions based on data from healthy subjects that intra-oral QST is sufficiently reliable for use as a part of a comprehensive evaluation of patients with somatosensory disturbances or neuropathic pain in the trigeminal region. © 2014 John Wiley & Sons Ltd.
Burnstein, Bryan D.; Steele, Russell J.; Shrier, Ian
2011-01-01
Context: Fitness testing is used frequently in many areas of physical activity, but the reliability of these measurements under real-world, practical conditions is unknown. Objective: To evaluate the reliability of specific fitness tests using the methods and time periods used in the context of real-world sport and occupational management. Design: Cohort study. Setting: Eighteen different Cirque du Soleil shows. Patients or Other Participants: Cirque du Soleil physical performers who completed 4 consecutive tests (6-month intervals) and were free of injury or illness at each session (n = 238 of 701 physical performers). Intervention(s): Performers completed 6 fitness tests on each assessment date: dynamic balance, Harvard step test, handgrip, vertical jump, pull-ups, and 60-second jump test. Main Outcome Measure(s): We calculated the intraclass coefficient (ICC) and limits of agreement between baseline and each time point and the ICC over all 4 time points combined. Results: Reliability was acceptable (ICC > 0.6) over an 18-month time period for all pairwise comparisons and all time points together for the handgrip, vertical jump, and pull-up assessments. The Harvard step test and 60-second jump test had poor reliability (ICC < 0.6) between baseline and other time points. When we excluded the baseline data and calculated the ICC for 6-month, 12-month, and 18-month time points, both the Harvard step test and 60-second jump test demonstrated acceptable reliability. Dynamic balance was unreliable in all contexts. Limit-of-agreement analysis demonstrated considerable intraindividual variability for some tests and a learning effect by administrators on others. Conclusions: Five of the 6 tests in this battery had acceptable reliability over an 18-month time frame, but the values for certain individuals may vary considerably from time to time for some tests. Specific tests may require a learning period for administrators. PMID:22488138
Reliability and Validity of Ten Consumer Activity Trackers Depend on Walking Speed.
Fokkema, Tryntsje; Kooiman, Thea J M; Krijnen, Wim P; VAN DER Schans, Cees P; DE Groot, Martijn
2017-04-01
To examine the test-retest reliability and validity of ten activity trackers for step counting at three different walking speeds. Thirty-one healthy participants walked twice on a treadmill for 30 min while wearing 10 activity trackers (Polar Loop, Garmin Vivosmart, Fitbit Charge HR, Apple Watch Sport, Pebble Smartwatch, Samsung Gear S, Misfit Flash, Jawbone Up Move, Flyfit, and Moves). Participants walked three walking speeds for 10 min each; slow (3.2 km·h), average (4.8 km·h), and vigorous (6.4 km·h). To measure test-retest reliability, intraclass correlations (ICC) were determined between the first and second treadmill test. Validity was determined by comparing the trackers with the gold standard (hand counting), using mean differences, mean absolute percentage errors, and ICC. Statistical differences were calculated by paired-sample t tests, Wilcoxon signed-rank tests, and by constructing Bland-Altman plots. Test-retest reliability varied with ICC ranging from -0.02 to 0.97. Validity varied between trackers and different walking speeds with mean differences between the gold standard and activity trackers ranging from 0.0 to 26.4%. Most trackers showed relatively low ICC and broad limits of agreement of the Bland-Altman plots at the different speeds. For the slow walking speed, the Garmin Vivosmart and Fitbit Charge HR showed the most accurate results. The Garmin Vivosmart and Apple Watch Sport demonstrated the best accuracy at an average walking speed. For vigorous walking, the Apple Watch Sport, Pebble Smartwatch, and Samsung Gear S exhibited the most accurate results. Test-retest reliability and validity of activity trackers depends on walking speed. In general, consumer activity trackers perform better at an average and vigorous walking speed than at a slower walking speed.
Questionnaire for low back pain in the garment industry workers
Bindra, Supreet; Sinha, A. G. K.; Benjamin, A. I.
2013-01-01
Low back pain affects up to 90% of the world's population at some point in their lives. Until date no questionnaire has been designed for back pain in the garment industry workers. Therefore, the objective of this study is to design a questionnaire to determine the prevalence, risk factors, impact, health care service utilization and back pain features in the garment industry workers and gain preliminary experience of its use. The content validity and reliability of the questionnaire was established. Items showing acceptable internal consistency and moderate to high test re-test reliability were retained in the questionnaire. Items showing unacceptable internal consistency, low test re-test reliability or poor differentiation were reworded, redrafted and re-tested on the workers. It took 20 min to complete one interview schedule. Environmental factors such as the absence of the garment industry owner/supervisor or co-workers at the time of the interview and interview during leisure hours need to be standardized. Thus, final questionnaire is ready for use after necessary amendments and will be used on the larger sample size in the main study. PMID:24421591
Questionnaire for low back pain in the garment industry workers.
Bindra, Supreet; Sinha, A G K; Benjamin, A I
2013-05-01
Low back pain affects up to 90% of the world's population at some point in their lives. Until date no questionnaire has been designed for back pain in the garment industry workers. Therefore, the objective of this study is to design a questionnaire to determine the prevalence, risk factors, impact, health care service utilization and back pain features in the garment industry workers and gain preliminary experience of its use. The content validity and reliability of the questionnaire was established. Items showing acceptable internal consistency and moderate to high test re-test reliability were retained in the questionnaire. Items showing unacceptable internal consistency, low test re-test reliability or poor differentiation were reworded, redrafted and re-tested on the workers. It took 20 min to complete one interview schedule. Environmental factors such as the absence of the garment industry owner/supervisor or co-workers at the time of the interview and interview during leisure hours need to be standardized. Thus, final questionnaire is ready for use after necessary amendments and will be used on the larger sample size in the main study.
Assessment Instrument for Problem-focused Coping. Reliability test of APC. Part 1.
Tollén, A; Ahlström, G
1998-01-01
A new self-report instrument, the Assessment Instrument of Problem-focused Coping (APC) developed from qualitative interviews, is described. This instrument provides knowledge of the patients' own competence in coping with activities of daily living (ADL), the patients' own assessment of what they experience as problems, and the extent to which they are satisfied with their ADL. The purpose of the study was to test the reliability of the instrument with regard to intra-rater reliability and internal consistency. The study group comprised 40 patients with muscular weakness and other symptoms relating to the postpolio syndrome. The result showed an acceptable internal consistency (alpha 0.70), which confirms the construct validity of the instrument. The test-retest showed that the stability over a period of time varied from low to high for a total of 28 items. At the same time, it is evident that the instrument does not achieve the aim of being a good evaluation instrument, because the stability over a period of time was unsatisfactory. The test-retest should be repeated with a larger test group in future research projects.
Costa, Y M; Morita-Neto, O; de Araújo-Júnior, E N S; Sampaio, F A; Conti, P C R; Bonjardim, L R
2017-03-01
Assessing the reliability of medical measurements is a crucial step towards the elaboration of an applicable clinical instrument. There are few studies that evaluate the reliability of somatosensory assessment and pain modulation of masticatory structures. This study estimated the test-retest reliability, that is over time, of the mechanical somatosensory assessment of anterior temporalis, masseter and temporomandibular joint (TMJ) and the conditioned pain modulation (CPM) using the anterior temporalis as the test site. Twenty healthy women were evaluated in two sessions (1 week apart) by the same examiner. Mechanical detection threshold (MDT), mechanical pain threshold (MPT), wind-up ratio (WUR) and pressure pain threshold (PPT) were assessed on the skin overlying the anterior temporalis, masseter and TMJ of the dominant side. CPM was tested by comparing PPT before and during the hand immersion in a hot water bath. anova and intra-class correlation coefficients (ICCs) were applied to the data (α = 5%). The overall ICCs showed acceptable values for the test-retest reliability of mechanical somatosensory assessment of masticatory structures. The ICC values of 75% of all quantitative sensory measurements were considered fair to excellent (fair = 8·4%, good = 33·3% and excellent = 33·3%). However, the CPM paradigm presented poor reliability (ICC = 0·25). The mechanical somatosensory assessment of the masticatory structures, but not the proposed CPM protocol, can be considered sufficiently reliable over time to evaluate the trigeminal sensory function. © 2016 John Wiley & Sons Ltd.
Evaluation of tools used to measure calcium and/or dairy consumption in children and adolescents.
Magarey, Anthea; Yaxley, Alison; Markow, Kylie; Baulderstone, Lauren; Miller, Michelle
2014-08-01
To identify and critique tools that assess Ca and/or dairy intake in children to ascertain the most accurate and reliable tools available. A systematic review of the literature was conducted using defined inclusion and exclusion criteria. Articles were included on the basis that they reported on a tool measuring Ca and/or dairy intake in children in Western countries and reported on originally developed tools or tested the validity or reliability of existing tools. Defined criteria for reporting reliability and validity properties were applied. Studies in Western countries. Children. Eighteen papers reporting on two tools that assessed dairy intake, ten that assessed Ca intake and five that assessed both dairy and Ca were identified. An examination of tool testing revealed high reliance on lower-order tests such as correlation and failure to differentiate between statistical and clinically meaningful significance. Only half of the tools were tested for reliability and results indicated that only one Ca tool and one dairy tool were reliable. Validation studies showed acceptable levels of agreement (<100 mg difference) and/or sensitivity (62-83 %) and specificity (55-77 %) in three Ca tools. With reference to the testing methodology and results, no tools were considered both valid and reliable for the assessment of dairy intake and only one tool proved valid and reliable for the assessment of Ca intake. These results clearly indicate the need for development and rigorous testing of tools to assess Ca and/or dairy intake in children and adolescents.
Singh, Amika S; Vik, Froydis N; Chinapaw, Mai J M; Uijtdewilligen, Léonie; Verloigne, Maïté; Fernández-Alvira, Juan M; Stomfai, Sarolta; Manios, Yannis; Martens, Marloes; Brug, Johannes
2011-12-09
Insight in children's energy balance-related behaviours (EBRBs) and their determinants is important to inform obesity prevention research. Therefore, reliable and valid tools to measure these variables in large-scale population research are needed. To examine the test-retest reliability and construct validity of the child questionnaire used in the ENERGY-project, measuring EBRBs and their potential determinants among 10-12 year old children. We collected data among 10-12 year old children (n = 730 in the test-retest reliability study; n = 96 in the construct validity study) in six European countries, i.e. Belgium, Greece, Hungary, the Netherlands, Norway, and Spain. Test-retest reliability was assessed using the intra-class correlation coefficient (ICC) and percentage agreement comparing scores from two measurements, administered one week apart. To assess construct validity, the agreement between questionnaire responses and a subsequent face-to-face interview was assessed using ICC and percentage agreement. Of the 150 questionnaire items, 115 (77%) showed good to excellent test-retest reliability as indicated by ICCs > .60 or percentage agreement ≥ 75%. Test-retest reliability was moderate for 34 items (23%) and poor for one item. Construct validity appeared to be good to excellent for 70 (47%) of the 150 items, as indicated by ICCs > .60 or percentage agreement ≥ 75%. From the other 80 items, construct validity was moderate for 39 (26%) and poor for 41 items (27%). Our results demonstrate that the ENERGY-child questionnaire, assessing EBRBs of the child as well as personal, family, and school-environmental determinants related to these EBRBs, has good test-retest reliability and moderate to good construct validity for the large majority of items.
2011-01-01
Background Insight in children's energy balance-related behaviours (EBRBs) and their determinants is important to inform obesity prevention research. Therefore, reliable and valid tools to measure these variables in large-scale population research are needed. Objective To examine the test-retest reliability and construct validity of the child questionnaire used in the ENERGY-project, measuring EBRBs and their potential determinants among 10-12 year old children. Methods We collected data among 10-12 year old children (n = 730 in the test-retest reliability study; n = 96 in the construct validity study) in six European countries, i.e. Belgium, Greece, Hungary, the Netherlands, Norway, and Spain. Test-retest reliability was assessed using the intra-class correlation coefficient (ICC) and percentage agreement comparing scores from two measurements, administered one week apart. To assess construct validity, the agreement between questionnaire responses and a subsequent face-to-face interview was assessed using ICC and percentage agreement. Results Of the 150 questionnaire items, 115 (77%) showed good to excellent test-retest reliability as indicated by ICCs > .60 or percentage agreement ≥ 75%. Test-retest reliability was moderate for 34 items (23%) and poor for one item. Construct validity appeared to be good to excellent for 70 (47%) of the 150 items, as indicated by ICCs > .60 or percentage agreement ≥ 75%. From the other 80 items, construct validity was moderate for 39 (26%) and poor for 41 items (27%). Conclusions Our results demonstrate that the ENERGY-child questionnaire, assessing EBRBs of the child as well as personal, family, and school-environmental determinants related to these EBRBs, has good test-retest reliability and moderate to good construct validity for the large majority of items. PMID:22152048
Strand, Edythe A; McCauley, Rebecca J; Weigand, Stephen D; Stoeckel, Ruth E; Baas, Becky S
2013-04-01
In this article, the authors report reliability and validity evidence for the Dynamic Evaluation of Motor Speech Skill (DEMSS), a new test that uses dynamic assessment to aid in the differential diagnosis of childhood apraxia of speech (CAS). Participants were 81 children between 36 and 79 months of age who were referred to the Mayo Clinic for diagnosis of speech sound disorders. Children were given the DEMSS and a standard speech and language test battery as part of routine evaluations. Subsequently, intrajudge, interjudge, and test-retest reliability were evaluated for a subset of participants. Construct validity was explored for all 81 participants through the use of agglomerative cluster analysis, sensitivity measures, and likelihood ratios. The mean percentage of agreement for 171 judgments was 89% for test-retest reliability, 89% for intrajudge reliability, and 91% for interjudge reliability. Agglomerative hierarchical cluster analysis showed that total DEMSS scores largely differentiated clusters of children with CAS vs. mild CAS vs. other speech disorders. Positive and negative likelihood ratios and measures of sensitivity and specificity suggested that the DEMSS does not overdiagnose CAS but sometimes fails to identify children with CAS. The value of the DEMSS in differential diagnosis of severe speech impairments was supported on the basis of evidence of reliability and validity.
Baradaran, Aslan; Ebrahimzadeh, Mohammad H; Birjandinejad, Ali; Kachooei, Amir Reza
2016-04-01
Prospective study. We aimed to validate the Persian version of the modified Oswestry disability questionnaire (MODQ) in patients with low back pain. Modified Oswestry low back pain disability questionnaire is a well-known condition-specific outcome measure that helps quantify disability in patients with lumbar syndromes. To test the validity in a pilot study, the Persian MODQ was administered to 25 individuals with low back pain. We then enrolled 200 consecutive patients with low back pain to fill the Persian MODQ as well as the short form 36 (SF-36) questionnaire. Convergent validity of the MODQ was tested using the Spearman's correlation coefficient between the MODQ and SF-36 subscales. Intraclass correlation coefficient (ICC) and Cronbach's α coefficient were measured to test the reliability between test and retest and internal consistency of all items, respectively. ICC for individual items ranged from 0.43 to 0.80 showing good reliability and reproducibility of each individual item. Cronbach's α coefficient was 0.69 showing good internal consistency across all 10 items of the Persian MODQ. Total MODQ score showed moderate to strong correlation with the eight subscales and the two domains of the SF-36. The highest correlation was between the MODQ and the physical functioning subscale of the SF-36 (r=-0.54, p<0.001) and the physical component domain of the SF-36 (r=-0.55, p<0.001) showing that MODQ is measuring what it is supposed to measure in terms of disability and physical function. Persian version of the MODQ is a valid and reliable tool for the assessment of the disability following low back pain.
Validation of the Brazilian Portuguese Version of Geriatric Anxiety Inventory--GAI-BR.
Massena, Patrícia Nitschke; de Araújo, Narahyana Bom; Pachana, Nancy; Laks, Jerson; de Pádua, Analuiza Camozzato
2015-07-01
The Geriatric Anxiety Inventory (GAI) is a recently developed scale aiming to evaluate symptoms of anxiety in later life. This 20-item scale uses dichotomous answers highlighting non-somatic anxiety complaints of elderly people. The present study aimed to evaluate the psychometric properties of the Brazilian Portuguese version GAI (GAI-BR) in a sample from community and outpatient psychogeriatric clinic. A mixed convenience sample of 72 subjects was recruited for answering the research protocol. The interview procedures were structured with questionnaires about sociodemographic data, clinical health status, anxiety, and depression previously validated instruments, Mini-Mental State Examination, Mini International Neuropsychiatric Interview, and GAI-BR. Twenty-two percent of the sample were interviewed twice for test-retest reliability. For internal consistency analyses, the Cronbach's α test was applied. The Spearman correlation test was applied to evaluate the test-retest GAI-BR reliability. A ROC (receiver operating characteristic) curve study was made to estimate the GAI-BR area under curve, cut-off points, sensitivity, and specificity for the Generalized Anxiety Disorder diagnosis. The GAI-BR version showed high internal consistency (Cronbach's α = 0.91) and strong and significant test-retest reliability (ρ = 0.85, p < 0.001). It also showed moderate and significant correlation with the Beck Anxiety Inventory (ρ = 0.68, p < 0.001) and the State-Trait Anxiety Inventory (ρ = 0.61, p < 0.001) showing evidence of concurrent validation. The cut-off point of 13 estimated by ROC curve analyses showed sensitivity of 83.3% and specificity of 84.6% to detect Generalized Anxiety Disorder (DSM-IV). GAI-BR has demonstrated very good psychometric properties and can be a reliable instrument to measure anxiety in Brazilian elderly people.
Calès, Paul; Halfon, Philippe; Batisse, Dominique; Carrat, Fabrice; Perré, Philippe; Penaranda, Guillaume; Guyader, Dominique; d'Alteroche, Louis; Fouchard-Hubert, Isabelle; Michelet, Christian; Veillon, Pascal; Lambert, Jérôme; Weiss, Laurence; Salmon, Dominique; Cacoub, Patrice
2010-08-01
We compared 5 non-specific and 2 specific blood tests for liver fibrosis in HCV/HIV co-infection. Four hundred and sixty-seven patients were included into derivation (n=183) or validation (n=284) populations. Within these populations, the diagnostic target, significant fibrosis (Metavir F > or = 2), was found in 66% and 72% of the patients, respectively. Two new fibrosis tests, FibroMeter HICV and HICV test, were constructed in the derivation population. Unadjusted AUROCs in the derivation population were: APRI: 0.716, Fib-4: 0.722, Fibrotest: 0.778, Hepascore: 0.779, FibroMeter: 0.783, HICV test: 0.822, FibroMeter HICV: 0.828. AUROCs adjusted on classification and distribution of fibrosis stages in a reference population showed similar values in both populations. FibroMeter, FibroMeter HICV and HICV test had the highest correct classification rates in F0/1 and F3/4 (which account for high predictive values): 77-79% vs. 70-72% in the other tests (p=0.002). Reliable individual diagnosis based on predictive values > or = 90% distinguished three test categories: poorly reliable: Fib-4 (2.4% of patients), APRI (8.9%); moderately reliable: Fibrotest (25.4%), FibroMeter (26.6%), Hepascore (30.2%); acceptably reliable: HICV test (40.2%), FibroMeter HICV (45.6%) (p<10(-3) between tests). FibroMeter HICV classified all patients into four reliable diagnosis intervals (< or =F1, F1+/-1, > or =F1, > or =F2) with an overall accuracy of 93% vs. 79% (p<10(-3)) for a binary diagnosis of significant fibrosis. Tests designed for HCV infections are less effective in HIV/HCV infections. A specific test, like FibroMeter HICV, was the most interesting test for diagnostic accuracy, correct classification profile, and a reliable diagnosis. With reliable diagnosis intervals, liver biopsy can therefore be avoided in all patients. Copyright 2010 European Association for the Study of the Liver. Published by Elsevier B.V. All rights reserved.
Cannon, Joanna E; Hubley, Anita M; Millhoff, Courtney; Mazlouman, Shahla
2016-01-01
The aim of the current study was to gather validation evidence for the Comprehension of Written Grammar (CWG; Easterbrooks, 2010) receptive test of 26 grammatical structures of English print for use with children who are deaf and hard of hearing (DHH). Reliability and validity data were collected for 98 participants (49 DHH and 49 hearing) in Grades 2-6. The objectives were to: (a) examine 4-week test-retest reliability data; and (b) provide evidence of known-groups validity by examining expected differences between the groups on the CWG vocabulary pretest and main test, as well as selected structures. Results indicated excellent test-retest reliability estimates for CWG test scores. DHH participants performed statistically significantly lower on the CWG vocabulary pretest and main test than the hearing participants. Significantly lower performance by DHH participants on most expected grammatical structures (e.g., basic sentence patterns, auxiliary "be" singular/plural forms, tense, comparatives, and complementation) also provided known groups evidence. Overall, the findings of this study showed strong evidence of the reliability of scores and known group-based validity of inferences made from the CWG. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Reliability systems for implantable cardiac defibrillator batteries
NASA Astrophysics Data System (ADS)
Takeuchi, Esther S.
The reliability of the power sources used in implantable cardiac defibrillators is critical due to the life-saving nature of the device. Achieving a high reliability power source depends on several systems functioning together. Appropriate cell design is the first step in assuring a reliable product. Qualification of critical components and of the cells using those components is done prior to their designation as implantable grade. Product consistency is assured by control of manufacturing practices and verified by sampling plans using both accelerated and real-time testing. Results to date show that lithium/silver vanadium oxide cells used for implantable cardiac defibrillators have a calculated maximum random failure rate of 0.005% per test month.
Test-retest reliability of the multifocal photopic negative response.
Van Alstine, Anthony W; Viswanathan, Suresh
2017-02-01
To assess the test-retest reliability of the multifocal photopic negative response (mfPhNR) of normal human subjects. Multifocal electroretinograms were recorded from one eye of 61 healthy adult subjects on two separate days using a Visual Evoked Response Imaging System software version 4.3 (EDI, San Mateo, California). The visual stimulus delivered on a 75-Hz monitor consisted of seven equal-sized hexagons each subtending 12° of visual angle. The m-step exponent was 9, and the m-sequence was slowed to include at least 30 blank frames after each flash. Only the first slice of the first-order kernel was analyzed. The mfPhNR amplitude was measured at a fixed time in the trough from baseline (BT) as well as at the same fixed time in the trough from the preceding b-wave peak (PT). Additionally, we also analyzed BT normalized either to PT (BT/PT) or to the b-wave amplitude (BT/b-wave). The relative reliability of test-retest differences for each test location was estimated by the Wilcoxon matched-pair signed-rank test and intraclass correlation coefficients (ICC). Absolute test-retest reliability was estimated by Bland-Altman analysis. The test-retest amplitude differences for neither of the two measurement techniques were statistically significant as determined by Wilcoxon matched-pair signed-rank test. PT measurements showed greater ICC values than BT amplitude measurements for all test locations. For each measurement technique, the ICC value of the macular response was greater than that of the surrounding locations. The mean test-retest difference was close to zero for both techniques at each of the test locations, and while the coefficient of reliability (COR-1.96 times the standard deviation of the test-retest difference) was comparable for the two techniques at each test location when expressed in nanovolts, the %COR (COR normalized to the mean test and retest amplitudes) was superior for PT than BT measurements. The ICC and COR were comparable for the BT/PT and BT/b-wave ratios and were better than the ICC and COR for BT but worse than PT. mfPhNR amplitude measured at a fixed time in the trough from the preceding b-wave peak (PT) shows greater test-retest reliability when compared to amplitude measurement from baseline (BT) or BT amplitude normalized to either the PT or b-wave amplitudes.
Age-Related Differences in Test-Retest Reliability in Resting-State Brain Functional Connectivity
Song, Jie; Desphande, Alok S.; Meier, Timothy B.; Tudorascu, Dana L.; Vergun, Svyatoslav; Nair, Veena A.; Biswal, Bharat B.; Meyerand, Mary E.; Birn, Rasmus M.; Bellec, Pierre; Prabhakaran, Vivek
2012-01-01
Resting-state functional MRI (rs-fMRI) has emerged as a powerful tool for investigating brain functional connectivity (FC). Research in recent years has focused on assessing the reliability of FC across younger subjects within and between scan-sessions. Test-retest reliability in resting-state functional connectivity (RSFC) has not yet been examined in older adults. In this study, we investigated age-related differences in reliability and stability of RSFC across scans. In addition, we examined how global signal regression (GSR) affects RSFC reliability and stability. Three separate resting-state scans from 29 younger adults (18–35 yrs) and 26 older adults (55–85 yrs) were obtained from the International Consortium for Brain Mapping (ICBM) dataset made publically available as part of the 1000 Functional Connectomes project www.nitrc.org/projects/fcon_1000. 92 regions of interest (ROIs) with 5 cubic mm radius, derived from the default, cingulo-opercular, fronto-parietal and sensorimotor networks, were previously defined based on a recent study. Mean time series were extracted from each of the 92 ROIs from each scan and three matrices of z-transformed correlation coefficients were created for each subject, which were then used for evaluation of multi-scan reliability and stability. The young group showed higher reliability of RSFC than the old group with GSR (p-value = 0.028) and without GSR (p-value <0.001). Both groups showed a high degree of multi-scan stability of RSFC and no significant differences were found between groups. By comparing the test-retest reliability of RSFC with and without GSR across scans, we found significantly higher proportion of reliable connections in both groups without GSR, but decreased stability. Our results suggest that aging is associated with reduced reliability of RSFC which itself is highly stable within-subject across scans for both groups, and that GSR reduces the overall reliability but increases the stability in both age groups and could potentially alter group differences of RSFC. PMID:23227153
Wixted, John T; Mickes, Laura; Fisher, Ronald P
2018-05-01
The available real-world evidence suggests that, on an initial test, eyewitness memory is often reliable. Ironically, even the DNA exoneration cases-which generally involved nonpristine testing conditions and which are usually construed as an indictment of eyewitness memory-show how reliable an initial test of eyewitness memory can be in the real world. We endorse the use of pristine testing procedures, but their absence does not automatically imply that eyewitness memory is unreliable.
Roaldsen, Kirsti Skavberg; Måøy, Åsa Blad; Jørgensen, Vivien; Stanghelle, Johan Kvalvik
2016-05-01
Translation of the Spinal Cord Injury Falls Concern Scale (SCI-FCS), and investigation of test-retest reliability on item-level and total-score-level. Translation, adaptation and test-retest study. A specialized rehabilitation setting in Norway. Fifty-four wheelchair users with a spinal cord injury. The median age of the cohort was 49 years, and the median number of years after injury was 13. Interventions/measurements: The SCI-FCS was translated and back-translated according to guidelines. Individuals answered the SCI-FCS twice over the course of one week. We investigated item-level test-retest reliability using Svensson's rank-based statistical method for disagreement analysis of paired ordinal data. For relative reliability, we analyzed the total-score-level test-retest reliability with intraclass correlation coefficients (ICC2.1), the standard error of measurement (SEM), and the smallest detectable change (SDC) for absolute reliability/measurement-error assessment and Cronbach's alpha for internal consistency. All items showed satisfactory percentage agreement (≥69%) between test and retest. There were small but non-negligible systematic disagreements among three items; we recovered an 11-13% higher chance for a lower second score. There was no disagreement due to random variance. The test-retest agreement (ICC2.1) was excellent (0.83). The SEM was 2.6 (12%), and the SDC was 7.1 (32%). The Cronbach's alpha was high (0.88). The Norwegian SCI-FCS is highly reliable for wheelchair users with chronic spinal cord injuries.
Merritt, Victoria C; Bradson, Megan L; Meyer, Jessica E; Arnett, Peter A
2018-05-01
The Immediate Post-Concussion Assessment and Cognitive Testing (ImPACT) is a commonly used tool in sports concussion assessment. While test-retest reliabilities have been established for the ImPACT cognitive composites, few studies have evaluated the psychometric properties of the ImPACT's Post-Concussion Symptom Scale (PCSS). The purpose of this study was to establish the test-retest reliability of symptom indices associated with the PCSS. Participants included 38 undergraduate students (50.0% male) who underwent neuropsychological testing as part of their participation in their psychology department's research subject pool. The majority of the participants were Caucasian (94.7%) and had no history of concussion (73.7%). All participants completed the ImPACT at two time points, approximately 6 weeks apart. The PCSS was the main outcome measure, and eight symptom indices were calculated (a total symptom score, three symptom summary indices, and four symptom clusters). Pearson correlations (r) and intraclass correlation coefficients (ICCs) were computed as measures of test-retest reliability. Overall, reliabilities ranged from low to high (r = .44 to .80; ICC = .44 to .77). The cognitive symptom cluster exhibited the highest test-retest reliability (r = .80, ICC = .77), followed by the positive symptom total (PST) index, an indicator of the total number of symptoms endorsed (r = .71, ICC = .69). In contrast, the commonly used total symptom score showed lower test-retest reliability (r = .67, ICC = .62). Paired-samples t tests revealed no significant differences between test and retest for any of the symptom variables (all p > .01). Finally, reliable change indices (RCI) were computed to determine whether differences observed between test and retest represented clinically significant change. RCI values were provided for each symptom index at the 80%, 90%, and 95% confidence intervals. These results suggest that evaluating additional symptom indices beyond the total symptom score from the PCSS is beneficial. Findings from this study can be applied to athlete samples to assess reliable change in symptoms following concussion.
Jung, Sung-Hoon; Kwon, Oh-Yun; Jeon, In-Cheol; Hwang, Ui-Jae; Weon, Jong-Hyuck
2018-01-01
The purposes of this study were to determine the intra-rater test-retest reliability of a smart phone-based measurement tool (SBMT) and a three-dimensional (3D) motion analysis system for measuring the transverse rotation angle of the pelvis during single-leg lifting (SLL) and the criterion validity of the transverse rotation angle of the pelvis measurement using SBMT compared with a 3D motion analysis system (3DMAS). Seventeen healthy volunteers performed SLL with their dominant leg without bending the knee until they reached a target placed 20 cm above the table. This study used a 3DMAS, considered the gold standard, to measure the transverse rotation angle of the pelvis to assess the criterion validity of the SBMT measurement. Intra-rater test-retest reliability was determined using the SBMT and 3DMAS using intra-class correlation coefficient (ICC) [3,1] values. The criterion validity of the SBMT was assessed with ICC [3,1] values. Both the 3DMAS (ICC = 0.77) and SBMT (ICC = 0.83) showed excellent intra-rater test-retest reliability in the measurement of the transverse rotation angle of the pelvis during SLL in a supine position. Moreover, the SBMT showed an excellent correlation with the 3DMAS (ICC = 0.99). Measurement of the transverse rotation angle of the pelvis using the SBMT showed excellent reliability and criterion validity compared with the 3DMAS.
Alyusuf, Raja H; Prasad, Kameshwar; Abdel Satir, Ali M; Abalkhail, Ali A; Arora, Roopa K
2013-01-01
The exponential use of the internet as a learning resource coupled with varied quality of many websites, lead to a need to identify suitable websites for teaching purposes. The aim of this study is to develop and to validate a tool, which evaluates the quality of undergraduate medical educational websites; and apply it to the field of pathology. A tool was devised through several steps of item generation, reduction, weightage, pilot testing, post-pilot modification of the tool and validating the tool. Tool validation included measurement of inter-observer reliability; and generation of criterion related, construct related and content related validity. The validated tool was subsequently tested by applying it to a population of pathology websites. Reliability testing showed a high internal consistency reliability (Cronbach's alpha = 0.92), high inter-observer reliability (Pearson's correlation r = 0.88), intraclass correlation coefficient = 0.85 and κ =0.75. It showed high criterion related, construct related and content related validity. The tool showed moderately high concordance with the gold standard (κ =0.61); 92.2% sensitivity, 67.8% specificity, 75.6% positive predictive value and 88.9% negative predictive value. The validated tool was applied to 278 websites; 29.9% were rated as recommended, 41.0% as recommended with caution and 29.1% as not recommended. A systematic tool was devised to evaluate the quality of websites for medical educational purposes. The tool was shown to yield reliable and valid inferences through its application to pathology websites.
Baert, Isabel A C; Lluch, Enrique; Struyf, Thomas; Peeters, Greta; Van Oosterwijck, Sophie; Tuynman, Joanna; Rufai, Salim; Struyf, Filip
2018-06-01
The therapeutic value of proprioceptive-based exercises in knee osteoarthritis (KOA) management warrants investigation of proprioceptive testing methods easily accessible in clinical practice. To estimate inter- and intrarater reliability of the knee joint position sense (KJPS) test and knee force sense (KFS) test in subjects with and without KOA. Cross-sectional test-retest design. Two blinded raters performed independently repeated measures of the KJPS and KFS test, using an analogue inclinometer and handheld dynamometer, respectively, in eight KOA patients (12 symptomatic knees) and 26 healthy controls (52 asymptomatic knees). Intraclass correlation coefficients (ICCs; model 2,1), standard error of measurement (SEM) and minimal detectable change with 95% confidence bounds (MDC 95 ) were calculated. For KJPS, results showed good to excellent test-retest agreement (ICCs 0.70-0.95 in KOA patients; ICCs 0.65-0.85 in healthy controls). A 2° measurement error (SEM 1°) was reported when measuring KJPS in multiple test positions and calculating mean repositioning error. Testing KOA patients pre and post therapy a repositioning error larger than 4° (MDC 95 ) is needed to consider true change. Measuring KFS using handheld dynamometry showed poor to fair interrater and poor to excellent intrarater reliability in subjects with and without KOA. Measuring KJPS in multiple test positions using an analogue inclinometer and calculating mean repositioning error is reliable and can be used in clinical practice. We do not recommend the use of the KFS test to clinicians. Further research is required to establish diagnostic accuracy and validity of our KJPS test in larger knee pain populations. Copyright © 2017 Elsevier Ltd. All rights reserved.
López-Pina, José Antonio; Sánchez-Meca, Julio; López-López, José Antonio; Marín-Martínez, Fulgencio; Núñez-Núñez, Rosa Ma; Rosa-Alcázar, Ana I; Gómez-Conesa, Antonia; Ferrer-Requena, Josefa
2015-01-01
The Yale-Brown Obsessive-Compulsive Scale for children and adolescents (CY-BOCS) is a frequently applied test to assess obsessive-compulsive symptoms. We conducted a reliability generalization meta-analysis on the CY-BOCS to estimate the average reliability, search for reliability moderators, and propose a predictive model that researchers and clinicians can use to estimate the expected reliability of the CY-BOCS scores. A total of 47 studies reporting a reliability coefficient with the data at hand were included in the meta-analysis. The results showed good reliability and a large variability associated to the standard deviation of total scores and sample size.
van der Meulen, Ineke; van de Sandt-Koenderman, W Mieke E; Duivenvoorden, Hugo J; Ribbers, Gerard M
2010-01-01
This study explores the psychometric qualities of the Scenario Test, a new test to assess daily-life communication in severe aphasia. The test is innovative in that it: (1) examines the effectiveness of verbal and non-verbal communication; and (2) assesses patients' communication in an interactive setting, with a supportive communication partner. To determine the reliability, validity, and sensitivity to change of the Scenario Test and discuss its clinical value. The Scenario Test was administered to 122 persons with aphasia after stroke and to 25 non-aphasic controls. Analyses were performed for the entire group of persons with aphasia, as well as for a subgroup of persons unable to communicate verbally (n = 43). Reliability (internal consistency, test-retest reliability, inter-judge, and intra-judge reliability) and validity (internal validity, convergent validity, known-groups validity) and sensitivity to change were examined using standard psychometric methods. The Scenario Test showed high levels of reliability. Internal consistency (Cronbach's alpha = 0.96; item-rest correlations = 0.58-0.82) and test-retest reliability (ICC = 0.98) were high. Agreement between judges in total scores was good, as indicated by the high inter- and intra-judge reliability (ICC = 0.86-1.00). Agreement in scores on the individual items was also good (square-weighted kappa values 0.61-0.92). The test demonstrated good levels of validity. A principal component analysis for categorical data identified two dimensions, interpreted as general communication and communicative creativity. Correlations with three other instruments measuring communication in aphasia, that is, Spontaneous Speech interview from the Aachen Aphasia Test (AAT), Amsterdam-Nijmegen Everyday Language Test (ANELT), and Communicative Effectiveness Index (CETI), were moderate to strong (0.50-0.85) suggesting good convergent validity. Group differences were observed between persons with aphasia and non-aphasic controls, as well as between persons with aphasia unable to use speech to convey information and those able to communicate verbally; this indicates good known-groups validity. The test was sensitive to changes in performance, measured over a period of 6 months. The data support the reliability and validity of the Scenario Test as an instrument for examining daily-life communication in aphasia. The test focuses on multimodal communication; its psychometric qualities enable future studies on the effect of Alternative and Augmentative Communication (AAC) training in aphasia.
Determining the Appropriateness of the "What If" Situations Test (WIST) with Turkish Pre-Schoolers.
Citak Tunc, Gulseren; Gorak, Gulay; Ozyazicioglu, Nurcan; Ak, Bedriye; Isil, Ozlem; Vural, Pinar
2018-04-01
Measurement instruments are needed to assess the child's sexual abuse prevention program. The purpose of the study was to determine the reliability and validity of the WIST (What If Situations Test) for Turkish culture. Participants were children of the 3-6 age group attending pre-school education institutions and the sample size was identified by means of a power analysis. Seventy children were identified as the sample with 0.85 power and 0.05 type I error according to the power analysis. Language validity, content validity, internal validity coefficient (Cronbach alpha coefficient), and test-retest analyses were conducted in terms of validity and reliability in the scope of efforts for adaptation to Turkish culture. Firstly, Kendall W = 0.83 was the score for the expert opinions concerning the content validity of the language validity scale. It was found that the Cronbach alpha coefficients were between 0.68 and 0.90 for the scale sub-dimensions of appropriate and inappropriate recognition, saying, doing, telling, and reporting. The test-retest reliability of the scale was found to be r = 0.89 and the test-retest reliabilities for the sub-dimensions (appropriate recognition, inappropriate recognition, say skills, do skills, tell skills, and reporting skills) were between r = 0.48 and r = 0.92. The test-retest reliability for the Personal Safety Questionnaire (PSQ), as having complimentary items to the WIST, was found to be r = 0.82. The reliability and validity analysis of the 'What If' Situations Test (WIST), used to evaluate pre-schoolers' skills regarding self-protection against sexual abuse, showed that the Test's adaptation to Turkish culture was reliable and valid.
Almeida, Gabriel Peixoto Leão; das Neves Rodrigues, Helena Larissa; de Freitas, Bruno Wesley; de Paula Lima, Pedro Olavo
2017-12-01
Study Design Cross-sectional study. Background The Hip Stability Isometric Test (HipSIT) evaluates the strength of the hip posterolateral stabilizers in a position that favors greater activation of the gluteus maximus and gluteus medius and lower activation of the tensor fascia lata. Objectives To check the validity and reliability of the HipSIT and to evaluate the HipSIT in women with patellofemoral pain (PFP). Methods The HipSIT was evaluated with a handheld dynamometer. During testing, the participants were sidelying, with their legs positioned at 45° of hip flexion and 90° of knee flexion. Participants were instructed to raise the knee of the upper leg while keeping the upper and lower heels in contact. To establish reliability and validity, 49 women were tested with the HipSIT by 2 different evaluators on day 1, and then again 7 days later. The strength of the hip extensors, abductors, and external rotators was also evaluated. Twenty women with unilateral PFP were also evaluated. Results The HipSIT has excellent intrarater and interrater reliability. The standard error of measurement was 0.01 kgf/kg, and the minimal detectable change was 0.036 kgf/kg. The HipSIT showed good validity in isolated hip abduction, external rotation, and extension (P<.01). Women with PFP showed a 10% deficit in the HipSIT results for the symptomatic limb (P = .01). Conclusion The HipSIT showed excellent interrater and intrarater reliability, moderate to good validity in women, and was able to identify strength deficits in women with PFP. J Orthop Sports Phys Ther 2017;47(12):906-913. Epub 9 Oct 2017. doi:10.2519/jospt.2017.7274.
Malinowsky, Camilla; Kassberg, Ann-Charlotte; Larsson-Lund, Maria; Kottorp, Anders
2016-01-01
To evaluate the test-retest reliability of the Management of Everyday Technology Assessment (META) in a sample of people with acquired brain injury (ABI). The META was administered twice within a two-week period to 25 people with ABI. A Rasch measurement model was used to convert the META ordinal raw scores into equal-interval linear measures of each participant's ability to manage everyday technology (ET). Test-retest reliability of the stability of the person ability measures in the META was examined by a standardized difference Z-test and an intra-class correlations analysis (ICC 1). The results showed that the paired person ability measures generated from the META were stable over the test-retest period for 22 of the 25 subjects. The ICC 1 correlation was 0.63, which indicates good overall reliability. The META demonstrated acceptable test-retest reliability in a sample of people with ABI. The results illustrate the importance of using sufficiently challenging ETs (relative to a person's abilities) to generate stable META measurements over time. Implications for Rehabilitation The findings add evidence regarding the test-retest reliability of the person ability measures generated from the observation assessment META in a sample of people with ABI. The META might support professionals in the evaluation of interventions that are designed to improve clients' performance of activities including the ability to manage ET.
Rubio-Ochoa, J; Benítez-Martínez, J; Lluch, E; Santacruz-Zaragozá, S; Gómez-Contreras, P; Cook, C E
2016-02-01
It has been suggested that differential diagnosis of headaches should consist of a robust subjective examination and a detailed physical examination of the cervical spine. Cervicogenic headache (CGH) is a form of headache that involves referred pain from the neck. To our knowledge, no studies have summarized the reliability and diagnostic accuracy of physical examination tests for CGH. The aim of this study was to summarize the reliability and diagnostic accuracy of physical examination tests used to diagnose CGH. A systematic review following PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines was performed in four electronic databases (MEDLINE, Web of Science, Embase and Scopus). Full text reports concerning physical tests for the diagnosis of CGH which reported the clinometric properties for assessment of CGH, were included and screened for methodological quality. Quality Appraisal for Reliability Studies (QAREL) and Quality Assessment of Studies of Diagnostic Accuracy (QUADAS-2) scores were completed to assess article quality. Eight articles were retrieved for quality assessment and data extraction. Studies investigating diagnostic reliability of physical examination tests for CGH scored poorer on methodological quality (higher risk of bias) than those of diagnostic accuracy. There is sufficient evidence showing high levels of reliability and diagnostic accuracy of the selected physical examination tests for the diagnosis of CGH. The cervical flexion-rotation test (CFRT) exhibited both the highest reliability and the strongest diagnostic accuracy for the diagnosis of CGH. Copyright © 2015 Elsevier Ltd. All rights reserved.
Reliabilities of mental rotation tasks: limits to the assessment of individual differences.
Hirschfeld, Gerrit; Thielsch, Meinald T; Zernikow, Boris
2013-01-01
Mental rotation tasks with objects and body parts as targets are widely used in cognitive neuropsychology. Even though these tasks are well established to study between-groups differences, the reliability on an individual level is largely unknown. We present a systematic study on the internal consistency and test-retest reliability of individual differences in mental rotation tasks comparing different target types and orders of presentations. In total n = 99 participants (n = 63 for the retest) completed the mental rotation tasks with hands, feet, faces, and cars as targets. Different target types were presented in either randomly mixed blocks or blocks of homogeneous targets. Across all target types, the consistency (split-half reliability) and stability (test-retest reliabilities) were good or acceptable both for intercepts and slopes. At the level of individual targets, only intercepts showed acceptable reliabilities. Blocked presentations resulted in significantly faster and numerically more consistent and stable responses. Mental rotation tasks-especially in blocked variants-can be used to reliably assess individual differences in global processing speed. However, the assessment of the theoretically important slope parameter for individual targets requires further adaptations to mental rotation tests.
Apivatgaroon, Adinun; Angthong, Chayanin; Sanguanjit, Prakasit; Chernchujit, Bancha
2016-10-01
To develop a Thai version of the Kujala score and show the evaluation of the validity and reliability of the score. The Thai version of the Kujala score was developed using the forward-backward translation protocol. The 49 PFPS patients answered the Thai version of questionnaires including the Kujala score, Short Form-36 (SF-36) and International Knee Documentation Committee (IKDC) Subjective Knee Form. The validity between the scores has been tested. The reliability was assessed using test-retest reliability and internal consistency. The Thai version of the Kujala score showed a good correlation with Thai IKDC Subjective Knee Form (Pearson's correlation coefficient; r = 0.74: p < 0.01) and moderate correlation with the Thai SF-36 subscales of physical component summary, total score and role physical (r = 0.586, 0.571 and 0.524, respectively: p < 0.01). The test-retest reliability was excellent with an intra-class correlation coefficient of 0.908 (p < 0.001; 95% CI [0.842-0.947]). The internal consistency was strong with Cronbach's alpha of 0.952 (p < 0.001). No floor and ceiling effects were observed. The Thai version of the Kujala score has shown good validity and reliability. This score can be effectively used for evaluating Thai patients with patellofemoral pain syndrome. Implications for Rehabilitation The Kujala score is a self-administered questionnaire for patients with patellofemoral pain syndrome (PFPS). The validity and reliability of the Thai version of Kujala are compatible with other versions (Turkish, Chinese and Persian version). The Thai version of Kujala has been shown to have validity and reliability in Thai PFPS patients and can be used for clinical evaluation and also in the research work.
Amariles, Pedro; Pino-Marín, Daniel; Sabater-Hernández, Daniel; García-Jiménez, Emilio; Roig-Sánchez, Inés; Faus, María José
2016-11-01
To determine the test-retest reliability of a questionnaire, with a validation preliminary, to assess knowledge of cardiovascular risk (CVR) and cardiovascular disease in patients attending community pharmacies in Spain. To complement the external validity, establishing the relationship between an educational activity and the increase in knowledge about CVR and cardiovascular disease. Sub-analysis of a controlled clinical study, EMDADER-CV, in which a questionnaire about knowledge concerning CVR was applied at 4 different times. Spanish Community Pharmacies. There were 323 patients in the control group, from the 640 who completed the study. Intraclass correlation coefficient to assess the reliability in 3 comparisons (post-educational activity with week 16, post-educational activity with week 32, and week 16 with week 32); and the non-parametric Friedman test to establish the relationship between an oral and written educational activity with increasing knowledge. For the 323 patients in the 3 comparisons, the intraclass correlation coefficient values were 0.624; 0.608 and 0.801, respectively (fair-good to excellent reliability). So, the Friedman test showed a statistically significant relationship between educational activity and increased knowledge (p < .0001). According to the intraclass correlation coefficient, the questionnaire aimed at assessing the knowledge on CVR and cardiovascular disease has a reliability between acceptable and excellent, which added to the previous validation, shows that the instrument meets the criteria of validity and reliability. Furthermore, the questionnaire showed the ability to relate an increase in knowledge with an educational intervention, feature that complements its external validity. Copyright © 2016 Elsevier España, S.L.U. All rights reserved.
De Blaiser, C; De Ridder, R; Willems, T; Danneels, L; Vanden Bossche, L; Palmans, T; Roosen, P
2018-02-01
The aims of this study were to research the amplitude and median frequency characteristics of selected abdominal, back, and hip muscles of healthy subjects during a prone bridging endurance test, based on surface electromyography (sEMG), (a) to determine if the prone bridging test is a valid field test to measure abdominal muscle fatigue, and (b) to evaluate if the current method of administrating the prone bridging test is reliable. Thirty healthy subjects participated in this experiment. The sEMG activity of seven abdominal, back, and hip muscles was bilaterally measured. Normalized median frequencies were computed from the EMG power spectra. The prone bridging tests were repeated on separate days to evaluate inter and intratester reliability. Significant differences in normalized median frequency slope (NMF slope ) values between several abdominal, back, and hip muscles could be demonstrated. Moderate-to-high correlation coefficients were shown between NMF slope values and endurance time. Multiple backward linear regression revealed that the test endurance time could only be significantly predicted by the NMF slope of the rectus abdominis. Statistical analysis showed excellent reliability (ICC=0.87-0.89). The findings of this study support the validity and reliability of the prone bridging test for evaluating abdominal muscle fatigue. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Ruiz-Cárdenas, Juan Diego; Rodríguez-Juan, Juan José; Smart, Rowan R; Jakobi, Jennifer M; Jones, Gareth R
2018-01-01
The purposes of this study were: (i) Analyze the concurrent validity and reliability of an iPhone App for measuring time, velocity and power during a single sit-to-stand (STS) test compared with measurements recorded from a force plate; and (ii) Evaluate the relationship between the iPhone App measures with age and functional performance. Forty-eight healthy individuals (age range: 26-81 years) were recruited. All participants completed a STS test on a force plate with the movement recorded on an iPhone 6 at 240 frames-per-second. Functional ability was also measured using isometric handgrip strength and self-paced walking time tests. Intraclass correlation coefficients (ICC), Pearson's correlation coefficient, Cronbach's alpha (α) and Bland-Altman plots with 95% confidence intervals (CI) were used to test validity and reliability between instruments. The results showed a good agreement between all STS measurement variables; time (ICC=0.864, 95%CI=0.77-0.92; α=0.926), velocity (ICC=0.912, 95%CI=0.85-0.95; α=0.953) and power (ICC=0.846, 95%CI=0.74-0.91; α=0.917) with no systematic bias between instruments for any variable analyzed. STS time, velocity and power derived from the iPhone App show moderate to strong associations with age (|r|=0.63-0.83) and handgrip strength (|r|=0.4-0.64) but not the walking test. The results of this study identify that this iPhone App is reliable for measuring STS and the derived values of time, velocity and power shows strong associations with age and handgrip strength. Copyright © 2017 Elsevier B.V. All rights reserved.
Gunaydin, Gurkan; Citaker, Seyit; Meray, Jale; Cobanoglu, Gamze; Gunaydin, Ozge Ece; Hazar Kanik, Zeynep
2016-11-01
Validation of a self-report questionnaire. The purpose of this study was to investigate adaptation, validity, and reliability of the Turkish version of the Bournemouth Questionnaire. Low back pain is one of the most frequent disorders leading to activity limitation. This pain affects most of people in their lives. The most important point to evaluate patient's functional abilities and to decide a successful therapy procedure is to manage the assessment questionnaires precisely. One hundred ten patients with chronic low back pain were included in present study. To assess reliability, test-retest and internal consistency analyses were applied. The results of test-retest analysis were assessed by using Intraclass Correlation Coefficient method (95% confidence interval). For internal consistency, Cronbach alpha value was calculated. Validity of the questionnaire was assessed in terms of construct validity. For construct validity, factor analysis and convergent validity were tested. For convergent validity, total points of the Bournemouth Questionnaire were assessed with the total points of Quebec Back Pain Disability Scale and Roland Morris Disability Questionnaire by using Pearson correlation coefficient analysis. Cronbach alpha value was found 0.914, showing that this questionnaire has high internal consistency. The results of test-retest analysis were varying between 0.851 and 0.927, which shows that test-retest results are highly correlated. Factor analysis test indicated that this questionnaire had one factor. Pearson correlation coefficient of the Bournemouth Questionnaire with Roland Morris Disability Questionnaire was calculated 0.703 and it was found with Quebec Back Pain Disability Scale is 0.659. These results showed that the Bournemouth Questionnaire is very good correlated with Roland Morris Disability Questionnaire and Quebec Back Pain Disability Scale. The Turkish version of the Bournemouth Questionnaire is valid and reliable. 3.
The development and validation of a test of science critical thinking for fifth graders.
Mapeala, Ruslan; Siew, Nyet Moi
2015-01-01
The paper described the development and validation of the Test of Science Critical Thinking (TSCT) to measure the three critical thinking skill constructs: comparing and contrasting, sequencing, and identifying cause and effect. The initial TSCT consisted of 55 multiple choice test items, each of which required participants to select a correct response and a correct choice of critical thinking used for their response. Data were obtained from a purposive sampling of 30 fifth graders in a pilot study carried out in a primary school in Sabah, Malaysia. Students underwent the sessions of teaching and learning activities for 9 weeks using the Thinking Maps-aided Problem-Based Learning Module before they answered the TSCT test. Analyses were conducted to check on difficulty index (p) and discrimination index (d), internal consistency reliability, content validity, and face validity. Analysis of the test-retest reliability data was conducted separately for a group of fifth graders with similar ability. Findings of the pilot study showed that out of initial 55 administered items, only 30 items with relatively good difficulty index (p) ranged from 0.40 to 0.60 and with good discrimination index (d) ranged within 0.20-1.00 were selected. The Kuder-Richardson reliability value was found to be appropriate and relatively high with 0.70, 0.73 and 0.92 for identifying cause and effect, sequencing, and comparing and contrasting respectively. The content validity index obtained from three expert judgments equalled or exceeded 0.95. In addition, test-retest reliability showed good, statistically significant correlations ([Formula: see text]). From the above results, the selected 30-item TSCT was found to have sufficient reliability and validity and would therefore represent a useful tool for measuring critical thinking ability among fifth graders in primary science.
Lee, Kyoung Min; Lee, Jaebong; Chung, Chin Youb; Ahn, Soyeon; Sung, Ki Hyuk; Kim, Tae Won; Lee, Hui Jong; Park, Moon Seok
2012-06-01
Intra-class correlation coefficients (ICCs) provide a statistical means of testing the reliability. However, their interpretation is not well documented in the orthopedic field. The purpose of this study was to investigate the use of ICCs in the orthopedic literature and to demonstrate pitfalls regarding their use. First, orthopedic articles that used ICCs were retrieved from the Pubmed database, and journal demography, ICC models and concurrent statistics used were evaluated. Second, reliability test was performed on three common physical examinations in cerebral palsy, namely, the Thomas test, the Staheli test, and popliteal angle measurement. Thirty patients were assessed by three orthopedic surgeons to explore the statistical methods testing reliability. Third, the factors affecting the ICC values were examined by simulating the data sets based on the physical examination data where the ranges, slopes, and interobserver variability were modified. Of the 92 orthopedic articles identified, 58 articles (63%) did not clarify the ICC model used, and only 5 articles (5%) described all models, types, and measures. In reliability testing, although the popliteal angle showed a larger mean absolute difference than the Thomas test and the Staheli test, the ICC of popliteal angle was higher, which was believed to be contrary to the context of measurement. In addition, the ICC values were affected by the model, type, and measures used. In simulated data sets, the ICC showed higher values when the range of data sets were larger, the slopes of the data sets were parallel, and the interobserver variability was smaller. Care should be taken when interpreting the absolute ICC values, i.e., a higher ICC does not necessarily mean less variability because the ICC values can also be affected by various factors. The authors recommend that researchers clarify ICC models used and ICC values are interpreted in the context of measurement.
Validity and reliability of a new tool to evaluate handwriting difficulties in Parkinson’s disease
Nackaerts, Evelien; Heremans, Elke; Smits-Engelsman, Bouwien C. M.; Broeder, Sanne; Vandenberghe, Wim; Bergmans, Bruno; Nieuwboer, Alice
2017-01-01
Background Handwriting in Parkinson’s disease (PD) features specific abnormalities which are difficult to assess in clinical practice since no specific tool for evaluation of spontaneous movement is currently available. Objective This study aims to validate the ‘Systematic Screening of Handwriting Difficulties’ (SOS-test) in patients with PD. Methods Handwriting performance of 87 patients and 26 healthy age-matched controls was examined using the SOS-test. Sixty-seven patients were tested a second time within a period of one month. Participants were asked to copy as much as possible of a text within 5 minutes with the instruction to write as neatly and quickly as in daily life. Writing speed (letters in 5 minutes), size (mm) and quality of handwriting were compared. Correlation analysis was performed between SOS outcomes and other fine motor skill measurements and disease characteristics. Intrarater, interrater and test-retest reliability were assessed using the intraclass correlation coefficient (ICC) and Spearman correlation coefficient. Results Patients with PD had a smaller (p = 0.043) and slower (p<0.001) handwriting and showed worse writing quality (p = 0.031) compared to controls. The outcomes of the SOS-test significantly correlated with fine motor skill performance and disease duration and severity. Furthermore, the test showed excellent intrarater, interrater and test-retest reliability (ICC > 0.769 for both groups). Conclusion The SOS-test is a short and effective tool to detect handwriting problems in PD with excellent reliability. It can therefore be recommended as a clinical instrument for standardized screening of handwriting deficits in PD. PMID:28253374
Müller-Staub, Maria; Lunney, Margaret; Lavin, Mary Ann; Needham, Ian; Odenbreit, Matthias; van Achterberg, Theo
2010-04-01
The instrument Q-DIO was developed in the years 2005 till 2006 to measure the quality of documented nursing diagnoses, interventions, and nursing sensitive patient outcomes. Testing psychometric properties of the Q-DIO (Quality of nursing Diagnoses, Interventions and Outcomes.) was the study aim. Instrument testing included internal consistency, test-retest reliability, interrater reliability, item analyses, and an assessment of the objectivity. To render variation in scores, a random strata sample of 60 nursing documentations was drawn. The strata represented 30 nursing documentations with and 30 without application of theory based, standardised nursing language. Internal consistency of the subscale nursing diagnoses as process showed Cronbach's Alpha 0.83 [0.78, 0.88]; nursing diagnoses as product 0.98 [0.94, 0.99]; nursing interventions 0.90 [0.85, 0.94]; and nursing-sensitive patient outcomes 0.99 [0.95, 0.99]. With Cohen's Kappa of 0.95, the intrarater reliability was good. The interrater reliability showed a Kappa of 0.94 [0.90, 0.96]. Item analyses confirmed the fulfilment of criteria for degree of difficulty and discriminative validity of the items. In this study, Q-DIO has shown to be a reliable instrument. It allows measuring the documented quality of nursing diagnoses, interventions and outcomes with and without implementation of theory based, standardised nursing languages. Studies for further testing of Q-DIO in other settings are recommended. The results implicitly support the use of nursing classifications such as NANDA, NIC and NOC.
Murray, Nicholas P.; Hunfalvay, Melissa; Bolte, Takumi
2017-01-01
Purpose The purpose of this study was to determine the reliability of interpupillary distance (IPD) and pupil diameter (PD) measures using an infrared eye tracker and central point stimuli. Validity of the test compared to known clinical tools was determined, and normative data was established against which individuals can measure themselves. Methods Participants (416) across various demographics were examined for normative data. Of these, 50 were examined for reliability and validity. Validity for IPD measured the test (RightEye IPD/PD) against the PL850 Pupilometer and the Essilor Digital CRP. For PD, the test was measured against the Rosenbaum Pocket Vision Screener (RPVS). Reliability was analyzed with intraclass correlation coefficients (ICC) between trials with Cronbach's alpha (CA) and the standard error of measurement for each ICC. Convergent validity was investigated by calculating the bivariate correlation coefficient. Results Reliability results were strong (CA > 0.7) for all measures. High positive significant correlations were found between the RightEye IPD test and the PL850 Pupilometer (P < 0.001) and Essilor Digital CRP (P < 0.001) and for the RightEye PD test and the RPVS (P < 0.001). Conclusions Using infrared eye tracking and the RightEye IPD/PD test stimuli, reliable and accurate measures of IPD and PD were found. Results from normative data showed an adequate comparison for people with normal vision development. Translational Relevance Results revealed a central point of fixation may remove variability in examining PD reliably using infrared eye tracking when consistent environmental and experimental procedures are conducted. PMID:28685104
Validity and Reliability of Farsi Version of Youth Sport Environment Questionnaire
Eshghi, Mohammad Ali; Kordi, Ramin; Memari, Amir Hossein; Ghaziasgar, Ahmad; Mansournia, Mohammad-Ali; Zamani Sani, Seyed Hojjat
2015-01-01
The Youth Sport Environment Questionnaire (YSEQ) had been developed from Group Environment Questionnaire, a well-known measure of team cohesion. The aim of this study was to adapt and examine the reliability and validity of the Farsi version of the YSEQ. This version was completed by 455 athletes aged 13–17 years. Results of confirmatory factor analysis indicated that two-factor solution showed a good fit to the data. The results also revealed that the Farsi YSEQ showed high internal consistency, test-retest reliability, and good concurrent validity. This study indicated that the Farsi version of the YSEQ is a valid and reliable measure to assess team cohesion in sport setting. PMID:26464900
Brurok, Berit; Tjønna, Arnt Erik; Tørhaug, Tom; Askim, Torunn
2017-01-01
Background People with stroke have a low peak aerobic capacity and experience increased effort during performance of daily activities. The purpose of this study was to examine test-retest reliability of a portable ergospirometry system in people with stroke during performance of functional activities in a field-test. Secondary aims were to examine the proportion of oxygen consumed during the field-test in relation to the peak-test and to analyse the correlation between the oxygen uptake during the field-test and peak-test in order to support the validity of the field-test. Methods With simultaneous measurement of oxygen consumption, participants performed a standardized field-test consisting of five activities; walking over ground, stair walking, stepping over obstacles, walking slalom between cones and from a standing position lifting objects from one height to another. All activities were performed in self-selected speed. Prior to the field-test, a peak aerobic capacity test was performed. The field-test was repeated minimum 2 and maximum 14 days between the tests. ICC2,1 and Bland Altman tests (Limits of Agreement, LoA) were used to analyse test-retest reliability. Results In total 31 participants (39% women, mean (SD) age 54.5 (12.7) years and 21.1 (14.3) months’ post-stroke) were included. The ICC2,1 was ≥ 0.80 for absolute V̇O2, relative V̇O2, minute ventilation, CO2, respiratory exchange ratio, heart rate and Borgs rating of perceived exertion. ICC2,1 for total time to complete the field-test was 0.99. Mean difference in steady state V̇O2 during Test 1 and Test 2 was -0.40 (2.12) The LoAs were -3.75 and 4.51. Participants spent 60.7% of their V̇O2peak performing functional activities. Correlation between field-test and peak-test was 0.689, p = 0.001 for absolute and 0.733, p = 0.001 for relative V̇O2. Conclusions This study presents first evidence on reliability of oxygen uptake during performance of functional activities after stroke, showing very good test-retest reliability. The secondary analysis showed that the amount of energy spent during the field-test relative to the peak-test was high and the correlation between the two test was good, supporting the validity of this method. PMID:29065164
Aerospace reliability applied to biomedicine.
NASA Technical Reports Server (NTRS)
Lalli, V. R.; Vargo, D. J.
1972-01-01
An analysis is presented that indicates that the reliability and quality assurance methodology selected by NASA to minimize failures in aerospace equipment can be applied directly to biomedical devices to improve hospital equipment reliability. The Space Electric Rocket Test project is used as an example of NASA application of reliability and quality assurance (R&QA) methods. By analogy a comparison is made to show how these same methods can be used in the development of transducers, instrumentation, and complex systems for use in medicine.
Validation of the VISA-A questionnaire for Turkish language: the VISA-A-Tr study.
Dogramaci, Yunus; Kalaci, Aydiner; Kücükkübas, Nigar; Inandi, Taceddin; Esen, Erdinc; Yanat, A Nedim
2011-04-01
To evaluate the validity and reliability of the Turkish version of the Victorian Institute of Sports Assessment-Achilles (VISA-A) questionnaire for patients with Achilles tendinopathy. Fifty-five patients with a diagnosis of Achilles tendinopathy and 55 healthy subjects were included in the study. VISA-A questionnaires were translated and culturally adapted into Turkish. The final Turkish version (VISA-A-Tr) was tested for reliability on healthy individuals and patients. Tests for internal consistency, validity and structure were performed on 55 patients. The VISA-A-Tr showed good test-retest reliability (Pearson's r=0.99, p<0.001). The patients with Achilles tendinopathy had a significantly lower score (p<0.001) than the healthy individuals. The VISA-A-Tr score correlated significantly with the Stanish tendon grading system (Spearman's r=-0.86; p<0.001). The VISA-A-Tr is a valid and reliable tool for evaluating the severity of Achilles tendinopathy.
Reliability and criterion-related validity of a new repeated agility test
Makni, E; Jemni, M; Elloumi, M; Chamari, K; Nabli, MA; Padulo, J; Moalla, W
2016-01-01
The study aimed to assess the reliability and the criterion-related validity of a new repeated sprint T-test (RSTT) that includes intense multidirectional intermittent efforts. The RSTT consisted of 7 maximal repeated executions of the agility T-test with 25 s of passive recovery rest in between. Forty-five team sports players performed two RSTTs separated by 3 days to assess the reliability of best time (BT) and total time (TT) of the RSTT. The intra-class correlation coefficient analysis revealed a high relative reliability between test and retest for BT and TT (>0.90). The standard error of measurement (<0.50) showed that the RSTT has a good absolute reliability. The minimal detectable change values for BT and TT related to the RSTT were 0.09 s and 0.58 s, respectively. To check the criterion-related validity of the RSTT, players performed a repeated linear sprint (RLS) and a repeated sprint with changes of direction (RSCD). Significant correlations between the BT and TT of the RLS, RSCD and RSTT were observed (p<0.001). The RSTT is, therefore, a reliable and valid measure of the intermittent repeated sprint agility performance. As this ability is required in all team sports, it is suggested that team sports coaches, fitness coaches and sports scientists consider this test in their training follow-up. PMID:27274109
NASA Technical Reports Server (NTRS)
Lathrop, J. W.; Davis, C. W.; Royal, E.
1982-01-01
The use of accelerated testing methods in a program to determine the reliability attributes of terrestrial silicon solar cells is discussed. Different failure modes are to be expected when cells with and without encapsulation are subjected to accelerated testing and separate test schedules for each are described. Unencapsulated test cells having slight variations in metallization are used to illustrate how accelerated testing can highlight different diffusion related failure mechanisms. The usefulness of accelerated testing when applied to encapsulated cells is illustrated by results showing that moisture related degradation may be many times worse with some forms of encapsulation than with no encapsulation at all.
Kanehara, Akiko; Kotake, Risa; Miyamoto, Yuki; Kumakura, Yousuke; Morita, Kentaro; Ishiura, Tomoko; Shimizu, Kimiko; Fujieda, Yumiko; Ando, Shuntaro; Kondo, Shinsuke; Kasai, Kiyoto
2017-11-07
Personal recovery is increasingly recognised as an important outcome measure in mental health services. This study aimed to develop a Japanese version of the Questionnaire about the Process of Recovery (QPR-J) and test its validity and reliability. The study comprised two stages that employed the cross-sectional and prospective cohort designs, respectively. We translated the questionnaire using a standard translation/back-translation method. Convergent validity was examined by calculating Pearson's correlation coefficients with scores on the Recovery Assessment Scale (RAS) and the Short-Form-8 Health Survey (SF-8). An exploratory factor analysis (EFA) was conducted to examine factorial validity. We used intraclass correlation and Cronbach's alpha to examine the test-retest and internal consistency reliability of the QPR-J's 22-item full scale, 17-item intrapersonal and 5-item interpersonal subscales. We conducted an EFA along with a confirmatory factor analysis (CFA). Data were obtained from 197 users of mental health services (mean age: 42.0 years; 61.9% female; 49.2% diagnosed with schizophrenia). The QPR-J showed adequate convergent validity, exhibiting significant, positive correlations with the RAS and SF-8 scores. The QPR-J's full version, subscales, showed excellent test-retest and internal consistency reliability, with the exception of acceptable but relatively low internal consistency reliability for the interpersonal subscale. Based on the results of the CFA and EFA, we adopted the factor structure extracted from the original 2-factor model based on the present CFA. The QPR-J is an adequately valid and reliable measure of the process of recovery among Japanese users with mental health services.
Deskovitz, Mark A; Weed, Nathan C; McLaughlan, Joseph K; Williams, John E
2016-04-01
The reliability of six Minnesota Multiphasic Personality Inventory-Second edition (MMPI-2) computer-based test interpretation (CBTI) programs was evaluated across a set of 20 commonly appearing MMPI-2 profile codetypes in clinical settings. Evaluation of CBTI reliability comprised examination of (a) interrater reliability, the degree to which raters arrive at similar inferences based on the same CBTI profile and (b) interprogram reliability, the level of agreement across different CBTI systems. Profile inferences drawn by four raters were operationalized using q-sort methodology. Results revealed no significant differences overall with regard to interrater and interprogram reliability. Some specific CBTI/profile combinations (e.g., the CBTI by Automated Assessment Associates on a within normal limits profile) and specific profiles (e.g., the 4/9 profile displayed greater interprogram reliability than the 2/4 profile) were interpreted with variable consensus (α range = .21-.95). In practice, users should consider that certain MMPI-2 profiles are interpreted more or less consensually and that some CBTIs show variable reliability depending on the profile. © The Author(s) 2015.
Lo, Wing-Sze; Ho, Sai-Yin; Wong, Bonny Yee-Man; Mak, Kwok-Kei; Lam, Tai-Hing
2011-06-01
The reliability and validity of Stunkard's Figure Rating Scale (FRS) as a measure of current body size (CBS) was established in Western adolescent girls but not in non-Western population. We examined the validity and test-retest reliability of Stunkard's FRS in assessing CBS among Chinese adolescents. Methods. In a school-based survey in Hong Kong, 5666 adolescents (boys: 45.1%; mean age 14.7 years) provided data on self-reported height and weight, CBS, perceived weight status, and health-related quality of life using the Medical Outcomes Study Short-Form version 2 (SF-12v2). Height and weight were also objectively measured. Spearman's correlation was used to assess construct validity, concurrent validity and test-retest reliability. Convergent and discriminant validity were good: CBS correlated strongly with weight and self-reported/measured BMI, but only weakly with SF-12v2. CBS correlated strongly with perceived weight status, showing concurrent validity. Spearman's correlation (r) for CBS was 0.78 for girls and 0.72 for boys indicating good test-retest reliability. Validity and reliability results did not differ significantly between senior and junior grade adolescents. Our findings support the use of Stunkard's FRS to measure body size among Chinese adolescents.
An empirical look at the Defense Mechanism Test (DMT): reliability and construct validity.
Ekehammar, Bo; Zuber, Irena; Konstenius, Marja-Liisa
2005-07-01
Although the Defense Mechanism Test (DMT) has been in use for almost half a century, there are still quite contradictory views about whether it is a reliable instrument, and if so, what it really measures. Thus, based on data from 39 female students, we first examined DMT inter-coder reliability by analyzing the agreement among trained judges in their coding of the same DMT protocols. Second, we constructed a "parallel" photographic picture that retained all structural characteristic of the original and analyzed DMT parallel-test reliability. Third, we examined the construct validity of the DMT by (a) employing three self-report defense-mechanism inventories and analyzing the intercorrelations between DMT defense scores and corresponding defenses in these instruments, (b) studying the relationships between DMT responses and scores on trait and state anxiety, and (c) relating DMT-defense scores to measures of self-esteem. The main results showed that the DMT can be coded with high reliability by trained coders, that the parallel-test reliability is unsatisfactory compared to traditional psychometric standards, that there is a certain generalizability in the number of perceptual distortions that people display from one picture to another, and that the construct validation provided meager empirical evidence for the conclusion that the DMT measures what it purports to measure, that is, psychological defense mechanisms.
Grooten, Wilhelmus Johannes Andreas; Sandberg, Lisa; Ressman, John; Diamantoglou, Nicolas; Johansson, Elin; Rasmussen-Barr, Eva
2018-01-08
Clinical examinations are subjective and often show a low validity and reliability. Objective and highly reliable quantitative assessments are available in laboratory settings using 3D motion analysis, but these systems are too expensive to use for simple clinical examinations. Qinematic™ is an interactive movement analyses system based on the Kinect camera and is an easy-to-use clinical measurement system for assessing posture, balance and side-bending. The aim of the study was to test the test-retest the reliability and construct validity of Qinematic™ in a healthy population, and to calculate the minimal clinical differences for the variables of interest. A further aim was to identify the discriminative validity of Qinematic™ in people with low-back pain (LBP). We performed a test-retest reliability study (n = 37) with around 1 week between the occasions, a construct validity study (n = 30) in which Qinematic™ was tested against a 3D motion capture system, and a discriminative validity study, in which a group of people with LBP (n = 20) was compared to healthy controls (n = 17). We tested a large range of psychometric properties of 18 variables in three sections: posture (head and pelvic position, weight distribution), balance (sway area and velocity in single- and double-leg stance), and side-bending. The majority of the variables in the posture and balance sections, showed poor/fair reliability (ICC < 0.4) and poor/fair validity (Spearman <0.4), with significant differences between occasions, between Qinematic™ and the 3D-motion capture system. In the clinical study, Qinematic™ did not differ between people with LPB and healthy for these variables. For one variable, side-bending to the left, there was excellent reliability (ICC =0.898), excellent validity (r = 0.943), and Qinematic™ could differentiate between LPB and healthy individuals (p = 0.012). This paper shows that a novel software program (Qinematic™) based on the Kinect camera for measuring balance, posture and side-bending has poor psychometric properties, indicating that the variables on balance and posture should not be used for monitoring individual changes over time or in research. Future research on the dynamic tasks of Qinematic™ is warranted.
Steenson, Sharalyn; Özcebe, Hilal; Arslan, Umut; Konşuk Ünlü, Hande; Araz, Özgür M; Yardim, Mahmut; Üner, Sarp; Bilir, Nazmi; Huang, Terry T-K
2018-01-01
Childhood obesity rates have been rising rapidly in developing countries. A better understanding of the risk factors and social context is necessary to inform public health interventions and policies. This paper describes the validation of several measurement scales for use in Turkey, which relate to child and parent perceptions of physical activity (PA) and enablers and barriers of physical activity in the home environment. The aim of this study was to assess the validity and reliability of several measurement scales in Turkey using a population sample across three socio-economic strata in the Turkish capital, Ankara. Surveys were conducted in Grade 4 children (mean age = 9.7 years for boys; 9.9 years for girls), and their parents, across 6 randomly selected schools, stratified by SES (n = 641 students, 483 parents). Construct validity of the scales was evaluated through exploratory and confirmatory factor analysis. Internal consistency of scales and test-retest reliability were assessed by Cronbach's alpha and intra-class correlation. The scales as a whole were found to have acceptable-to-good model fit statistics (PA Barriers: RMSEA = 0.076, SRMR = 0.0577, AGFI = 0.901; PA Outcome Expectancies: RMSEA = 0.054, SRMR = 0.0545, AGFI = 0.916, and PA Home Environment: RMSEA = 0.038, SRMR = 0.0233, AGFI = 0.976). The PA Barriers subscales showed good internal consistency and poor to fair test-retest reliability (personal α = 0.79, ICC = 0.29, environmental α = 0.73, ICC = 0.59). The PA Outcome Expectancies subscales showed good internal consistency and test-retest reliability (negative α = 0.77, ICC = 0.56; positive α = 0.74, ICC = 0.49). Only the PA Home Environment subscale on support for PA was validated in the final confirmatory model; it showed moderate internal consistency and test-retest reliability (α = 0.61, ICC = 0.48). This study is the first to validate measures of perceptions of physical activity and the physical activity home environment in Turkey. Our results support the originally hypothesized two-factor structures for Physical Activity Barriers and Physical Activity Outcome Expectancies. However, we found the one-factor rather than two-factor structure for Physical Activity Home Environment had the best model fit. This study provides general support for the use of these scales in Turkey in terms of validity, but test-retest reliability warrants further research.
Validity of a novel computerized screening test system for mild cognitive impairment.
Park, Jin-Hyuck; Jung, Minye; Kim, Jongbae; Park, Hae Yean; Kim, Jung-Ran; Park, Ji-Hyuk
2018-06-20
ABSTRACTBackground:The mobile screening test system for screening mild cognitive impairment (mSTS-MCI) was developed for clinical use. However, the clinical usefulness of mSTS-MCI to detect elderly with MCI from those who are cognitively healthy has yet to be validated. Moreover, the comparability between this system and traditional screening tests for MCI has not been evaluated. The purpose of this study was to examine the validity and reliability of the mSTS-MCI and confirm the cut-off scores to detect MCI. The data were collected from 107 healthy elderly people and 74 elderly people with MCI. Concurrent validity was examined using the Korean version of Montreal Cognitive Assessment (MoCA-K) as a gold standard test, and test-retest reliability was investigated using 30 of the study participants at four-week intervals. The sensitivity, specificity, positive predictive value, and negative predictive value (NPV) were confirmed through Receiver Operating Characteristic (ROC) analysis, and the cut-off scores for elderly people with MCI were identified. Concurrent validity showed statistically significant correlations between the mSTS-MCI and MoCA-K and test-rests reliability indicated high correlation. As a result of screening predictability, the mSTS-MCI had a higher NPV than the MoCA-K. The mSTS-MCI was identified as a system with a high degree of validity and reliability. In addition, the mSTS-MCI showed high screening predictability, indicating it can be used in the clinical field as a screening test system for mild cognitive impairment.
Development, validity, and reliability of a ballet-specific aerobic fitness test.
Twitchett, Emily; Nevill, Alan; Angioi, Manuela; Koutedakis, Yiannis; Wyon, Matthew
2011-09-01
The aim of this study was to develop and assess the reliability and validity of a multi-stage, ballet-specific aerobic fitness test to be used in a dance studio setting. The test consists of five stages, each four minutes long, that increase in intensity. It uses classical ballet movement of an intermediate-level of difficulty, thus emphasizing physiological demand rather than skill. The demand of each stage was determined by calculating the mean oxygen uptake during its final minute using a portable gas analyser. After an initial familiarization period, eight female subjects performed the test twice within seven days. The results showed significant differences in oxygen consumption between stages (p < 0.001), but not between trials. Pearson correlation co-efficients produced a very good linear relationship between trials (r = 0.998, p < 0.001). Bland-Altman reliability analysis revealed the 95% limits of agreement to be ± 6.2 ml·kg(-1)·min(-1), showing good agreement between trials. The oxygen uptake in our subjects equated positively to previous estimates for class and performance, confirming validity. It was concluded that the test is suitable for use among classical ballet dancers, with many possible applications.
NASA Astrophysics Data System (ADS)
Mattila, Toni T.; Hokka, Jussi; Paulasto-Kröckel, Mervi
2014-11-01
In this study, the performance of three microalloyed Sn-Ag-Cu solder interconnection compositions (Sn-3.1Ag-0.52Cu, Sn-3.0Ag-0.52Cu-0.24Bi, and Sn-1.1Ag-0.52Cu-0.1Ni) was compared under mechanical shock loading (JESD22-B111 standard) and cyclic thermal loading (40 ± 125°C, 42 min cycle) conditions. In the drop tests, the component boards with the low-silver nickel-containing composition (Sn-Ag-Cu-Ni) showed the highest average number of drops-to-failure, while those with the bismuth-containing alloy (Sn-Ag-Cu-Bi) showed the lowest. Results of the thermal cycling tests showed that boards with Sn-Ag-Cu-Bi interconnections performed the best, while those with Sn-Ag-Cu-Ni performed the worst. Sn-Ag-Cu was placed in the middle in both tests. In this paper, we demonstrate that solder strength is an essential reliability factor and that higher strength can be beneficial for thermal cycling reliability but detrimental to drop reliability. We discuss these findings from the perspective of the microstructures and mechanical properties of the three solder interconnection compositions and, based on a comprehensive literature review, investigate how the differences in the solder compositions influence the mechanical properties of the interconnections and discuss how the differences are reflected in the failure mechanisms under both loading conditions.
Alyusuf, Raja H.; Prasad, Kameshwar; Abdel Satir, Ali M.; Abalkhail, Ali A.; Arora, Roopa K.
2013-01-01
Background: The exponential use of the internet as a learning resource coupled with varied quality of many websites, lead to a need to identify suitable websites for teaching purposes. Aim: The aim of this study is to develop and to validate a tool, which evaluates the quality of undergraduate medical educational websites; and apply it to the field of pathology. Methods: A tool was devised through several steps of item generation, reduction, weightage, pilot testing, post-pilot modification of the tool and validating the tool. Tool validation included measurement of inter-observer reliability; and generation of criterion related, construct related and content related validity. The validated tool was subsequently tested by applying it to a population of pathology websites. Results and Discussion: Reliability testing showed a high internal consistency reliability (Cronbach's alpha = 0.92), high inter-observer reliability (Pearson's correlation r = 0.88), intraclass correlation coefficient = 0.85 and κ =0.75. It showed high criterion related, construct related and content related validity. The tool showed moderately high concordance with the gold standard (κ =0.61); 92.2% sensitivity, 67.8% specificity, 75.6% positive predictive value and 88.9% negative predictive value. The validated tool was applied to 278 websites; 29.9% were rated as recommended, 41.0% as recommended with caution and 29.1% as not recommended. Conclusion: A systematic tool was devised to evaluate the quality of websites for medical educational purposes. The tool was shown to yield reliable and valid inferences through its application to pathology websites. PMID:24392243
Short- and long-term reliability of language fMRI.
Nettekoven, Charlotte; Reck, Nicola; Goldbrunner, Roland; Grefkes, Christian; Weiß Lucas, Carolin
2018-08-01
When using functional magnetic resonance imaging (fMRI) for mapping important language functions, a high test-retest reliability is mandatory, both in basic scientific research and for clinical applications. We, therefore, systematically tested the short- and long-term reliability of fMRI in a group of healthy subjects using a picture naming task and a sparse-sampling fMRI protocol. We hypothesized that test-retest reliability might be higher for (i) speech-related motor areas than for other language areas and for (ii) the short as compared to the long intersession interval. 16 right-handed subjects (mean age: 29 years) participated in three sessions separated by 2-6 (session 1 and 2, short-term) and 21-34 days (session 1 and 3, long-term). Subjects were asked to perform the same overt picture naming task in each fMRI session (50 black-white images per session). Reliability was tested using the following measures: (i) Euclidean distances (ED) between local activation maxima and Centers of Gravity (CoGs), (ii) overlap volumes and (iii) voxel-wise intraclass correlation coefficients (ICCs). Analyses were performed for three regions of interest which were chosen based on whole-brain group data: primary motor cortex (M1), superior temporal gyrus (STG) and inferior frontal gyrus (IFG). Our results revealed that the activation centers were highly reliable, independent of the time interval, ROI or hemisphere with significantly smaller ED for the local activation maxima (6.45 ± 1.36 mm) as compared to the CoGs (8.03 ± 2.01 mm). In contrast, the extent of activation revealed rather low reliability values with overlaps ranging from 24% (IFG) to 56% (STG). Here, the left hemisphere showed significantly higher overlap volumes than the right hemisphere. Although mean ICCs ranged between poor (ICC<0.5) and moderate (ICC 0.5-0.74) reliability, highly reliable voxels (ICC>0.75) were found for all ROIs. Voxel-wise reliability of the different ROIs was influenced by the intersession interval. Taken together, we could show that, despite of considerable ROI-dependent variations of the extent of activation over time, highly reliable centers of activation can be identified using an overt picture naming paradigm. Copyright © 2018 Elsevier Inc. All rights reserved.
Validation of the German version of the Ford Insomnia Response to Stress Test.
Dieck, Arne; Helbig, Susanne; Drake, Christopher L; Backhaus, Jutta
2018-06-01
The purpose of this study was to assess the psychometric properties of a German version of the Ford Insomnia Response to Stress Test with groups with and without sleep problems. Three studies were analysed. Data set 1 was based on an initial screening for a sleep training program (n = 393), data set 2 was based on a study to test the test-retest reliability of the Ford Insomnia Response to Stress Test (n = 284) and data set 3 was based on a study to examine the influence of competitive sport on sleep (n = 37). Data sets 1 and 2 were used to test internal consistency, factor structure, convergent validity, discriminant validity and test-retest reliability of the Ford Insomnia Response to Stress Test. Content validity was tested using data set 3. Cronbach's alpha of the Ford Insomnia Response to Stress Test was good (α = 0.80) and test-retest reliability was satisfactory (r = 0.72). Overall, the one-factor model showed the best fit. Furthermore, significant positive correlations between the Ford Insomnia Response to Stress Test and impaired sleep quality, depression and stress reactivity were in line with the expectations regarding the convergent validity. Subjects with sleep problems had significantly higher scores in the Ford Insomnia Response to Stress Test than subjects without sleep problems (P < 0.01). Competitive athletes with higher scores in the Ford Insomnia Response to Stress Test had significantly lower sleep quality (P = 0.01), demonstrating that vulnerability for stress-induced sleep disturbances accompanies poorer sleep quality in stressful episodes. The findings show that the German version of the Ford Insomnia Response to Stress Test is a reliable and valid questionnaire to assess the vulnerability to stress-induced sleep disturbances. © 2017 European Sleep Research Society.
[Reliability and validity of the Braden Scale for predicting pressure sore risk].
Boes, C
2000-12-01
For more accurate and objective pressure sore risk assessment various risk assessment tools were developed mainly in the USA and Great Britain. The Braden Scale for Predicting Pressure Sore Risk is one such example. By means of a literature analysis of German and English texts referring to the Braden Scale the scientific control criteria reliability and validity will be traced and consequences for application of the scale in Germany will be demonstrated. Analysis of 4 reliability studies shows an exclusive focus on interrater reliability. Further, even though examination of 19 validity studies occurs in many different settings, such examination is limited to the criteria sensitivity and specificity (accuracy). The range of sensitivity and specificity level is 35-100%. The recommended cut off points rank in the field of 10 to 19 points. The studies prove to be not comparable with each other. Furthermore, distortions in these studies can be found which affect accuracy of the scale. The results of the here presented analysis show an insufficient proof for reliability and validity in the American studies. In Germany, the Braden scale has not yet been tested under scientific criteria. Such testing is needed before using the scale in different German settings. During the course of such testing, construction and study procedures of the American studies can be used as a basis as can the problems be identified in the analysis presented below.
Miniature sheathed thermocouples for turbine blade temperature measurement
NASA Technical Reports Server (NTRS)
Holanda, R.; Glawe, G. E.; Krause, L. N.
1974-01-01
An investigation was made of sheathed thermocouples for turbine blade temperature measurements. Tests were performed on the Chromel-Alumel sheathed thermocouples with both two-wire and single-wire configurations. Sheath diameters ranged from 0.25 to 0.76 mm, and temperatures ranged from 1080 to 1250 K. Both steady-state and thermal cycling tests were performed for times up to 450 hr. Special-order and commercial-grade thermocouples were tested. The tests showed that special-order single-wire sheathed thermocouples can be obtained that are reliable and accurate with diameters as small as 0.25 mm. However, all samples of 0.25-mm-diameter sheathed commercial-grade two-wire and single-wire thermocouples that were tested showed unacceptable drift rates for long-duration engine testing programs. The drift rates were about 1 percent in 10 hr. A thermocouple drift test is recommended in addition to the normal acceptance tests in order to select reliable miniature sheathed thermocouples for turbine blade applications.
Toward extending the educational interpreter performance assessment to cued speech.
Krause, Jean C; Kegl, Judy A; Schick, Brenda
2008-01-01
The Educational Interpreter Performance Assessment (EIPA) is as an important research tool for examining the quality of interpreters who use American Sign Language or a sign system in classroom settings, but it is not currently applicable to educational interpreters who use Cued Speech (CS). In order to determine the feasibility of extending the EIPA to include CS, a pilot EIPA test was developed and administered to 24 educational CS interpreters. Fifteen of the interpreters' performances were evaluated two to three times in order to assess reliability. Results show that the instrument has good construct validity and test-retest reliability. Although more interrater reliability data are needed, intrarater reliability was quite high (0.9), suggesting that the pilot test can be rated as reliably as signing versions of the EIPA. Notably, only 48% of interpreters who formally participated in pilot testing performed at a level that could be considered minimally acceptable. In light of similar performance levels previously reported for interpreters who sign (e.g., Schick, Williams, & Kupermintz, 2006), these results suggest that interpreting services for deaf and hard-of hearing students, regardless of the communication option used, are often inadequate and could seriously hinder access to the classroom environment.
Hartling, Lisa; Bond, Kenneth; Santaguida, P Lina; Viswanathan, Meera; Dryden, Donna M
2011-08-01
To develop and test a study design classification tool. We contacted relevant organizations and individuals to identify tools used to classify study designs and ranked these using predefined criteria. The highest ranked tool was a design algorithm developed, but no longer advocated, by the Cochrane Non-Randomized Studies Methods Group; this was modified to include additional study designs and decision points. We developed a reference classification for 30 studies; 6 testers applied the tool to these studies. Interrater reliability (Fleiss' κ) and accuracy against the reference classification were assessed. The tool was further revised and retested. Initial reliability was fair among the testers (κ=0.26) and the reference standard raters κ=0.33). Testing after revisions showed improved reliability (κ=0.45, moderate agreement) with improved, but still low, accuracy. The most common disagreements were whether the study design was experimental (5 of 15 studies), and whether there was a comparison of any kind (4 of 15 studies). Agreement was higher among testers who had completed graduate level training versus those who had not. The moderate reliability and low accuracy may be because of lack of clarity and comprehensiveness of the tool, inadequate reporting of the studies, and variability in tester characteristics. The results may not be generalizable to all published studies, as the test studies were selected because they had posed challenges for previous reviewers with respect to their design classification. Application of such a tool should be accompanied by training, pilot testing, and context-specific decision rules. Copyright © 2011 Elsevier Inc. All rights reserved.
Reliability of cognitive tests of ELSA-Brasil, the brazilian longitudinal study of adult health
Batista, Juliana Alves; Giatti, Luana; Barreto, Sandhi Maria; Galery, Ana Roscoe Papini; Passos, Valéria Maria de Azeredo
2013-01-01
Cognitive function evaluation entails the use of neuropsychological tests, applied exclusively or in sequence. The results of these tests may be influenced by factors related to the environment, the interviewer or the interviewee. OBJECTIVES We examined the test-retest reliability of some tests of the Brazilian version from the Consortium to Establish a Registry for Alzheimer's disease. METHODS The ELSA-Brasil is a multicentre study of civil servants (35-74 years of age) from public institutions across six Brazilian States. The same tests were applied, in different order of appearance, by the same trained and certified interviewer, with an approximate 20-day interval, to 160 adults (51% men, mean age 52 years). The Intraclass Correlation Coefficient (ICC) was used to assess the reliability of the measures; and a dispersion graph was used to examine the patterns of agreement between them. RESULTS We observed higher retest scores in all tests as well as a shorter test completion time for the Trail Making Test B. ICC values for each test were as following: Word List Learning Test (0.56), Word Recall (0.50), Word Recognition (0.35), Phonemic Verbal Fluency Test (VFT, 0.61), Semantic VFT (0.53) and Trail B (0.91). The Bland-Altman plot showed better correlation of executive function (VFT and Trail B) than of memory tests. CONCLUSIONS Better performance in retest may reflect a learning effect, and suggest that retest should be repeated using alternate forms or after longer periods. In this sample of adults with high schooling level, reliability was only moderate for memory tests whereas the measurement of executive function proved more reliable. PMID:29213860
Test-retest reliability of an fMRI paradigm for studies of cardiovascular reactivity.
Sheu, Lei K; Jennings, J Richard; Gianaros, Peter J
2012-07-01
We examined the reliability of measures of fMRI, subjective, and cardiovascular reactions to standardized versions of a Stroop color-word task and a multisource interference task. A sample of 14 men and 12 women (30-49 years old) completed the tasks on two occasions, separated by a median of 88 days. The reliability of fMRI BOLD signal changes in brain areas engaged by the tasks was moderate, and aggregating fMRI BOLD signal changes across the tasks improved test-retest reliability metrics. These metrics included voxel-wise intraclass correlation coefficients (ICCs) and overlap ratio statistics. Task-aggregated ratings of subjective arousal, valence, and control, as well as cardiovascular reactions evoked by the tasks showed ICCs of 0.57 to 0.87 (ps < .001), indicating moderate-to-strong reliability. These findings support using these tasks as a battery for fMRI studies of cardiovascular reactivity. Copyright © 2012 Society for Psychophysiological Research.
NASA Astrophysics Data System (ADS)
Rosas, Pedro; Wagemans, Johan; Ernst, Marc O.; Wichmann, Felix A.
2005-05-01
A number of models of depth-cue combination suggest that the final depth percept results from a weighted average of independent depth estimates based on the different cues available. The weight of each cue in such an average is thought to depend on the reliability of each cue. In principle, such a depth estimation could be statistically optimal in the sense of producing the minimum-variance unbiased estimator that can be constructed from the available information. Here we test such models by using visual and haptic depth information. Different texture types produce differences in slant-discrimination performance, thus providing a means for testing a reliability-sensitive cue-combination model with texture as one of the cues to slant. Our results show that the weights for the cues were generally sensitive to their reliability but fell short of statistically optimal combination - we find reliability-based reweighting but not statistically optimal cue combination.
Fieseler, Georg; Molitor, Thomas; Irlenbusch, Lars; Delank, Karl-Stefan; Laudner, Kevin G; Hermassi, Souhail; Schwesig, Rene
2015-12-01
To evaluate the intrarater reliability for examining active range of motion (ROM) and isometric strength of the shoulder and elbow among asymptomatic female team handball athletes and a control group using a manual goniometer and hand-held dynamometry (HHD). 22 female team handball athletes (age: 21.0 ± 3.7 years) and 25 volunteers (13 female, 12 male, age: 21.9 ± 1.24 years) participated to determine bilateral ROM for shoulder rotation and elbow flexion/extension, as well as isometric shoulder rotation and elbow flexion/extension strength. Subjects were assessed on two separate test sessions with 7 days between sessions. Relative (intraclass correlation coefficients (ICC) and standard error of measurement (SEM) reliability were calculated. Reliability for ROM and strength were good to excellent for both shoulders and groups (athletes: ICC = 0.94-0.97, SEM 1.07°-4.76 N, controls: ICC = 0.96-1.00, SEM = 0.00 N-4.48 N). Elbow measurements for both groups also showed good-to-excellent reliability (athletes: ICC = 0.79-0.97, SEM = 0.98°-5.94 N, controls: ICC = 0.87-1.00, SEM = 0.00 N-5.43 N). It is important to be able to reliably reproduce active ROM and isometric strength evaluations. Using a standardized testing position, goniometry and HHD are reliable instruments in the assessment of shoulder and elbow joint performance testing. We showed good-to-excellent reproducible results for male and female control subjects and female handball athletes, although the single parameters in ROM and strength were different for each group and between the shoulders and elbows.
Rosales, Roberto S; Martin-Hidalgo, Yolanda; Reboso-Morales, Luis; Atroshi, Isam
2016-03-03
The purpose of this study was to assess the reliability and construct validity of the Spanish version of the 6-item carpal tunnel syndrome (CTS) symptoms scale (CTS-6). In this cross-sectional study 40 patients diagnosed with CTS based on clinical and neurophysiologic criteria, completed the standard Spanish versions of the CTS-6 and the disabilities of the arm, shoulder and hand (QuickDASH) scales on two occasions with a 1-week interval. Internal-consistency reliability was assessed with the Cronbach alpha coefficient and test-retest reliability with the intraclass correlation coefficient, two way random effect model and absolute agreement definition (ICC2,1). Cross-sectional precision was analyzed with the Standard Error of the Measurement (SEM). Longitudinal precision for test-retest reliability coefficient was assessed with the Standard Error of the Measurement difference (SEMdiff) and the Minimal Detectable Change at 95 % confidence level (MDC95). For assessing construct validity it was hypothesized that the CTS-6 would have a strong positive correlation with the QuickDASH, analyzed with the Pearson correlation coefficient (r). The standard Spanish version of the CTS-6 presented a Cronbach alpha of 0.81 with a SEM of 0.3. Test-retest reliability showed an ICC of 0.85 with a SRMdiff of 0.36 and a MDC95 of 0.7. The correlation between CTS-6 and the QuickDASH was concordant with the a priori formulated construct hypothesis (r 0.69) CONCLUSIONS: The standard Spanish version of the 6-item CTS symptoms scale showed good internal consistency, test-retest reliability and construct validity for outcomes assessment in CTS. The CTS-6 will be useful to clinicians and researchers in Spanish speaking parts of the world. The use of standardized outcome measures across countries also will facilitate comparison of research results in carpal tunnel syndrome.
Jensen, Christian Gaden; Niclasen, Janni; Vangkilde, Signe Allerup; Petersen, Anders; Hasselbalch, Steen Gregers
2016-05-01
The Mindful Attention Awareness Scale (MAAS) measures perceived degree of inattentiveness in different contexts and is often used as a reversed indicator of mindfulness. MAAS is hypothesized to reflect a psychological trait or disposition when used outside attentional training contexts, but the long-term test-retest reliability of MAAS scores is virtually untested. It is unknown whether MAAS predicts psychological health after controlling for standardized socioeconomic status classifications. First, MAAS translated to Danish was validated psychometrically within a randomly invited healthy adult community sample (N = 490). Factor analysis confirmed that MAAS scores quantified a unifactorial construct of excellent composite reliability and consistent convergent validity. Structural equation modeling revealed that MAAS scores contributed independently to predicting psychological distress and mental health, after controlling for age, gender, income, socioeconomic occupational class, stressful life events, and social desirability (β = 0.32-.42, ps < .001). Second, MAAS scores showed satisfactory short-term test-retest reliability in 100 retested healthy university students. Finally, MAAS sample mean scores as well as individuals' scores demonstrated satisfactory test-retest reliability across a 6 months interval in the adult community (retested N = 407), intraclass correlations ≥ .74. MAAS scores displayed significantly stronger long-term test-retest reliability than scores measuring psychological distress (z = 2.78, p = .005). Test-retest reliability estimates did not differ within demographic and socioeconomic strata. Scores on the Danish MAAS were psychometrically validated in healthy adults. MAAS's inattentiveness scores reflected a unidimensional construct, long-term reliable disposition, and a factor of independent significance for predicting psychological health. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Nazary-Moghadam, Salman; Zeinalzadeh, Afsaneh; Salavati, Mahyar; Almasi, Simin; Negahban, Hossein
2017-01-01
The aim of the present study was to culturally adapt and evaluate reliability and validity of Health Assessment Questionnaire-Disability Index (HAQ-DI) in Iranian patients with rheumatoid arthritis (RA). 234 patients with RA for validation study, Eighty-six participants for reliability study. Test-retest relative reliability and internal consistency of Persian version of HAQ-DI were examined by intraclass correlation coefficient (ICC) and Cronbach's alpha, respectively. Additionally, HAQ-DI construct validity (Spearman's correlation) was examined using Persian version of Short-Form 36 Health survey (SF-36), activity and severity parameters. Persian version of HAQ-DI total score showed excellent test-retest reliability (ICC = 0.98) and internal consistency (Cronbach's alpha = 0.95). Spearman's correlations between the total PHAQ-DI score and activity and severity parameters were above 0.55. Correlation between PHAQ-DI and SF-36 Physical Health were higher as compared with SF-36 Mental Health. Persian version of HAQ-DI is a reliable and valid culturally-adapted instrument in order to measure functional limitations in Iranian people with RA. Copyright © 2016 Elsevier Ltd. All rights reserved.
Matrisch, M; Trampisch, U; Klaassen-Mielke, R; Pientka, L; Trampisch, H J; Thiem, U
2012-04-01
To assess cognitive impairment or dementia in epidemiologic studies using telephone interviews for data acquisition, valid, reliable and short instruments suitable for telephone administration are required. For the Telephone Interview for Cognitive Status (TICS) in its modified German version, the only instrument used in Germany so far, more data on reliability and practicability are needed. Participants were recruited in the offices of nine primary care physicians. Data from 197 participants (115 females, mean age 78.5±4.1 years) who were tested by telephone and in the office by means of the Mini-Mental State Examination (MMSE) were used for the evaluation. For assessing reliability, a group of 91 participants (55 females, mean age 78.1±4.1 years) was contacted twice during 30 days to be tested during a telephone interview by means of the TICS in its modified German version. The intraclass correlation coefficient (ICC), a measure of reliability, was 0.67 [95% confidence interval (CI): 0.53; 0.77]. The Bland-Altman plot did not reveal any relationship between the variability of the difference between repeated measures and the total amount of the measure. For the overall TICS score, no differences were found between repeated measurements. However, the tasks recall of the word list and counting backwards showed some improvement in the repeated tests. TICS and MMSE showed only moderate correlation, with a correlation coefficient of 0.48 (95% CI: 0.36; 0.58). TICS values were dependent on age and educational level of the person tested. The TICS in its modified German version appears to be of acceptable reliability for the assessment of cognitive impairment during a telephone interview. TICS values depend on age and educational level of the person tested. TICS and MMSE correlate only moderately.
Haugum, Mona; Iversen, Hilde Hestad; Bjertnaes, Oyvind; Lindahl, Anne Karin
2017-02-20
Patient experiences are an important aspect of health care quality, but there is a lack of validated instruments for their measurement in the substance dependence literature. A new questionnaire to measure inpatients' experiences of interdisciplinary treatment for substance dependence has been developed in Norway. The aim of this study was to psychometrically test the new questionnaire, using data from a national survey in 2013. The questionnaire was developed based on a literature review, qualitative interviews with patients, expert group discussions and pretesting. Data were collected in a national survey covering all residential facilities with inpatients in treatment for substance dependence in 2013. Data quality and psychometric properties were assessed, including ceiling effects, item missing, exploratory factor analysis, and tests of internal consistency reliability, test-retest reliability and construct validity. The sample included 978 inpatients present at 98 residential institutions. After correcting for excluded patients (n = 175), the response rate was 91.4%. 28 out of 33 items had less than 20.5% of missing data or replies in the "not applicable" category. All but one item met the ceiling effect criterion of less than 50.0% of the responses in the most favorable category. Exploratory factor analysis resulted in three scales: "treatment and personnel", "milieu" and "outcome". All scales showed satisfactory internal consistency reliability (Cronbach's alpha ranged from 0.75-0.91) and test-retest reliability (ICC ranged from 0.82-0.85). 17 of 18 significant associations between single variables and the scales supported construct validity of the PEQ-ITSD. The content validity of the PEQ-ITSD was secured by a literature review, consultations with an expert group and qualitative interviews with patients. The PEQ-ITSD was used in a national survey in Norway in 2013 and psychometric testing showed that the instrument had satisfactory internal consistency reliability and construct validity.
Indrebø, Kirsten Lerum; Andersen, John Roger; Natvig, Gerd Karin
2014-01-01
The purpose of this study was to adapt the Ostomy Adjustment Scale to a Norwegian version and to assess its construct validity and 2 components of its reliability (internal consistency and test-retest reliability). One hundred fifty-eight of 217 patients (73%) with a colostomy, ileostomy, or urostomy participated in the study. Slightly more than half (56%) were men. Their mean age was 64 years (range, 26-91 years). All respondents had undergone ostomy surgery at least 3 months before participation in the study. The Ostomy Adjustment Scale was translated into Norwegian according to standard procedures for forward and backward translation. The questionnaire was sent to the participants via regular post. The Cronbach alpha and test-retest were computed to assess reliability. Construct validity was evaluated via correlations between each item and score sums; correlations were used to analyze relationships between the Ostomy Adjustment Scale and the 36-item Short Form Health Survey, the Quality of Life Scale, the Hospital Anxiety & Depression Scale, and the General Self-Efficacy Scale. The Cronbach alpha was 0.93, and test-retest reliability r was 0.69. The average correlation quotient item to sum score was 0.49 (range, 0.31-0.73). Results showed moderate negative correlations between the Ostomy Adjustment Scale and the Hospital Anxiety and Depression Scale (-0.37 and -0.40), and moderate positive correlations between the Ostomy Adjustment Scale and the 36-item Short Form Health Survey, the Quality of Life Scale, and the General Self-Efficacy Scale (0.30-0.45) with the exception of the pain domain in the Short Form 36 (0.28). Regression analysis showed linear associations between the Ostomy Adjustment Scale and sociodemographic and clinical variables with the exception of education. The Norwegian language version of the Ostomy Adjustment Scale was found to possess construct validity, along with internal consistency and test-retest reliability. The instrument is sensitive for sociodemographic and clinical variables pertinent to persons with urostomies, colostomies, and ileostomies.
Unreliability as a Threat to Understanding Psychopathology: The Cautionary Tale of Attentional Bias
Rodebaugh, Thomas L.; Scullin, Rachel B.; Langer, Julia K.; Dixon, David J.; Huppert, Jonathan D.; Bernstein, Amit; Zvielli, Ariel; Lenze, Eric J.
2016-01-01
The use of unreliable measures constitutes a threat to our understanding of psychopathology, because advancement of science using both behavioral and biologically-oriented measures can only be certain if such measurements are reliable. Two pillars of NIMH’s portfolio – the Research Domain Criteria (RDoC) initiative for psychopathology and the target engagement initiative in clinical trials – cannot succeed without measures that possess the high reliability necessary for tests involving mediation and selection based on individual differences. We focus on the historical lack of reliability of attentional bias measures as an illustration of how reliability can pose a threat to our understanding. Our own data replicate previous findings of poor reliability for traditionally-used scores, which suggests a serious problem with the ability to test theories regarding attentional bias. This lack of reliability may also suggest problems with the assumption (in both theory and the formula for the scores) that attentional bias is consistent and stable across time. In contrast, measures accounting for attention as a dynamic process in time show good reliability in our data. The field is sorely in need of research reporting findings and reliability for attentional bias scores using multiple methods, including those focusing on dynamic processes over time. We urge researchers to test and report reliability of all measures, considering findings of low reliability not just as a nuisance but as an opportunity to modify and improve upon the underlying theory. Full assessment of reliability of measures will maximize the possibility that RDoC (and psychological science more generally) will succeed. PMID:27322741
The analysis of reliability and validity of the IT-MAIS, MAIS and MUSS.
Zhong, Yan; Xu, Tianqiu; Dong, Ruijuan; Lyu, Jing; Liu, Bo; Chen, Xueqing
2017-05-01
The aim of this study was to investigate the reliability and validity of the Infant-toddler Meaningful Auditory Integration Scale (IT-MAIS), Meaningful Auditory Integration Scale (MAIS), and Meaningful Use of Speech Scale (MUSS). IT-MAIS, MAIS and MUSS were divided into 3 sub dimensions. 300 children with cochlear implants (CI) were included in the investigation. To assess test-retest reliability of these questionnaires, 30 children were selected randomly to be evaluated at a two-week interval indicated that there were no significant changes between test and retest. Furthermore random test analysis by different evaluators was also administered to 30 users. Reliability test: Test-retest reliability of the three scales was proved to be satisfactory. All domains had correlation coefficients that exceeded 0.750(P < 0.01). The Cronbach's α of the three scales and their three domains were greater than 0.700. Reliability between evaluators of the three scales were considered to be satisfactory. All domains had correlation coefficients that exceeded 0.750(P < 0.01). Validity test: The evaluation of content validity by expert review showed the questionnaire had good content validity; The correlation coefficients between the overall scores of the three scales and their three domains were 0.699-0.978(P < 0.01). There were correlations among the three sub-domains but the strength of the correlations was relatively low. There was certain construct validity. IT-MAIS, MAIS, MUSS scales have good reliability and validity, and can be used to measure the outcome for children with cochlear implants hearing and speech evaluation. Copyright © 2017 Elsevier B.V. All rights reserved.
Validity and reliability of a video questionnaire to assess physical function in older adults.
Balachandran, Anoop; N Verduin, Chelsea; Potiaumpai, Melanie; Ni, Meng; Signorile, Joseph F
2016-08-01
Self-report questionnaires are widely used to assess physical function in older adults. However, they often lack a clear frame of reference and hence interpreting and rating task difficulty levels can be problematic for the responder. Consequently, the usefulness of traditional self-report questionnaires for assessing higher-level functioning is limited. Video-based questionnaires can overcome some of these limitations by offering a clear and objective visual reference for the performance level against which the subject is to compare his or her perceived capacity. Hence the purpose of the study was to develop and validate a novel, video-based questionnaire to assess physical function in older adults independently living in the community. A total of 61 community-living adults, 60years or older, were recruited. To examine validity, 35 of the subjects completed the video questionnaire, two types of physical performance tests: a test of instrumental activity of daily living (IADL) included in the Short Physical Functional Performance battery (PFP-10), and a composite of 3 performance tests (30s chair stand, single-leg balance and usual gait speed). To ascertain reliability, two-week test-retest reliability was assessed in the remaining 26 subjects who did not participate in validity testing. The video questionnaire showed a moderate correlation with the IADLs (Spearman rho=0.64, p<0.001; 95% CI (0.4, 0.8)), and a lower correlation with the composite score of physical performance tests (Spearman rho=0.49, p<0.01; 95% CI (0.18, 0.7)). The test-retest assessment yielded an intra-class correlation (ICC) of 0.87 (p<0.001; 95% CI (0.70, 0.94)) and a Cronbach's alpha of 0.89 demonstrating good reliability and internal consistency. Our results show that the video questionnaire developed to evaluate physical function in community-living older adults is a valid and reliable assessment tool; however, further validation is needed for definitive conclusions. Copyright © 2016 Elsevier Inc. All rights reserved.
Psychometric Properties of the Persian Version of the Simple Shoulder Test (SST) Questionnaire.
Ebrahimzadeh, Mohammad H; Vahedi, Ehsan; Baradaran, Aslan; Birjandinejad, Ali; Seyyed-Hoseinian, Seyyed-Hadi; Bagheri, Farshid; Kachooei, Amir Reza
2016-10-01
To validate the Persian version of the simple shoulder test in patients with shoulder joint problems. Following Beaton`s guideline, translation and back translation was conducted. We reached to a consensus on the Persian version of SST. To test the face validity in a pilot study, the Persian SST was administered to 20 individuals with shoulder joint conditions. We enrolled 148 consecutive patients with shoulder problem to fill the Persian SST, shoulder specific measure including Oxford shoulder score (OSS) and two general measures including DASH and SF-36. To measure the test-retest reliability, 42 patients were randomly asked to fill the Persian-SST for the second time after one week. Cronbach's alpha coefficient was used to demonstrate internal consistency over the 12 items of Persian-SST. ICC for the total questionnaire was 0.61 showing good and acceptable test-retest reliability. ICC for individual items ranged from 0.32 to 0.79. The total Cronbach's alpha was 0.84 showing good internal consistency over the 12 items of the Persian-SST. Validity testing showed strong correlation between SST and OSS and DASH. The correlation with OSS was positive while with DASH scores was negative. The correlation was also good to strong with all physical and most mental subscales of the SF-36. Correlation coefficient was higher with DASH and OSS in compare to SF-36. Persian version of SST found to be valid and reliable instrument for shoulder joint pain and function assessment in Iranian population.
Kloos, Anne D; Fritz, Nora E; Kostyk, Sandra K; Young, Gregory S; Kegelmeyer, Deb A
2014-09-01
Individuals with Huntington's disease (HD) experience balance and gait problems that lead to falls. Clinicians currently have very little information about the reliability and validity of outcome measures to determine the efficacy of interventions that aim to reduce balance and gait impairments in HD. This study examined the reliability and concurrent validity of spatiotemporal gait measures, the Tinetti Mobility Test (TMT), Four Square Step Test (FSST), and Activities-specific Balance Confidence (ABC) Scale in individuals with HD. Participants with HD [n = 20; mean age ± SD=50.9 ± 13.7; 7 male] were tested on spatiotemporal gait measures and the TMT, FSST, and ABC Scale before and after a six week period to determine test-retest reliability and minimal detectable change (MDC) values. Linear relationships between gait and clinical measures were estimated using Pearson's correlation coefficients. Spatiotemporal gait measures, the TMT total and the FSST showed good to excellent test-retest reliability (ICC > 0.75). MDC values were 0.30 m/s and 0.17 m/s for velocity in forward and backward walking respectively, four points for the TMT, and 3s for the FSST. The TMT and FSST were highly correlated with most spatiotemporal measures. The ABC Scale demonstrated lower reliability and less concurrent validity than other measures. The high test-retest reliability over a six week period and concurrent validity between the TMT, FSST, and spatiotemporal gait measures suggest that the TMT and FSST may be useful outcome measures for future intervention studies in ambulatory individuals with HD. Copyright © 2014 Elsevier B.V. All rights reserved.
Normative Data for an Instrumental Assessment of the Upper-Limb Functionality.
Caimmi, Marco; Guanziroli, Eleonora; Malosio, Matteo; Pedrocchi, Nicola; Vicentini, Federico; Molinari Tosatti, Lorenzo; Molteni, Franco
2015-01-01
Upper-limb movement analysis is important to monitor objectively rehabilitation interventions, contributing to improving the overall treatments outcomes. Simple, fast, easy-to-use, and applicable methods are required to allow routinely functional evaluation of patients with different pathologies and clinical conditions. This paper describes the Reaching and Hand-to-Mouth Evaluation Method, a fast procedure to assess the upper-limb motor control and functional ability, providing a set of normative data from 42 healthy subjects of different ages, evaluated for both the dominant and the nondominant limb motor performance. Sixteen of them were reevaluated after two weeks to perform test-retest reliability analysis. Data were clustered into three subgroups of different ages to test the method sensitivity to motor control differences. Experimental data show notable test-retest reliability in all tasks. Data from older and younger subjects show significant differences in the measures related to the ability for coordination thus showing the high sensitivity of the method to motor control differences. The presented method, provided with control data from healthy subjects, appears to be a suitable and reliable tool for the upper-limb functional assessment in the clinical environment.
Normative Data for an Instrumental Assessment of the Upper-Limb Functionality
Caimmi, Marco; Guanziroli, Eleonora; Malosio, Matteo; Pedrocchi, Nicola; Vicentini, Federico; Molinari Tosatti, Lorenzo; Molteni, Franco
2015-01-01
Upper-limb movement analysis is important to monitor objectively rehabilitation interventions, contributing to improving the overall treatments outcomes. Simple, fast, easy-to-use, and applicable methods are required to allow routinely functional evaluation of patients with different pathologies and clinical conditions. This paper describes the Reaching and Hand-to-Mouth Evaluation Method, a fast procedure to assess the upper-limb motor control and functional ability, providing a set of normative data from 42 healthy subjects of different ages, evaluated for both the dominant and the nondominant limb motor performance. Sixteen of them were reevaluated after two weeks to perform test-retest reliability analysis. Data were clustered into three subgroups of different ages to test the method sensitivity to motor control differences. Experimental data show notable test-retest reliability in all tasks. Data from older and younger subjects show significant differences in the measures related to the ability for coordination thus showing the high sensitivity of the method to motor control differences. The presented method, provided with control data from healthy subjects, appears to be a suitable and reliable tool for the upper-limb functional assessment in the clinical environment. PMID:26539500
Establishing the validity and reliability of the Project Talent Personality Inventory
Pozzebon, Julie; Damian, Rodica I.; Hill, Patrick L.; Lin, Yuchen; Lapham, Susan; Roberts, Brent W.
2013-01-01
Project Talent is a national longitudinal study that started in 1960. The original sample included over 440,000 students, which amounted to a 5% representative sample of high school students across the United States. Previous research has not yet established the validity and reliability of the personality measure used in this study, that is, the Project Talent Personality Inventory (PTPI). Given the potential interest and use of the PTPI in forthcoming research, the goals of the present paper were to establish (a) the construct and predictive validity and (b) the internal consistency and test-retest reliability of the PTPI. This information will be valuable to researchers who might be interested in using the PTPI to predict life course outcomes, such as mortality, occupational success, relationship success, and health. Study 1 found that the 10 sub-scales of the PTPI showed good internal consistency reliability, as well as good construct and predictive validity. With the use of several modern personality measures, we showed how the 10 PTPI scales can be mapped onto the Big Five personality traits, and we examined their relations with health, well-being, and life satisfaction outcomes. Study 2 found that the 10 PTPI scales showed good test-retest reliability. Together, these findings allow researchers to better understand and use the PTPI scales, as they are available in Project Talent. PMID:24399984
Conceição, Cristiano Sena da; Neto, Mansueto Gomes; Neto, Anolino Costa; Mendes, Selena M D; Baptista, Abrahão Fontes; Sá, Kátia Nunes
2016-01-01
To tested the reliability and validity of Aofas in a sample of rheumatoid arthritis patients. The scale was applicable to rheumatoid arthritis patients, twice by the interviewer 1 and once by the interviewer 2. The Aofas was subjected to test-retest reliability analysis (with 20 Rheumatoid arthritis subjects). The psychometric properties were investigated using Rasch analysis on 33 Rheumatoid arthritis patients. Intra-Class Correlation Coefficient (ICC) were (0.90
[Evaluation of Suicide Risk Levels in Hospitals: Validity and Reliability Tests].
Macagnino, Sandro; Steinert, Tilman; Uhlmann, Carmen
2018-05-01
Examination of in-hospital suicide risk levels concerning their validity and their reliability. The internal suicide risk levels were evaluated in a cross sectional study of in 163 inpatients. A reliability check was performed via determining interrater-reliability of senior physician, therapist and the responsible nurse. Within the scope of the validity check, we conducted analyses of criterion validity and construct validity. For the total sample an "acceptable" to "good" interrater-reliability (Kendalls W = .77) of suicide risk levels were obtained. Schizophrenic disorders showed the lowest values, for personality disorders we found the highest level of interrater-reliability. When examining the criterion validity, Item-9 of the BDI-II is substantial correlated to our suicide risk levels (ρ m = .54, p < .01). Within the scope of construct validity check, affective disorders showed the highest correlation (ρ = .77), compatible also with "convergent validity". They differed with schizophrenic disorders which showed the least concordance (ρ = .43). In-hospital suicide risk levels may represent an important contribution to the assessment of suicidal behavior of inpatients experiencing psychiatric treatment due to their overall good validity and reliability. © Georg Thieme Verlag KG Stuttgart · New York.
Wang, Jin-Hui; Zuo, Xi-Nian; Gohel, Suril; Milham, Michael P.; Biswal, Bharat B.; He, Yong
2011-01-01
Graph-based computational network analysis has proven a powerful tool to quantitatively characterize functional architectures of the brain. However, the test-retest (TRT) reliability of graph metrics of functional networks has not been systematically examined. Here, we investigated TRT reliability of topological metrics of functional brain networks derived from resting-state functional magnetic resonance imaging data. Specifically, we evaluated both short-term (<1 hour apart) and long-term (>5 months apart) TRT reliability for 12 global and 6 local nodal network metrics. We found that reliability of global network metrics was overall low, threshold-sensitive and dependent on several factors of scanning time interval (TI, long-term>short-term), network membership (NM, networks excluding negative correlations>networks including negative correlations) and network type (NT, binarized networks>weighted networks). The dependence was modulated by another factor of node definition (ND) strategy. The local nodal reliability exhibited large variability across nodal metrics and a spatially heterogeneous distribution. Nodal degree was the most reliable metric and varied the least across the factors above. Hub regions in association and limbic/paralimbic cortices showed moderate TRT reliability. Importantly, nodal reliability was robust to above-mentioned four factors. Simulation analysis revealed that global network metrics were extremely sensitive (but varying degrees) to noise in functional connectivity and weighted networks generated numerically more reliable results in compared with binarized networks. For nodal network metrics, they showed high resistance to noise in functional connectivity and no NT related differences were found in the resistance. These findings provide important implications on how to choose reliable analytical schemes and network metrics of interest. PMID:21818285
The De-Escalating Aggressive Behaviour Scale: development and psychometric testing.
Nau, Johannes; Halfens, Ruud; Needham, Ian; Dassen, Theo
2009-09-01
This paper is a report of a study to develop and test the psychometric properties of a scale measuring nursing students' performance in de-escalation of aggressive behaviour. Successful training should lead not merely to more knowledge and amended attitudes but also to improved performance. However, the quality of de-escalation performance is difficult to assess. Based on a qualitative investigation, seven topics pertaining to de-escalating behaviour were identified and the wording of items tested. The properties of the items and the scale were investigated quantitatively. A total of 1748 performance evaluations by students (rater group 1) from a skills laboratory were used to check distribution and conduct a factor analysis. Likewise, 456 completed evaluations by de-escalation experts (rater group 2) of videotaped performances at pre- and posttest were used to investigate internal consistency, interrater reliability, test-retest reliability, effect size and factor structure. Data were collected in 2007-2008 in German. Factor analysis showed a unidimensional 7-item scale with factor loadings ranging from 0.55 to 0.81 (rater group 1) and 0.48 to 0.88 (rater group 2). Cronbach's alphas of 0.87 and 0.88 indicated good internal consistency irrespective of rater group. A Pearson's r of 0.80 confirmed acceptable test-retest reliability, and interrater reliability Intraclass Correlation 3 ranging from 0.77 to 0.93 also showed acceptable results. The effect size r of 0.53 plus Cohen's d of 1.25 indicates the capacity of the scale to detect changes in performance. Further research is needed to test the English version of the scale and its validity.
[*C]octanoic acid breath test to measure gastric emptying rate of solids.
Maes, B D; Ghoos, Y F; Rutgeerts, P J; Hiele, M I; Geypens, B; Vantrappen, G
1994-12-01
We have developed a breath test to measure solid gastric emptying using a standardized scrambled egg test meal (250 kcal) labeled with [14C]octanoic acid or [13C]octanoic acid. In vitro incubation studies showed that octanoic acid is a reliable marker of the solid phase. The breath test was validated in 36 subjects by simultaneous radioscintigraphic and breath test measurements. Nine healthy volunteers were studied after intravenous administration of 200 mg erythromycin and peroral administration of 30 mg propantheline, respectively. Erythromycin significantly enhanced gastric emptying, while propantheline significantly reduced gastric emptying rates. We conclude that the [*C]octanoic breath test is a promising and reliable test for measuring the gastric emptying rate of solids.
Milanović, Zoran; Pantelić, Saša; Trajković, Nebojša; Jorgić, Bojan; Sporiš, Goran; Bratić, Milovan
2014-01-01
The purpose of this study was to determine the test-retest reliability of the International Physical Activity Questionnaire (IPAQ) for older adults in Serbia. Six hundred and sixty older adults (352 men, 53%; 308 women, 47%; mean age 67.65±5.76 years) participated in the study. To examine test-retest reliability, the participants were asked to complete the IPAQ on two occasions 2 weeks apart. Moderate reliability was observed between the repeated IPAQ, with intraclass correlation coefficients ranging from 0.53 to 0.91. The least reliability was established in leisure time activity (0.53) and the most reliability in the transport domain (0.91). Men and women had similar intraclass correlation coefficients for total physical activity (0.71 versus 0.74, respectively), while the biggest difference was obtained for housework in men (0.68) and in women (0.90). Our study shows that the long version of the IPAQ is a reliable instrument for assessing physical activity levels in older adults and that it may be useful for generating internationally comparable data.
Kim, Min-Beom; Ban, Jae Ho
2012-12-01
To evaluate the test-retest reliability and convenience of simultaneous binaural acoustic-evoked ocular vestibular evoked myogenic potentials (oVEMP). Thirteen healthy subjects with no history of ear diseases participated in this study. All subjects underwent oVEMP test with both separated monaural acoustic stimulation and simultaneous binaural acoustic stimulation. For evaluating test-retest reliability, three repetitive sessions were performed in each ear for calculating the intraclass correlation coefficient (ICC) for both monaural and binaural tests. We analyzed data from the biphasic n1-p1 complex, such as latency of peak, inter-peak amplitude, and asymmetric ratio of amplitude in both ears. Finally, we checked the total time required to complete each test for evaluating test convenience. No significant difference was observed in amplitude and asymmetric ratio in comparison between monaural and binaural oVEMP. However, latency was slightly delayed in binaural oVEMP. In test-retest reliability analysis, binaural oVEMP showed excellent ICC values ranging from 0.68 to 0.98 in latency, asymmetric ratio, and inter-peak amplitude. Additionally, the test time was shorter in binaural than monaural oVEMP. oVEMP elicited from binaural acoustic stimulation yields similar satisfactory results as monaural stimulation. Further, excellent test-retest reliability and shorter test time were achieved in binaural than in monaural oVEMP.
Brett, Benjamin L; Solomon, Gary S
2017-04-01
Research findings to date on the stability of Immediate Post-Concussion Assessment and Cognitive Testing (ImPACT) Composite scores have been inconsistent, requiring further investigation. The use of test validity criteria across these studies also has been inconsistent. Using multiple measures of stability, we examined test-retest reliability of repeated ImPACT baseline assessments in high school athletes across various validity criteria reported in previous studies. A total of 1146 high school athletes completed baseline cognitive testing using the online ImPACT test battery at two time periods of approximately two-year intervals. No participant sustained a concussion between assessments. Five forms of validity criteria used in previous test-retest studies were applied to the data, and differences in reliability were compared. Intraclass correlation coefficients (ICCs) ranged in composite scores from .47 (95% confidence interval, CI [.38, .54]) to .83 (95% CI [.81, .85]) and showed little change across a two-year interval for all five sets of validity criteria. Regression based methods (RBMs) examining the test-retest stability demonstrated a lack of significant change in composite scores across the two-year interval for all forms of validity criteria, with no cases falling outside the expected range of 90% confidence intervals. The application of more stringent validity criteria does not alter test-retest reliability, nor does it account for some of the variation observed across previously performed studies. As such, use of the ImPACT manual validity criteria should be utilized in the determination of test validity and in the individualized approach to concussion management. Potential future efforts to improve test-retest reliability are discussed.
Psychometric evaluation of a motor control test battery of the craniofacial region.
von Piekartz, H; Stotz, E; Both, A; Bahn, G; Armijo-Olivo, S; Ballenberger, N
2017-12-01
The primary objective of this study was to determine the structural and known-group validity as well as the inter-rater reliability of a test battery to evaluate the motor control of the craniofacial region. Seventy volunteers without TMD and 25 subjects with TMD (Axes I) per the DC/TMD were asked to execute a test battery consisting of eight tests. The tests were video-taped in the same sequence in a standardised manner. Two experienced physical therapists participated in this study as blinded assessors. We used exploratory factor analysis to identify the underlying component structure of the eight tests. Internal consistency (Cronbach's α), inter-rater reliability (intra-class correlation coefficient) and construct validity (ie, hypothesis testing-known-group validity) (receiver operating curves) were also explored for the test battery. The structural validity showed the presence of one factor underlying the construct of the test battery. The internal consistency was excellent (0.90) as well as the inter-rater reliability. All values of reliability were close to 0.9 or above indicating very high inter-rater reliability. The area under the curve (AUC) was 0.93 for rater 1 and 0.94 for rater two, respectively, indicating excellent discrimination between subjects with TMD and healthy controls. The results of the present study support the psychometric properties of test battery to measure motor control of the craniofacial region when evaluated through videotaping. This test battery could be used to differentiate between healthy subjects and subjects with musculoskeletal impairments in the cervical and oro-facial regions. In addition, this test battery could be used to assess the effectiveness of management strategies in the craniofacial region. © 2017 John Wiley & Sons Ltd.
Reliability of doming and toe flexion testing to quantify foot muscle strength.
Ridge, Sarah Trager; Myrer, J William; Olsen, Mark T; Jurgensmeier, Kevin; Johnson, A Wayne
2017-01-01
Quantifying the strength of the intrinsic foot muscles has been a challenge for clinicians and researchers. The reliable measurement of this strength is important in order to assess weakness, which may contribute to a variety of functional issues in the foot and lower leg, including plantar fasciitis and hallux valgus. This study reports 3 novel methods for measuring foot strength - doming (previously unmeasured), hallux flexion, and flexion of the lesser toes. Twenty-one healthy volunteers performed the strength tests during two testing sessions which occurred one to five days apart. Each participant performed each series of strength tests (doming, hallux flexion, and lesser toe flexion) four times during the first testing session (twice with each of two raters) and two times during the second testing session (once with each rater). Intra-class correlation coefficients were calculated to test for reliability for the following comparisons: between raters during the same testing session on the same day (inter-rater, intra-day, intra-session), between raters on different days (inter-rater, inter-day, inter-session), between days for the same rater (intra-rater, inter-day, inter-session), and between sessions on the same day by the same rater (intra-rater, intra-day, inter-session). ICCs showed good to excellent reliability for all tests between days, raters, and sessions. Average doming strength was 99.96 ± 47.04 N. Average hallux flexion strength was 65.66 ± 24.5 N. Average lateral toe flexion was 50.96 ± 22.54 N. These simple tests using relatively low cost equipment can be used for research or clinical purposes. If repeated testing will be conducted on the same participant, it is suggested that the same researcher or clinician perform the testing each time for optimal reliability.
Test-retest reliability of functional connectivity networks during naturalistic fMRI paradigms.
Wang, Jiahui; Ren, Yudan; Hu, Xintao; Nguyen, Vinh Thai; Guo, Lei; Han, Junwei; Guo, Christine Cong
2017-04-01
Functional connectivity analysis has become a powerful tool for probing the human brain function and its breakdown in neuropsychiatry disorders. So far, most studies adopted resting-state paradigm to examine functional connectivity networks in the brain, thanks to its low demand and high tolerance that are essential for clinical studies. However, the test-retest reliability of resting-state connectivity measures is moderate, potentially due to its low behavioral constraint. On the other hand, naturalistic neuroimaging paradigms, an emerging approach for cognitive neuroscience with high ecological validity, could potentially improve the reliability of functional connectivity measures. To test this hypothesis, we characterized the test-retest reliability of functional connectivity measures during a natural viewing condition, and benchmarked it against resting-state connectivity measures acquired within the same functional magnetic resonance imaging (fMRI) session. We found that the reliability of connectivity and graph theoretical measures of brain networks is significantly improved during natural viewing conditions over resting-state conditions, with an average increase of almost 50% across various connectivity measures. Not only sensory networks for audio-visual processing become more reliable, higher order brain networks, such as default mode and attention networks, but also appear to show higher reliability during natural viewing. Our results support the use of natural viewing paradigms in estimating functional connectivity of brain networks, and have important implications for clinical application of fMRI. Hum Brain Mapp 38:2226-2241, 2017. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Selcuk, Selcuk; Kucukbas, Mehmet; Cam, Cetin; Eser, Ahmet; Devranoglu, Belgin; Turkyilmaz, Sebnem; Karateke, Ates
2016-06-01
The Sexual Health Outcomes in Women Questionnaire (SHOW-Q) is designed to evaluate the sexual life of women for satisfaction, orgasm, desire, and pelvic problem interference. The SHOW-Q is important for evaluating worsening of sexual life for patients with pelvic problems and the management of these women to improve their sexual life. To validate the Turkish versions of the SHOW-Q for Turkish-speaking women. The Turkish version of the SHOW-Q was generated by two independent professional English-to-Turkish translators. The translated version of the SHOW-Q was reverse translated by two bilingual translators whose native language was English. Women with at least one symptom related to pelvic problems (n = 71) and those with no symptoms (n = 38) were included in the present study. Test-retest reliability analysis, content-face validity, internal consistency reliability, item-total correlations, convergent validity, construct validity, and factorial validity were performed to assess the psychometric properties of the Turkish versions of the SHOW-Q. Test-retest reliability demonstrated good correlation for all subscales. Cronbach α values ranged from 0.735 to 0.892 and indicated high internal consistency. There was a strong correlation for the corresponding subscales between the SHOW-Q and the Female Sexual Function Index. The mean score of each SHOW-Q subscale showed significant differences between symptomatic and asymptomatic patients. The Turkish version of the SHOW-Q is a valid and reliable instrument that can be used to evaluate the sexual life of Turkish-speaking women with different pelvic problems. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Assessing fear-avoidance beliefs in patients with cervical radiculopathy.
Dedering, Asa; Börjesson, Tina
2013-12-01
The study sought to evaluate validity and reliability of the Fear Avoidance Beliefs Questionnaire and the Tampa Scale for Kinesiophobia in patients with cervical radiculopathy. A test-retest design was used to test stability over time in 46 patients with cervical radiculopathy. Differences between patients and healthy subjects were also evaluated comparing the patients with 41 physically active and healthy subjects. The patients answered the Fear Avoidance Beliefs Questionnaire and the Tampa Scale for Kinesiophobia twice. To test for differences between the patients and the healthy subjects, the latter answered the same questionnaires once. Questionnaires about activity, personal factors and health were also used. The test-retest reliability assessed with weighted kappa was 0.68 for the Fear Avoidance Beliefs Questionnaire and 0.45 for the Tampa Scale for Kinesiophobia. Only six of the 11 single items of the Fear Avoidance Beliefs Questionnaire and none of the single items of the Tampa Scale of Kinesiophobia showed kappa coefficients exceeding 0.60 (good reliability). Patients with cervical radiculopathy rated significantly worse on the Fear Avoidance Beliefs Questionnaire and the Tampa Scale for Kinesiophobia than the healthy subjects did. The Fear Avoidance Beliefs Questionnaire may be recommended for test-retest evaluations because 'good' reliability was found. The Tampa Scale for Kinesiophobia had only 'moderate' test-retest reliability, and this should be considered when using this scale in test-retest evaluations. Both questionnaires can discriminate between patients with cervical radiculopathy and healthy subjects. Copyright © 2012 John Wiley & Sons, Ltd.
Reliability and convergent validity of the five-step test in people with chronic stroke.
Ng, Shamay S M; Tse, Mimi M Y; Tam, Eric W C; Lai, Cynthia Y Y
2018-01-10
(i) To estimate the intra-rater, inter-rater and test-retest reliabilities of the Five-Step Test (FST), as well as the minimum detectable change in FST completion times in people with stroke. (ii) To estimate the convergent validity of the FST with other measures of stroke-specific impairments. (iii) To identify the best cut-off times for distinguishing FST performance in people with stroke from that of healthy older adults. A cross-sectional study. University-based rehabilitation centre. Forty-eight people with stroke and 39 healthy controls. None. The FST, along with (for the stroke survivors only) scores on the Fugl-Meyer Lower Extremity Assessment (FMA-LE), the Berg Balance Scale (BBS), Limits of Stability (LOS) tests, and Activities-specific Balance Confidence (ABC) scale were tested. The FST showed excellent intra-rater (intra-class correlation coefficient; ICC = 0.866-0.905), inter-rater (ICC = 0.998), and test-retest (ICC = 0.838-0.842) reliabilities. A minimum detectable change of 9.16 s was found for the FST in people with stroke. The FST correlated significantly with the FMA-LE, BBS, and LOS results in the forward and sideways directions (r = -0.411 to -0.716, p < 0.004). The FST completion time of 13.35 s was shown to discriminate reliably between people with stroke and healthy older adults. The FST is a reliable, easy-to-administer clinical test for assessing stroke survivors' ability to negotiate steps and stairs.
Y-balance test: a reliability study involving multiple raters.
Shaffer, Scott W; Teyhen, Deydre S; Lorenson, Chelsea L; Warren, Rick L; Koreerat, Christina M; Straseske, Crystal A; Childs, John D
2013-11-01
The Y-balance test (YBT) is one of the few field expedient tests that have shown predictive validity for injury risk in an athletic population. However, analysis of the YBT in a heterogeneous population of active adults (e.g., military, specific occupations) involving multiple raters with limited experience in a mass screening setting is lacking. The primary purpose of this study was to determine interrater test-retest reliability of the YBT in a military setting using multiple raters. Sixty-four service members (53 males, 11 females) actively conducting military training volunteered to participate. Interrater test-retest reliability of the maximal reach had intraclass correlation coefficients (2,1) of 0.80 to 0.85 with a standard error of measurement ranging from 3.1 to 4.2 cm for the 3 reach directions (anterior, posteromedial, and posterolateral). Interrater test-retest reliability of the average reach of 3 trails had an intraclass correlation coefficients (2,3) range of 0.85 to 0.93 with an associated standard error of measurement ranging from 2.0 to 3.5cm. The YBT showed good interrater test-retest reliability with an acceptable level of measurement error among multiple raters screening active duty service members. In addition, 31.3% (n = 20 of 64) of participants exhibited an anterior reach asymmetry of >4cm, suggesting impaired balance symmetry and potentially increased risk for injury. Reprint & Copyright © 2013 Association of Military Surgeons of the U.S.
Valle, Susanne Collier; Støen, Ragnhild; Sæther, Rannei; Jensenius, Alexander Refsum; Adde, Lars
2015-10-01
A computer-based video analysis has recently been presented for quantitative assessment of general movements (GMs). This method's test-retest reliability, however, has not yet been evaluated. The aim of the current study was to evaluate the test-retest reliability of computer-based video analysis of GMs, and to explore the association between computer-based video analysis and the temporal organization of fidgety movements (FMs). Test-retest reliability study. 75 healthy, term-born infants were recorded twice the same day during the FMs period using a standardized video set-up. The computer-based movement variables "quantity of motion mean" (Qmean), "quantity of motion standard deviation" (QSD) and "centroid of motion standard deviation" (CSD) were analyzed, reflecting the amount of motion and the variability of the spatial center of motion of the infant, respectively. In addition, the association between the variable CSD and the temporal organization of FMs was explored. Intraclass correlation coefficients (ICC 1.1 and ICC 3.1) were calculated to assess test-retest reliability. The ICC values for the variables CSD, Qmean and QSD were 0.80, 0.80 and 0.86 for ICC (1.1), respectively; and 0.80, 0.86 and 0.90 for ICC (3.1), respectively. There were significantly lower CSD values in the recordings with continual FMs compared to the recordings with intermittent FMs (p<0.05). This study showed high test-retest reliability of computer-based video analysis of GMs, and a significant association between our computer-based video analysis and the temporal organization of FMs. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Silva, Paula F. S.; Quintino, Ludmylla F.; Franco, Juliane; Faria, Christina D. C. M.
2014-01-01
Background Subjects with neurological disease (ND) usually show impaired performance during sit-to-stand and stand-to-sit tasks, with a consequent reduction in their mobility levels. Objective To determine the measurement properties and feasibility previously investigated for clinical tests that evaluate sit-to-stand and stand-to-sit in subjects with ND. Method A systematic literature review following the PRISMA (Preferred Reporting Items for Systematic reviews and Meta-Analyses) protocol was performed. Systematic literature searches of databases (MEDLINE/SCIELO/LILACS/PEDro) were performed to identify relevant studies. In all studies, the following inclusion criteria were assessed: investigation of any measurement property or the feasibility of clinical tests that evaluate sit-to-stand and stand-to-sit tasks in subjects with ND published in any language through December 2012. The COSMIN checklist was used to evaluate the methodological quality of the included studies. Results Eleven studies were included. The measurement properties/feasibility were most commonly investigated for the five-repetition sit-to-stand test, which showed good test-retest reliability (Intraclass Correlation Coefficient:ICC=0.94-0.99) for subjects with stroke, cerebral palsy and dementia. The ICC values were higher for this test than for the number of repetitions in the 30-s test. The five-repetition sit-to-stand test also showed good inter/intra-rater reliabilities (ICC=0.97-0.99) for stroke and inter-rater reliability (ICC=0.99) for subjects with Parkinson disease and incomplete spinal cord injury. For this test, the criterion-related validity for subjects with stroke, cerebral palsy and incomplete spinal cord injury was, in general, moderate (correlation=0.40-0.77), and the feasibility and safety were good for subjects with Alzheimer's disease. Conclusions The five-repetition sit-to-stand test was used more often in subjects with ND, and most of the measurement properties were investigated and showed adequate results. PMID:24839043
Li, Jing-Jing; Gao, Qi; Liu, Zhi-Dong; Kang, Qiong-Hua; Hou, Yi-Jun; Zhang, Luo-Chuan; Hu, Xiao-Mei; Li, Jie; Zhang, Juan
2015-01-01
Internal quality control (IQC) is a critical component of laboratory quality management, and IQC products can determine the reliability of testing results. In China, given the fact that most blood transfusion compatibility laboratories do not employ IQC products or do so minimally, there is a lack of uniform and standardized IQC methods. To explore the reliability of IQC products and methods, we studied 697 results from IQC samples in our laboratory from 2012 to 2014. The results showed that the sensitivity and specificity of the IQCs in anti-B testing were 100% and 99.7%, respectively. The sensitivity and specificity of the IQCs in forward blood typing, anti-A testing, irregular antibody screening, and cross-matching were all 100%. The reliability analysis indicated that 97% of anti-B testing results were at a 99% confidence level, and 99.9% of forward blood typing, anti-A testing, irregular antibody screening, and cross-matching results were at a 99% confidence level. Therefore, our IQC products and methods are highly sensitive, specific, and reliable. Our study paves the way for the establishment of a uniform and standardized IQC method for pre-transfusion compatibility testing in China and other parts of the world. PMID:26488582
Comparison of subjective olfaction ratings in patients with and without olfactory disorders.
Haxel, B R; Bertz-Duffy, S; Fruth, K; Letzel, S; Mann, W J; Muttray, A
2012-07-01
Olfactory dysfunction is common. The reliability of self-assessment tools for smell testing is still controversial. This study aimed to provide new data about the accuracy of olfactory self-assessment compared with a standardised smell test. Prospective, controlled, cohort study of patients with olfactory disorders and healthy controls. Ninety-six patients with a smell deficit and 71 controls were asked to rate their sense of smell on a visual analogue scale. Their olfactory abilities were also evaluated with the Sniffin' Sticks tests. The whole cohort showed a significant correlation between visual analogue scale smell scores and Sniffin' Sticks total scores. This correlation was also significant in the patient group, but not in the control group. These results were independent of olfactory deficit aetiology and subject age. Self-assessment of olfaction is only a reliable indicator in smell-impaired patients, not in healthy controls. For an accurate assessment of olfaction, reliable, standardised tests are needed.
Feasibility of a Semi-computerized Line Bisection Test for Unilateral Visual Neglect Assessment.
Jee, H; Kim, J; Kim, C; Kim, T; Park, J
2015-01-01
Commonly used paper-and-pencil based test modalities for assessing the degree of unilateral visual neglect (ULN) in patients with hemispheric cerebral lesions consume human resources with a significant inter and intra-rater variability. To explore the feasibility of a semi-computerized electronic-pen based ULN assessment system (e-system) to improve assessment quality without altering the conventional user interface. Thirty cognitively healthy participants (HG) and 11 participants diagnosed with right-hemispheric lesion and unilateral visual neglect (NG) were recruited to evaluate the e-system. Line bisection tests (LBT) were repeatedly conducted twice for the inter-rater and intra-rater (reliability) comparisons. The LBT results were assessed by the e-system and the golden standard methods (manual rater assessment). The percent deviation (%), assessment duration (sec), and number of neglected line (each) were evaluated. The inter-rater comparisons of the assessed deviation (%) variable showed excellent interrater reliabilities (CCCs) ranging from .84 (.59 to .95 (p < .001)) to .99 (.90 to .99 (p < .001)) for HG and NG. The Bland Altman mean difference (B-A) plots with bias (95% LOA (limits of agreement)) showed similar agreements between the e-system and the raters ranging from -.04 % (-2.10 to 1.97) to 1.30 % (-2.23 to 4.84) for HG and NG. The effect sizes (ES), which show similarities between the assessment methods, yielded smaller ranges from .01 to .30 for HG and NG. The reliability (test-retest) comparisons showed similar assessment results between the e-system, rater 1, and rater 2. The manual rater assessment time ranging from 5.85 to 6.00 minutes and inter- and intraassessment variations were virtually eliminated with the e-system. The semi-computerized system with the conventional paper-and pencil user-interface showed valid and reliable assessment results. It may be a feasible replacement for the manual rater assessment modality even in a clinical setting.
Mak, M K Y; Lau, E T L; Tam, V W K; Woo, C W Y; Yuen, S K Y
2015-01-01
To investigate the test-retest reliability of JTT in older patients with Parkinson's disease (PD); and to compare the Jebsen Taylor Hand Function Test (JTT) scores between PD and healthy subjects. Cross-sectional comparative study. Fifteen PD and fifteen healthy subjects performed the JTT and the time taken to complete the JTT was recorded. Test-retest reliabilities of JTT subtests and total score of both dominant and non-dominant hand were good to excellent (ICCs = 0.77-0.97) except J5 checkers which had moderate reliability. PD subjects required significantly longer time to finish subtests and the whole JTT (p < 0.05), except the subtest J1 writing of dominant hand that showed marginal significance (p = 0.059). JTT is a reliable and easily available assessment tool for assessing the hand function of PD subjects. PD subjects took a longer time to complete the JTT, suggesting that they have deficits in gross and fine functional dexterity. Copyright © 2015 Hanley & Belfus. Published by Elsevier Inc. All rights reserved.
Reliability of a questionnaire on substance use among adolescent students, Brazil.
Machado Neto, Adelmo de Souza; Andrade, Tarcisio Matos; Fernandes, Gilênio Borges; Zacharias, Helder Paulo; Carvalho, Fernando Martins; Machado, Ana Paula Souza; Dias, Ana Carmen Costa; Garcia, Ana Carolina Rocha; Santana, Lauro Reis; Rolin, Carlos Eduardo; Sampaio, Cyntia; Ghiraldi, Gisele; Bastos, Francisco Inácio
2010-10-01
To analyze reliability of a self-applied questionnaire on substance use and misuse among adolescent students. Two cross-sectional studies were carried out for the instrument test-retest. The sample comprised male and female students aged 1119 years from public and private schools (elementary, middle, and high school students) in the city of Salvador, Northeastern Brazil, in 2006. A total of 591 questionnaires were applied in the test and 467 in the retest. Descriptive statistics, the Kappa index, Cronbach's alpha and intraclass correlation were estimated. The prevalence of substance use/misuse was similar in both test and retest. Sociodemographic variables showed a "moderate" to "almost perfect" agreement for the Kappa index, and a "satisfactory" (>0.75) consistency for Cronbach's alpha and intraclass correlation. The age which psychoactive substances (tobacco, alcohol, and cannabis) were first used and chronological age were similar in both studies. Test-retest reliability was found to be a good indicator of students' age of initiation and their patterns of substance use. The questionnaire reliability was found to be satisfactory in the population studied.
JCQ scale reliability and responsiveness to changes in manufacturing process.
d'Errico, Angelo; Punnett, Laura; Gold, Judith E; Gore, Rebecca
2008-02-01
The job content questionnaire (JCQ) was administered to automobile manufacturing workers in two interviews, 5 years apart. Between the two interviews, the company introduced substantial changes in production technology in some production areas. The aims were: (1) to describe the impact of these changes on self-reported psychosocial exposures, and (2) to examine test-retest reliability of the JCQ scales, taking into account changes in job assignment and, for a subset of workers, physical ergonomic exposures as assessed through field observations. The study population included 790 subjects at the first and 519 at the second interview, of whom 387 were present in both. Differences in demand and control scores between interviews were analyzed by Wilcoxon matched-pairs signed-rank test. Test-retest reliability of these scales was evaluated by the intraclass correlation coefficient (ICC) and the Spearman's rho coefficient. The introduction of more automated technology produced an overall increase in job control but did not decrease psychological demand. The reliability of the control scale was low overall but increased to an acceptable level among workers who had not changed job. The demand scale had high reliability only among workers whose physical ergonomic exposures were similar on both survey occasions. These results show that 5-year test-retest reliability of self-reported psychosocial exposures is adequate among workers whose job assignment and ergonomic exposures have remained stable over time.
Stroke and aphasia quality-of-life scale-39: Reliability and validity of the Turkish version.
Noyan-ErbaŞ, AyŞin; Toğram, Bülent
2016-10-01
The aim of this study was to adapt the stroke and aphasia quality-of-life scale-39 (SAQoL-39) to the Turkish language and carry out a reliability and validity study of the instrument in a group of patients with aphasia. The study was a descriptive study and contained three phases: adaptation of the SAQoL-39 to the Turkish language, administration of the scale to 30 aphasia patients and reliability and validity studies of the scale. Internal consistency was assessed with Cronbach's alpha and test-re-test reliability was explored (n = 14). The adaptation process was completed based on inter-rater agreement on the translated items and within the scope of final editing by the authors of the study. The SAQoL-39 in Turkish exhibited high test-re-test reliability (ICC =0.97) as well as acceptability with minimal missing data (0-1.4). This instrument exhibited high internal consistency (Cronbach's α = 0.70-0.97), domain-total correlations (r = 0.76-0.85) and inter-domain correlations (r = 0.40-0.68). The analysis shows that the Turkish version of SAQoL-39 is a scale that is highly acceptable, valid and reliable and can be easily used in evaluating the quality-of-life of Turkish people with aphasia.
López-de-Uralde-Villanueva, Ibai; Acuyo-Osorio, Mario; Prieto-Aldana, María; La Touche, Roy
2017-04-01
The Passive Neck Flexion Test (PNFT) can diagnose meningitis and potential spinal disorders. Little evidence is available concerning the use of a modified version of the PNFT (mPNFT) in patients with chronic nonspecific neck pain (CNSNP). To assess the reliability of the mPNFT in subjects with and without CNSNP. The secondary objective was to assess the differences in the symptoms provoked by the mPNFT between these two populations. We used repeated measures concordance design for the main objective and cross-sectional design for the secondary objective. A total of 30 asymptomatic subjects and 34 patients with CNSNP were recruited. The following measures were recorded: the range of motion at the onset of symptoms (OS-mPNFT), the range of motion at the submaximal pain (SP-mPNFT), and evoked pain intensity on the mPNFT (VAS-mPNFT). Good to excellent reliability was observed for OS-mPNFT and SP-mPNFT in the asymptomatic group (intra-examiner reliability: 0.95-0.97; inter-examiner reliability: 0.86-0.90; intra-examiner test-retest reliability: 0.84-0.87). In the CNSNP group, a good to excellent reliability was obtained for the OS-mPNFT (intra-examiner reliability: 0.89-0.96; inter-examiner reliability: 0.83-0.86; intra-examiner test-retest reliability: 0.83-0.85) and the SP-PNFT (intra-examiner reliability: 0.94-0.98; inter-examiner reliability: 0.80-0.82; intra-examiner test-retest reliability: 0.88-0.91). The CNSNP group showed statistically significant differences in OS-mPNFT (t = 4.92; P < 0.001), SP-mPNFT (t = 2.79; P = 0.007) and in VAS-mPNFT (t = -10.39; P < 0.001) versus the asymptomatic group. The mPNFT is a reliable tool regardless of the examiner and the time factor. Patients with CNSNP have a decrease range of motion and more pain than asymptomatic subjects in the mPNFT. This exceeds the minimal detectable changes for OS-mPNFT and VAS-mPNFT. Copyright © 2017 Elsevier Ltd. All rights reserved.
Reliability and validity of a questionnaire for self-assessment of complete dentures.
Komagamine, Yuriko; Kanazawa, Manabu; Kaiba, Yoshinori; Sato, Yusuke; Minakuchi, Shunsuke
2014-05-02
Demand for complete denture treatment is expected to rise over several decades. However, to date, no questionnaire on complete dentures, as evaluated by edentulous patients, has been shown to be reliable and valid. This study sought to assess the reliability and validity of Patient's Denture Assessment (PDA), which provides a multidimensional evaluation of dentures among edentulous patients. Patients, who had new complete dentures fabricated at the University Hospital of Dentistry, Tokyo Medical and Dental University through 2009 to 2010, were enrolled. The reliability of the PDA was determined by examining internal consistency and test-retest reliability. Internal consistency for all of the question items and the six subscales was measured using Cronbach's α and average inter-item correlation coefficients among 93 participants. For 33 of these participants, test-retest reliability was determined at a 2 month-interval using the interclass correlation coefficients (ICCs) and 95% confidence interval for the summary scores and the six subscale scores. The PDA was validated in 93 participants by examining the difference in the summary score and the six subscale scores of the PDA before and after replacement with new dentures by the paired t-test. Ability to detect change was also tested in 93 patients using effect size. The Cronbach's α for the PDA ranged from 0.56 to 0.93. The average inter-item correlation coefficients ranged from 0.28 to 0.83. ICCs for the PDA ranged from 0.37 to 0.83. The paired t-test showed a significant difference between the summary score and the six subscale scores before and after replacement with new dentures (p < 0.05) and the effect size was 0.97. The PDA demonstrated good reliability by assessing internal consistency and test-retest reliability. In addition, the PDA demonstrated good validity by assessing discriminant validity. Thus, the PDA could help dentists obtain a detailed understanding of the patients' perceptions in using their dentures.
Aertssen, W F M; Steenbergen, B; Smits-Engelsman, B C M
2018-06-07
There is lack of valid and reliable field-based tests for assessing functional strength in young children with mild intellectual disabilities (IDs). The aim of this study was to investigate the test-retest reliability and construct validity of the Functional Strength Measurement in children with ID (FSM-ID). Fifty-two children with mild ID (40 boys and 12 girls, mean age 8.48 years, SD = 1.48) were tested with the FSM. Test-retest reliability (n = 32) was examined by a two-way interclass correlation coefficient for agreement (ICC 2.1A). Standard error of measurement and smallest detectable change were calculated. Construct validity was determined by calculating correlations between the FSM-ID and handheld dynamometry (HHD) (convergent validity), FSM-ID, FSM-ID and subtest strength of the Bruininks-Oseretsky test of motor proficiency - second edition (BOT-2) (convergent validity) and the FSM-ID and balance subtest of the BOT-2 (discriminant validity). Test-retest reliability ICC ranged 0.89-0.98. Correlation between the items of the FSM-ID and HHD ranged 0.39-0.79 and between FSM-ID and BOT-2 (strength items) 0.41-0.80. Correlation between items of the FSM-ID and BOT-2 (balance items) ranged 0.41-0.70. The FSM-ID showed good test-retest reliability and good convergent validity with the HHD and BOT-2 subtest strength. The correlations assessing discriminant validity were higher than expected. Poor levels of postural control and core stability in children with mild IDs may be the underlying factor of those higher correlations. © 2018 MENCAP and International Association of the Scientific Study of Intellectual and Developmental Disabilities and John Wiley & Sons Ltd.
Further Study of the Choice of Anchor Tests in Equating
ERIC Educational Resources Information Center
Trierweiler, Tammy J.; Lewis, Charles; Smith, Robert L.
2016-01-01
In this study, we describe what factors influence the observed score correlation between an (external) anchor test and a total test. We show that the anchor to full-test observed score correlation is based on two components: the true score correlation between the anchor and total test, and the reliability of the anchor test. Findings using an…
Kuzmanova, Rumyana; Stefanova, Irina; Velcheva, Irena; Stambolieva, Katerina
2014-10-01
Adverse effects (AEs) of antiepileptic drugs (AEDs) affect the quality of life of patients with epilepsy and their outcomes. There are no questionnaires or studies on the reliability and validity of instruments measuring AEs of AEDs in patients with epilepsy in Bulgarian language. The aim of the present study was the translation, cross-cultural adaptation, and validation of the LAEP in the Bulgarian language in order to use it in the Bulgarian-speaking population in providing a reliable instrument for the clinical monitoring of patients with epilepsy. One hundred thirty-one patients (57 men and 74 women, mean age: 40.13±13.37 years) took part in the investigation. The internal consistency and test-retest reliability were tested by Cronbach's α and ICC estimations. The convergent construct validity was tested by estimating the correlation of the LAEP-BG with the QOLIE-89 and the discriminant validity by evaluating the difference between LAEP-BG scores and clinical parameters such as the type of epilepsy using Kruskal-Wallis ANOVA. The LAEP-BG showed high internal consistency and reliability. The Cronbach's α of the total scale was 0.86. No significant differences between the Cronbach's α coefficients of the total LAEP-BG and original English, Chinese, Spanish, Korean, and Portuguese-Brazilian versions of the questionnaire were observed. The ICCs, which evaluate the test-retest reliability, were higher than the recommended value of 0.75 and determined the strong positive correlations between the first and second examinations. The creation of two subscales "Neurological and psychiatric side effects" and "Non neurological side effects" of the LAEP-BG proposed by us showed good internal consistency (Cronbach's α of 0.85 and 0.71, respectively). The LAEP-BG scores significantly correlated with other questionnaires such as the Quality of Life in Epilepsy Inventory-89 (QOLIE-89) and showed a good discriminative validity between groups with different levels of self-assessed AEs of AEDs. The Bulgarian version of the Liverpool Adverse Event Profile (LAEP) is a reliable and valid tool in assessing the patient-reported AEs of AEDs and their impact on the patient's outcome. Copyright © 2014 Elsevier Inc. All rights reserved.
Development of a short version of the new brief job stress questionnaire.
Inoue, Akiomi; Kawakami, Norito; Shimomitsu, Teruichi; Tsutsumi, Akizumi; Haratani, Takashi; Yoshikawa, Toru; Shimazu, Akihito; Odagiri, Yuko
2014-01-01
This study was aimed to investigate the test-retest reliability and validity of a short version of the New Brief Job Stress Questionnaire (New BJSQ) whose scales have one item selected from a standard version. Based on the results from an anonymous web-based questionnaire of occupational health staffs and personnel/labor staffs, we selected higher-priority scales from the standard version. After selecting one item with highest item-total correlation coefficient from each scale, a 23-item questionnaire was developed. A nationally representative survey was administered to Japanese employees (n=1,633) to examine test-retest reliability and validity. Most scales (or items) showed modest but adequate levels of test-retest reliability (r>0.50). Furthermore, job demands and job resources scales (or items) were associated with mental and physical stress reactions while job resources scales (or items) were also associated with positive outcomes. These findings provided a piece of evidence that the short version of the New BJSQ is reliable and valid.
Development of a Short Version of the New Brief Job Stress Questionnaire
INOUE, Akiomi; KAWAKAMI, Norito; SHIMOMITSU, Teruichi; TSUTSUMI, Akizumi; HARATANI, Takashi; YOSHIKAWA, Toru; SHIMAZU, Akihito; ODAGIRI, Yuko
2014-01-01
This study was aimed to investigate the test-retest reliability and validity of a short version of the New Brief Job Stress Questionnaire (New BJSQ) whose scales have one item selected from a standard version. Based on the results from an anonymous web-based questionnaire of occupational health staffs and personnel/labor staffs, we selected higher-priority scales from the standard version. After selecting one item with highest item-total correlation coefficient from each scale, a 23-item questionnaire was developed. A nationally representative survey was administered to Japanese employees (n=1,633) to examine test-retest reliability and validity. Most scales (or items) showed modest but adequate levels of test-retest reliability (r>0.50). Furthermore, job demands and job resources scales (or items) were associated with mental and physical stress reactions while job resources scales (or items) were also associated with positive outcomes. These findings provided a piece of evidence that the short version of the New BJSQ is reliable and valid. PMID:24975108
Ashur, S T; Shamsuddin, K; Shah, S A; Bosseri, S; Morisky, D E
2015-12-13
No validation study has previously been made for the Arabic version of the 8-item Morisky Medication Adherence Scale (MMAS-8(©)) as a measure for medication adherence in diabetes. This study in 2013 tested the reliability and validity of the Arabic MMAS-8 for type 2 diabetes mellitus patients attending a referral centre in Tripoli, Libya. A convenience sample of 103 patients self-completed the questionnaire. Reliability was tested using Cronbach alpha, average inter-item correlation and Spearman-Brown coefficient. Known-group validity was tested by comparing MMAS-8 scores of patients grouped by glycaemic control. The Arabic version showed adequate internal consistency (α = 0.70) and moderate split-half reliability (r = 0.65). Known-group validity was supported as a significant association was found between medication adherence and glycaemic control, with a moderate effect size (ϕc = 0.34). The Arabic version displayed good psychometric properties and could support diabetes research and practice in Arab countries.
Clinical applications of correlational vestibular autorotation test.
Hsieh, Li-Chun; Lin, Te-Ming; Chang, Yu-Min; Kuo, Terry B J; Lee, Gho-She
2015-06-01
The correlational vestibular autorotation test (VAT) system has the advantages of good test-retest reliability and calibrations of absolute degrees of eye movement are unnecessary when acquiring a cross correlation coefficient (CCC). The approach is able to efficiently detect peripheral vestibulopathies. A VAT has some drawbacks including poor test-retest reliability and slippage of sensor. This study aimed to develop a correlational VAT system and to evaluate the reliability and applicability of this system. Twenty healthy participants and 10 vertiginous patients were enrolled. Vertical and horizontal autorotations from 0 to 3 Hz with either closed or open eyes were performed. A small sensor and a wireless transmission technique were used to acquire the electro-ocular graph and head velocity signals. The two signals were analyzed using CCCs to assess the functioning of the vestibular ocular reflex (VOR). The results showed a significantly greater CCC for open-eye versus closed-eye of head autorotations. The CCCs also increased significantly with head rotational frequencies. Moreover, the CCCs significantly correlated with the VOR gains at autorotation frequencies ≥1.0 Hz. The test-retest reliability was good (intraclass correlation coefficients ≥0.85). The vertiginous participants had significantly lower individual CCCs and overall average CCC than age- and-gender matched controls.
Ruiz, Jonatan R; Ortega, Francisco B; Castro-Piñero, Jose
2014-11-30
We investigated the criterion-related validity and the reliability of the 1/4 mile run-walk test (MRWT) in children and adolescents. A total of 86 children (n=42 girls) completed a maximal graded treadmill test using a gas analyzer and the 1/4MRW test. We investigated the test-retest reliability of the 1/4MRWT in a different group of children and adolescents (n=995, n=418 girls). The 1/4MRWT time, sex, and BMI significantly contributed to predict measured VO2peak (R2= 0.32). There was no systematic bias in the cross-validation group (P>0.1). The root mean sum of squared errors (RMSE) and the percentage error were 6.9 ml/kg/min and 17.7%, respectively, and the accurate prediction (i.e. the percentage of estimations within ±4.5 ml/kg/min of VO2peak) was 48.8%. The reliability analysis showed that the mean inter-trial difference ranged from 0.6 seconds in children aged 6-11 years to 1.3 seconds in adolescents aged 12-17 years (all P. Copyright AULA MEDICA EDICIONES 2014. Published by AULA MEDICA. All rights reserved.
Effects of back posture education on elementary schoolchildren's back function.
Geldhof, Elisabeth; Cardon, Greet; De Bourdeaudhuij, Ilse; Danneels, Lieven; Coorevits, Pascal; Vanderstraeten, Guy; De Clercq, Dirk
2007-06-01
The possible effects of back education on children's back function were never evaluated. Therefore, main aim of the present study was to evaluate the effects of back education in elementary schoolchildren on back function parameters. Since the reliability of back function measurement in children is poorly defined, another objective was to test the selected instruments for reliability in 8-11-year olds. The multi-factorial intervention lasting two school-years consisted of a back education program and the stimulation of postural dynamism in the class. Trunk muscle endurance, leg muscle capacity and spinal curvature were evaluated in a pre-post design including 41 children who received the back education program (mean age at post-test: 11.2 +/- 0.9 years) and 28 controls (mean age at post-test: 11.4 +/- 0.6 years). Besides, test-retest reliability with a 1-week interval was investigated in a separate sample. Therefore, 47 children (mean age: 10.1 +/- 0.5 years) were tested for reliability of trunk muscle endurance and 40 children (mean age: 10.2 +/- 0.7 years) for the assessment of spinal curvatures. Reliability of endurance testing was very good to good for the trunk flexors (ICC = 0.82) and trunk extensors (ICC = 0.63). The assessment of the thoracic (ICC = 0.69) and the lumbar curvature (ICC = 0.52) in seating position showed good to acceptable reliability. Low ICCs were found for the assessment of the thoracic (ICC = 0.39) and the lumbar curvature (ICC = 0.37) in stance. The effects of 2 year back education showed an increase in trunk flexor endurance in the intervention group compared to a decrease in the controls and a trend towards significance for a higher increase in trunk extensor endurance in the intervention group. For leg muscle capacity and spinal curvature no intervention effects were found. The small samples recommend cautious interpretation of intervention effects. However, the present study's findings favor the implementation of back education with focus on postural dynamism in the class as an integral part of the elementary school curriculum in the scope of optimizing spinal loading through the school environment.
Mieritz, Rune M; Bronfort, Gert; Jakobsen, Markus D; Aagaard, Per; Hartvigsen, Jan
2014-09-01
A basic premise for any instrument measuring spinal motion is that reliable outcomes can be obtained on a relevant sample under standardized conditions. The purpose of this study was to assess the overall reliability and measurement error of regional spinal sagittal plane motion in patients with chronic low back pain (LBP), and then to evaluate the influence of body mass index, examiner, gender, stability of pain, and pain distribution on reliability and measurement error. This study comprises a test-retest design separated by 7 to 14 days. The patient cohort consisted of 220 individuals with chronic LBP. Kinematics of the lumbar spine were sampled during standardized spinal extension-flexion testing using a 6-df instrumented spatial linkage system. Test-retest reliability and measurement error were evaluated using interclass correlation coefficients (ICC(1,1)) and Bland-Altman limits of agreement (LOAs). The overall test-retest reliability (ICC(1,1)) for various motion parameters ranged from 0.51 to 0.70, and relatively wide LOAs were observed for all parameters. Reliability measures in patient subgroups (ICC(1,1)) ranged between 0.34 and 0.77. In general, greater (ICC(1,1)) coefficients and smaller LOAs were found in subgroups with patients examined by the same examiner, patients with a stable pain level, patients with a body mass index less than below 30 kg/m(2), patients who were men, and patients in the Quebec Task Force classifications Group 1. This study shows that sagittal plane kinematic data from patients with chronic LBP may be sufficiently reliable in measurements of groups of patients. However, because of the large LOAs, this test procedure appears unusable at the individual patient level. Furthermore, reliability and measurement error varies substantially among subgroups of patients. Copyright © 2014 Elsevier Inc. All rights reserved.
Angeltveit, Andreas; Paulsen, Gøran; Solberg, Paul A; Raastad, Truls
2016-02-01
Operators in Special Operation Forces (SOF) have a particularly demanding profession where physical and psychological capacities can be challenged to the extremes. The diversity of physical capacities needed depend on the mission. Consequently, tests used to monitor SOF operators' physical fitness should cover a broad range of physical capacities. Whereas tests for strength and aerobic endurance are established, there is no test for specific anaerobic work capacity described in the literature. The purpose of this study was therefore to evaluate the reliability, validity, and to identify performance determinants of a new test developed for testing specific anaerobic work capacity in SOF operators. Nineteen active young students were included in the concurrent validity part of the study. The students performed the evacuation (EVAC) test 3 times and the results were compared for reliability and with performance in the Wingate cycle test, 300-m sprint, and a maximal accumulated oxygen deficit (MAOD) test. In part II of the study, 21 Norwegian Navy Special Operations Command operators conducted the EVAC test, anthropometric measurements, a dual x-ray absorptiometry scan, leg press, isokinetic knee extensions, maximal oxygen uptake test, and countermovement jump (CMJ) test. The EVAC test showed good reliability after 1 familiarization trial (intraclass correlation = 0.89; coefficient of variance = 3.7%). The EVAC test correlated well with the Wingate test (r = -0.68), 300-m sprint time (r = 0.51), and 300-m mean power (W) (r = -0.67). No significant correlation was found with the MAOD test. In part II of the study, height, body mass, lean body mass, isokinetic knee extension torque, maximal oxygen uptake, and maximal power in a CMJ was significantly correlated with performance in the EVAC test. The EVAC test is a reliable and valid test for anaerobic work capacity for SOF operators, and muscle mass, leg strength, and leg power seem to be the most important determinants of performance.
Ke, Hong-Liang; Jing, Lei; Gao, Qun; Wang, Yao; Hao, Jian; Sun, Qiang; Xu, Zhi-Jun
2015-11-20
Accelerated aging tests are the main method used in the evaluation of LED reliability, and can be performed in either online or offline modes. The goal of this study is to provide the difference between the two test modes. In the experiments, the sample is attached to different heat sinks to acquire the optical parameters under different junction temperatures of LEDs. By measuring the junction temperature in the aging process (Tj1), and the junction temperature in the testing process (Tj2), we achieve consistency with an online test of Tj1 and Tj2 and a difference with an offline test of Tj1 and Tj2. Experimental results show that the degradation rate of the luminous flux rises as Tj2 increases, which yields a difference of projected life L(70%) of 8% to 13%. For color shifts over 5000 h of aging, the online test shows a larger variation of the distance from the Planckian locus, about 40% to 50% more than the normal test at an ambient temperature of 25°C.
Reliability and Validity of the Footprint Assessment Method Using Photoshop CS5 Software.
Gutiérrez-Vilahú, Lourdes; Massó-Ortigosa, Núria; Costa-Tutusaus, Lluís; Guerra-Balic, Myriam
2015-05-01
Several sophisticated methods of footprint analysis currently exist. However, it is sometimes useful to apply standard measurement methods of recognized evidence with an easy and quick application. We sought to assess the reliability and validity of a new method of footprint assessment in a healthy population using Photoshop CS5 software (Adobe Systems Inc, San Jose, California). Forty-two footprints, corresponding to 21 healthy individuals (11 men with a mean ± SD age of 20.45 ± 2.16 years and 10 women with a mean ± SD age of 20.00 ± 1.70 years) were analyzed. Footprints were recorded in static bipedal standing position using optical podography and digital photography. Three trials for each participant were performed. The Hernández-Corvo, Chippaux-Smirak, and Staheli indices and the Clarke angle were calculated by manual method and by computerized method using Photoshop CS5 software. Test-retest was used to determine reliability. Validity was obtained by intraclass correlation coefficient (ICC). The reliability test for all of the indices showed high values (ICC, 0.98-0.99). Moreover, the validity test clearly showed no difference between techniques (ICC, 0.99-1). The reliability and validity of a method to measure, assess, and record the podometric indices using Photoshop CS5 software has been demonstrated. This provides a quick and accurate tool useful for the digital recording of morphostatic foot study parameters and their control.
Reliability and validity of the korean version of the connor-davidson resilience scale.
Baek, Hyun-Sook; Lee, Kyoung-Uk; Joo, Eun-Jeong; Lee, Mi-Young; Choi, Kyeong-Sook
2010-06-01
The Connor-Davidson Resilience Scale (CD-RISC) measures various aspects of psychological resilience in patients with posttraumatic stress disorder (PTSD) and other psychiatric ailments. This study sought to assess the reliability and validity of the Korean version of the Connor-Davidson Resilience Scale (K-CD-RISC). In total, 576 participants were enrolled (497 females and 79 males), including hospital nurses, university students, and firefighters. Subjects were evaluated using the K-CD-RISC, the Beck Depression Inventory (BDI), the Impact of Event Scale-Revised (IES-R), the Rosenberg Self-Esteem Scale (RSES), and the Perceived Stress Scale (PSS). Test-retest reliability and internal consistency were examined as a measure of reliability, and convergent validity and factor analysis were also performed to evaluate validity. Cronbach's alpha coefficient and test-retest reliability were 0.93 and 0.93, respectively. The total score on the K-CD-RISC was positively correlated with the RSES (r=0.56, p<0.01). Conversely, BDI (r=-0.46, p<0.01), PSS (r=-0.32, p<0.01), and IES-R scores (r=-0.26, p<0.01) were negatively correlated with the K-CD-RISC. The K-CD-RISC showed a five-factor structure that explained 57.2% of the variance. The K-CD-RISC showed good reliability and validity for measurement of resilience among Korean subjects.
Validity and reliability of the Utrecht Work Engagement Scale-Student Version in Sri Lanka.
Wickramasinghe, Nuwan Darshana; Dissanayake, Devani Sakunthala; Abeywardena, Gihan Sajiwa
2018-05-04
The present study was aimed at assessing the validity and the reliability of the Sinhala version of the Utrecht Work Engagement Scale-Student Version (UWES-S) among collegiate cycle students in Sri Lanka. The 17-item UWES-S was translated to Sinhala and the judgmental validity was assessed by a multi-disciplinary panel of experts. Construct validity of the UWES-S was appraised by using multi-trait scaling analysis and exploratory factor analysis (EFA) on data obtained from a sample of 194 grade thirteen students in the Kurunegala district, Sri Lanka. Reliability of the UWES-S was assessed by using internal consistency and test-retest reliability. Except for item 13, all other items showed good psychometric properties in judgemental validity, item-convergent validity and item-discriminant validity. EFA using principal component analysis with Oblimin rotation, suggested a three-factor solution (including vigor, dedication and absorption subscales) explaining 65.4% of the total variance for the 16-item UWES-S (with item 13 deleted). All three subscales show high internal consistency with Cronbach's α coefficient values of 0.867, 0.819, and 0.903 and test-retest reliability was high (p < 0.001). Hence, the Sinhala version of the 16-item UWES-S is a valid and a reliable instrument to assess work engagement among collegiate cycle students in Sri Lanka.
Prather, H; Harris-Hayes, M; Hunt, D; Steger-May, K; Mathew, V; Clohisy, JC
2012-01-01
Objective The objectives of this study are the following: 1) report passive hip ROM in asymptomatic young adults, 2) report the intra-tester and inter-tester reliability of hip ROM measurements among testers of multiple disciplines, 3) report the results of provocative hip tests and tester agreement. Design descriptive epidemiology study Setting tertiary university Participants Twenty-eight young adult volunteers without musculoskeletal symptoms, history of disorder or surgery involving the lumbar spine or lower extremities were enrolled and completed the study. Methods Asymptomatic young adult volunteers completed questionnaires and were examined by two blinded examiners during a single session. The testers were physical therapists and physicians. Hip range of motion and provocative tests were completed by both examiners on each hip. Main Outcome Measurements Inter and intra-rater reliability for ROM and agreement for provocative tests was determined. Results Twenty-eight asymptomatic adults with mean age 31 years old (range 18–51 years) and mean modified Harris Hip Score of 99.5 ± 1.5 and UCLA Activity score of 8.8 ± 1.2 completed the study. Intra-rater agreement was excellent for all hip range of motion measurements, with intraclass correlation coefficients (ICCs) ranging from 0.76 to 0.97 with similar agreement if the examiner was a physical therapist or a physician. Excellent inter-rater reliability was found for hip flexion ICC 0.87 (95% CI 0.78 to 0.92), supine internal rotation ICC 0.75 (95% CI 0.60 to 0.84) and prone internal rotation ICC 0.79 (95% CI 0.66 to 0.87). The least reliable measurements were supine hip abduction (ICC 0.34) and supine external rotation (ICC 0.18). Agreement between examiners ranged from 96–100% for provocative hip tests which included the hip impingement, resisted straight leg raise, FABER/Patrick’s and log roll tests. Conclusions Specific hip ROM measures show excellent inter-rater reliability and provocative hip tests show good agreement among multiple examiners and medical disciplines. Further studies are needed to assess the utilization of these measurements and tests as a part of a hip screening examination to assess for young adults at risk intra-articular hip disorders prior to the onset of degenerative changes. PMID:20970757
Beardsley, Chris; Egerton, Tim; Skinner, Brendon
2016-01-01
Objective. The purpose of this study was to investigate the reliability of a digital pelvic inclinometer (DPI) for measuring sagittal plane pelvic tilt in 18 young, healthy males and females. Method. The inter-rater reliability and test-re-test reliabilities of the DPI for measuring pelvic tilt in standing on both the right and left sides of the pelvis were measured by two raters carrying out two rating sessions of the same subjects, three weeks apart. Results. For measuring pelvic tilt, inter-rater reliability was designated as good on both sides (ICC = 0.81-0.88), test-re-test reliability within a single rating session was designated as good on both sides (ICC = 0.88-0.95), and test-re-test reliability between two rating sessions was designated as moderate on the left side (ICC = 0.65) and good on the right side (ICC = 0.85). Conclusion. Inter-rater reliability and test-re-test reliability within a single rating session of the DPI in measuring pelvic tilt were both good, while test-re-test reliability between rating sessions was moderate-to-good. Caution is required regarding the interpretation of the test-re-test reliability within a single rating session, as the raters were not blinded. Further research is required to establish validity.
Mousavian, Alireza; Ebrahimzadeh, Mohammad H; Birjandinejad, Ali; Omidi-Kashani, Farzad; Kachooei, Amir Reza
2015-12-01
In this study, we aimed to translate and test the validity and reliablity of the Persian version of the Manchester-Oxford Foot Questionnaire in foot and ankle patients. We translated the Manchester-Oxford Foot Questionnaire to Persian language according to the accepted guidelines, then assessed the psychometric properties including the validity and reliability on 308 patients with long-standing foot and ankle problems. To test the reliability, we calculated the intra-class correlation coefficient (ICC) for test-retest reliability and measured Cronbach's alpha to test the internal consistency. To test the construct validity of the Manchester-Oxford Foot Questionnaire we also administered the Short-Form 36 to patients. Construct validity was supported by significant correlation with SF36 subscales except for pain subscale of the persian MOXFQ with mental health of the SF36 (r=0.207). Intraclass correlation coefficient was 0.79 for the total MOXFQ and ranged from 0.83 to 0.89 for the three subscales. Cronbach's alpha for pain, walking/standing, and social interaction was 0.86, 0.88, and 0.89, respectively, and was 0.79 for the total MOXFQ showing good internal consistency in each domain. The Persian Manchester-Oxford Foot Questionnaire health scoring system is a valid and reliable patient-reported instrument for foot and ankle problems. Copyright © 2015. Published by Elsevier Ltd.
Buiza, Cristina; Yanguas, Javier; Zulaica, Amaia; Antón, Iván; Arriola, Enrique; García, Alvaro
2018-04-13
Adaptation and validation to the Basque language of tests to assess advanced cognitive impairment is a not covered need for Basque-speaking people. The present work shows the validation of the Basque version of the Severe Mini Mental State Examination (SMMSE). A total of 109 people with advanced dementia (MEC<15) took part in the validation study, and were classified as GDS 5-7 on the Geriatric Depression Scale (GDS). All participants were Spanish-Basque bilingual. It was shown that SMMSE-eus has a high internal consistency (alpha=0.92), a good test-retest reliability (r=0.88; P<.01), and a high inter-rater reliability (CCI=0.99; P<.00) for the overall score, as well as for each item. Both the high internal consistency and inter-rater reliability, and to a lesser extent, test-retest reliability, made the SMMSE-eus a valid test for the brief assessment of cognitive status in people with advanced dementia in Basque-speaking people. For this reason, the SMMSE-eus is a usable and reliable alternative for assessing Basque-speaking people in their mother-tongue, or preferred language. Copyright © 2017 SEGG. Publicado por Elsevier España, S.L.U. All rights reserved.
The Validity and reliability of the Comprehensive Home Environment Survey (CHES).
Pinard, Courtney A; Yaroch, Amy L; Hart, Michael H; Serrano, Elena L; McFerren, Mary M; Estabrooks, Paul A
2014-01-01
Few comprehensive measures exist to assess contributors to childhood obesity within the home, specifically among low-income populations. The current study describes the modification and psychometric testing of the Comprehensive Home Environment Survey (CHES), an inclusive measure of the home food, physical activity, and media environment related to childhood obesity. The items were tested for content relevance by an expert panel and piloted in the priority population. The CHES was administered to low-income parents of children 5 to 17 years (N = 150), including a subsample of parents a second time and additional caregivers to establish test-retest and interrater reliabilities. Children older than 9 years (n = 95), as well as parents (N = 150) completed concurrent assessments of diet and physical activity behaviors (predictive validity). Analyses and item trimming resulted in 18 subscales and a total score, which displayed adequate internal consistency (α = .74-.92) and high test-retest reliability (r ≥ .73, ps < .01) and interrater reliability (r ≥ .42, ps < .01). The CHES score and a validated screener for the home environment were correlated (r = .37, p < .01; concurrent validity). CHES subscales were significantly correlated with behavioral measures (r = -.20-.55, p < .05; predictive validity). The CHES shows promise as a valid/reliable assessment of the home environment related to childhood obesity, including healthy diet and physical activity.
Singh, Amika S; Chinapaw, Mai J M; Uijtdewilligen, Léonie; Vik, Froydis N; van Lippevelde, Wendy; Fernández-Alvira, Juan M; Stomfai, Sarolta; Manios, Yannis; van der Sluijs, Maria; Terwee, Caroline; Brug, Johannes
2012-08-13
Insight in parental energy balance-related behaviours, their determinants and parenting practices are important to inform childhood obesity prevention. Therefore, reliable and valid tools to measure these variables in large-scale population research are needed. The objective of the current study was to examine the test-retest reliability and construct validity of the parent questionnaire used in the ENERGY-project, assessing parental energy balance-related behaviours, their determinants, and parenting practices among parents of 10-12 year old children. We collected data among parents (n = 316 in the test-retest reliability study; n = 109 in the construct validity study) of 10-12 year-old children in six European countries, i.e. Belgium, Greece, Hungary, the Netherlands, Norway, and Spain. Test-retest reliability was assessed using the intra-class correlation coefficient (ICC) and percentage agreement comparing scores from two measurements, administered one week apart. To assess construct validity, the agreement between questionnaire responses and a subsequent interview was assessed using ICC and percentage agreement.All but one item showed good to excellent test-retest reliability as indicated by ICCs > .60 or percentage agreement ≥ 75%. Construct validity appeared to be good to excellent for 92 out of 121 items, as indicated by ICCs > .60 or percentage agreement ≥ 75%. From the other 29 items, construct validity was moderate for 24 and poor for 5 items. The reliability and construct validity of the items of the ENERGY-parent questionnaire on multiple energy balance-related behaviours, their potential determinants, and parenting practices appears to be good. Based on the results of the validity study, we strongly recommend adapting parts of the ENERGY-parent questionnaire if used in future research.
Faux-Pas Test: A Proposal of a Standardized Short Version.
Fernández-Modamio, Mar; Arrieta-Rodríguez, Marta; Bengochea-Seco, Rosario; Santacoloma-Cabero, Iciar; Gómez de Tojeiro-Roce, Juan; García-Polavieja, Bárbara; González-Fraile, Eduardo; Martín-Carrasco, Manuel; Griffin, Kim; Gil-Sanz, David
2018-06-26
Previous research on theory of mind suggests that people with schizophrenia have difficulties with complex mentalization tasks that involve the integration of cognition and affective mental states. One of the tools most commonly used to assess theory of mind is the Faux-Pas Test. However, it presents two main methodological problems: 1) the lack of a standard scoring system; 2) the different versions are not comparable due to a lack of information on the stories used. These methodological problems make it difficult to draw conclusions about performance on this test by people with schizophrenia. The aim of this study was to develop a reduced version of the Faux-Pas test with adequate psychometric properties. The test was administered to control and clinical groups. Interrater and test-retest reliability were analyzed for each story in order to select the set of 10 stories included in the final reduced version. The shortened version showed good psychometric properties for controls and patients: test-retest reliability of 0.97 and 0.78, inter-rater reliability of 0.95 and 0.87 and Cronbach's alpha of 0.82 and 0.72.
Chiang, Hsin-Yu; Lu, Wen-Shian; Yu, Wan-Hui; Hsueh, I-Ping; Hsieh, Ching-Lin
2018-04-11
To examine the interrater and intrarater reliability of the Balance Computerized Adaptive Test (Balance CAT) in patients with chronic stroke having a wide range of balance functions. Repeated assessments design (1wk apart). Seven teaching hospitals. A pooled sample (N=102) including 2 independent groups of outpatients (n=50 for the interrater reliability study; n=52 for the intrarater reliability study) with chronic stroke. Not applicable. Balance CAT. For the interrater reliability study, the values of intraclass correlation coefficient, minimal detectable change (MDC), and percentage of MDC (MDC%) for the Balance CAT were .84, 1.90, and 31.0%, respectively. For the intrarater reliability study, the values of intraclass correlation coefficient, MDC, and MDC% ranged from .89 to .91, from 1.14 to 1.26, and from 17.1% to 18.6%, respectively. The Balance CAT showed sufficient intrarater reliability in patients with chronic stroke having balance functions ranging from sitting with support to independent walking. Although the Balance CAT may have good interrater reliability, we found substantial random measurement error between different raters. Accordingly, if the Balance CAT is used as an outcome measure in clinical or research settings, same raters are suggested over different time points to ensure reliable assessments. Copyright © 2018 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Simulation-Based Training for Colonoscopy
Preisler, Louise; Svendsen, Morten Bo Søndergaard; Nerup, Nikolaj; Svendsen, Lars Bo; Konge, Lars
2015-01-01
Abstract The aim of this study was to create simulation-based tests with credible pass/fail standards for 2 different fidelities of colonoscopy models. Only competent practitioners should perform colonoscopy. Reliable and valid simulation-based tests could be used to establish basic competency in colonoscopy before practicing on patients. Twenty-five physicians (10 consultants with endoscopic experience and 15 fellows with very little endoscopic experience) were tested on 2 different simulator models: a virtual-reality simulator and a physical model. Tests were repeated twice on each simulator model. Metrics with discriminatory ability were identified for both modalities and reliability was determined. The contrasting-groups method was used to create pass/fail standards and the consequences of these were explored. The consultants significantly performed faster and scored higher than the fellows on both the models (P < 0.001). Reliability analysis showed Cronbach α = 0.80 and 0.87 for the virtual-reality and the physical model, respectively. The established pass/fail standards failed one of the consultants (virtual-reality simulator) and allowed one fellow to pass (physical model). The 2 tested simulations-based modalities provided reliable and valid assessments of competence in colonoscopy and credible pass/fail standards were established for both the tests. We propose to use these standards in simulation-based training programs before proceeding to supervised training on patients. PMID:25634177
Gergov, Vera; Lahti, Jari; Marttunen, Mauri; Lipsanen, Jari; Evans, Chris; Ranta, Klaus; Laitila, Aarno; Lindberg, Nina
2017-05-01
An increasing need exists for suitable measures to evaluate treatment outcome in adolescents. YP-CORE is a pan-theoretical brief questionnaire developed for this purpose, but it lacks studies in different cultures or languages. To explore the acceptability, factor structure, reliability, validity, and sensitivity to change of the Finnish translation of YP-CORE. The study was conducted at the Department of Adolescent Psychiatry, Helsinki University Central Hospital. A Finnish translation was prepared by a team of professionals and adolescents. A clinical sample of 104 patients was asked to complete the form together with BDI-21 and BAI, and 92 of them filled the forms again after a 3-month treatment. Analysis included acceptability, confirmatory factor analysis, internal and test-re-test reliability, concurrent validity, influence of gender and age, and criteria for reliable change. YP-CORE was well accepted, and the rate of missing values was low. Internal consistency (α = 0.83-.92) and test-re-test reliability were good (r = 0.69), and the results of CFA supported a one-factor model. YP-CORE showed good concurrent validity against two widely used symptom-specific measures (r = 0.62-0.87). Gender had a moderately strong effect on the scores (d = 0.67), but the effect of age was not as evident. The measure was sensitive to change, showing a larger effect size (d = 0.55) than in the BDI-21 and BAI (d = 0.31-0.50). The results show that the translation of YP-CORE into Finnish has been successful, the YP-CORE has good psychometric properties, and the measure could be taken into wider use in clinical settings for outcome measurement in adolescents.
Gerhardsson, Lars; Gillström, Lennart; Hagberg, Mats
2014-01-01
Exposure to hand-held vibrating tools may cause the hand-arm vibration syndrome (HAVS). The aim was to study the test-retest reliability of hand and muscle strength tests, and tests for the determination of thermal and vibration perception thresholds, which are used when investigating signs of neuropathy in vibration exposed workers. In this study, 47 vibration exposed workers who had been investigated at the department of Occupational and Environmental Medicine in Gothenburg were compared with a randomized sample of 18 unexposed subjects from the general population of the city of Gothenburg. All participants passed a structured interview, answered several questionnaires and had a physical examination including hand and finger muscle strength tests, determination of vibrotactile (VPT) and thermal perception thresholds (TPT). Two weeks later, 23 workers and referents, selected in a randomized manner, were called back for the same test-procedures for the evaluation of test-retest reliability. The test-retest reliability after a two week interval expressed as limits of agreement (LOA; Bland-Altman), intra-class correlation coefficients (ICC) and Pearson correlation coefficients was excellent for tests with the Baseline hand grip, Pinch-grip and 3-Chuck grip among the exposed workers and referents (N = 23: percentage of differences within LOA 91 - 100%; ICC-values ≥0.93; Pearson r ≥0.93). The test-retest reliability was also excellent (percentage of differences within LOA 96-100 %) for the determination of vibration perception thresholds in digits 2 and 5 bilaterally as well as for temperature perception thresholds in digits 2 and 5, bilaterally (percentage of differences within LOA 91 - 96%). For ICC and Pearson r the results for vibration perception thresholds were good for digit 2, left hand and for digit 5, bilaterally (ICC ≥ 0.84; r ≥0.85), and lower (ICC = 0.59; r = 0.59) for digit 2, right hand. For the latter two indices the test-retest reliability for the determination of temperature thresholds was lower and showed more varying results. The strong test-retest reliability for hand and muscle strength tests as well as for the determination of VPTs makes these procedures useful for diagnostic purposes and follow-up studies in vibration exposed workers.
Dennett, Hugh W; McKone, Elinor; Tavashmi, Raka; Hall, Ashleigh; Pidcock, Madeleine; Edwards, Mark; Duchaine, Bradley
2012-06-01
Many research questions require a within-class object recognition task matched for general cognitive requirements with a face recognition task. If the object task also has high internal reliability, it can improve accuracy and power in group analyses (e.g., mean inversion effects for faces vs. objects), individual-difference studies (e.g., correlations between certain perceptual abilities and face/object recognition), and case studies in neuropsychology (e.g., whether a prosopagnosic shows a face-specific or object-general deficit). Here, we present such a task. Our Cambridge Car Memory Test (CCMT) was matched in format to the established Cambridge Face Memory Test, requiring recognition of exemplars across view and lighting change. We tested 153 young adults (93 female). Results showed high reliability (Cronbach's alpha = .84) and a range of scores suitable both for normal-range individual-difference studies and, potentially, for diagnosis of impairment. The mean for males was much higher than the mean for females. We demonstrate independence between face memory and car memory (dissociation based on sex, plus a modest correlation between the two), including where participants have high relative expertise with cars. We also show that expertise with real car makes and models of the era used in the test significantly predicts CCMT performance. Surprisingly, however, regression analyses imply that there is an effect of sex per se on the CCMT that is not attributable to a stereotypical male advantage in car expertise.
Diagnostic reliability of MMPI-2 computer-based test interpretations.
Pant, Hina; McCabe, Brian J; Deskovitz, Mark A; Weed, Nathan C; Williams, John E
2014-09-01
Reflecting the common use of the MMPI-2 to provide diagnostic considerations, computer-based test interpretations (CBTIs) also typically offer diagnostic suggestions. However, these diagnostic suggestions can sometimes be shown to vary widely across different CBTI programs even for identical MMPI-2 profiles. The present study evaluated the diagnostic reliability of 6 commercially available CBTIs using a 20-item Q-sort task developed for this study. Four raters each sorted diagnostic classifications based on these 6 CBTI reports for 20 MMPI-2 profiles. Two questions were addressed. First, do users of CBTIs understand the diagnostic information contained within the reports similarly? Overall, diagnostic sorts of the CBTIs showed moderate inter-interpreter diagnostic reliability (mean r = .56), with sorts for the 1/2/3 profile showing the highest inter-interpreter diagnostic reliability (mean r = .67). Second, do different CBTIs programs vary with respect to diagnostic suggestions? It was found that diagnostic sorts of the CBTIs had a mean inter-CBTI diagnostic reliability of r = .56, indicating moderate but not strong agreement across CBTIs in terms of diagnostic suggestions. The strongest inter-CBTI diagnostic agreement was found for sorts of the 1/2/3 profile CBTIs (mean r = .71). Limitations and future directions are discussed. PsycINFO Database Record (c) 2014 APA, all rights reserved.
Wilson, Stephen M; Eriksson, Dana K; Schneck, Sarah M; Lucanie, Jillian M
2018-01-01
This paper describes a quick aphasia battery (QAB) that aims to provide a reliable and multidimensional assessment of language function in about a quarter of an hour, bridging the gap between comprehensive batteries that are time-consuming to administer, and rapid screening instruments that provide limited detail regarding individual profiles of deficits. The QAB is made up of eight subtests, each comprising sets of items that probe different language domains, vary in difficulty, and are scored with a graded system to maximize the informativeness of each item. From the eight subtests, eight summary measures are derived, which constitute a multidimensional profile of language function, quantifying strengths and weaknesses across core language domains. The QAB was administered to 28 individuals with acute stroke and aphasia, 25 individuals with acute stroke but no aphasia, 16 individuals with chronic post-stroke aphasia, and 14 healthy controls. The patients with chronic post-stroke aphasia were tested 3 times each and scored independently by 2 raters to establish test-retest and inter-rater reliability. The Western Aphasia Battery (WAB) was also administered to these patients to assess concurrent validity. We found that all QAB summary measures were sensitive to aphasic deficits in the two groups with aphasia. All measures showed good or excellent test-retest reliability (overall summary measure: intraclass correlation coefficient (ICC) = 0.98), and excellent inter-rater reliability (overall summary measure: ICC = 0.99). Sensitivity and specificity for diagnosis of aphasia (relative to clinical impression) were 0.91 and 0.95 respectively. All QAB measures were highly correlated with corresponding WAB measures where available. Individual patients showed distinct profiles of spared and impaired function across different language domains. In sum, the QAB efficiently and reliably characterized individual profiles of language deficits.
Eriksson, Dana K.; Schneck, Sarah M.; Lucanie, Jillian M.
2018-01-01
This paper describes a quick aphasia battery (QAB) that aims to provide a reliable and multidimensional assessment of language function in about a quarter of an hour, bridging the gap between comprehensive batteries that are time-consuming to administer, and rapid screening instruments that provide limited detail regarding individual profiles of deficits. The QAB is made up of eight subtests, each comprising sets of items that probe different language domains, vary in difficulty, and are scored with a graded system to maximize the informativeness of each item. From the eight subtests, eight summary measures are derived, which constitute a multidimensional profile of language function, quantifying strengths and weaknesses across core language domains. The QAB was administered to 28 individuals with acute stroke and aphasia, 25 individuals with acute stroke but no aphasia, 16 individuals with chronic post-stroke aphasia, and 14 healthy controls. The patients with chronic post-stroke aphasia were tested 3 times each and scored independently by 2 raters to establish test-retest and inter-rater reliability. The Western Aphasia Battery (WAB) was also administered to these patients to assess concurrent validity. We found that all QAB summary measures were sensitive to aphasic deficits in the two groups with aphasia. All measures showed good or excellent test-retest reliability (overall summary measure: intraclass correlation coefficient (ICC) = 0.98), and excellent inter-rater reliability (overall summary measure: ICC = 0.99). Sensitivity and specificity for diagnosis of aphasia (relative to clinical impression) were 0.91 and 0.95 respectively. All QAB measures were highly correlated with corresponding WAB measures where available. Individual patients showed distinct profiles of spared and impaired function across different language domains. In sum, the QAB efficiently and reliably characterized individual profiles of language deficits. PMID:29425241
Behavioral and cognitive outcomes for clinical trials in children with neurofibromatosis type 1.
van der Vaart, Thijs; Rietman, André B; Plasschaert, Ellen; Legius, Eric; Elgersma, Ype; Moll, Henriëtte A
2016-01-12
To evaluate the appropriateness of cognitive and behavioral outcome measures in clinical trials in neurofibromatosis type 1 (NF1) by analyzing the degree of deficits compared to reference groups, test-retest reliability, and how scores correlate between outcome measures. Data were analyzed from the Simvastatin for cognitive deficits and behavioral problems in patients with neurofibromatosis type 1 (NF1-SIMCODA) trial, a randomized placebo-controlled trial of simvastatin for cognitive deficits and behavioral problems in children with NF1. Outcome measures were compared with age-specific reference groups to identify domains of dysfunction. Pearson r was computed for before and after measurements within the placebo group to assess test-retest reliability. Principal component analysis was used to identify the internal structure in the outcome data. Strongest mean score deviations from the reference groups were observed for full-scale intelligence (-1.1 SD), Rey Complex Figure Test delayed recall (-2.0 SD), attention problems (-1.2 SD), and social problems (-1.1 SD). Long-term test-retest reliability were excellent for Wechsler scales (r > 0.88), but poor to moderate for other neuropsychological tests (r range 0.52-0.81) and Child Behavioral Checklist subscales (r range 0.40-0.79). The correlation structure revealed 2 strong components in the outcome measures behavior and cognition, with no correlation between these components. Scores on psychosocial quality of life correlate strongly with behavioral problems and less with cognitive deficits. Children with NF1 show distinct deficits in multiple domains. Many outcome measures showed weak test-retest correlations over the 1-year trial period. Cognitive and behavioral outcomes are complementary. This analysis demonstrates the need to include reliable outcome measures on a variety of cognitive and behavioral domains in clinical trials for NF1. © 2015 American Academy of Neurology.
Accelerated testing of module-level power electronics for long-term reliability
DOE Office of Scientific and Technical Information (OSTI.GOV)
Flicker, Jack David; Tamizhmani, Govindasamy; Moorthy, Mathan Kumar
This work has applied a suite of long-term-reliability accelerated tests to a variety of module-level power electronics (MLPE) devices (such as microinverters and optimizers) from five different manufacturers. This dataset is one of the first (only the paper by Parker et al. entitled “Dominant factors affecting reliability of alternating current photovoltaic modules,” in Proc. 42nd IEEE Photovoltaic Spec. Conf., 2015, is reported for reliability testing in the literature), as well as the largest, experimental sets in public literature, both in the sample size (five manufacturers including both dc/dc and dc/ac units and 20 units for each test) and the numbermore » of experiments (six different experimental test conditions) for MLPE devices. The accelerated stress tests (thermal cycling test per IEC 61215 profile, damp heat test per IEC 61215 profile, and static temperature tests at 100 and 125 °C) were performed under powered and unpowered conditions. The first independent long-term experimental data regarding damp heat and grid transient testing, as well as the longest term (>9 month) testing of MLPE units reported in the literature for thermal cycling and high-temperature operating life, are included in these experiments. Additionally, this work is the first to show in situ power measurements, as well as periodic efficiency measurements over a series of experimental tests, demonstrating whether certain tests result in long-term degradation or immediate catastrophic failures. Lastly, the result of this testing highlights the performance of MLPE units under the application of several accelerated environmental stressors.« less
Accelerated testing of module-level power electronics for long-term reliability
Flicker, Jack David; Tamizhmani, Govindasamy; Moorthy, Mathan Kumar; ...
2016-11-10
This work has applied a suite of long-term-reliability accelerated tests to a variety of module-level power electronics (MLPE) devices (such as microinverters and optimizers) from five different manufacturers. This dataset is one of the first (only the paper by Parker et al. entitled “Dominant factors affecting reliability of alternating current photovoltaic modules,” in Proc. 42nd IEEE Photovoltaic Spec. Conf., 2015, is reported for reliability testing in the literature), as well as the largest, experimental sets in public literature, both in the sample size (five manufacturers including both dc/dc and dc/ac units and 20 units for each test) and the numbermore » of experiments (six different experimental test conditions) for MLPE devices. The accelerated stress tests (thermal cycling test per IEC 61215 profile, damp heat test per IEC 61215 profile, and static temperature tests at 100 and 125 °C) were performed under powered and unpowered conditions. The first independent long-term experimental data regarding damp heat and grid transient testing, as well as the longest term (>9 month) testing of MLPE units reported in the literature for thermal cycling and high-temperature operating life, are included in these experiments. Additionally, this work is the first to show in situ power measurements, as well as periodic efficiency measurements over a series of experimental tests, demonstrating whether certain tests result in long-term degradation or immediate catastrophic failures. Lastly, the result of this testing highlights the performance of MLPE units under the application of several accelerated environmental stressors.« less
Abma, Femke I; van der Klink, Jac J L; Bültmann, Ute
2013-03-01
The promotion of a sustainable, healthy and productive working life attracts more and more attention. Recently the Work Role Functioning Questionnaire (WRFQ) has been cross-culturally translated and adapted to Dutch. This questionnaire aims to measure the health-related work functioning of workers with health problems. The aim of this study is to evaluate the reliability, validity (including five new items) and responsiveness of the WRFQ 2.0 in the working population. A longitudinal study was conducted among workers. The reliability (internal consistency, test-retest reliability, measurement error), validity (structural validity-factor analysis, construct validity by means of hypotheses testing) and responsiveness of the WRFQ 2.0 were evaluated. A total of N = 553 workers completed the survey. The final WRFQ 2.0 has four subscales and showed very good internal consistency, moderate test-retest reliability, good construct validity and moderate responsiveness in the working population. The WRFQ was able to distinguish between groups with different levels of mental health, physical health, fatigue and need for recovery. A moderate correlation was found between WRFQ and related constructs respectively work ability and work productivity. A weak relationship was found with general self-rated health, work engagement and work involvement. The WRFQ 2.0 is a reliable and valid instrument to measure health-related work functioning in the working population. Further validation in larger samples is recommended, especially for test-retest reliability, responsiveness and the questionnaire's ability to predict the future course of health-related work functioning.
Chen, Yi-Miau; Huang, Yi-Jing; Huang, Chien-Yu; Lin, Gong-Hong; Liaw, Lih-Jiun; Lee, Shih-Chieh; Hsieh, Ching-Lin
2017-10-01
The 3-point Berg Balance Scale (BBS-3P) and 3-point Postural Assessment Scale for Stroke Patients (PASS-3P) were simplified from the BBS and PASS to overcome the complex scoring systems. The BBS-3P and PASS-3P were more feasible in busy clinical practice and showed similarly sound validity and responsiveness to the original measures. However, the reliability of the BBS-3P and PASS-3P is unknown limiting their utility and the interpretability of scores. We aimed to examine the test-retest reliability and minimal detectable change (MDC) of the BBS-3P and PASS-3P in patients with stroke. Cross-sectional study. The rehabilitation departments of a medical center and a community hospital. A total of 51 chronic stroke patients (64.7% male). Both balance measures were administered twice 7 days apart. The test-retest reliability of both the BBS-3P and PASS-3P were examined by intraclass correlation coefficients (ICC). The MDC and its percentage over the total score (MDC%) of each measure was calculated for examining the random measurement errors. The ICC values of the BBS-3P and PASS-3P were 0.99 and 0.97, respectively. The MDC% (MDC) of the BBS-3P and PASS-3P were 9.1% (5.1 points) and 8.4% (3.0 points), respectively, indicating that both measures had small and acceptable random measurement errors. Our results showed that both the BBS-3P and the PASS-3P had good test-retest reliability, with small and acceptable random measurement error. These two simplified 3-level balance measures can provide reliable results over time. Our findings support the repeated administration of the BBS-3P and PASS-3P to monitor the balance of patients with stroke. The MDC values can help clinicians and researchers interpret the change scores more precisely.
Internal consistency and stability of the CANTAB neuropsychological test battery in children.
Syväoja, Heidi J; Tammelin, Tuija H; Ahonen, Timo; Räsänen, Pekka; Tolvanen, Asko; Kankaanpää, Anna; Kantomaa, Marko T
2015-06-01
The Cambridge Neuropsychological Test Automated Battery (CANTAB) is a computer-assessed test battery widely use in different populations. The internal consistency and 1-year stability of CANTAB tests were examined in school-age children. Two hundred-thirty children (57% girls) from five schools in the Jyväskylä school district in Finland participated in the study in spring 2011. The children completed the following CANTAB tests: (a) visual memory (pattern recognition memory [PRM] and spatial recognition memory [SRM]), (b) executive function (spatial span [SSP], Stockings of Cambridge [SOC], and intra-extra dimensional set shift [IED]), and (c) attention (reaction time [RTI] and rapid visual information processing [RVP]). Seventy-four children participated in the follow-up measurements (64% girls) in spring 2012. Cronbach's alpha reliability coefficient was used to estimate the internal consistency of the nonhampering test, and structural equation models were applied to examine the stability of these tests. The reliability and the stability could not be determined for IED or SSP because of the nature of these tests. The internal consistency was acceptable only in the RTI task. The 1-year stability was moderate-to-good for the PRM, RTI, and RVP. The SSP and IED showed a moderate correlation between the two measurement points. The SRM and the SOC tasks were not reliable or stable measures in this study population. For research purposes, we recommend using structural equation modeling to improve reliability. The results suggest that the reliability and the stability of computer-based test batteries should be confirmed in the target population before using them for clinical or research purposes. (c) 2015 APA, all rights reserved).
Reliability of the detailed assessment of speed of handwriting on Flemish children.
Simons, Johan; Probst, Michel
2014-01-01
This study evaluates the reliability of the Detailed Assessment of Speed of Handwriting (DASH) in a Dutch-speaking sample of children. The sample included 650 boys and 513 girls (age range = 9-16 years). Handwriting speed measurements were obtained using the DASH. Interrater agreement, test-retest reliability, and internal consistency were calculated; gender and age effects were analyzed. Interrater agreement shows excellent reliability with intraclass correlation coefficients of at least 0.94. Test-retest correlations ranged from r = 0.65 to r = 0.81. The internal consistency measures, calculated with Cronbach's alpha, were between 0.88 and 0.94. Both gender and age have a significant effect on handwriting speed, with F (7.1144) = 17.43 (P < .001) for gender and F (7.1144) = 21.8 (P < .001) for age. The DASH is a reliable assessment tool to evaluate handwriting speed of Dutch-speaking children. There is a tendency of girls to write faster than boys.
Lario, Sergio; Ramírez-Lázaro, María José; Montserrat, Antònia; Quílez, María Elisa; Junquera, Félix; Martínez-Bauer, Eva; Sanfeliu, Isabel; Brullet, Enric; Campo, Rafael; Segura, Ferran; Calvet, Xavier
2016-06-01
Immunochromatographic tests need to be improved in order to enhance their reliability. Recently, several new kits have appeared on the market. The objective was to evaluate the diagnostic accuracy of three monoclonal rapid stool tests - the new Uni-Gold™ H.pylori Antigen (Trinity Biotech, Ireland), the RAPID Hp StAR (Oxoid Ltd., UK) and the ImmunoCard STAT! HpSA (Meridian Diagnostics, USA) - for detecting H. pylori infection prior to eradication treatment. Diagnostic accuracy (sensitivity and specificity) and reliability (concordance between observers) were evaluated in 250 untreated consecutive dyspeptic patients. The gold standard for diagnosing H. pylori infection was defined as the concordance of two or more of rapid urease test (RUT), histopathology and urease breath test (UBT) or positive culture in isolation. Readings of immunochromatographic tests were performed by two different observers. Sensitivity, specificity, positive and negative predictive values and 95% confidence intervals were calculated. Sensitivity and specificity were compared using the McNemar test. The three tests showed a good correlation, with Kappa values>0.9. RAPID Hp StAR had a sensitivity of 91%-92% and a specificity ranging from 77% to 85%. Its sensitivity was higher than that of Uni-Gold™ H.pylori Antigen and ImmunoCard STAT! HpSA (p<0.01). Uni-Gold™ H.pylori Antigen kit showed a sensitivity of 83%, similar to ImmunoCard STAT! HpSA. Specificity of Uni-Gold™ H.pylori Antigen approached 90% (87-89%) and was superior to that of RAPID Hp StAR (p<0.01). Uni-Gold™ H.pylori Antigen and ImmunoCard STAT! HpSA present similar levels of diagnostic accuracy. RAPID Hp StAR was the most sensitive but less reliable of the three immunochromatographic stool tests. None are as accurate and reliable as UBT, RUT and histology. Copyright © 2016 The Canadian Society of Clinical Chemists. Published by Elsevier Inc. All rights reserved.
Measuring cognitive change with ImPACT: the aggregate baseline approach.
Bruce, Jared M; Echemendia, Ruben J; Meeuwisse, Willem; Hutchison, Michael G; Aubry, Mark; Comper, Paul
2017-11-01
The Immediate Post-Concussion Assessment and Cognitive Test (ImPACT) is commonly used to assess baseline and post-injury cognition among athletes in North America. Despite this, several studies have questioned the reliability of ImPACT when given at intervals employed in clinical practice. Poor test-retest reliability reduces test sensitivity to cognitive decline, increasing the likelihood that concussed athletes will be returned to play prematurely. We recently showed that the reliability of ImPACT can be increased when using a new composite structure and the aggregate of two baselines to predict subsequent performance. The purpose of the present study was to confirm our previous findings and determine whether the addition of a third baseline would further increase the test-retest reliability of ImPACT. Data from 97 English speaking professional hockey players who had received at least 4 ImPACT baseline evaluations were extracted from a National Hockey League Concussion Program database. Linear regression was used to determine whether each of the first three testing sessions accounted for unique variance in the fourth testing session. Results confirmed that the aggregate baseline approach improves the psychometric properties of ImPACT, with most indices demonstrating adequate or better test-retest reliability for clinical use. The aggregate baseline approach provides a modest clinical benefit when recent baselines are available - and a more substantial benefit when compared to approaches that obtain baseline measures only once during the course of a multi-year playing career. Pending confirmation in diverse samples, neuropsychologists are encouraged to use the aggregate baseline approach to best quantify cognitive change following sports concussion.
Donath, Lars; Ludyga, Sebastian; Hammes, Daniel; Rossmeissl, Anja; Andergassen, Nadin; Zahner, Lukas; Faude, Oliver
2017-10-25
Aging is accompanied by a decline of executive function. Aerobic exercise training induces moderate improvements of cognitive domains (i.e., attention, processing, executive function, memory) in seniors. Most conclusive data are obtained from studies with dementia or cognitive impairment. Confident detection of exercise training effects requires adequate between-day reliability and low day-to-day variability obtained from acute studies, respectively. These absolute and relative reliability measures have not yet been examined for a single aerobic training session in seniors. Twenty-two healthy and physically active seniors (age: 69 ± 3 y, BMI: 24.8 ± 2.2, VO 2peak : 32 ± 6 mL/kg/bodyweight) were enrolled in this randomized controlled cross-over study. A repeated between-day comparison [i.e., day 1 (habituation) vs. day 2 & day 2 vs. day 3] of executive function testing (Eriksen-Flanker-Test, Stroop-Color-Test, Digit-Span, Five-Point-Test) before and after aerobic cycling exercise at 70% of the heart rate reserve [0.7 × (HR max - HR rest )] was conducted. Reliability measures were calculated for pre, post and change scores. Large between-day differences between day 1 and 2 were found for reaction times (Flanker- and Stroop Color testing) and completed figures (Five-Point test) at pre and post testing (0.002 < p < 0.05, 0.16 < ɳ p 2 < 0.38). These differences notably declined when comparing day 2 and 3. Absolute between days variability (CoV) dropped from 10 to 5% when comparing day 2 vs. day 3 instead of day 1 vs. day 2. Also ICC ranges increased from day 1 vs. day 2 (0.65 < ICC < 0.87) to day 2 vs. day 3 (0.40 < ICC < 0.93). Interestingly, reliability measures for pre-post change scores were low (0.02 < ICC < 0.71). These data did not improve when comparing day 2 with day 3. During inhibition tests, reaction times showed excellent reliability values compared to the poor to fair reliability of accuracy. Notable habituation to the whole testing procedure should be considered as it increased the reliability of different executive function tests. Change scores of executive function after acute aerobic exercise cannot be detected reliably. Large intra- and inter-individual of responses to acute aerobic exercise in seniors can be presumed.
Houx, P J; Shepherd, J; Blauw, G-J; Murphy, M B; Ford, I; Bollen, E L; Buckley, B; Stott, D J; Jukema, W; Hyland, M; Gaw, A; Norrie, J; Kamper, A M; Perry, I J; MacFarlane, P W; Meinders, A Edo; Sweeney, B J; Packard, C J; Twomey, C; Cobbe, S M; Westendorp, R G
2002-10-01
For large scale follow up studies with non-demented patients in which cognition is an endpoint, there is a need for short, inexpensive, sensitive, and reliable neuropsychological tests that are suitable for repeated measurements. The commonly used Mini-Mental-State-Examination fulfils only the first two requirements. In the PROspective Study of Pravastatin in the Elderly at Risk (PROSPER), 5804 elderly subjects aged 70 to 82 years were examined using a learning test (memory), a coding test (general speed), and a short version of the Stroop test (attention). Data presented here were collected at dual baseline, before randomisation for active treatment. The tests proved to be reliable (with test/retest reliabilities ranging from acceptable (r=0.63) to high (r=0.88) and sensitive to detect small differences in subjects from different age categories. All tests showed significant practice effects: performance increased from the first measurement to the first follow up after two weeks. Normative data are provided that can be used for one time neuropsychological testing as well as for assessing individual and group change. Methods for analysing cognitive change are proposed.
Foerster, Rebecca M.; Poth, Christian H.; Behler, Christian; Botsch, Mario; Schneider, Werner X.
2016-01-01
Neuropsychological assessment of human visual processing capabilities strongly depends on visual testing conditions including room lighting, stimuli, and viewing-distance. This limits standardization, threatens reliability, and prevents the assessment of core visual functions such as visual processing speed. Increasingly available virtual reality devices allow to address these problems. One such device is the portable, light-weight, and easy-to-use Oculus Rift. It is head-mounted and covers the entire visual field, thereby shielding and standardizing the visual stimulation. A fundamental prerequisite to use Oculus Rift for neuropsychological assessment is sufficient test-retest reliability. Here, we compare the test-retest reliabilities of Bundesen’s visual processing components (visual processing speed, threshold of conscious perception, capacity of visual working memory) as measured with Oculus Rift and a standard CRT computer screen. Our results show that Oculus Rift allows to measure the processing components as reliably as the standard CRT. This means that Oculus Rift is applicable for standardized and reliable assessment and diagnosis of elementary cognitive functions in laboratory and clinical settings. Oculus Rift thus provides the opportunity to compare visual processing components between individuals and institutions and to establish statistical norm distributions. PMID:27869220
Sun, Wei; Song, Qipeng; Yu, Bing; Zhang, Cui; Mao, Dewei
2015-01-01
This study aimed to evaluate the test-retest reliability of a new device for assessing ankle joint kinesthesia. This device could measure the passive motion threshold of four ankle joint movements, namely plantarflexion, dorsiflexion, inversion and eversion. A total of 21 healthy adults, including 13 males and 8 females, participated in the study. Each participant completed two sessions on two separate days with 1-week interval. The sessions were administered by the same experimenter in the same laboratory. At least 12 trials (three successful trials in each of the four directions) were performed in each session. The mean values in each direction were calculated and analysed. The ICC values of test-retest reliability ranged from 0.737 (dorsiflexion) to 0.935 (eversion), whereas the SEM values ranged from 0.21° (plantarflexion) to 0.52° (inversion). The Bland-Altman plots showed that the reliability of plantarflexion-dorsiflexion was better than that of inversion-eversion. The results evaluated the reliability of the new device as fair to excellent. The new device for assessing kinesthesia could be used to examine the ankle joint kinesthesia.
Validation and cross cultural adaptation of the Italian version of the Harris Hip Score.
Dettoni, Federico; Pellegrino, Pietro; La Russa, Massimo R; Bonasia, Davide E; Blonna, Davide; Bruzzone, Matteo; Castoldi, Filippo; Rossi, Roberto
2015-01-01
The Harris Hip Score (HHS) is one of the most widely used health related quality of life (HRQOL) measures for the assessment of hip pathology: in spite of this, a validation study, and an official Italian version have not been provided yet. The aim of this study was to create an Italian valid and reliable version of the HHS. The score was translated and modified in Italian; then 103 patients with different hip pathologies were evaluated using this HHS version and also with the WOMAC and the SF-12 questionnaires. Content, construct and criterion validities were tested, such as interobserver reliability, test-retest reliability and internal consistency. Cross-cultural adaptation was easy, and only minor adaptation was required in the translation process. Construct and criterion validity of the HHS Italian Version were confirmed by satisfactory values of Spearman's Rho for correlation between specific domains of HHS and Womac and SF12 scores. Interobserver and test-retest reliabilities obtained values of 0.996 and 0.975 respectively; Cronbach's alpha for internal consistency was 0.816. Statistical and clinical analysis showed that HHS is highly valid and reliable in this new Italian version.
Foerster, Rebecca M; Poth, Christian H; Behler, Christian; Botsch, Mario; Schneider, Werner X
2016-11-21
Neuropsychological assessment of human visual processing capabilities strongly depends on visual testing conditions including room lighting, stimuli, and viewing-distance. This limits standardization, threatens reliability, and prevents the assessment of core visual functions such as visual processing speed. Increasingly available virtual reality devices allow to address these problems. One such device is the portable, light-weight, and easy-to-use Oculus Rift. It is head-mounted and covers the entire visual field, thereby shielding and standardizing the visual stimulation. A fundamental prerequisite to use Oculus Rift for neuropsychological assessment is sufficient test-retest reliability. Here, we compare the test-retest reliabilities of Bundesen's visual processing components (visual processing speed, threshold of conscious perception, capacity of visual working memory) as measured with Oculus Rift and a standard CRT computer screen. Our results show that Oculus Rift allows to measure the processing components as reliably as the standard CRT. This means that Oculus Rift is applicable for standardized and reliable assessment and diagnosis of elementary cognitive functions in laboratory and clinical settings. Oculus Rift thus provides the opportunity to compare visual processing components between individuals and institutions and to establish statistical norm distributions.
Toward Extending the Educational Interpreter Performance Assessment to Cued Speech
Krause, Jean C.; Kegl, Judy A.; Schick, Brenda
2008-01-01
The Educational Interpreter Performance Assessment (EIPA) is as an important research tool for examining the quality of interpreters who use American Sign Language or a sign system in classroom settings, but it is not currently applicable to educational interpreters who use Cued Speech (CS). In order to determine the feasibility of extending the EIPA to include CS, a pilot EIPA test was developed and administered to 24 educational CS interpreters. Fifteen of the interpreters’ performances were evaluated two to three times in order to assess reliability. Results show that the instrument has good construct validity and test–retest reliability. Although more interrater reliability data are needed, intrarater reliability was quite high (0.9), suggesting that the pilot test can be rated as reliably as signing versions of the EIPA. Notably, only 48% of interpreters who formally participated in pilot testing performed at a level that could be considered minimally acceptable. In light of similar performance levels previously reported for interpreters who sign (e.g., Schick, Williams, & Kupermintz, 2006), these results suggest that interpreting services for deaf and hard-of hearing students, regardless of the communication option used, are often inadequate and could seriously hinder access to the classroom environment. PMID:18042791
Intra- and interobserver reliability of quantitative ultrasound measurement of the plantar fascia.
Rathleff, Michael Skovdal; Moelgaard, Carsten; Lykkegaard Olesen, Jens
2011-01-01
To determine intra- and interobserver reliability and measurement precision of sonographic assessment of plantar fascia thickness when using one, the mean of two, or the mean of three measurements. Two experienced observers scanned 20 healthy subjects twice with 60 minutes between test and retest. A GE LOGIQe ultrasound scanner was used in the study. The built-in software in the scanner was used to measure the thickness of the plantar fascia (PF). Reliability was calculated using intraclass correlation coefficient (ICC) and limits of agreement (LOA). Intraobserver reliability (ICC) using one measurement was 0.50 for one observer and 0.52 for the other, and using the mean of three measurements intraobserver reliability increased up to 0.77 and 0.67, respectively. Interobserver reliability (ICC) when using one measurement was 0.62 and increased to 0.82 when using the average of three measurements. LOA showed that when using the average of three measurements, LOA decreased to 0.6 mm, corresponding to 17.5% of the mean thickness of the PF. The results showed that reliability increases when using the mean of three measurements compared with one. Limits of agreement based on intratester reliability shows that changes in thickness that are larger than 0.6 mm can be considered actual changes in thickness and not a result of measurement error. Copyright © 2011 Wiley Periodicals, Inc.
Development and Validation of a Test for Bulimia.
ERIC Educational Resources Information Center
Smith, Marcia C.; Thelen, Mark H.
1984-01-01
Developed the Bulimia Test (BULIT) based on responses of clinically identified females (N=18) and normal female college students (N=119) to preliminary test items. Results showed that the BULIT provided an objective, reliable, and valid measure by which to identify individuals with symptoms of bulimia. (Instrument is appended.) (LLL)
How Many Sleep Diary Entries Are Needed to Reliably Estimate Adolescent Sleep?
Short, Michelle A; Arora, Teresa; Gradisar, Michael; Taheri, Shahrad; Carskadon, Mary A
2017-03-01
To investigate (1) how many nights of sleep diary entries are required for reliable estimates of five sleep-related outcomes (bedtime, wake time, sleep onset latency [SOL], sleep duration, and wake after sleep onset [WASO]) and (2) the test-retest reliability of sleep diary estimates of school night sleep across 12 weeks. Data were drawn from four adolescent samples (Australia [n = 385], Qatar [n = 245], United Kingdom [n = 770], and United States [n = 366]), who provided 1766 eligible sleep diary weeks for reliability analyses. We performed reliability analyses for each cohort using complete data (7 days), one to five school nights, and one to two weekend nights. We also performed test-retest reliability analyses on 12-week sleep diary data available from a subgroup of 55 US adolescents. Intraclass correlation coefficients for bedtime, SOL, and sleep duration indicated good-to-excellent reliability from five weekday nights of sleep diary entries across all adolescent cohorts. Four school nights was sufficient for wake times in the Australian and UK samples, but not the US or Qatari samples. Only Australian adolescents showed good reliability for two weekend nights of bedtime reports; estimates of SOL were adequate for UK adolescents based on two weekend nights. WASO was not reliably estimated using 1 week of sleep diaries. We observed excellent test-rest reliability across 12 weeks of sleep diary data in a subsample of US adolescents. We recommend at least five weekday nights of sleep dairy entries to be made when studying adolescent bedtimes, SOL, and sleep duration. Adolescent sleep patterns were stable across 12 consecutive school weeks. © Sleep Research Society 2017. Published by Oxford University Press on behalf of the Sleep Research Society. All rights reserved. For permissions, please e-mail journals.permissions@oup.com.
Manzoor, Behzad; Suleiman, Mahmood; Palmer, Richard M
2013-01-01
The crestal bone level around a dental implant may influence its strength characteristics by offering protection against mechanical failures. Therefore, the present study investigated the effect of simulated bone loss on modes, loads, and cycles to failure in an in vitro model. Different amounts of bone loss were simulated: 0, 1.5, 3.0, and 4.5 mm from the implant head. Forty narrow-diameter (3.0-mm) implant-abutment assemblies were tested using compressive bending and cyclic fatigue testing. Weibull and accelerated life testing analysis were used to assess reliability and functional life. Statistical analyses were performed using the Fisher-Exact test and the Spearman ranked correlation. Compressive bending tests showed that the level of bone loss influenced the load-bearing capacity of implant-abutment assemblies. Fatigue testing showed that the modes, loads, and cycles to failure had a statistically significant relationship with the level of bone loss. All 16 samples with bone loss of 3.0 mm or more experienced horizontal implant body fractures. In contrast, 14 of 16 samples with 0 and 1.5 mm of bone loss showed abutment and screw fractures. Weibull and accelerated life testing analysis indicated a two-group distribution: the 0- and 1.5-mm bone loss samples had better functional life and reliability than the 3.0- and 4.5-mm samples. Progressive bone loss had a significant effect on modes, loads, and cycles to failure. In addition, bone loss influenced the functional life and reliability of the implant-abutment assemblies. Maintaining crestal bone levels is important in ensuring biomechanical sustainability and predictable long-term function of dental implant assemblies.
Vertical jumping tests in volleyball: reliability, validity, and playing-position specifics.
Sattler, Tine; Sekulic, Damir; Hadzic, Vedran; Uljevic, Ognjen; Dervisevic, Edvin
2012-06-01
Vertical jumping is known to be important in volleyball, and jumping performance tests are frequently studied for their reliability and validity. However, most studies concerning jumping in volleyball have dealt with standard rather than sport-specific jumping procedures and tests. The aims of this study, therefore, were (a) to determine the reliability and factorial validity of 2 volleyball-specific jumping tests, the block jump (BJ) test and the attack jump (AJ) test, relative to 2 frequently used and systematically validated jumping tests, the countermovement jump test and the squat jump test and (b) to establish volleyball position-specific differences in the jumping tests and simple anthropometric indices (body height [BH], body weight, and body mass index [BMI]). The BJ was performed from a defensive volleyball position, with the hands positioned in front of the chest. During an AJ, the players used a 2- to 3-step approach and performed a drop jump with an arm swing followed by a quick vertical jump. A total of 95 high-level volleyball players (all men) participated in this study. The reliability of the jumping tests ranged from 0.97 to 0.99 for Cronbach's alpha coefficients, from 0.93 to 0.97 for interitem correlation coefficients and from 2.1 to 2.8 for coefficients of variation. The highest reliability was found for the specific jumping tests. The factor analysis extracted one significant component, and all of the tests were highly intercorrelated. The analysis of variance with post hoc analysis showed significant differences between 5 playing positions in some of the jumping tests. In general, receivers had a greater jumping capacity, followed by libero players. The differences in jumping capacities should be emphasized vis-a-vis differences in the anthropometric measures of players, where middle hitters had higher BH and body weight, followed by opposite hitters and receivers, with no differences in the BMI between positions.
Iversen, J V; Bartels, E M; Jørgensen, J E; Nielsen, T G; Ginnerup, C; Lind, M C; Langberg, H
2016-12-01
The VISA-A questionnaire has proven to be a valid and reliable tool for assessing severity of Achilles tendinopathy (AT). The aim was to translate and cross-culturally adapt the VISA-A questionnaire for a Danish-speaking AT population, and subsequently perform validity and reliability tests. Translation and following cross-cultural adaptation was performed as translation, synthesis, reverse translation, expert review, and pretesting. The final Danish version (VISA-A-DK) was tested for reliability on healthy controls (n = 75) and patients (n = 36). Tests for internal consistency, validity, and structure were performed on 71 patients. VISA-A-DK showed good reliability for patients (r = 0.80 ICC = 0.79) and healthy individuals (r = 0.98 ICC = 0.97). Internal consistency was 0.73 (Cronbach's alpha). The mean VISA-A-DK score in AT patients was 51 [47-55]. This was significantly lower than healthy controls with a score of 93 (90-95). Criterion validity was considered good when comparing the scores of the Danish version with the original version in both healthy individuals and patients. VISA-A-DK is a valid and reliable instrument and has shown compatible to the original version in assessment of AT patients. VISA-A-DK is a useful tool in the assessment of AT, both in research and in a clinical setting. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
López-Plaza, Diego; Juan-Recio, Casto; Barbado, David; Ruiz-Pérez, Iñaki; Vera-Garcia, Francisco J
2018-05-18
Although the Star Excursion Balance test (SEBT) has shown a good intrasession reliability, the intersession reliability of this test has not been deeply studied. Furthermore, there is an evident high influence of the lower limbs in the performance of the SEBT, so even if it has been used to measure core stability, it is possibly not the most suitable measurement. The aims of this study were to (1) to assess the absolute and relative between-session reliability of the SEBT and 2 novel variations of this test to assess trunk postural control while sitting, ie, the Star Excursion Sitting Test (SEST) and the Star Excursion Timing Test (SETT); and (2) to analyze the relationships between these 3 test scores. Correlational and reliability test-retest study. Controlled laboratory environment. Twenty-seven physically active men (age: 24.54 ± 3.05 years). Relative and absolute reliability of the SEBT, SEST, and SETT were calculated through the intraclass correlation coefficient (ICC) and standard error of measurement (SEM), respectively. A Pearson correlation analysis was carried out between the variables of the 3 tests. Maximum normalized reach distances were assessed for different SEBT and SEST directions. In addition, composite indexes were calculated for SEBT, SEST, and SETT. The SEBT (dominant leg: ICC = 0.87 [0.73-0.94], SEM = 2.12 [1.66-2.93]; nondominant leg: ICC = 0.74 [0.50-0.87], SEM = 3.23 [2.54-4.45]), SEST (ICC = 0.85 [0.68-0.92], SEM = 1.27 [1.03-1.80]), and SETT (ICC = 0.61 [0.30-0.80], SEM = 2.31 [1.82-3.17]) composite indexes showed moderate-to-high 1-month reliability. A learning effect was detected for some SEBT and SEST directions and for SEST and SETT composite indexes. No significant correlations were found between SEBT and its 2 variations (r ≤ .366; P > .05). A significant correlation was found between the SEST and SETT composite indexes (r = .520; P > .01). SEBT, SEST, and SETT are reliable field protocols to measure postural control. However, whereas the SEBT assesses postural control in single-leg stance, SEST and SETT provide trunk postural control measures with lower influence of the lower-limbs. To be determined. Copyright © 2018 American Academy of Physical Medicine and Rehabilitation. Published by Elsevier Inc. All rights reserved.
E-Service Quality Evaluation on E-Government Website: Case Study BPJS Kesehatan Indonesia
NASA Astrophysics Data System (ADS)
Rasyid, A.; Alfina, I.
2017-01-01
This research intends to develop a model to evaluate the quality of e-services on e-government. The proposed model consists of seven dimensions: web design, reliability, responsiveness, privacy and security, personalization, information, and ease of use. The model is used to measure the quality of the e-registration of BPJS Kesehatan, an Indonesian government health insurance program. The validation and reliability testing show that of the seven dimensions proposed, only four that suitable for the case study. The result shows that the BPJS Kesehatan e-registration service is good in reliability and responsiveness dimensions, while from web design and ease of use dimensions the e-service still needs to be optimized.
Inter-arch digital model vs. manual cast measurements: Accuracy and reliability.
Kiviahde, Heikki; Bukovac, Lea; Jussila, Päivi; Pesonen, Paula; Sipilä, Kirsi; Raustia, Aune; Pirttiniemi, Pertti
2017-06-28
The purpose of this study was to evaluate the accuracy and reliability of inter-arch measurements using digital dental models and conventional dental casts. Thirty sets of dental casts with permanent dentition were examined. Manual measurements were done with a digital caliper directly on the dental casts, and digital measurements were made on 3D models by two independent examiners. Intra-class correlation coefficients (ICC), a paired sample t-test or Wilcoxon signed-rank test, and Bland-Altman plots were used to evaluate intra- and inter-examiner error and to determine the accuracy and reliability of the measurements. The ICC values were generally good for manual and excellent for digital measurements. The Bland-Altman plots of all the measurements showed good agreement between the manual and digital methods and excellent inter-examiner agreement using the digital method. Inter-arch occlusal measurements on digital models are accurate and reliable and are superior to manual measurements.
Kneebone, Ian I.; Dewar, Sophie J.
2016-01-01
Background: The current study aimed to examine the psychometric properties of an attributional style measure that can be administered remotely, to people who have multiple sclerosis (MS). Methods: A total of 495 participants with MS were recruited. Participants completed the Attributional Style Questionnaire-Survey (ASQ-S) and two comparison measures of cognitive variables via postal survey on three occasions, each 12 months apart. Internal reliability, test-retest reliability and congruent validity were considered. Results: The internal reliability of the ASQ-S was good (α > 0.7). The test-retest correlations were significant, but failed to reach the 0.7 set. The congruent validity of the ASQ-S was established relative to the comparisons. Conclusions: The psychometric properties of the ASQ-S indicate that it shows promise as a tool for researchers investigating depression in people with MS and is likely sound to use clinically in this population. PMID:28450893
Nutrition environment measures survey-vending: development, dissemination, and reliability.
Voss, Carol; Klein, Susan; Glanz, Karen; Clawson, Margaret
2012-07-01
Researchers determined a need to develop an instrument to assess the vending machine environment that was comparably reliable and valid to other Nutrition Environment Measures Survey tools and that would provide consistent and comparable data for businesses, schools, and communities. Tool development, reliability testing, and dissemination of the Nutrition Environment Measures Survey-Vending (NEMS-V) involved a collaboration of students, professionals, and community leaders. Interrater reliability testing showed high levels of agreement among trained raters on the products and evaluations of products. NEMS-V can benefit public health partners implementing policy and environmental change initiatives as a part of their community wellness activities. The vending machine project will support a policy calling for state facilities to provide a minimum of 30% of foods and beverages in vending machines as healthy options, based on NEMS-V criteria, which will be used as a model for other businesses.
Soltanparast, Sanaz; Jafari, Zahra; Sameni, Seyed Jalal; Salehi, Masoud
2014-01-01
The purpose of the present study was to evaluate the psychometric properties (validity and reliability) of the Persian version of the Sustained Auditory Attention Capacity Test in children with attention deficit hyperactivity disorder. The Persian version of the Sustained Auditory Attention Capacity Test was constructed to assess sustained auditory attention using the method provided by Feniman and colleagues (2007). In this test, comments were provided to assess the child's attentional deficit by determining inattention and impulsiveness error, the total scores of the sustained auditory attention capacity test and attention span reduction index. In the present study for determining the validity and reliability of in both Rey Auditory Verbal Learning test and the Persian version of the Sustained Auditory Attention Capacity Test (SAACT), 46 normal children and 41 children with Attention Deficit Hyperactivity (ADHD), all right-handed and aged between 7 and 11 of both genders, were evaluated. In determining convergent validity, a negative significant correlation was found between the three parts of the Rey Auditory Verbal Learning test (first, fifth, and immediate recall) and all indicators of the SAACT except attention span reduction. By comparing the test scores between the normal and ADHD groups, discriminant validity analysis showed significant differences in all indicators of the test except for attention span reduction (p< 0.001). The Persian version of the Sustained Auditory Attention Capacity test has good validity and reliability, that matches other reliable tests, and it can be used for the identification of children with attention deficits and if they suspected to have Attention Deficit Hyperactivity Disorder.
Reliability and Validity of a New Test of Agility and Skill for Female Amateur Soccer Players
Kutlu, Mehmet; Yapici, Hakan; Yilmaz, Abdullah
2017-01-01
Abstract The aim of this study was to evaluate the Agility and Skill Test, which had been recently developed to assess agility and skill in female athletes. Following a 10 min warm-up, two trials to test the reliability and validity of the test were conducted one week apart. Measurements were collected to compare soccer players’ physical performance in a 20 m sprint, a T-Drill test, the Illinois Agility Run Test, change-of-direction and acceleration, as well as agility and skill. All tests were completed following the same order. Thirty-four amateur female soccer players were recruited (age = 20.8 ± 1.9 years; body height = 166 ± 6.9 cm; body mass = 55.5 ± 5.8 kg). To determine the reliability and usefulness of these tests, paired sample t-tests, intra-class correlation coefficients, typical error, coefficient of variation, and differences between the typical error and smallest worthwhile change statistics were computed. Test results showed no significant differences between the two sessions (p > 0.01). There were higher intra-class correlations between the test and retest values (r = 0.94–0.99) for all tests. Typical error values were below the smallest worthwhile change, indicating ‘good’ usefulness for these tests. A near perfect Pearson correlation between the Agility and Skill Test (r = 0.98) was found, and there were moderate-to-large levels of correlation between the Agility and Skill Test and other measures (r = 0.37 to r = 0.56). The results of this study suggest that the Agility and Skill Test is a reliable and valid test for female soccer players and has significant value for assessing the integrative agility and skill capability of soccer players. PMID:28469760
Tung, Li-Chen; Yu, Wan-Hui; Lin, Gong-Hong; Yu, Tzu-Ying; Wu, Chien-Te; Tsai, Chia-Yin; Chou, Willy; Chen, Mei-Hsiang; Hsieh, Ching-Lin
2016-09-01
To develop a Tablet-based Symbol Digit Modalities Test (T-SDMT) and to examine the test-retest reliability and concurrent validity of the T-SDMT in patients with stroke. The study had two phases. In the first phase, six experts, nine college students and five outpatients participated in the development and testing of the T-SDMT. In the second phase, 52 outpatients were evaluated twice (2 weeks apart) with the T-SDMT and SDMT to examine the test-retest reliability and concurrent validity of the T-SDMT. The T-SDMT was developed via expert input and college student/patient feedback. Regarding test-retest reliability, the practise effects of the T-SDMT and SDMT were both trivial (d=0.12) but significant (p≦0.015). The improvement in the T-SDMT (4.7%) was smaller than that in the SDMT (5.6%). The minimal detectable changes (MDC%) of the T-SDMT and SDMT were 6.7 (22.8%) and 10.3 (32.8%), respectively. The T-SDMT and SDMT were highly correlated with each other at the two time points (Pearson's r=0.90-0.91). The T-SDMT demonstrated good concurrent validity with the SDMT. Because the T-SDMT had a smaller practise effect and less random measurement error (superior test-retest reliability), it is recommended over the SDMT for assessing information processing speed in patients with stroke. Implications for Rehabilitation The Symbol Digit Modalities Test (SDMT), a common measure of information processing speed, showed a substantial practise effect and considerable random measurement error in patients with stroke. The Tablet-based SDMT (T-SDMT) has been developed to reduce the practise effect and random measurement error of the SDMT in patients with stroke. The T-SDMT had smaller practise effect and random measurement error than the SDMT, which can provide more reliable assessments of information processing speed.
Fundamentals of endoscopic surgery: creation and validation of the hands-on test.
Vassiliou, Melina C; Dunkin, Brian J; Fried, Gerald M; Mellinger, John D; Trus, Thadeus; Kaneva, Pepa; Lyons, Calvin; Korndorffer, James R; Ujiki, Michael; Velanovich, Vic; Kochman, Michael L; Tsuda, Shawn; Martinez, Jose; Scott, Daniel J; Korus, Gary; Park, Adrian; Marks, Jeffrey M
2014-03-01
The Fundamentals of Endoscopic Surgery™ (FES) program consists of online materials and didactic and skills-based tests. All components were designed to measure the skills and knowledge required to perform safe flexible endoscopy. The purpose of this multicenter study was to evaluate the reliability and validity of the hands-on component of the FES examination, and to establish the pass score. Expert endoscopists identified the critical skill set required for flexible endoscopy. They were then modeled in a virtual reality simulator (GI Mentor™ II, Simbionix™ Ltd., Airport City, Israel) to create five tasks and metrics. Scores were designed to measure both speed and precision. Validity evidence was assessed by correlating performance with self-reported endoscopic experience (surgeons and gastroenterologists [GIs]). Internal consistency of each test task was assessed using Cronbach's alpha. Test-retest reliability was determined by having the same participant perform the test a second time and comparing their scores. Passing scores were determined by a contrasting groups methodology and use of receiver operating characteristic curves. A total of 160 participants (17 % GIs) performed the simulator test. Scores on the five tasks showed good internal consistency reliability and all had significant correlations with endoscopic experience. Total FES scores correlated 0.73, with participants' level of endoscopic experience providing evidence of their validity, and their internal consistency reliability (Cronbach's alpha) was 0.82. Test-retest reliability was assessed in 11 participants, and the intraclass correlation was 0.85. The passing score was determined and is estimated to have a sensitivity (true positive rate) of 0.81 and a 1-specificity (false positive rate) of 0.21. The FES hands-on skills test examines the basic procedural components required to perform safe flexible endoscopy. It meets rigorous standards of reliability and validity required for high-stakes examinations, and, together with the knowledge component, may help contribute to the definition and determination of competence in endoscopy.
Inter-rater reliability of three standardized functional tests in patients with low back pain
Tidstrand, Johan; Horneij, Eva
2009-01-01
Background Of all patients with low back pain, 85% are diagnosed as "non-specific lumbar pain". Lumbar instability has been described as one specific diagnosis which several authors have described as delayed muscular responses, impaired postural control as well as impaired muscular coordination among these patients. This has mostly been measured and evaluated in a laboratory setting. There are few standardized and evaluated functional tests, examining functional muscular coordination which are also applicable in the non-laboratory setting. In ordinary clinical work, tests of functional muscular coordination should be easy to apply. The aim of this present study was to therefore standardize and examine the inter-rater reliability of three functional tests of muscular functional coordination of the lumbar spine in patients with low back pain. Methods Nineteen consecutive individuals, ten men and nine women were included. (Mean age 42 years, SD ± 12 yrs). Two independent examiners assessed three tests: "single limb stance", "sitting on a Bobath ball with one leg lifted" and "unilateral pelvic lift" on the same occasion. The standardization procedure took altered positions of the spine or pelvis and compensatory movements of the free extremities into account. The inter-rater reliability was analyzed by Cohen's kappa coefficient (κ) and by percentage agreement. Results The inter-rater reliability for the right and the left leg respectively was: for the single limb stance very good (κ: 0.88–1.0), for sitting on a Bobath ball good (κ: 0.79) and very good (κ: 0.88) and for the unilateral pelvic lift: good (κ: 0.61) and moderate (κ: 0.47). Conclusion The present study showed good to very good inter-rater reliability for two standardized tests, that is, the single-limb stance and sitting on a Bobath-ball with one leg lifted. Inter-rater reliability for the unilateral pelvic lift test was moderate to good. Validation of the tests in their ability to evaluate lumbar stability is required. PMID:19490644
Zuvela, Frane; Bozanic, Ana; Miletic, Durdica
2011-01-01
Inadequately adopted fundamental movement skills (FMS) in early childhood may have a negative impact on the motor performance in later life (Gallahue and Ozmun, 2005). The need for an efficient FMS testing in Physical Education was recognized. The aim of this paper was to construct and validate a new FMS test for 8 year old children. Ninety-five 8 year old children were used for the testing. A total of 24 new FMS tasks were constructed and only the best representatives of movement areas entered into the final test product - FMS-POLYGON. The ICC showed high values for all 24 tasks (0.83-0.97) and the factorial analysis revealed the best representatives of each movement area that entered the FMS-POLYGON: tossing and catching the volleyball against a wall, running across obstacles, carrying the medicine balls, and straight running. The ICC for the FMS-POLYGON showed a very high result (0.98) and, therefore, confirmed the test's intra-rater reliability. Concurrent validity was tested with the use of the "Test of Gross Motor Development" (TGMD-2). Correlation analysis between the newly constructed FMS-POLYGON and the TGMD-2 revealed the coefficient of -0.82 which indicates a high correlation. In conclusion, the new test for FMS assessment proved to be a reliable and valid instrument for 8 year old children. Application of this test in schools is justified and could play an important factor in physical education and sport practice. Key pointsAll 21 newly constructed tasks demonstrated high intra-rater reliability (0.83-0.97) in FMS assessment. High reliability was also noted in the FMS-POLYGON test (0.98).A high correlation was found between the FMS-POLYGON and TGMD-2 which is a confirmation of the new test's concurrent validity.The research resolved the problem of long and detailed FMS assessment by adding a new dimension using quick and effective norm-referenced approach but also covering all the most important movement areas.New and validated test can be of great use primarily in school practice for physical education teachers and FMS experts.
Code of Federal Regulations, 2010 CFR
2010-01-01
... to verify the existence and reliability of such tests, bear directly on whether the firm acted... relevance to the latter two remedies is whether reasonable and representative tests were performed... fabrics or on reasonable and representative tests showing that the fabric covered by the guaranty or used...
Lam, S S
2001-02-01
In 1990 Podsakoff, MacKenzie, Moorman, and Fetter developed a scale to measure the five dimensions of organizational citizenship behavior. Test-retest data over 15 weeks are reported for this scale for a sample of 82 female and 32 male Chinese tellers (ages 18 to 54 years) from a large international bank in Hong Kong. Stability was .83, and there was no significant change between Times 1 and 2. Analysis indicated the five-factor structure and showed it to be a reliable measure when used with a nonwestern sample.
Llerena, Katiah; Wynn, Jonathan K; Hajcak, Greg; Green, Michael F; Horan, William P
2016-07-01
Accurately monitoring one's performance on daily life tasks, and integrating internal and external performance feedback are necessary for guiding productive behavior. Although internal feedback processing, as indexed by the error-related negativity (ERN), is consistently impaired in schizophrenia, initial findings suggest that external performance feedback processing, as indexed by the feedback negativity (FN), may actually be intact. The current study evaluated internal and external feedback processing task performance and test-retest reliability in schizophrenia. 92 schizophrenia outpatients and 63 healthy controls completed a flanker task (ERN) and a time estimation task (FN). Analyses examined the ΔERN and ΔFN defined as difference waves between correct/positive versus error/negative feedback conditions. A temporal principal component analysis was conducted to distinguish the ΔERN and ΔFN from overlapping neural responses. We also assessed test-retest reliability of ΔERN and ΔFN in patients over a 4-week interval. Patients showed reduced ΔERN accompanied by intact ΔFN. In patients, test-retest reliability for both ΔERN and ΔFN over a four-week period was fair to good. Individuals with schizophrenia show a pattern of impaired internal, but intact external, feedback processing. This pattern has implications for understanding the nature and neural correlates of impaired feedback processing in schizophrenia. Published by Elsevier B.V.
Glaister, Mark; Stone, Michael H; Stewart, Andrew M; Hughes, Michael; Moir, Gavin L
2004-08-01
The purpose of the present study was to assess the reliability and validity of fatigue measures, as derived from 4 separate formulae, during tests of repeat sprint ability. On separate days over a 3-week period, 2 groups of 7 recreationally active men completed 6 trials of 1 of 2 maximal (20 x 5 seconds) intermittent cycling tests with contrasting recovery periods (10 or 30 seconds). All trials were conducted on a friction-braked cycle ergometer, and fatigue scores were derived from measures of mean power output for each sprint. Apart from formula 1, which calculated fatigue from the percentage difference in mean power output between the first and last sprint, all remaining formulae produced fatigue scores that showed a reasonably good level of test-retest reliability in both intermittent test protocols (intraclass correlation range: 0.78-0.86; 95% likely range of true values: 0.54-0.97). Although between-protocol differences in the magnitude of the fatigue scores suggested good construct validity, within-protocol differences highlighted limitations with each formula. Overall, the results support the use of the percentage decrement score as the most valid and reliable measure of fatigue during brief maximal intermittent work.
The reliability and validity of fatigue measures during multiple-sprint work: an issue revisited.
Glaister, Mark; Howatson, Glyn; Pattison, John R; McInnes, Gill
2008-09-01
The ability to repeatedly produce a high-power output or sprint speed is a key fitness component of most field and court sports. The aim of this study was to evaluate the validity and reliability of eight different approaches to quantify this parameter in tests of multiple-sprint performance. Ten physically active men completed two trials of each of two multiple-sprint running protocols with contrasting recovery periods. Protocol 1 consisted of 12 x 30-m sprints repeated every 35 seconds; protocol 2 consisted of 12 x 30-m sprints repeated every 65 seconds. All testing was performed in an indoor sports facility, and sprint times were recorded using twin-beam photocells. All but one of the formulae showed good construct validity, as evidenced by similar within-protocol fatigue scores. However, the assumptions on which many of the formulae were based, combined with poor or inconsistent test-retest reliability (coefficient of variation range: 0.8-145.7%; intraclass correlation coefficient range: 0.09-0.75), suggested many problems regarding logical validity. In line with previous research, the results support the percentage decrement calculation as the most valid and reliable method of quantifying fatigue in tests of multiple-sprint performance.
Validation of the breast evaluation questionnaire for breast hypertrophy and breast reduction.
Lewin, Richard; Elander, Anna; Lundberg, Jonas; Hansson, Emma; Thorarinsson, Andri; Claudelin, Malin; Bladh, Helena; Lidén, Mattias
2018-06-13
There is a lack of published, validated questionnaires for evaluating psychosocial morbidity in patients with breast hypertrophy undergoing breast reduction surgery. To validate the breast evaluation questionnaire (BEQ), originally developed for the assessment of breast augmentation patients, for the assessment of psychosocial morbidity in patients with breast hypertrophy undergoing breast reduction surgery. Validation study Subjects: Women with macromastia Methods: The validation of the BEQ, adapted to breast reduction, was performed in several steps. Content validity, reliability, construct validity and responsiveness were assessed. The original version was adjusted according to the results for content validity and resulted in item reduction and a modified BEQ (mBEQ) that was then assessed for reliability, construct validity and responsiveness. Internal and external validation was performed for the modified BEQ. Convergent validity was tested against Breast-Q (reduction) and discriminate validity was tested against the SF-36. Known-groups validation revealed significant differences between the normal population and patients undergoing breast reduction surgery. The BEQ showed good reliability by test-re-test analysis and high responsiveness. The modified BEQ may be reliable, valid and responsive instrument for assessing women who undergo breast reduction.
Peterson, Jennifer R.; Hill, Catherine C.; Kirkpatrick, Kimberly
2016-01-01
Impulsive choice is typically measured by presenting smaller-sooner (SS) versus larger-later (LL) rewards, with biases towards the SS indicating impulsivity. The current study tested rats on different impulsive choice procedures with LL delay manipulations to assess same-form and alternate-form test-retest reliability. In the systematic-GE procedure (Green & Estle, 2003), the LL delay increased after several sessions of training; in the systematic-ER procedure (Evenden & Ryan, 1996), the delay increased within each session; and in the adjusting-M procedure (Mazur, 1987), the delay changed after each block of trials within a session based on each rat’s choices in the previous block. In addition to measuring choice behavior, we also assessed temporal tracking of the LL delays using the median times of responding during LL trials. The two systematic procedures yielded similar results in both choice and temporal tracking measures following extensive training, whereas the adjusting procedure resulted in relatively more impulsive choices and poorer temporal tracking. Overall, the three procedures produced acceptable same form test-retest reliability over time, but the adjusting procedure did not show significant alternate form test-retest reliability with the other two procedures. The results suggest that systematic procedures may supply better measurements of impulsive choice in rats. PMID:25490901
Fagerstone, Kathleen A.; Johns, Brad E.
1987-01-01
A 0.05-g transponder implanted subcutaneously was tested to see if it provided a reliable identification method. In laboratory tests 20 domestic ferrets (Mustela putorius furo) received transponders and were monitored for a minimum of 6 months. None showed signs of inflammation, and necropsies conducted at the end of the study showed no scar tissue or transponder migration. Seven of 23 transponders failed during the test because of leakage through the plastic case, and a glass case is now being manufactured that does not have the leakage problem. During mark-recapture studies in September and October 1985, transponders were implanted in 20 black-footed ferrets (M. nigripes), 11 of which were subsequently recaptured and 9 of which were brought into captivity; none showed signs of inflammation. Transponders provide a reliable new method for identifying hard-to-mark wildlife with a unique, permanent number than can be read with the animal in-hand or by remote equipment.
NASA Astrophysics Data System (ADS)
Yu, Zheng
2002-08-01
Facing the new demands of the optical fiber communications market, almost all the performance and reliability of optical network system are dependent on the qualification of the fiber optics components. So, how to comply with the system requirements, the Telcordia / Bellcore reliability and high-power testing has become the key issue for the fiber optics components manufacturers. The qualification of Telcordia / Bellcore reliability or high-power testing is a crucial issue for the manufacturers. It is relating to who is the outstanding one in the intense competition market. These testing also need maintenances and optimizations. Now, work on the reliability and high-power testing have become the new demands in the market. The way is needed to get the 'Triple-Win' goal expected by the component-makers, the reliability-testers and the system-users. To those who are meeting practical problems for the testing, there are following seven topics that deal with how to shoot the common mistakes to perform qualify reliability and high-power testing: ¸ Qualification maintenance requirements for the reliability testing ¸ Lots control for preparing the reliability testing ¸ Sampling select per the reliability testing ¸ Interim measurements during the reliability testing ¸ Basic referencing factors relating to the high-power testing ¸ Necessity of re-qualification testing for the changing of producing ¸ Understanding the similarity for product family by the definitions
Highly reliable contacts for lead-salt diode lasers
NASA Astrophysics Data System (ADS)
Lo, W.
1981-02-01
In order to improve the long term reliability of lead-salt diode lasers, ohmic contacts of multilayer, thin-film structures consisting of In plus Au, Pt, Ni, and Pd have been studied. Diode lasers of PbSnTe fabricated with a variety of contacts were tested during room-temperature storage and during accelerated aging tests. The results show that contact reliability can be improved when multiple overlapping films are used. After 4500 h of baking at 60 C, lasers with In-Au-Pd-Au contacts on both sides showed the least resistance increase (10%). For lasers with In-Au-Pt-Au contacts, 1 h of baking at 60 C is equivalent to 2 d storage at room temperature. Extrapolating these results, a 70% increase in contact resistance is expected for this type of laser after 9000 d of storage at room temperature. The data also suggests that a smaller increase in contact resistance can be expected for lasers fabricated with In-Au-Ni and In-Au-Pd-Au contacts.
Development of a reliable method to assess footwear comfort during running.
Mündermann, Anne; Nigg, Benno M; Stefanyshyn, Darren J; Humble, R Neil
2002-08-01
The purposes of this study were: (a) to determine whether subjects are able to distinguish between differences in footwear with respect to footwear comfort; and (b) to determine how reliably footwear comfort can be assessed using a visual analogue scale (VAS) and a protocol including a control condition during running. Intraclass correlation coefficients (ICCs) between comfort ratings for repeated conditions were high (ICC = 0.799). Differences in comfort ratings between the insert conditions were significant. A paired t-test revealed a significant difference in overall comfort ratings for the control insert when tested after the soft insert compared to when tested after the hard insert (P = 0.008). The results of this study showed that VASs provide a reliable measure to assess footwear comfort during running under the conditions that: (a) a control condition is included; and (b) the average comfort rating of sessions 4-6 is used. Copyright 2002 Elsevier Science B.V.
Long, Brandon R.; Rinaldo, Steven G.; Gallagher, Kevin G.; ...
2016-11-09
Coin-cells are often the test format of choice for laboratories engaged in battery research and development as they provide a convenient platform for rapid testing of new materials on a small scale. However, reliable, reproducible data via the coin-cell format is inherently difficult, particularly in the full-cell configuration. In addition, statistical evaluation to prove the consistency and reliability of such data is often neglected. Herein we report on several studies aimed at formalizing physical process parameters and coin-cell construction related to full cells. Statistical analysis and performance benchmarking approaches are advocated as a means to more confidently track changes inmore » cell performance. Finally, we show that trends in the electrochemical data obtained from coin-cells can be reliable and informative when standardized approaches are implemented in a consistent manner.« less
ERIC Educational Resources Information Center
Meyer, Ilan H.; And Others
1996-01-01
Structured clinical interviews concerning childhood histories of physical and sexual abuse with 70 mentally ill women at 2 times found test-retest reliability of .63 for physical abuse and .82 for sexual abuse. Validity, assessed as consistency with an independent clinical assessment, showed 75% agreement for physical abuse and 93% agreement for…
Lima Rodríguez, Joaquín Salvador; Lima Serrano, Marta; Jiménez Picón, Nerea; Domínguez Sánchez, Isabel
2012-10-01
Family health determines and it is determined by family´s capacity to function effectively as a biosocial unit in a given culture and society. The main of study has been to test reliability and construct validity of an instrument to asses the Self-perception of Family Health Status. We validated its content by an on-line Dephi panel with experts. We surveyed 258 families in them homes or in primary health centres from Seville, Spain. We administered the instrument that has five Likert scales: Family climate, Family integrity, Family functioning, and Family resistance. We tested reliability by Cronbach Alpha and construct validity by exploratory factor analysis. The five scales obtained values α between 0.73 for the Family Climate and 0.89 for Family Integrity. They showed evidence of one-dimensional interpretation after factor analysis, a) all items got weights r>0.30 in first factor before rotations, b) the first factor explained a significant proportion of variance before rotations, and c) the total variance explained by the main factors extracted was greater than 50%. The scales showed their reliability and validity. They could be employed to assess the self-perception of family health status.
Bower, Kelly J.; McGinley, Jennifer L.; Miller, Kimberly J.; Clark, Ross A.
2014-01-01
Background and Objectives The Wii Balance Board (WBB) is a globally accessible device that shows promise as a clinically useful balance assessment tool. Although the WBB has been found to be comparable to a laboratory-grade force platform for obtaining centre of pressure data, it has not been comprehensively studied in clinical populations. The aim of this study was to investigate the measurement properties of tests utilising the WBB in people after stroke. Methods Thirty individuals who were more than three months post-stroke and able to stand unsupported were recruited from a single outpatient rehabilitation facility. Participants performed standardised assessments incorporating the WBB and customised software (static stance with eyes open and closed, static weight-bearing asymmetry, dynamic mediolateral weight shifting and dynamic sit-to-stand) in addition to commonly employed clinical tests (10 Metre Walk Test, Timed Up and Go, Step Test and Functional Reach) on two testing occasions one week apart. Test-retest reliability and construct validity of the WBB tests were investigated. Results All WBB-based outcomes were found to be highly reliable between testing occasions (ICC = 0.82 to 0.98). Correlations were poor to moderate between WBB variables and clinical tests, with the strongest associations observed between task-related activities, such as WBB mediolateral weight shifting and the Step Test. Conclusions The WBB, used with customised software, is a reliable and potentially useful tool for the assessment of balance and weight-bearing asymmetry following stroke. Future research is recommended to further investigate validity and responsiveness. PMID:25541939
Bower, Kelly J; McGinley, Jennifer L; Miller, Kimberly J; Clark, Ross A
2014-01-01
The Wii Balance Board (WBB) is a globally accessible device that shows promise as a clinically useful balance assessment tool. Although the WBB has been found to be comparable to a laboratory-grade force platform for obtaining centre of pressure data, it has not been comprehensively studied in clinical populations. The aim of this study was to investigate the measurement properties of tests utilising the WBB in people after stroke. Thirty individuals who were more than three months post-stroke and able to stand unsupported were recruited from a single outpatient rehabilitation facility. Participants performed standardised assessments incorporating the WBB and customised software (static stance with eyes open and closed, static weight-bearing asymmetry, dynamic mediolateral weight shifting and dynamic sit-to-stand) in addition to commonly employed clinical tests (10 Metre Walk Test, Timed Up and Go, Step Test and Functional Reach) on two testing occasions one week apart. Test-retest reliability and construct validity of the WBB tests were investigated. All WBB-based outcomes were found to be highly reliable between testing occasions (ICC = 0.82 to 0.98). Correlations were poor to moderate between WBB variables and clinical tests, with the strongest associations observed between task-related activities, such as WBB mediolateral weight shifting and the Step Test. The WBB, used with customised software, is a reliable and potentially useful tool for the assessment of balance and weight-bearing asymmetry following stroke. Future research is recommended to further investigate validity and responsiveness.
Citronberg, Jessica S; Wilkens, Lynne R; Lim, Unhee; Hullar, Meredith A J; White, Emily; Newcomb, Polly A; Le Marchand, Loïc; Lampe, Johanna W
2016-09-01
Plasma lipopolysaccharide-binding protein (LBP), a measure of internal exposure to bacterial lipopolysaccharide, has been associated with several chronic conditions and may be a marker of chronic inflammation; however, no studies have examined the reliability of this biomarker in a healthy population. We examined the temporal reliability of LBP measured in archived samples from participants in two studies. In Study one, 60 healthy participants had blood drawn at two time points: baseline and follow-up (either three, six, or nine months). In Study two, 24 individuals had blood drawn three to four times over a seven-month period. We measured LBP in archived plasma by ELISA. Test-retest reliability was estimated by calculating the intraclass correlation coefficient (ICC). Plasma LBP concentrations showed moderate reliability in Study one (ICC 0.60, 95 % CI 0.43-0.75) and Study two (ICC 0.46, 95 % CI 0.26-0.69). Restricting the follow-up period improved reliability. In Study one, the reliability of LBP over a three-month period was 0.68 (95 % CI: 0.41-0.87). In Study two, the ICC of samples taken ≤seven days apart was 0.61 (95 % CI 0.29-0.86). Plasma LBP concentrations demonstrated moderate test-retest reliability in healthy individuals with reliability improving over a shorter follow-up period.
Combined evaluation of commonly used techniques, including PCR, for diagnosis of mouse fur mites.
Karlsson, Eleanor M; Pearson, Laura M; Kuzma, Kristen M; Burkholder, Tanya H
2014-01-01
Our study evaluated and compared the false-negative rates (FNR) of a wide array of fur-mite diagnostic tests, including 2 postmortem tests (pelt exam and sticky paper) and 3 antemortem tests (adhesive tape, fur pluck, and PCR). Past publications examining fur-mite diagnostic techniques primarily used paired comparisons, evaluating tests by their level of agreement with only one other test. However, different combinations or pairs of diagnostics are used in the different studies, making the results of these comparisons difficult to interpret across all available diagnostics. In the current study, mice from a conventionally maintained colony endemic for Myobia musculi were identified as positive based on at least one positive diagnostic test. From this pool of positive animals, the FNR of all tests were quantified. The PCR assay and the pelt exam performed the best, with 0% and 2% FNR respectively, whereas tape, fur-pluck, and sticky-paper tests showed 24%, 26%, and 36% FNR, respectively. Our study shows that for mice in a colony naturally infested with Myobia musculi, PCR testing can be used for reliable antemortem detection, and pelt exam performed by experienced examiners is reliable for postmortem detection.
Scaglioni-Solano, Pietro; Aragón-Vargas, Luis F
2014-06-01
Standing balance is an important motor task. Postural instability associated with age typically arises from deterioration of peripheral sensory systems. The modified Clinical Test of Sensory Integration for Balance and the Tandem test have been used to screen for balance. Timed tests present some limitations, whereas quantification of the motions of the center of pressure (CoP) with portable and inexpensive equipment may help to improve the sensitivity of these tests and give the possibility of widespread use. This study determines the validity and reliability of the Wii Balance Board (Wii BB) to quantify CoP motions during the mentioned tests. Thirty-seven older adults completed three repetitions of five balance conditions: eyes open, eyes closed, eyes open on a compliant surface, eyes closed on a compliant surface, and tandem stance, all performed on a force plate and a Wii BB simultaneously. Twenty participants repeated the trials for reliability purposes. CoP displacement was the main outcome measure. Regression analysis indicated that the Wii BB has excellent concurrent validity, and Bland-Altman plots showed good agreement between devices with small mean differences and no relationship between the difference and the mean. Intraclass correlation coefficients (ICCs) indicated modest-to-excellent test-retest reliability (ICC=0.64-0.85). Standard error of measurement and minimal detectable change were similar for both devices, except the 'eyes closed' condition, with greater standard error of measurement for the Wii BB. In conclusion, the Wii BB is shown to be a valid and reliable method to quantify CoP displacement in older adults.
The reliability of WorkWell Systems Functional Capacity Evaluation: a systematic review
2014-01-01
Background Functional capacity evaluation (FCE) determines a person’s ability to perform work-related tasks and is a major component of the rehabilitation process. The WorkWell Systems (WWS) FCE (formerly known as Isernhagen Work Systems FCE) is currently the most commonly used FCE tool in German rehabilitation centres. Our systematic review investigated the inter-rater, intra-rater and test-retest reliability of the WWS FCE. Methods We performed a systematic literature search of studies on the reliability of the WWS FCE and extracted item-specific measures of inter-rater, intra-rater and test-retest reliability from the identified studies. Intraclass correlation coefficients ≥ 0.75, percentages of agreement ≥ 80%, and kappa coefficients ≥ 0.60 were categorised as acceptable, otherwise they were considered non-acceptable. The extracted values were summarised for the five performance categories of the WWS FCE, and the results were classified as either consistent or inconsistent. Results From 11 identified studies, 150 item-specific reliability measures were extracted. 89% of the extracted inter-rater reliability measures, all of the intra-rater reliability measures and 96% of the test-retest reliability measures of the weight handling and strength tests had an acceptable level of reliability, compared to only 67% of the test-retest reliability measures of the posture/mobility tests and 56% of the test-retest reliability measures of the locomotion tests. Both of the extracted test-retest reliability measures of the balance test were acceptable. Conclusions Weight handling and strength tests were found to have consistently acceptable reliability. Further research is needed to explore the reliability of the other tests as inconsistent findings or a lack of data prevented definitive conclusions. PMID:24674029
NASA Astrophysics Data System (ADS)
Wan, Fubin; Tan, Yuanyuan; Jiang, Zhenhua; Chen, Xun; Wu, Yinong; Zhao, Peng
2017-12-01
Lifetime and reliability are the two performance parameters of premium importance for modern space Stirling-type pulse tube refrigerators (SPTRs), which are required to operate in excess of 10 years. Demonstration of these parameters provides a significant challenge. This paper proposes a lifetime prediction and reliability estimation method that utilizes accelerated degradation testing (ADT) for SPTRs related to gaseous contamination failure. The method was experimentally validated via three groups of gaseous contamination ADT. First, the performance degradation model based on mechanism of contamination failure and material outgassing characteristics of SPTRs was established. Next, a preliminary test was performed to determine whether the mechanism of contamination failure of the SPTRs during ADT is consistent with normal life testing. Subsequently, the experimental program of ADT was designed for SPTRs. Then, three groups of gaseous contamination ADT were performed at elevated ambient temperatures of 40 °C, 50 °C, and 60 °C, respectively and the estimated lifetimes of the SPTRs under normal condition were obtained through acceleration model (Arrhenius model). The results show good fitting of the degradation model with the experimental data. Finally, we obtained the reliability estimation of SPTRs through using the Weibull distribution. The proposed novel methodology enables us to take less than one year time to estimate the reliability of the SPTRs designed for more than 10 years.
Knox's Cube Imitation Test: A Historical Review and an Experimental Analysis
ERIC Educational Resources Information Center
Richardson, John T. E.
2005-01-01
The cube imitation test was developed by Knox (1913) as a nonverbal test of intelligence. Many variants show satisfactory reliability, but performance is correlated both with Verbal IQ and with Performance IQ. Performance is impaired by cerebral lesions but unrelated to the side of lesion. Examinees describe both verbal and visuospatial…
Trippolini, Maurizio Alen; Janssen, Svenja; Hilfiker, Roger; Oesch, Peter
2018-06-01
Purpose To analyze the reliability and validity of a picture-based questionnaire, the Modified Spinal Function Sort (M-SFS). Methods Sixty-two injured workers with chronic musculoskeletal disorders (MSD) were recruited from two work rehabilitation centers. Internal consistency was assessed by Cronbach's alpha. Construct validity was tested based on four a priori hypotheses. Structural validity was measured with principal component analysis (PCA). Test-retest reliability and agreement was evaluated using intraclass correlation coefficient (ICC) and measurement error with the limits of agreement (LoA). Results Total score of the M-SFS was 54.4 (SD 16.4) and 56.1 (16.4) for test and retest, respectively. Item distribution showed no ceiling effects. Cronbach's alpha was 0.94 and 0.95 for test and retest, respectively. PCA showed the presence of four components explaining a total of 74% of the variance. Item communalities were >0.6 in 17 out of 20 items. ICC was 0.90, LoA was ±12.6/16.2 points. The correlations between the M-SFS were 0.89 with the original SFS, 0.49 with the Pain Disability Index, -0.37 and -0.33 with the Numeric Rating Scale for actual pain, -0.52 for selfreported disability due to chronic low back pain, and 0.50, 0.56-0.59 with three distinct lifting tests. No a priori defined hypothesis for construct validity was rejected. Conclusions The M-SFS allows reliable and valid assessment of perceived self-efficacy for work-related tasks and can be recommended for use in patients with chronic MSD. Further research should investigate the proposed M-SFS score of <56 for its predictive validity for non-return to work.
Electrical impedance myography in facioscapulohumeral muscular dystrophy.
Statland, Jeffrey M; Heatwole, Chad; Eichinger, Katy; Dilek, Nuran; Martens, William B; Tawil, Rabi
2016-10-01
In this study we determined the reliability and validity of electrical impedance myography (EIM) in facioscapulohumeral muscular dystrophy (FSHD). We performed a prospective study of EIM on 16 bilateral limb and trunk muscles in 35 genetically defined and clinically affected FSHD patients (reliability testing on 18 patients). Summary scores based on body region were derived. Reactance and phase (50 and 100 kHz) were compared with measures of strength, FSHD disease severity, and functional outcomes. Participants were mostly men, mean age 53.0 years, and included a full range of severity. Limb and trunk muscles showed good to excellent reliability [intraclass correlation coefficients (ICC) 0.72-0.99]. Summary scores for the arm, leg, and trunk showed excellent reliability (ICC 0.89-0.98). Reactance was the most sensitive EIM parameter to a broad range of FSHD disease metrics. EIM is a reliable measure of muscle composition in FSHD that offers the possibility to serially evaluate affected muscles. Muscle Nerve 54: 696-701, 2016. © 2016 Wiley Periodicals, Inc.
System reliability, performance and trust in adaptable automation.
Chavaillaz, Alain; Wastell, David; Sauer, Jürgen
2016-01-01
The present study examined the effects of reduced system reliability on operator performance and automation management in an adaptable automation environment. 39 operators were randomly assigned to one of three experimental groups: low (60%), medium (80%), and high (100%) reliability of automation support. The support system provided five incremental levels of automation which operators could freely select according to their needs. After 3 h of training on a simulated process control task (AutoCAMS) in which the automation worked infallibly, operator performance and automation management were measured during a 2.5-h testing session. Trust and workload were also assessed through questionnaires. Results showed that although reduced system reliability resulted in lower levels of trust towards automation, there were no corresponding differences in the operators' reliance on automation. While operators showed overall a noteworthy ability to cope with automation failure, there were, however, decrements in diagnostic speed and prospective memory with lower reliability. Copyright © 2015. Published by Elsevier Ltd.
Validity and Reliability of a New Device (WIMU®) for Measuring Hamstring Muscle Extensibility.
Muyor, José M
2017-09-01
The aims of the current study were 1) to evaluate the validity of the WIMU ® system for measuring hamstring muscle extensibility in the passive straight leg raise (PSLR) test using an inclinometer for the criterion and 2) to determine the test-retest reliability of the WIMU ® system to measure hamstring muscle extensibility during the PSLR test. 55 subjects were evaluated on 2 separate occasions. Data from a Unilever inclinometer and WIMU ® system were collected simultaneously. Intraclass correlation coefficients (ICCs) for the validity were very high (0.983-1); a very low systematic bias (-0.21°--0.42°), random error (0.05°-0.04°) and standard error of the estimate (0.43°-0.34°) were observed (left-right leg, respectively) between the 2 devices (inclinometer and the WIMU ® system). The R 2 between the devices was 0.999 (p<0.001) in both the left and right legs. The test-retest reliability of the WIMU ® system was excellent, with ICCs ranging from 0.972-0.995, low coefficients of variation (0.01%), and a low standard error of the estimate (0.19-0.31°). The WIMU ® system showed strong concurrent validity and excellent test-retest reliability for the evaluation of hamstring muscle extensibility in the PSLR test. © Georg Thieme Verlag KG Stuttgart · New York.
Gustafsson, Margareta; Blomberg, Karin; Holmefur, Marie
2015-07-01
The Clinical Learning Environment, Supervision and Nurse Teacher (CLES + T) scale evaluates the student nurses' perception of the learning environment and supervision within the clinical placement. It has never been tested in a replication study. The aim of the present study was to evaluate the test-retest reliability of the CLES + T scale. The CLES + T scale was administered twice to a group of 42 student nurses, with a one-week interval. Test-retest reliability was determined by calculations of Intraclass Correlation Coefficients (ICCs) and weighted Kappa coefficients. Standard Error of Measurements (SEM) and Smallest Detectable Difference (SDD) determined the precision of individual scores. Bland-Altman plots were created for analyses of systematic differences between the test occasions. The results of the study showed that the stability over time was good to excellent (ICC 0.88-0.96) in the sub-dimensions "Supervisory relationship", "Pedagogical atmosphere on the ward" and "Role of the nurse teacher". Measurements of "Premises of nursing on the ward" and "Leadership style of the manager" had lower but still acceptable stability (ICC 0.70-0.75). No systematic differences occurred between the test occasions. This study supports the usefulness of the CLES + T scale as a reliable measure of the student nurses' perception of the learning environment within the clinical placement at a hospital. Copyright © 2015 Elsevier Ltd. All rights reserved.
Barbosa, Taís de Souza; Gavião, Maria Beatriz Duarte
2015-01-01
To test the validity and reliability of Brazilian Portuguese version of the Parental-Caregiver Perceptions Questionnaire (P-CPQ) (Aim 1) and to assess the agreement between parents and children concerning the child's oral health-related quality of life (OHRQoL) (Aim 2). The P-CPQ and the Brazilian Portuguese versions of the Child Perceptions Questionnaires (CPQ8-10 and CPQ11-14 ) were used. Objective 1 addressed in the study that involved 210 (validity and internal reliability) and 20 (test-retest reliability) parents and Objective 2 in the study that involved 210 pairs of parents and children. Construct validity was calculated using the Spearman's correlation and the Mann-Whitney/Kruskal-Wallis tests. Reliability was determined using Cronbach's alpha and intraclass correlation coefficient (ICC). Agreement between overall and subscale scores derived from the P-CPQ and CPQ was assessed in comparison and correlation analyses. The P-CPQ discriminated among the categories of malocclusion and dmft. The P-CPQ showed good construct validity, good internal consistency reliability, and excellent test-retest reliability. There was systematic under- and overreporting in parents' assessments for younger and older children, respectively. However, the magnitude of the directional differences was just small. At individual level, agreement between parents and children was excellent. However, it ranged from excellent to moderate or substantial in subscales for CPQ8-10 and CPQ11-14 groups, respectively. The Portuguese version of P-CPQ is valid and reliable. Some parents have limited knowledge about child OHRQoL. Given that parental and child reports measure different realities concerning the child's OHRQoL, information provided by parents can complement the child's evaluation. © 2015 American Association of Public Health Dentistry.
Lamarão, Andressa M.; Costa, Lucíola C. M.; Comper, Maria L. C.; Padula, Rosimeire S.
2014-01-01
Background: Observational instruments, such as the Rapid Entire Body Assessment, quickly assess biomechanical risks present in the workplace. However, in order to use these instruments, it is necessary to conduct the translational/cross-cultural adaptation of the instrument and test its measurement properties. Objectives: To perform the translation and the cross-cultural adaptation to Brazilian-Portuguese and test the reliability of the REBA instrument. Method: The procedures of translation and cross-cultural adaptation to Brazilian-Portuguese were conducted following proposed guidelines that involved translation, synthesis of translations, back translation, committee review and testing of the pre-final version. In addition, reliability and the intra- and inter-rater percent agreement were obtained with the Linear Weighted Kappa Coefficient that was associated with the 95% Confidence Interval and the cross tabulation 2×2. Results : The procedures for translation and adaptation were adequate and the necessary adjustments were conducted on the instrument. The intra- and inter-rater reliability showed values of 0.104 to 0.504, respectively, ranging from very poor to moderate. The percentage agreement values ranged from 5.66% to 69.81%. The percentage agreement was closer to 100% at the item 'upper arm' (69.81%) for the Intra-rater 1 and at the items 'legs' and 'upper arm' for the Intra-rater 2 (62.26%). Conclusions: The processes of translation and cross-cultural adaptation were conducted on the REBA instrument and the Brazilian version of the instrument was obtained. However, despite the reliability of the tests used to correct the translated and adapted version, the reliability values are unacceptable according to the guidelines standard, indicating that the reliability must be re-evaluated. Therefore, caution in the interpretation of the biomechanical risks measured by this instrument should be taken. PMID:25003273
Olsen, Cecilie Fromholt; Bergland, Astrid
2017-06-09
The purpose of the study was to establish the test-retest reliability of the Norwegian version of the Short Physical Performance Battery (SPPB). This was a cross- sectional reliability study. A convenience sample of 61 older adults with a mean age of 88.4(8.1) was tested by two different physiotherapists at two time points. The mean time interval between tests was 2.5 days. The Intraclass Correlation Coefficient model 3.1 (ICC, 3.1) with 95% confidence intervals as well as the weighted Kappa (K) were used as measures of relative reliability. The Standard Error of Measurement (SEM) and Minimal Detectable Change (MDC) were used to measure absolute reliability. The results were also analyzed for a subgroup of 24 older people with dementia. The ICC reflected high relative reliability for the SPPB summary score and the 4 m walk test (4mwt), both for the total sample (ICC = 0.92, and 0.91 respectively)) and for the subgroup with dementia (ICC = 0.84 and 0.90 respectively). Furthermore, weighted Ks for the SPPB subscales were 0.64 for the chair stand, 0.80 for gait and 0.52 for balance for the total sample and almost identical for the subgroup with dementia. MDC-values at the 95% confidence intervals (MDC95) were calculated at 0.8 for the total score of SPPB and 0.39 m/s for the 4mwt in the total sample. For the subgroup with dementia MDC95 was 1.88 for the total score of SPPB and 0.28 m/s for 4mwt. The SPPB total score and the timed walking test showed overall high relative and absolute reliability for the total sample indicating that the Norwegian version of the SPPB is reliable when used by trained physiotherapists with older people. The reliability of the Norwegian SPPB in older people with dementia seems high, but due to a small sample size this needs further investigation.
INOUE, Akiomi; KAWAKAMI, Norito; SHIMOMITSU, Teruichi; TSUTSUMI, Akizumi; HARATANI, Takashi; YOSHIKAWA, Toru; SHIMAZU, Akihito; ODAGIRI, Yuko
2014-01-01
This study aimed to investigate the reliability and construct validity of a new version of the Brief Job Stress Questionnaire (New BJSQ), which measures an extended set of psychosocial factors at work by adding new scales/items to the current version of the BJSQ. Additional scales/items were extensively collected from theoretical job stress models and similar questionnaires in several countries. Scales/items were field-tested and refined through a pilot internet survey. Finally, an 84-item questionnaire (141 items in total when combined with the current BJSQ) was developed. A nationally representative survey was administered to employees in Japan (n=1,633) to examine the reliability and construct validity. Most scales showed acceptable levels of internal consistency and test-retest reliability. Principal component analyses showed that the first factor explained 50% or greater proportion of the variance in most scales. A scale factor analysis and a correlation analysis showed that these scales fit the theoretical expectations. These findings provided a piece of evidence that the New BJSQ scales are reliable and valid. Although more detailed content and construct validity should be examined in future study, the New BJSQ is a useful instrument to evaluate psychosocial work environment and positive mental health outcomes in the current workplace. PMID:24492763
Translation and adaptation of the fatigue severity scale for use in Portugal.
Laranjeira, Carlos António
2012-08-01
The Fatigue Severity Scale (FSS) is a widely used instrument to measure the impact of fatigue on specific types of functioning. This study aims to translate and test the reliability and validity of the Portuguese version of the FSS. The questionnaire was administered to a worker sample of 424 nurses. Reliability analysis showed satisfactory results (Cronbach's alpha coefficient = .87). The test-retest reliability was .85. The principal component analysis showed that the FSS was a measure with a one-factor structure. The construct validity of the total FSS score was assessed by correlation with Maslach Burnout Inventory (MBI) score, Depression Anxiety Stress Scale (DASS) score, and Visual Analogue Scale (VAS) score. Each of the corresponding correlation coefficients among the total FSS score and MBI score, DASS score, and perceived fatigue score (VAS) were .55 (p < .01), .62 (p < .01), and .68 (p < .01), respectively, which shows sufficient construct validity. To measure the discriminant validity of FSS, we examined the differences in scores between groups in terms of the number of hours of sleep and overtime. The less nurses slept and the longer they worked, the higher their total FSS score became. This preliminary validation study of the Portuguese version of FSS proved that it is an acceptable, reliable, and valid measure of fatigue in the working population. Copyright © 2012 Elsevier Inc. All rights reserved.
Inoue, Akiomi; Kawakami, Norito; Shimomitsu, Teruichi; Tsutsumi, Akizumi; Haratani, Takashi; Yoshikawa, Toru; Shimazu, Akihito; Odagiri, Yuko
2014-01-01
This study aimed to investigate the reliability and construct validity of a new version of the Brief Job Stress Questionnaire (New BJSQ), which measures an extended set of psychosocial factors at work by adding new scales/items to the current version of the BJSQ. Additional scales/items were extensively collected from theoretical job stress models and similar questionnaires in several countries. Scales/items were field-tested and refined through a pilot internet survey. Finally, an 84-item questionnaire (141 items in total when combined with the current BJSQ) was developed. A nationally representative survey was administered to employees in Japan (n=1,633) to examine the reliability and construct validity. Most scales showed acceptable levels of internal consistency and test-retest reliability. Principal component analyses showed that the first factor explained 50% or greater proportion of the variance in most scales. A scale factor analysis and a correlation analysis showed that these scales fit the theoretical expectations. These findings provided a piece of evidence that the New BJSQ scales are reliable and valid. Although more detailed content and construct validity should be examined in future study, the New BJSQ is a useful instrument to evaluate psychosocial work environment and positive mental health outcomes in the current workplace.
Bussey, Melanie D; Aldabe, Daniela; Adhia, Divya; Mani, Ramakrishnan
2018-04-01
Normalizing to a reference signal is essential when analysing and comparing electromyography signals across or within individuals. However, studies have shown that MVC testing may not be as reliable in persons with acute and chronic pain. The purpose of this study was to compare the test-retest reliability of the muscle activity in the biceps femoris and gluteus maximus between a novel sub-MVC and standard MVC protocols. This study utilized a single individual repeated measures design with 12 participants performing multiple trials of both the sub-MVC and MVC tasks on two separate days. The participant position in the prone leg raise task was standardised with an ultrasonic sensor to improve task precession between trials/days. Day-to-day and trial-to-trial reliability of the maximal muscle activity was examined using ICC and SEM. Day-to-day and trial-to-trial reliability of the EMG activity in the BF and GM were high (0.70-0.89) to very high (≥0.90) for both test procedures. %SEM was <5-10% for both tests on a given day but higher in the day-to-day comparisons. The lower amplitude of the sub-MVC is a likely contributor to increased %SEM (8-13%) in the day-to-day comparison. The findings show that the sub-MVC modified prone double leg raise results in GM and BF EMG measures similar in reliability and precision to the standard MVC tasks. Therefore, the modified prone double leg raise may be a useful substitute for traditional MVC testing for normalizing EMG signals of the BF and GM. Copyright © 2017 Elsevier Ltd. All rights reserved.
Tan, Christine L; Hassali, Mohamed A; Saleem, Fahad; Shafie, Asrul A; Aljadhey, Hisham; Gan, Vincent B
2015-01-01
(i) To develop the Pharmacy Value-Added Services Questionnaire (PVASQ) using emerging themes generated from interviews. (ii) To establish reliability and validity of questionnaire instrument. Using an extended Theory of Planned Behavior as the theoretical model, face-to-face interviews generated salient beliefs of pharmacy value-added services. The PVASQ was constructed initially in English incorporating important themes and later translated into the Malay language with forward and backward translation. Intention (INT) to adopt pharmacy value-added services is predicted by attitudes (ATT), subjective norms (SN), perceived behavioral control (PBC), knowledge and expectations. Using a 7-point Likert-type scale and a dichotomous scale, test-retest reliability (N=25) was assessed by administrating the questionnaire instrument twice at an interval of one week apart. Internal consistency was measured by Cronbach's alpha and construct validity between two administrations was assessed using the kappa statistic and the intraclass correlation coefficient (ICC). Confirmatory Factor Analysis, CFA (N=410) was conducted to assess construct validity of the PVASQ. The kappa coefficients indicate a moderate to almost perfect strength of agreement between test and retest. The ICC for all scales tested for intra-rater (test-retest) reliability was good. The overall Cronbach' s alpha (N=25) is 0.912 and 0.908 for the two time points. The result of CFA (N=410) showed most items loaded strongly and correctly into corresponding factors. Only one item was eliminated. This study is the first to develop and establish the reliability and validity of the Pharmacy Value-Added Services Questionnaire instrument using the Theory of Planned Behavior as the theoretical model. The translated Malay language version of PVASQ is reliable and valid to predict Malaysian patients' intention to adopt pharmacy value-added services to collect partial medicine supply.
Kloos, Anne D.; Fritz, Nora E.; Kostyk, Sandra K.; Young, Gregory S.; Kegelmeyer, Deb A.
2014-01-01
Background and purpose Individuals with Huntington's disease (HD) experience balance and gait problems that lead to falls. Clinicians currently have very little information about the reliability and validity of outcome measures to determine the efficacy of interventions that aim to reduce balance and gait impairments in HD. This study examined the reliability and concurrent validity of spatiotemporal gait measures, the Tinetti Mobility Test (TMT), Four Square Step Test (FSST), and Activities-specific Balance Confidence (ABC) Scale in individuals with HD. Methods Participants with HD [n = 20; mean age ± SD = 50.9 ± 13.7; 7 male] were tested on spatiotemporal gait measures the TMT, FSST, and ABC Scale before and after a six week period to determine test–retest reliability and minimal detectable change (MDC) values. Linear relationships between gait and clinical measures were estimated using Pearson's correlation coefficients. Results Spatiotemporal gait measures, the TMT total and the FSST showed good to excellent test–retest reliability (ICC > 0.75). MDC values were 0.30 m/s and 0.17 m/s for velocity in forward and backward walking respectively, four points for the TMT, and 3 s for the FSST. The TMT and FSST were highly correlated with most spatiotemporal measures. The ABC Scale demonstrated lower reliability and less concurrent validity than other measures. Conclusions The high test–retest reliability over a six week period and concurrent validity between the TMT, FSST, and spatiotemporal gait measures suggest that the TMT and FSST may be useful outcome measures for future intervention studies in ambulatory individuals with HD. PMID:25128156
NASA Astrophysics Data System (ADS)
Iskandar, Ismed; Satria Gondokaryono, Yudi
2016-02-01
In reliability theory, the most important problem is to determine the reliability of a complex system from the reliability of its components. The weakness of most reliability theories is that the systems are described and explained as simply functioning or failed. In many real situations, the failures may be from many causes depending upon the age and the environment of the system and its components. Another problem in reliability theory is one of estimating the parameters of the assumed failure models. The estimation may be based on data collected over censored or uncensored life tests. In many reliability problems, the failure data are simply quantitatively inadequate, especially in engineering design and maintenance system. The Bayesian analyses are more beneficial than the classical one in such cases. The Bayesian estimation analyses allow us to combine past knowledge or experience in the form of an apriori distribution with life test data to make inferences of the parameter of interest. In this paper, we have investigated the application of the Bayesian estimation analyses to competing risk systems. The cases are limited to the models with independent causes of failure by using the Weibull distribution as our model. A simulation is conducted for this distribution with the objectives of verifying the models and the estimators and investigating the performance of the estimators for varying sample size. The simulation data are analyzed by using Bayesian and the maximum likelihood analyses. The simulation results show that the change of the true of parameter relatively to another will change the value of standard deviation in an opposite direction. For a perfect information on the prior distribution, the estimation methods of the Bayesian analyses are better than those of the maximum likelihood. The sensitivity analyses show some amount of sensitivity over the shifts of the prior locations. They also show the robustness of the Bayesian analysis within the range between the true value and the maximum likelihood estimated value lines.
Ely, E Wesley; Truman, Brenda; Shintani, Ayumi; Thomason, Jason W W; Wheeler, Arthur P; Gordon, Sharon; Francis, Joseph; Speroff, Theodore; Gautam, Shiva; Margolin, Richard; Sessler, Curtis N; Dittus, Robert S; Bernard, Gordon R
2003-06-11
Goal-directed delivery of sedative and analgesic medications is recommended as standard care in intensive care units (ICUs) because of the impact these medications have on ventilator weaning and ICU length of stay, but few of the available sedation scales have been appropriately tested for reliability and validity. To test the reliability and validity of the Richmond Agitation-Sedation Scale (RASS). Prospective cohort study. Adult medical and coronary ICUs of a university-based medical center. Thirty-eight medical ICU patients enrolled for reliability testing (46% receiving mechanical ventilation) from July 21, 1999, to September 7, 1999, and an independent cohort of 275 patients receiving mechanical ventilation were enrolled for validity testing from February 1, 2000, to May 3, 2001. Interrater reliability of the RASS, Glasgow Coma Scale (GCS), and Ramsay Scale (RS); validity of the RASS correlated with reference standard ratings, assessments of content of consciousness, GCS scores, doses of sedatives and analgesics, and bispectral electroencephalography. In 290-paired observations by nurses, results of both the RASS and RS demonstrated excellent interrater reliability (weighted kappa, 0.91 and 0.94, respectively), which were both superior to the GCS (weighted kappa, 0.64; P<.001 for both comparisons). Criterion validity was tested in 411-paired observations in the first 96 patients of the validation cohort, in whom the RASS showed significant differences between levels of consciousness (P<.001 for all) and correctly identified fluctuations within patients over time (P<.001). In addition, 5 methods were used to test the construct validity of the RASS, including correlation with an attention screening examination (r = 0.78, P<.001), GCS scores (r = 0.91, P<.001), quantity of different psychoactive medication dosages 8 hours prior to assessment (eg, lorazepam: r = - 0.31, P<.001), successful extubation (P =.07), and bispectral electroencephalography (r = 0.63, P<.001). Face validity was demonstrated via a survey of 26 critical care nurses, which the results showed that 92% agreed or strongly agreed with the RASS scoring scheme, and 81% agreed or strongly agreed that the instrument provided a consensus for goal-directed delivery of medications. The RASS demonstrated excellent interrater reliability and criterion, construct, and face validity. This is the first sedation scale to be validated for its ability to detect changes in sedation status over consecutive days of ICU care, against constructs of level of consciousness and delirium, and correlated with the administered dose of sedative and analgesic medications.
Zuvela, Frane; Bozanic, Ana; Miletic, Durdica
2011-01-01
Inadequately adopted fundamental movement skills (FMS) in early childhood may have a negative impact on the motor performance in later life (Gallahue and Ozmun, 2005). The need for an efficient FMS testing in Physical Education was recognized. The aim of this paper was to construct and validate a new FMS test for 8 year old children. Ninety-five 8 year old children were used for the testing. A total of 24 new FMS tasks were constructed and only the best representatives of movement areas entered into the final test product - FMS-POLYGON. The ICC showed high values for all 24 tasks (0.83-0.97) and the factorial analysis revealed the best representatives of each movement area that entered the FMS-POLYGON: tossing and catching the volleyball against a wall, running across obstacles, carrying the medicine balls, and straight running. The ICC for the FMS-POLYGON showed a very high result (0.98) and, therefore, confirmed the test’s intra-rater reliability. Concurrent validity was tested with the use of the “Test of Gross Motor Development” (TGMD-2). Correlation analysis between the newly constructed FMS-POLYGON and the TGMD-2 revealed the coefficient of -0.82 which indicates a high correlation. In conclusion, the new test for FMS assessment proved to be a reliable and valid instrument for 8 year old children. Application of this test in schools is justified and could play an important factor in physical education and sport practice. Key points All 21 newly constructed tasks demonstrated high intra-rater reliability (0.83-0.97) in FMS assessment. High reliability was also noted in the FMS-POLYGON test (0.98). A high correlation was found between the FMS-POLYGON and TGMD-2 which is a confirmation of the new test’s concurrent validity. The research resolved the problem of long and detailed FMS assessment by adding a new dimension using quick and effective norm-referenced approach but also covering all the most important movement areas. New and validated test can be of great use primarily in school practice for physical education teachers and FMS experts. PMID:24149309
Ross, David E; Ochs, Alfred L; Seabaugh, Jan M; Demark, Michael F; Shrader, Carole R; Marwitz, Jennifer H; Havranek, Michael D
2012-01-01
NeuroQuant® is a recently developed, FDA-approved software program for measuring brain MRI volume in clinical settings. The aims of this study were as follows: (1) to examine the test-retest reliability of NeuroQuant®; (2) to test the hypothesis that patients with mild traumatic brain injury (TBI) would have abnormally rapid progressive brain atrophy; and (3) to test the hypothesis that progressive brain atrophy in patients with mild TBI would be associated with vocational outcome. Sixteen patients with mild TBI were compared to 20 normal controls. Vocational outcome was assessed with the Glasgow Outcome Scale-Extended (GOSE) and Disability Rating Scale (DRS). NeuroQuant® showed high test-re-test reliability. Patients had abnormally rapid progressive atrophy in several brain regions and the rate of atrophy was associated with inability to return to work. NeuroQuant®, is a reliable and valid method for assessing the anatomic effects of TBI. Progression of atrophy may continue for years after injury, even in patients with mild TBI.
Test-retest reliability of the Mandarin versions of the Hypertension Self-Care Profile instrument.
Ngoh, Soh Heng Agnes; Lim, Hazel Wai Ling; Koh, Yi Ling Eileen; Tan, Ngiap Chuan
2017-11-01
Self-efficacy in essential hypertension can be measured using scales, such as the "Hypertension Self-Care Profile" (HTN-SCP) questionnaire. It assesses "Behavior", "Motivation", and "Self-efficacy" in 3 domains, respectively. This study aimed to validate the Mandarin version of HTN-SCP instrument (HTN-SCP-Mn) targeted at patients of Chinese ethnicity with hypertension.Our study recruited Chinese patients, aged 40 years and older, with essential hypertension from a public primary healthcare clinic in Singapore. The 60-item HTN-SCP-Mn questionnaire was completed online using a tablet or smartphone on enrolment. A retest was conducted 2 weeks after the initial test. Reliability was assessed by internal consistency and test-retest reliability using Cronbach alpha and intraclass correlation coefficients (ICC). Differences between the overall HTN-SCP-Mn scores of the patients and their self-reported self-management activities were also determined using independent t test.Of the 153 patients who completed the HTN-SCP-Mn during the initial test, 79 responded to the test-retest evaluation. Reliability of the 3 domains "Behavior", "Motivation", and "Self-efficacy" obtained high internal consistency (Cronbach alpha = 0.838, 0.929, and 0.927, respectively). The item total correlation ranged from 0.058 to 0.677 for Behavior, 0.374 to 0.798 for Motivation, and 0.326 to 0.767 for self-efficacy. The ICC indicated fair to good test-retest reliability with scores of 0.643, 0.579, and 0.710 for the respective domains.The results showed face validity of the HTN-SCP-Mn instrument, indicating its potential application in mandarin-proficient patients. Further study is needed to correlate its scores with objective demonstration of self-efficacy.
Testing the Zimbardo Time Perspective Inventory in the Chinese context.
Wang, Ya; Chen, Xing-Jie; Cui, Ji-Fang; Liu, Lu-Lu
2015-09-01
In this study, the authors evaluated the Chinese version of the Zimbardo Time Perspective Inventory (ZTPI). The ZTPI was tested among a sample of 303 university students. A subsample of 51 participants was then asked to complete the ZTPI again along with another set of questionnaires. The five-factor model of a 20-item short version of the ZTPI showed good model fit, internal consistency, and test-retest reliability. The 20-item Chinese version of the ZTPI also provided good validity, showing correlations with other variables in expected directions. Past-Positive was positively correlated with reappraisal and negatively correlated with suppression emotion regulation strategies, and Present-Hedonistic was positively correlated with reappraisal emotion regulation strategies. These findings indicate that the ZTPI is a reliable and valid instrument for measuring time perspective in the Chinese setting. © 2015 The Institute of Psychology, Chinese Academy of Sciences and Wiley Publishing Asia Pty Ltd.
Metrics of Balance Control for Use in Screening Tests of Vestibular Function
NASA Technical Reports Server (NTRS)
Fiedler, Matthew; Cohen, Helen; Mulavara, Ajitkumar; Peters, Brian; Miller, Chris; Bloomberg, Jacob
2011-01-01
Decrements in balance control have been documented in astronauts after space flight. Reliable measures of balance control are needed for use in postflight field tests at remote landing sites. Diffusion analysis (DA) is a statistical mechanical tool that shows the average difference of the dependent variable on varying time scales. These techniques have been shown to measure differences in open-loop and closed-loop postural control in astronauts and elderly subjects. The goal of this study was to investigate the reliability of these measures of balance control. Eleven subjects were tested using the Clinical Test of Sensory Interaction on Balance: the subject stood with feet together and arms crossed on a stable or compliant surface, with eyes open or closed and with or without head movements in the pitch or yaw plane. Subjects were instrumented with inertial motion sensors attached to their trunk segment. The DA curves for linear acceleration measures were characterized by linear fits measuring open- (Ds) and closed-loop (Dl) control, and their intersection point (X-int, Y-int). Ds and Y-int showed significant differences between the test conditions. Additionally, Ds was correlated with the root mean square (RMS) of the signal, indicating that RMS was dominated by open-loop events (< 0.5 seconds). The Y-int was found to be correlated with the average linear velocity of trunk movements. Thus DA measures could be applied to derive reliable metrics of balance stability during field tests.
Toward lean satellites reliability improvement using HORYU-IV project as case study
NASA Astrophysics Data System (ADS)
Faure, Pauline; Tanaka, Atomu; Cho, Mengu
2017-04-01
Lean satellite programs are programs in which the satellite development philosophy is driven by fast delivery and low cost. Though this concept offers the possibility to develop and fly risky missions without jeopardizing a space program, most of these satellites suffer infant mortality and fail to achieve their mission minimum success. Lean satellites with high infant mortality rate indicate that testing prior to launch is insufficient. In this study, the authors monitored failures occurring during the development of the lean satellite HORYU-IV to identify the evolution of the cumulative number of failures against cumulative testing time. Moreover, the sub-systems driving the failures depending on the different development phases were identified. The results showed that half to 2/3 of the failures are discovered during the early stage of testing. Moreover, when the mean time before failure was calculated, it appeared that for any development phase considered, a new failure appears on average every 20 h of testing. Simulations were also performed and it showed that for an initial testing time of 50 h, reliability after 1 month launch can be improved by nearly 6 times as compared to an initial testing time of 20 h. Through this work, the authors aim at providing a qualitative reference for lean satellites developers to better help them manage resources to develop lean satellites following a fast delivery and low cost philosophy while ensuring sufficient reliability to achieve mission minimum success.
[Evaluation (assessment) of three tests for diagnosis of geohelmints in Colombia].
López, Myriam Consuelo; Moncada, Ligia Inés; Ariza-Araújo, Yoseth; Fernández-Niño, Julián Alfredo; Reyes, Patricia; Nicholls, Rubén Santiago
2013-01-01
Soil-transmitted helminth infections are considered a public health problem in developing countries. The diagnostic tests, both for individual parient diagnosis as for population studies should be evaluated in terms of validity and reliability. To compare the direct examination, the modified Ritchie-Frick method, a Kato-Katz designed by a Brazilian group and one designed by the WHO, for the diagnosis of soil-transmitted helminthes. A diagnostic test reliability study was performed. The same stool sample was analyzed by the same observer using four diagnostic tests. 204 samples were obtained, 194 of those fulfilled the inclusion criteria and were analyzed. The observers did not know the participants' identity neither the other tests results. For the analysis the Kato-Katz (WHO) was considered as the gold standard. For the reliability assessment percent agreement, positive percent agreement, Kappa statistic, and intraclass correlation were performed. The Brazilian Kato-Katz showed a good performance with high sensitivity and specificity for T. trichiura and Hookworm with values of 0.97 and 0.96 respectively, and a high specificity with mild sensitivity for A. lumbricoides (0.95 and 0.79) meanwhile the direct examination and the Ritche-Frick method showed a performance between mild and poor. The differences were higher for hookworm and Trichiuris trichiura than for Ascaris lumbricoides. The Brazilian Kato Katz test could be implemented, but further studies are needed to correlate its operative capacity with its feasibility, availability and cost.
Item Response Theory analysis of Fagerström Test for Cigarette Dependence.
Svicher, Andrea; Cosci, Fiammetta; Giannini, Marco; Pistelli, Francesco; Fagerström, Karl
2018-02-01
The Fagerström Test for Cigarette Dependence (FTCD) and the Heaviness of Smoking Index (HSI) are the gold standard measures to assess cigarette dependence. However, FTCD reliability and factor structure have been questioned and HSI psychometric properties are in need of further investigations. The present study examined the psychometrics properties of the FTCD and the HSI via the Item Response Theory. The study was a secondary analysis of data collected in 862 Italian daily smokers. Confirmatory factor analysis was run to evaluate the dimensionality of FTCD. A Grade Response Model was applied to FTCD and HSI to verify the fit to the data. Both item and test functioning were analyzed and item statistics, Test Information Function, and scale reliabilities were calculated. Mokken Scale Analysis was applied to estimate homogeneity and Loevinger's coefficients were calculated. The FTCD showed unidimensionality and homogeneity for most of the items and for the total score. It also showed high sensitivity and good reliability from medium to high levels of cigarette dependence, although problems related to some items (i.e., items 3 and 5) were evident. HSI had good homogeneity, adequate item functioning, and high reliability from medium to high levels of cigarette dependence. Significant Differential Item Functioning was found for items 1, 4, 5 of the FTCD and for both items of HSI. HSI seems highly recommended in clinical settings addressed to heavy smokers while FTCD would be better used in smokers with a level of cigarette dependence ranging between low and high. Copyright © 2017 Elsevier Ltd. All rights reserved.
The validity and reliability of the ADL-Glittre test for children.
Martins, Renata; Assumpção, Maíra S de; Bobbio, Tatiana G; Mayer, Anamaria F; Schivinski, Camila
2018-04-16
The ADL-Glittre was created to assess more comprehensively the essential activities of daily living in adults with chronic obstructive pulmonary disease. The aim of this study was to validate the ADL-Glittre test adapted for children (TGlittre-P) and verify its reliability. This is a cross-sectional study with 87 healthy children aged 6 to 14 years (mean 10.36 ± 2.32 years). Biometric and spirometry data were collected from all participants. On the same day, part of the sample (36 children included in the validation process) performed two 6MWT and two TGlittre-P (30-minute interval between them). The other part of the sample just performed two TGlittre-P for the reliability process. Pearson and Spearman correlation tests were used to verify the correlation between the time spent on the TGlittre-P and the distance walked in the 6MWT. The intraclass correlation coefficient (ICC) was also used to assess the reproducibility of the TGlittre-P. The TGlittre-P showed a moderate negative correlation with the 6MWT (r = -0.490; p = 0.002; 95%CI -0.712 to -0.233). However, the behavior of the physiological variables that were monitored during the tests was similar and showed to be reproducible (ICC = 0.843; p = 0.000; 95%CI 0.695 to 0.911). The TGlittre-P proved to be a valid and reliable assessment of the functional capacity of healthy children aged 6 to 14 years.
de Vries, Merlijn W; Visscher, Corine; Delwel, Suzanne; van der Steen, Jenny T; Pieper, Marjoleine J C; Scherder, Erik J A; Achterberg, Wilco P; Lobbezoo, Frank
2016-01-01
Objectives. The aim of this study was to establish the reliability of the "chewing" subscale of the OPS-NVI, a novel tool designed to estimate presence and severity of orofacial pain in nonverbal patients. Methods. The OPS-NVI consists of 16 items for observed behavior, classified into four categories and a subjective estimate of pain. Two observers used the OPS-NVI for 237 video clips of people with dementia in Dutch nursing homes during their meal to observe their behavior and to estimate the intensity of orofacial pain. Six weeks later, the same observers rated the video clips a second time. Results. Bottom and ceiling effects for some items were found. This resulted in exclusion of these items from the statistical analyses. The categories which included the remaining items (n = 6) showed reliability varying between fair-to-good and excellent (interobserver reliability, ICC: 0.40-0.47; intraobserver reliability, ICC: 0.40-0.92). Conclusions. The "chewing" subscale of the OPS-NVI showed a fair-to-good to excellent interobserver and intraobserver reliability in this dementia population. This study contributes to the validation process of the OPS-NVI as a whole and stresses the need for further assessment of the reliability of the OPS-NVI with subjects that might already show signs of orofacial pain.
Ventura, Joseph; Cienfuegos, Angel; Boxer, Oren; Bilder, Robert
2008-11-01
Cognitive deficits are core features of schizophrenia that have been associated reliably with functional outcomes and now are a focus of treatment research. New rating scales are needed to complement current psychometric testing procedures, both to enable wider clinical use, and to serve as endpoints in clinical trials. Subjects were 35 schizophrenia patient-and-caregiver pairs recruited from the UCLA and West Los Angeles VA Outpatient Psychiatry Departments. Participants were assessed with the Clinical Global Impression of Cognition in Schizophrenia (CGI-CogS), an interview-based rating scale of cognitive functioning, on 3 occasions (baseline, 1 month, and 3 months). A computerized neurocognitive battery (Cogtest), an assessment of functioning, and symptom measures were administered at two occasions (baseline and one month). The CGI-CogS ratings generally showed a high level of internal consistency (Cronbach's alpha=.69 to .96), adequate levels of inter-rater reliability (ICC's=.71 to .80), and high test-retest stability (ICC's=.92 to .95). Correlations of caregiver and rater global (but not "patient only rating") CGI-CogS ratings with neurocognitive performance were in the moderate range (r's=-.27 to -.48), while most of the correlations with functional outcome were moderate to high (r's=-.41 to -.72). In fact, the CGI-CogS ratings were significantly more correlated with Social Functioning than were objective neurocognitive test scores (p=.02) and showed a trend in the same direction for predicting Instrumental Functioning (p=.06). We found moderate correlations between CGI-CogS global ratings and PANSS positive (r's=.36 to .49) and SANS negative symptoms (r=.41 to .61), but not with BPRS depression (r's=.11 to .13). An interview-based measure of cognition demonstrated high internal consistency, good inter-rater reliability, and high test-retest reliability. Caregiver ratings appear to add important clinical information over patient-only ratings. The CGI-CogS showed moderate validity with respect to neurocognitive performance and functional outcome, and correlations of CGI-CogS with functional outcomes were stronger than correlations of objective neurocognitive performance with functional outcomes. The CGI-CogS appears to offer a reliable and valid method for clinical rating of cognitive deficits and their impact on everyday functioning in schizophrenia.
Jorgensen, J E; Rathleff, C R; Rathleff, M S; Andreasen, J
2016-12-01
The Oslo Sports Trauma Research Centre Overuse Injury Questionnaire (OSTRC-O) and the Oslo Sports Trauma Research Centre questionnaire on Health Problems (The OSTRC-H) make it possible to monitor illness and injury at regular intervals capturing prevalence and incidence of acute injury, overuse injury, and illnesses. The aim of this study was to translate, culturally adapt, and establish the face validity of the OSTRC-O and the OSTRC-H into a Danish context (DK) through cognitive interviews and the assessment of test-retest reliability. The OSTRC-O.DK was distributed to 57 heterogenous respondents; response rate was 89%. The OSTRC-H was distributed to 58 heterogenous respondents; response rate was 86%. No major disagreements were observed between the original and translated versions of the questionnaires. The OSTRC-O had high internal consistency (Cronbach's alpha 0.80-0.93). The primary reliability analyses including all participants, showed reliability ICC: 0.62 (95% CI: 0.42-0.77. The secondary reliability analyses that only included subjects who did not change injury region from the test to the retest showed an ICC of 0.86 (95% CI: 0.77-0.92).The questionnaires were found to be valid, reliable, and acceptable for use in a Danish population. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Arbab, Dariusch; van Ochten, Johannes H M; Schnurr, Christoph; Bouillon, Bertil; König, Dietmar
2017-12-01
Patient-reported outcome measures are a critical tool in evaluating the efficacy of orthopedic procedures. The intention of this study was to evaluate reliability, validity, responsiveness and minimally important change of the German version of the Hip dysfunction and osteoarthritis outcome score (HOOS). The German HOOS was investigated in 251 consecutive patients before and 6 months after total hip arthroplasty. All patients completed HOOS, Oxford-Hip Score, Short-Form (SF-36) and numeric scales for pain and disability. Test-retest reliability, internal consistency, floor and ceiling effects, construct validity and minimal important change were analyzed. The German HOOS demonstrated excellent test-retest reliability with intraclass correlation coefficient values > 0.7. Cronbach´s alpha values demonstrated strong internal consistency. As hypothesized, HOOS subscales strongly correlated with corresponding OHS and SF-36 domains. All subscales showed excellent (effect size/standardized response means > 0.8) responsiveness between preoperative assessment and postoperative follow-up. The HOOS and all subdomains showed higher changes than the minimal detectable change which indicates true changes. The German version of the HOOS demonstrated good psychometric properties. It proved to be valid, reliable and responsive to the changes instrument for use in patients with hip osteoarthritis undergoing total hip replacement.
Martínez-Gómez, David; Martínez-de-Haro, Vicente; Pozo, Tamara; Welk, Gregory J; Villagra, Ariel; Calle, Marisa E; Marcos, Ascensión; Veiga, Oscar L
2009-01-01
Questionnaires are feasible instruments to assess physical activity (PA) in large samples. The aim of the current study was to evaluate the reliability and validity of the PAQ-A questionnaire in Spanish adolescents using the measurement of PA by accelerometer as criterion. In a sample of 82 adolescents, aged 12 to 17 years, 1-week PAQ-A test-retest was administered. Reliability was analyzed by the Intraclass Correlation Coefficient (ICC) and the internal consistency by the Cronbach's alpha Coefficient. Two hundred thirty-two adolescents, aged 13-17 years, completed the PAQ-A and wore the ActiGraph GT1M accelerometer during 7-days. The PAQ-A was compared against total PA and moderate to vigorous PA (MVPA) obtained by the accelerometer. Test-retest reliability showed ICC = 0.71 for the final score of PAQ-A. Internal consistency was alpha = 0.65 in the first self-report, alpha = 0.67 in the retest in 82 adolescents sample, and alpha = 0.74 in the 232 adolescents sample. The PAQ-A was moderately correlated with total PA (rho = 0.39) and MVPA (rho= 0.34) assessed by the accelerometer. The PAQ-A obtained significantly moderate correlations in boys but not in girls against the accelerometer. The PAQ-A questionnaire shows an adequate reliability and a reasonable validity for assessing PA in Spanish adolescents.
Park, Myung Sook; Kang, Kyung Ja; Jang, Sun Joo; Lee, Joo Yun; Chang, Sun Ju
2018-03-01
This study aimed to evaluate the components of test-retest reliability including time interval, sample size, and statistical methods used in patient-reported outcome measures in older people and to provide suggestions on the methodology for calculating test-retest reliability for patient-reported outcomes in older people. This was a systematic literature review. MEDLINE, Embase, CINAHL, and PsycINFO were searched from January 1, 2000 to August 10, 2017 by an information specialist. This systematic review was guided by both the Preferred Reporting Items for Systematic Reviews and Meta-Analyses checklist and the guideline for systematic review published by the National Evidence-based Healthcare Collaborating Agency in Korea. The methodological quality was assessed by the Consensus-based Standards for the selection of health Measurement Instruments checklist box B. Ninety-five out of 12,641 studies were selected for the analysis. The median time interval for test-retest reliability was 14days, and the ratio of sample size for test-retest reliability to the number of items in each measure ranged from 1:1 to 1:4. The most frequently used statistical methods for continuous scores was intraclass correlation coefficients (ICCs). Among the 63 studies that used ICCs, 21 studies presented models for ICC calculations and 30 studies reported 95% confidence intervals of the ICCs. Additional analyses using 17 studies that reported a strong ICC (>0.09) showed that the mean time interval was 12.88days and the mean ratio of the number of items to sample size was 1:5.37. When researchers plan to assess the test-retest reliability of patient-reported outcome measures for older people, they need to consider an adequate time interval of approximately 13days and the sample size of about 5 times the number of items. Particularly, statistical methods should not only be selected based on the types of scores of the patient-reported outcome measures, but should also be described clearly in the studies that report the results of test-retest reliability. Copyright © 2017 Elsevier Ltd. All rights reserved.
The intra-individual reproducibility of flash-evoked potentials in a sample of children.
Schellberg, D; Gasser, T; Köhler, W
1987-07-01
Visual evoked potentials (VEPs) to flash stimuli were recorded twice from 26 children aged 10-13 years, with an intersession interval of about 10 months. Test-retest reliability was poor for recordings taken from scalp locations overlying non-specific cortex and somewhat better for specific cortex. The size of consistency coefficients (i.e. correlations within session) showed that noise and artefacts were not the decisive factors which lower reliability. A comparison with retest correlations of broad band parameters of the EEG at rest for the same sample showed, to our surprise, smaller retest reliability for VEP parameters. Variability of the VEP in children over time seems to be a substantial as its well-known inter-individual variability.
Methodology for the development of normative data for Spanish-speaking pediatric populations.
Rivera, D; Arango-Lasprilla, J C
2017-01-01
To describe the methodology utilized to calculate reliability and the generation of norms for 10 neuropsychological tests for children in Spanish-speaking countries. The study sample consisted of over 4,373 healthy children from nine countries in Latin America (Chile, Cuba, Ecuador, Guatemala, Honduras, Mexico, Paraguay, Peru, and Puerto Rico) and Spain. Inclusion criteria for all countries were to have between 6 to 17 years of age, an Intelligence Quotient of≥80 on the Test of Non-Verbal Intelligence (TONI-2), and score of <19 on the Children's Depression Inventory. Participants completed 10 neuropsychological tests. Reliability and norms were calculated for all tests. Test-retest analysis showed excellent or good- reliability on all tests (r's>0.55; p's<0.001) except M-WCST perseverative errors whose coefficient magnitude was fair. All scores were normed using multiple linear regressions and standard deviations of residual values. Age, age2, sex, and mean level of parental education (MLPE) were included as predictors in the models by country. The non-significant variables (p > 0.05) were removed and the analysis were run again. This is the largest Spanish-speaking children and adolescents normative study in the world. For the generation of normative data, the method based on linear regression models and the standard deviation of residual values was used. This method allows determination of the specific variables that predict test scores, helps identify and control for collinearity of predictive variables, and generates continuous and more reliable norms than those of traditional methods.
Reliability of provocative tests of motion sickness susceptibility
NASA Technical Reports Server (NTRS)
Calkins, D. S.; Reschke, M. F.; Kennedy, R. S.; Dunlop, W. P.
1987-01-01
Test-retest reliability values were derived from motion sickness susceptibility scores obtained from two successive exposures to each of three tests: (1) Coriolis sickness sensitivity test; (2) staircase velocity movement test; and (3) parabolic flight static chair test. The reliability of the three tests ranged from 0.70 to 0.88. Normalizing values from predictors with skewed distributions improved the reliability.
Manzi, Luigi; Villafañe, Jorge Hugo; Indino, Cristian; Tamini, Jacopo; Berjano, Pedro; Usuelli, Federico Giuseppe
2017-11-08
The purpose of this study was to investigate the test-retest reliability of the Phi angle in patients undergoing total ankle replacement (TAR) for end stage ankle osteoarthritis (OA) to assess the rotational alignment of the talar component. Retrospective observational cross-sectional study of prospectively collected data. Post-operative anteroposterior radiographs of the foot of 170 patients who underwent TAR for the ankle OA were evaluated. Three physicians measured Phi on the 170 randomly sorted and anonymized radiographs on two occasions, one week apart (test and retest conditions), inter and intra-observer agreement were evaluated. Test-retest reliability of Phi angle measurement was excellent for patients with Hintegra TAR (ICC=0.995; p<0.001) and Zimmer TAR (ICC=0.995; p<0.001) on radiographs of subjects with ankle OA. There were no significant differences in the reliability of the Phi angle measurement between patients with Hintegra vs. Zimmer implants (p>0.05). Measurement of Phi angle on weight-bearing dorsoplantar radiograph showed an excellent reliability among orthopaedic surgeons in determining the position of the talar component in the axial plane. Level II, cross sectional study. Copyright © 2017 European Foot and Ankle Society. Published by Elsevier Ltd. All rights reserved.
Validity and cross-cultural adaptation of the persian version of the oxford elbow score.
Ebrahimzadeh, Mohammad H; Kachooei, Amir Reza; Vahedi, Ehsan; Moradi, Ali; Mashayekhi, Zeinab; Hallaj-Moghaddam, Mohammad; Azami, Mehran; Birjandinejad, Ali
2014-01-01
Oxford Elbow Score (OES) is a patient-reported questionnaire used to assess outcomes after elbow surgery. The aim of this study was to validate and adapt the OES into Persian language. After forward-backward translation of the OES into Persian, a total number of 92 patients after elbow surgeries completed the Persian OES along with the Persian DASH and SF-36. To assess test-retest reliability, 31 randomly selected patients (34%) completed the Persian OES again after three days while abstaining from all forms of therapeutic regimens. Reliability of the Persian OES was assessed by measuring intraclass correlation coefficient (ICC) for test-retest reliability and Cronbach's alpha for internal consistency. Spearman's correlation coefficient was used to test the construct validity. Cronbach's alpha coefficient was 0.92 showing excellent reliability. Cronbach's alpha for function, pain, and social-psychological subscales was 0.95, 0.86, and 0.85, respectively. Intraclass correlation coefficient (ICC) was 0.85 for the overall questionnaire and 0.90, 0.76, and 0.75 for function, pain, and social-psychological subscales, respectively. Construct validity was confirmed as the Spearman correlation between OES and DASH was 0.80. Persian OES is a valid and reliable patient-reported outcome measure to assess postsurgical elbow status in Persian speaking population.
Sánchez-Ayala, Alfonso; Farias-Neto, Arcelino; Vilanova, Larissa Soares Reis; Costa, Marina Abrantes; Paiva, Ana Clara Soares; Carreiro, Adriana da Fonte Porto; Mestriner-Junior, Wilson
2016-08-01
Rehabilitation of masticatory function is inherent to prosthodontics; however, despite the various techniques for evaluating oral comminution, the methodological suitability of these has not been completely studied. The aim of this study was to determine the reproducibility, reliability, and validity of a test food based on fuchsin beads for masticatory function assessment. Masticatory performance was evaluated in 20 dentate subjects (mean age, 23.3 years) using two kinds of test foods and methods: fuchsin beads and ultraviolet-visible spectrophotometry, and silicone cubes and multiple sieving as gold standard. Three examiners conducted five masticatory performance trials with each test food. Reproducibility of the results from both test foods was separately assessed using the intraclass correlation coefficient (ICC). Reliability and validity of fuchsin bead data were measured by comparing the average mean of absolute differences and the measurement means, respectively, regarding silicone cube data using the paired Student's t-test (α = 0.05). Intraexaminer and interexaminer ICC for the fuchsin bead values were 0.65 and 0.76 (p < 0.001), respectively; those for the silicone cubes values were 0.93 and 0.91 (p < 0.001), respectively. Reliability revealed intraexaminer (p < 0.001) and interexaminer (p < 0.05) differences between the average means of absolute differences of each test foods. Validity also showed differences between the measurement means of each test food (p < 0.001). Intra- and interexaminer reproducibility of the test food based on fuchsin beads for evaluation of masticatory performance were good and excellent, respectively; however, the reliability and validity were low, because fuchsin beads do not measure the grinding capacity of masticatory function as silicone cubes do; instead, this test food describes the crushing potential of teeth. Thus, the two kinds of test foods evaluate different properties of masticatory capacity, confirming fushsin beads as a useful tool for this purpose. © 2015 by the American College of Prosthodontists.
A clinical test of stepping and change of direction to identify multiple falling older adults.
Dite, Wayne; Temple, Viviene A
2002-11-01
To establish the reliability and validity of a new clinical test of dynamic standing balance, the Four Square Step Test (FSST), to evaluate its sensitivity, specificity, and predictive value in identifying subjects who fall, and to compare it with 3 established balance and mobility tests. A 3-group comparison performed by using 3 validated tests and 1 new test. A rehabilitation center and university medical school in Australia. Eighty-one community-dwelling adults over the age of 65 years. Subjects were age- and gender-matched to form 3 groups: multiple fallers, nonmultiple fallers, and healthy comparisons. Not applicable. Time to complete the FSST and Timed Up and Go test and the number of steps to complete the Step Test and Functional Reach Test distance. High reliability was found for interrater (n=30, intraclass correlation coefficient [ICC]=.99) and retest reliability (n=20, ICC=.98). Evidence for validity was found through correlation with other existing balance tests. Validity was supported, with the FSST showing significantly better performance scores (P<.01) for each of the healthier and less impaired groups. The FSST also revealed a sensitivity of 85%, a specificity of 88% to 100%, and a positive predictive value of 86%. As a clinical test, the FSST is reliable, valid, easy to score, quick to administer, requires little space, and needs no special equipment. It is unique in that it involves stepping over low objects (2.5cm) and movement in 4 directions. The FSST had higher combined sensitivity and specificity for identifying differences between groups in the selected sample population of older adults than the 3 tests with which it was compared. Copyright 2002 by the American Congress of Rehabilitation Medicine and the American Academy of Physical Medicine and Rehabilitation
Moore, Amy Lawson; Miller, Terissa M
2018-01-01
The purpose of the current study is to evaluate the validity and reliability of the revised Gibson Test of Cognitive Skills, a computer-based battery of tests measuring short-term memory, long-term memory, processing speed, logic and reasoning, visual processing, as well as auditory processing and word attack skills. This study included 2,737 participants aged 5-85 years. A series of studies was conducted to examine the validity and reliability using the test performance of the entire norming group and several subgroups. The evaluation of the technical properties of the test battery included content validation by subject matter experts, item analysis and coefficient alpha, test-retest reliability, split-half reliability, and analysis of concurrent validity with the Woodcock Johnson III Tests of Cognitive Abilities and Tests of Achievement. Results indicated strong sources of evidence of validity and reliability for the test, including internal consistency reliability coefficients ranging from 0.87 to 0.98, test-retest reliability coefficients ranging from 0.69 to 0.91, split-half reliability coefficients ranging from 0.87 to 0.91, and concurrent validity coefficients ranging from 0.53 to 0.93. The Gibson Test of Cognitive Skills-2 is a reliable and valid tool for assessing cognition in the general population across the lifespan.
A standard for test reliability in group research.
Ellis, Jules L
2013-03-01
Many authors adhere to the rule that test reliabilities should be at least .70 or .80 in group research. This article introduces a new standard according to which reliabilities can be evaluated. This standard is based on the costs or time of the experiment and of administering the test. For example, if test administration costs are 7 % of the total experimental costs, the efficient value of the reliability is .93. If the actual reliability of a test is equal to this efficient reliability, the test size maximizes the statistical power of the experiment, given the costs. As a standard in experimental research, it is proposed that the reliability of the dependent variable be close to the efficient reliability. Adhering to this standard will enhance the statistical power and reduce the costs of experiments.
Bonasia, Davide Edoardo; Marmotti, Antongiulio; Massa, Alessandro Domenico Felice; Ferro, Andrea; Blonna, Davide; Castoldi, Filippo; Rossi, Roberto
2015-09-01
In the last two decades, many surgical techniques have been described for articular cartilage repair. Reliable histological scoring systems are fundamental tools to evaluate new procedures. Several histological scoring systems have been described, and these can be divided in elementary and comprehensive scores, according to the number of sub-items. The aim of this study was to test the inter- and intra-observer reliability of ten main scores used for the histological evaluation of in vivo cartilage repair. The authors tested the starting hypothesis that elementary scores would show superior intra- and inter-observer reliability compared with comprehensive scores. Fifty histological sections obtained from the trochlea of New Zealand Rabbit and stained with Safranin-O fast green were used. The histological sections were analysed by 4 observers: 2 experienced in cartilage histology and 2 inexperienced. Histological evaluations were performed at time 1 and time 2, separated by a 30-day interval. The following scores were used: Mankin, O'Driscoll, Pineda, Wakitani, Fortier, Selleres, ICRS, ICRSII, Oswestry (OsScore) and modified O'Driscoll. Intra- and inter-observer reliability were evaluated for each score. In addition, the pavement-ceiling effect and the Bland-Altman Coefficient of Repeatability were then evaluated for each sub-item of every score. Intra-observer reliability was high for all observers in every score, even though the reliability was significantly lower for non-expert observers compared with expert counterparts. In terms of Coefficient of Repeatability, some scores performed better (O'Driscoll, Modified O'Driscoll and ICRSII) than others (Fortier, Seller). Inter-observer reliability was high for all observers in every score, but significantly lower for non-expert compared with expert observers. In expert hands, all the scores showed high intra- and inter-observer reliability, independently of the complexity. Although every score has advantages and disadvantages, ICRSII, O'Driscoll and Modified O'Driscoll scores should be preferred for the evaluation of in vivo cartilage repair in animal models.
The psychometric properties of an Iranian translation of the Work Ability Index (WAI) questionnaire.
Abdolalizadeh, M; Arastoo, A A; Ghsemzadeh, R; Montazeri, A; Ahmadi, K; Azizi, A
2012-09-01
This study was carried out to evaluate the psychometric properties of an Iranian translation of the Work Ability Index (WAI) questionnaire. In this methodological study, nurses and healthcare workers aged 40 years and older who worked in educational hospitals in Ahvaz (236 workers) in 2010, completed the questionnaire and 60 of the workers filled out the WAI questionnaire for the second time to ensure test-retest reliability. Forward-backward method was applied to translate the questionnaire from English into Persian. The psychometric properties of the Iranian translation of the WAI were assessed using the fallowing tests: Internal consistency (to test reliability), test-retest analysis, exploratory factor analysis (construct validity), discriminate validity by comparing the mean WAI score in two groups of the employees that had different levels of sick leave, criterion validity by determining the correlation between the Persian version of short form health survey (SF-36) and WAI score. Cronbach's alpha coefficient was estimated to be 0.79 and it was concluded that the internal consistency was high enough. The intraclass correlation coefficient was recognized to be 0.92. Factor analysis indicated three factors in the structure of the work ability including self-perceived work ability (24.5% of the variance), mental resources (22.23% of the variance), and presence of disease and health related limitation (18.55% of the variance). Statistical tests showed that this questionnaire was capable of discriminating two groups of employees who had different levels of sick leave. Criterion validity analysis showed that this instrument and all dimensions of the Iranian version of SF-36 were correlated significantly. Item correlation corrective for overlap showed the items tests had a good correlation except for one. The finding of the study showed that the Iranian version of the WAI is a reliable and valid measure of work ability and can be used both in research and practical activities.
Kim, Dong Hee; Im, Yeo Jin
2013-02-01
To develop and test the validity and reliability of the Korean version of the Family Management Measure (Korean FaMM) to assess applicability for families with children having chronic illnesses. The Korean FaMM was articulated through forward-backward translation methods. Internal consistency reliability, construct and criterion validity were calculated using PASW WIN (19.0) and AMOS (20.0). Survey data were collected from 341 mothers of children suffering from chronic disease enrolled in a university hospital in Seoul, South Korea. The Korean version of FaMM showed reliable internal consistency with Cronbach's alpha for the total scale of .69-.91. Factor loadings of the 53 items on the six sub-scales ranged from 0.28-0.84. The model of six subscales for the Korean FaMM was validated by expiratory and confirmatory factor analysis (χ²<.001, RMR<.05, GFI, AGFI, NFI, NNFI>.08). Criterion validity compared to the Parental Stress Index (PSI) showed significant correlation. The findings of this study demonstrate that the Korean FaMM showed satisfactory construct and criterion validity and reliability. It is useful to measure Korean family's management style with their children who have a chronic illness.
Design and implementation of online automatic judging system
NASA Astrophysics Data System (ADS)
Liang, Haohui; Chen, Chaojie; Zhong, Xiuyu; Chen, Yuefeng
2017-06-01
For lower efficiency and poorer reliability in programming training and competition by currently artificial judgment, design an Online Automatic Judging (referred to as OAJ) System. The OAJ system including the sandbox judging side and Web side, realizes functions of automatically compiling and running the tested codes, and generating evaluation scores and corresponding reports. To prevent malicious codes from damaging system, the OAJ system utilizes sandbox, ensuring the safety of the system. The OAJ system uses thread pools to achieve parallel test, and adopt database optimization mechanism, such as horizontal split table, to improve the system performance and resources utilization rate. The test results show that the system has high performance, high reliability, high stability and excellent extensibility.
Assessing Perceptions AbouT Hazardous Substances (PATHS): The PATHS questionnaire
Amlôt, Richard; Page, Lisa; Pearce, Julia; Wessely, Simon
2013-01-01
How people perceive the nature of a hazardous substance may determine how they respond when potentially exposed to it. We tested a new Perceptions AbouT Hazardous Substances (PATHS) questionnaire. In Study 1 (N = 21), we assessed the face validity of items concerning perceptions about eight properties of a hazardous substance. In Study 2 (N = 2030), we tested the factor structure, reliability and validity of the PATHS questionnaire across four qualitatively different substances. In Study 3 (N = 760), we tested the impact of information provision on Perceptions AbouT Hazardous Substances scores. Our results showed that our eight measures demonstrated good reliability and validity when used for non-contagious hazards. PMID:23104995
Krüger-Gottschalk, Antje; Knaevelsrud, Christine; Rau, Heinrich; Dyer, Anne; Schäfer, Ingo; Schellong, Julia; Ehring, Thomas
2017-11-28
The Posttraumatic Stress Disorder (PTSD) Checklist (PCL, now PCL-5) has recently been revised to reflect the new diagnostic criteria of the disorder. A clinical sample of trauma-exposed individuals (N = 352) was assessed with the Clinician Administered PTSD Scale for DSM-5 (CAPS-5) and the PCL-5. Internal consistencies and test-retest reliability were computed. To investigate diagnostic accuracy, we calculated receiver operating curves. Confirmatory factor analyses (CFA) were performed to analyze the structural validity. Results showed high internal consistency (α = .95), high test-retest reliability (r = .91) and a high correlation with the total severity score of the CAPS-5, r = .77. In addition, the recommended cutoff of 33 on the PCL-5 showed high diagnostic accuracy when compared to the diagnosis established by the CAPS-5. CFAs comparing the DSM-5 model with alternative models (the three-factor solution, the dysphoria, anhedonia, externalizing behavior and hybrid model) to account for the structural validity of the PCL-5 remained inconclusive. Overall, the findings show that the German PCL-5 is a reliable instrument with good diagnostic accuracy. However, more research evaluating the underlying factor structure is needed.
Saengsuwan, Jittima; Berger, Lucia; Schuster-Amft, Corina; Nef, Tobias; Hunt, Kenneth J
2016-09-06
Exercise testing devices for evaluating cardiopulmonary fitness in patients with severe disability after stroke are lacking, but we have adapted a robotics-assisted tilt table (RATT) for cardiopulmonary exercise testing (CPET). Using the RATT in a sample of patients after stroke, this study aimed to investigate test-retest reliability and repeatability of CPET and to prospectively investigate changes in cardiopulmonary outcomes over a period of four weeks. Stroke patients with all degrees of disability underwent 3 separate CPET sessions: 2 tests at baseline (TB1 and TB2) and 1 test at follow up (TF). TB1 and TB2 were at least 24 h apart. TB2 and TF were 4 weeks apart. A RATT equipped with force sensors in the thigh cuffs, a work rate estimation algorithm and a real-time visual feedback system was used to guide the patients' exercise work rate during CPET. Test-retest reliability and repeatability of CPET variables were analysed using paired t-tests, the intraclass correlation coefficient (ICC), the coefficient of variation (CoV), and Bland and Altman limits of agreement. Changes in cardiopulmonary fitness during four weeks were analysed using paired t-tests. Seventeen sub-acute and chronic stroke patients (age 62.7 ± 10.4 years [mean ± SD]; 8 females) completed the test sessions. The median time post stroke was 350 days. There were 4 severely disabled, 1 moderately disabled and 12 mildly disabled patients. For test-retest, there were no statistically significant differences between TB1 and TB2 for most CPET variables. Peak oxygen uptake, peak heart rate, peak work rate and oxygen uptake at the ventilatory anaerobic threshold (VAT) and respiratory compensation point (RCP) showed good to excellent test-retest reliability (ICC 0.65-0.94). For all CPET variables, CoV was 4.1-14.5 %. The mean difference was close to zero in most of the CPET variables. There were no significant changes in most cardiopulmonary performance parameters during the 4-week period (TB2 vs TF). These findings provide the first evidence of test-retest reliability and repeatability of the principal CPET variables using the novel RATT system and testing methodology, and high success rates in identification of VAT and RCP: good to excellent test-retest reliability and repeatability were found for all submaximal and maximal CPET variables. Reliability and repeatability of the main CPET parameters in stroke patients on the RATT were comparable to previous findings in stroke patients using standard exercise testing devices. The RATT has potential to be used as an alternative exercise testing device in patients who have limitations for use of standard exercise testing devices.
Personality traits in companion dogs-Results from the VIDOPET.
Turcsán, Borbála; Wallis, Lisa; Virányi, Zsófia; Range, Friederike; Müller, Corsin A; Huber, Ludwig; Riemer, Stefanie
2018-01-01
Individual behavioural differences in pet dogs are of great interest from a basic and applied research perspective. Most existing dog personality tests have specific (practical) goals in mind and so focused only on a limited aspect of dogs' personality, such as identifying problematic (aggressive or fearful) behaviours, assessing suitability as working dogs, or improving the results of adoption. Here we aimed to create a comprehensive test of personality in pet dogs that goes beyond traditional practical evaluations by exposing pet dogs to a range of situations they might encounter in everyday life. The Vienna Dog Personality Test (VIDOPET) consists of 15 subtests and was performed on 217 pet dogs. A two-step data reduction procedure (principal component analysis on each subtest followed by an exploratory factor analysis on the subtest components) yielded five factors: Sociability-obedience, Activity-independence, Novelty seeking, Problem orientation, and Frustration tolerance. A comprehensive evaluation of reliability and validity measures demonstrated excellent inter- and intra-observer reliability and adequate internal consistency of all factors. Moreover the test showed good temporal consistency when re-testing a subsample of dogs after an average of 3.8 years-a considerably longer test-retest interval than assessed for any other dog personality test, to our knowledge. The construct validity of the test was investigated by analysing the correlations between the results of video coding and video rating methods and the owners' assessment via a dog personality questionnaire. The results demonstrated good convergent as well as discriminant validity. To conclude, the VIDOPET is not only a highly reliable and valid tool for measuring dog personality, but also the first test to show consistent behavioural traits related to problem solving ability and frustration tolerance in pet dogs.
Personality traits in companion dogs—Results from the VIDOPET
Wallis, Lisa; Virányi, Zsófia; Range, Friederike; Müller, Corsin A.; Huber, Ludwig; Riemer, Stefanie
2018-01-01
Individual behavioural differences in pet dogs are of great interest from a basic and applied research perspective. Most existing dog personality tests have specific (practical) goals in mind and so focused only on a limited aspect of dogs’ personality, such as identifying problematic (aggressive or fearful) behaviours, assessing suitability as working dogs, or improving the results of adoption. Here we aimed to create a comprehensive test of personality in pet dogs that goes beyond traditional practical evaluations by exposing pet dogs to a range of situations they might encounter in everyday life. The Vienna Dog Personality Test (VIDOPET) consists of 15 subtests and was performed on 217 pet dogs. A two-step data reduction procedure (principal component analysis on each subtest followed by an exploratory factor analysis on the subtest components) yielded five factors: Sociability-obedience, Activity-independence, Novelty seeking, Problem orientation, and Frustration tolerance. A comprehensive evaluation of reliability and validity measures demonstrated excellent inter- and intra-observer reliability and adequate internal consistency of all factors. Moreover the test showed good temporal consistency when re-testing a subsample of dogs after an average of 3.8 years—a considerably longer test-retest interval than assessed for any other dog personality test, to our knowledge. The construct validity of the test was investigated by analysing the correlations between the results of video coding and video rating methods and the owners’ assessment via a dog personality questionnaire. The results demonstrated good convergent as well as discriminant validity. To conclude, the VIDOPET is not only a highly reliable and valid tool for measuring dog personality, but also the first test to show consistent behavioural traits related to problem solving ability and frustration tolerance in pet dogs. PMID:29634747
Infant polysomnography: reliability and validity of infant arousal assessment.
Crowell, David H; Kulp, Thomas D; Kapuniai, Linda E; Hunt, Carl E; Brooks, Lee J; Weese-Mayer, Debra E; Silvestri, Jean; Ward, Sally Davidson; Corwin, Michael; Tinsley, Larry; Peucker, Mark
2002-10-01
Infant arousal scoring based on the Atlas Task Force definition of transient EEG arousal was evaluated to determine (1). whether transient arousals can be identified and assessed reliably in infants and (2). whether arousal and no-arousal epochs scored previously by trained raters can be validated reliably by independent sleep experts. Phase I for inter- and intrarater reliability scoring was based on two datasets of sleep epochs selected randomly from nocturnal polysomnograms of healthy full-term, preterm, idiopathic apparent life-threatening event cases, and siblings of Sudden Infant Death Syndrome infants of 35 to 64 weeks postconceptional age. After training, test set 1 reliability was assessed and discrepancies identified. After retraining, test set 2 was scored by the same raters to determine interrater reliability. Later, three raters from the trained group rescored test set 2 to assess inter- and intrarater reliabilities. Interrater and intrarater reliability kappa's, with 95% confidence intervals, ranged from substantial to almost perfect levels of agreement. Interrater reliabilities for spontaneous arousals were initially moderate and then substantial. During the validation phase, 315 previously scored epochs were presented to four sleep experts to rate as containing arousal or no-arousal events. Interrater expert agreements were diverse and considered as noninterpretable. Concordance in sleep experts' agreements, based on identification of the previously sampled arousal and no-arousal epochs, was used as a secondary evaluative technique. Results showed agreement by two or more experts on 86% of the Collaborative Home Infant Monitoring Evaluation Study arousal scored events. Conversely, only 1% of the Collaborative Home Infant Monitoring Evaluation Study-scored no-arousal epochs were rated as an arousal. In summary, this study presents an empirically tested model with procedures and criteria for attaining improved reliability in transient EEG arousal assessments in infants using the modified Atlas Task Force standards. With training based on specific criteria, substantial inter- and intrarater agreement in identifying infant arousals was demonstrated. Corroborative validation results were too disparate for meaningful interpretation. Alternate evaluation based on concordance agreements supports reliance on infant EEG criteria for assessment. Results mandate additional confirmatory validation studies with specific training on infant EEG arousal assessment criteria.
ERIC Educational Resources Information Center
Garcia Laborda, Jesus
2007-01-01
Interface design and ergonomics, while already studied in much of educational theory, have not until recently been considered in language testing (Fulcher, 2003). In this paper, we revise the design principles of PLEVALEX, a fully operational prototype Internet based language testing platform. Our focus here is to show PLEVALEX's interfaces and…
Almeida, Gustavo J; Irrgang, James J; Fitzgerald, G Kelley; Jakicic, John M; Piva, Sara R
2016-06-01
Few instruments that measure physical activity (PA) can accurately quantify PA performed at light and moderate intensities, which is particularly relevant in older adults. The evidence of their reliability in free-living conditions is limited. The study objectives were: (1) to determine the test-retest reliability of the Actigraph (ACT), SenseWear Armband (SWA), and Community Healthy Activities Model Program for Seniors (CHAMPS) questionnaire in assessing free-living PA at light and moderate intensities in people after total knee arthroplasty; (2) to compare the reliability of the 3 instruments relative to each other; and (3) to determine the reliability of commonly used monitoring time frames (24 hours, waking hours, and 10 hours from awakening). A one-group, repeated-measures design was used. Participants wore the activity monitors for 2 weeks, and the CHAMPS questionnaire was completed at the end of each week. Test-retest reliability was determined by using the intraclass correlation coefficient (ICC [2,k]) to compare PA measures from one week with those from the other week. Data from 28 participants who reported similar PA during the 2 weeks were included in the analysis. The mean age of these participants was 69 years (SD=8), and 75% of them were women. Reliability ranged from moderate to excellent for the ACT (ICC=.75-.86) and was excellent for the SWA (ICC=.93-.95) and the CHAMPS questionnaire (ICC=.86-.92). The 95% confidence intervals (95% CI) of the ICCs from the SWA were the only ones within the excellent reliability range (.85-.98). The CHAMPS questionnaire showed systematic bias, with less PA being reported in week 2. The reliability of PA measures in the waking-hour time frame was comparable to that in the 24-hour time frame and reflected most PA performed during this period. Reliability may be lower for time intervals longer than 1 week. All PA measures showed good reliability. The reliability of the ACT was lower than those of the SWA and the CHAMPS questionnaire. The SWA provided more precise reliability estimates. Wearing PA monitors during waking hours provided sufficiently reliable measures and can reduce the burden on people wearing them. © 2016 American Physical Therapy Association.
Reliability improvement of wire bonds subjected to fatigue stresses.
NASA Technical Reports Server (NTRS)
Ravi, K. V.; Philofsky, E. M.
1972-01-01
The failure of wire bonds due to repeated flexure when semiconductor devices are operated in an on-off mode has been investigated. An accelerated fatigue testing apparatus was constructed and the major fatigue variables, aluminum alloy composition, and bonding mechanism, were tested. The data showed Al-1% Mg wires to exhibit superior fatigue characteristics compared to Al-1% Cu or Al-1% Si and ultrasonic bonding to be better than thermocompression bonding for fatigue resistance. Based on these results highly reliable devices were fabricated using Al-1% Mg wire with ultrasonic bonding which withstood 120,000 power cycles with no failures.
Deb, Shoumitro; Bryant, Eleanor; Morris, Paul G; Prior, Lindsay; Lewis, Glyn; Haque, Sayeed
2007-06-01
To develop a measure to assess post-acute outcome following from traumatic brain injury (TBI) with particular emphasis on the emotional and the behavioral outcome. The second objective was to assess the test-retest reliability, internal consistency, and factor structure of the newly developed patient version of the Head Injury Participation Scale (P-HIPS) and Patient-Head Injury Neurobehavioral Scale (P-HINAS). Thirty-two TBI individuals and 27 carers took part in in-depth qualitative interviews exploring the consequences of the TBI. Interview transcripts were analyzed and key themes and concepts were used to construct the 49-item P-HIPS. A postal survey was then conducted on a cohort of 113 TBI patients to 'field test' the P-HIPS and the P-HINAS. All individual 49 items of the P-HIPS and their total score showed good test-retest reliability (0.93) and internal consistency (0.95). The P-HIPS showed a very good correlations with the Mayo Portland Adaptability Inventory-3 (MPAI-3) (0.87) and a moderate negative correlation with the Glasgow Outcome Scale-Extended (GOSE) (-0.51). Factor analysis extracted the following domains: 'Emotion/Behavior,' 'Independence/Community Living,' 'Cognition' and 'Physical'. The 'Emotion/Behavior' factor constituted the P-HINAS, which showed good internal consistency (0.93), test-retest reliability (0.91) and concurrent validity with MPAI subscale (0.82). Both the P-HIPS and the P-HINAS show strong psychometric properties. The qualitative methodology employed in the construction stage of the questionnaires provided good evidence of face and content validity.
Chan, Helen Yl; Chun, Gloria Km; Man, C W; Leung, Edward Mf
2018-05-01
Although much attention has been on integrating the palliative care approach into services of long-term care homes for older people living with frailty and progressive diseases, little is known about the staff preparedness for these new initiatives. The present study aimed to develop and test the psychometric properties of an instrument for measuring care home staff preparedness in providing palliative and end-of-life care. A 16-item instrument, covering perceived knowledge, skill and psychological readiness, was developed. A total of 247 staff members of different ranks from four care homes participated in the study. Exploratory factor analysis using the principal component analysis extraction method with varimax rotation was carried out for initial validation. Known group comparison was carried out to examine its discriminant validity. Reliability of the instrument was assessed based on test-retest reliability of a subsample of 20 participants and the Cronbach's alpha of the items. Exploratory factor analysis showed that the instrument yielded a three-factor solution, which cumulatively accounted for 68.5% of the total variance. Three subscales, namely, willingness, capability and resilience, showed high internal consistency and test-retest reliability. It also showed good discriminant validity between staff members of professional and non-professional groups. This is a brief, valid and reliable scale for measuring care home staff preparedness for providing palliative and end-of-life care. It can be used to identify their concerns and training needs in providing palliative and end-of-life care, and as an outcome measure to evaluate the effects of interventional studies for capacity building in this regard. Geriatr Gerontol Int 2018; 18: 745-749. © 2018 Japan Geriatrics Society.
Validity and reliability of the Spanish version of the 10-item CD-RISC in patients with fibromyalgia
2014-01-01
Background No resilience scale has been validated in Spanish patients with fibromyalgia. The aim of this study was to evaluate the validity and reliability of the 10-item CD-RISC in a sample of Spanish patients with fibromyalgia. Methods Design: Observational prospective multicenter study. Sample: Patients with diagnoses of fibromyalgia recruited from primary care settings (N = 208). Instruments: In addition to sociodemographic data, the following questionnaires were administered: Pain Visual Analogue Scale (PVAS), the 10-item Connor-Davidson Resilience scale (10-item CD-RISC), the Fibromyalgia Impact Questionnaire (FIQ), the Hospital Anxiety and Depression Scale (HADS), the Pain Catastrophizing Scale (PCS), the Chronic Pain Acceptance Questionnaire (CPAQ), and the Mindful Attention Awareness Scale (MAAS). Results Regarding construct validity, the factor solution in the Principal Component Analysis (PCA) was considered adequate, so the KMO test had a value of 0.91, and the Barlett’s test of sphericity was significant (χ2 = 852.8; gl = 45; p < 0.001). Only one factor showed an eigenvalue greater than 1, and it explained 50.4% of the variance. PCA and Confirmatory Factor Analysis (CFA) results did not show significant differences between groups. The 10-item CD-RISC scale demonstrated good internal consistency (Cronbach’s alpha = 0.88) and test-retest reliability (r = 0.89 for a six-week interval). The 10-item CD-RISC score was significantly correlated with all of the other psychometric instruments in the expected direction, except for the PVAS (−0.115; p = 0.113). Conclusions Our study confirms that the Spanish version of the 10-item CD-RISC shows, in patients with fibromyalgia, acceptable psychometric properties, with a high level of reliability and validity. PMID:24484847
NASA Astrophysics Data System (ADS)
Zima, W.; Kolenberg, K.; Briquet, M.; Breger, M.
2004-06-01
We have carried out a Hare-and-Hound test to determine the reliability of the Moment Method (Briquet & Aerts 2003) and the Pixel-by-Pixel Method (Mantegazza 2000) for the identification of pulsation modes in Delta Scuti stars. For this purpose we calculated synthetic line profiles, exhibiting six pulsation modes of low degree and with input parameters initially unknown to us. The aim was to test and increase the quality of the mode identification by applying both methods independently and by using a combined technique. Our results show that, whereas the azimuthal order m and its sign can be fixed by both methods, the degree l is not determined unambiguously. Both identification methods show a better reliability if multiple modes are fitted simultaneously. In particular, the inclination angle is better determined. We have to emphasize that the outcome of this test is only meaningful for stars having pulsational velocities below 0.2 vsini. This is the first part of a series of articles, in which we will test these spectroscopic identification methods.
Kim, Jongshin; Nam, Kyoung Won; Jang, Ik Gyu; Yang, Hee Kyung; Kim, Kwang Gi; Hwang, Jeong-Min
2012-03-15
To evaluate the accuracy, validity, and reliability of a newly developed infrared optical head tracker (IOHT) using Nintendo Wii remote controllers (WiiMote; Nintendo Co. Ltd., Kyoto, Japan) for measurement of the angle of head posture. The IOHT consists of two infrared (IR) receivers (WiiMote) that are fixed to a mechanical frame and connected to a monitoring computer via a Bluetooth communication channel and an IR beacon that consists of four IR light-emitting diodes (LEDs). With the use of the Cervical Range of Motion (CROM; Performance Attainment Associates, St. Paul, MN) as a reference, one- and three-dimensional (1- and 3-D) head postures of 20 normal adult subjects (20-37 years of age; 9 women and 11 men) were recorded with the IOHT. In comparison with the data from the CROM, the IOHT-derived results showed high consistency. The measurements of 1- and 3-D positions of the human head with the IOHT were very close to those of the CROM. The correlation coefficients of 1- and 3-D positions between the IOHT and the CROM were more than 0.99 and 0.96 (P < 0.05, Pearson's correlation test), respectively. Reliability tests of the IOHT for the normal adult subjects for 1- and 3-D positions of the human head had 95% limits of agreement angles of approximately ±4.5° and ±8.0°, respectively. The IOHT showed strong concordance with the CROM and relatively good test-retest reliability, thus proving its validity and reliability as a head-posture-measuring device. Considering its high performance, ease of use, and low cost, the IOHT has the potential to be widely used as a head-posture-measuring device in clinical practice.
Reliability of a new test battery for fitness assessment of the European Astronaut corps.
Petersen, Nora; Thieschäfer, Lutz; Ploutz-Snyder, Lori; Damann, Volker; Mester, Joachim
2015-01-01
To optimise health for space missions, European astronauts follow specific conditioning programs before, during and after their flights. To evaluate the effectiveness of these programs, the European Space Agency conducts an Astronaut Fitness Assessment (AFA), but the test-retest reliability of elements within it remains unexamined. The reliability study described here presents a scientific basis for implementing the AFA, but also highlights challenges faced by operational teams supporting humans in such unique environments, especially with respect to health and fitness monitoring of crew members travelling not only into space, but also across the world. The AFA tests assessed parameters known to be affected by prolonged exposure to microgravity: aerobic capacity (VO2max), muscular strength (one repetition max, 1 RM) and power (vertical jumps), core stability, flexibility and balance. Intraclass correlation coefficients (ICC3.1), standard error of measurement and coefficient of variation were used to assess relative and absolute test-retest reliability. Squat and bench 1 RM (ICC3.1 = 0.94-0.99), hip flexion (ICC3.1 = 0.99) and left and right handgrip strength (ICC3.1 = 0.95 and 0.97), showed the highest test-retest reliability, followed by VO2max (ICC3.1 = 0.91), core strength (ICC3.1 = 0.78-0.89), hip extension (ICC3.1 = 0.63), the countermeasure (ICC3.1 = 0.76) and squat (ICC3.1 = 0.63) jumps, and single right- and left-leg jump height (ICC3.1 = 0.51 and 0.14). For balance, relative reliability ranged from ICC3.1 = 0.78 for path length (two legs, head tilted back, eyes open) to ICC3.1 = 0.04 for average rotation velocity (one leg, eyes closed). In a small sample (n = 8) of young, healthy individuals, the AFA battery of tests demonstrated acceptable test-retest reliability for most parameters except some balance and single-leg jump tasks. These findings suggest that, for the application with astronauts, most AFA tests appear appropriate to be maintained in the test battery, but that some elements may be unreliable, and require either modification (duration, selection of task) or removal (single-leg jump, balance test on sphere) from the battery. The test battery is mobile and universally applicable for occupational and general fitness assessment by its comprehensive composition of tests covering many systems involved in whole body movement.
Johansson, Fredrik R.; Skillgate, Eva; Lapauw, Mattis L.; Clijmans, Dorien; Deneulin, Valentijn P.; Palmans, Tanneke; Engineer, Human Kinetic; Cools, Ann M.
2015-01-01
Context Shoulder strength assessment plays an important role in the clinical examination of the shoulder region. Eccentric strength measurements are of special importance in guiding the clinician in injury prevention or return-to-play decisions after injury. Objective To examine the absolute and relative reliability and validity of a standardized eccentric strength-measurement protocol for the glenohumeral external rotators. Design Descriptive laboratory study. Setting Testing environment at the Department of Rehabilitation Sciences and Physiotherapy of Ghent University, Belgium. Patients or Other Participants Twenty-five healthy participants (9 men and 16 women) without any history of shoulder pain were tested by 2 independent assessors using a handheld dynamometer (HHD) and underwent an isokinetic testing procedure. Intervention(s) The clinical protocol used an HHD, a DynaPort accelerometer to measure acceleration and angular velocity of testing 30°/s over 90° of range of motion, and a Biodex dynamometer to measure isokinetic activity. Main Outcome Measure(s) Three eccentric strength measurements: (1) tester 1 with the HHD, (2) tester 2 with the HHD, and (3) Biodex isokinetic strength measurement. Results The intratester reliability was excellent (0.879 and 0.858), whereas the intertester reliability was good, with an intraclass correlation coefficient between testers of 0.714. Pearson product moment correlation coefficients of 0.78 and 0.70 were noted between the HHD and the isokinetic data, showing good validity of this new procedure. Conclusions Standardized eccentric rotator cuff strength can be tested and measured in the clinical setting with good-to-excellent reliability and validity using an HHD. PMID:25974381
Test-Retest Reliability of a Novel Isokinetic Squat Device With Strength-Trained Athletes.
Bridgeman, Lee A; McGuigan, Michael R; Gill, Nicholas D; Dulson, Deborah K
2016-11-01
Bridgeman, LA, McGuigan, MR, Gill, ND, and Dulson, DK. Test-retest reliability of a novel isokinetic squat device with strength-trained athletes. J Strength Cond Res 30(11): 3261-3265, 2016-The aim of this study was to investigate the test-retest reliability of a novel multijoint isokinetic squat device. The subjects in this study were 10 strength-trained athletes. Each subject completed 3 maximal testing sessions to assess peak concentric and eccentric force (N) over a 3-week period using the Exerbotics squat device. Mean differences between eccentric and concentric force across the trials were calculated. Intraclass correlation coefficients (ICCs) and coefficients of variation (CVs) for the variables of interest were calculated using an excel reliability spreadsheet. Between trials 1 and 2 an 11.0 and 2.3% increase in mean concentric and eccentric forces, respectively, was reported. Between trials 2 and 3 a 1.35% increase in the mean concentric force production and a 1.4% increase in eccentric force production was reported. The mean concentric peak force CV and ICC across the 3 trials was 10% (7.6-15.4) and 0.95 (0.87-0.98) respectively. However, the mean eccentric peak force CV and ICC across the trials was 7.2% (5.5-11.1) and 0.90 (0.76-0.97), respectively. Based on these findings it is suggested that the Exerbotics squat device shows good test-retest reliability. Therefore practitioners and investigators may consider its use to monitor changes in concentric and eccentric peak force.
Erkan Turan, Kadriye; Taylan Sekeroglu, Hande; Karahan, Sevilay; Sanac, Ali Sefik
2017-12-01
The purpose of this study was to analyze the reliability of the fixation preference test (FPT) in the detection of amblyopia, and to determine interexaminer agreement. Eighty patients whose visual acuity could be tested objectively and had a horizontal misalignment of more than 10 prism diopters were enrolled. The best corrected visual acuity (BCVA) and orthoptic findings were all recorded. Non-preferred eye in primary position and fixation preference grade were assessed independently by two masked experienced examiners. The primary outcome measures were reliability of FPT in terms of its correlation with BCVA and interexaminer agreement. There was no significant correlation between fixation preference grades and interocular visual acuity difference as well as the type and amount of deviation, the presence of fusion, stereopsis, anisometropia, and previous strabismus surgery for none of the examiners (p > 0.05 for all). Sensitivity was 52.0% for examiner 1 and 54.0% for examiner 2 while specificity was 50.0 and 46.7%, respectively. Interexaminer agreement was 76.7% (p < 0.001) for all patients. FPT is widely used in children particularly when the visual acuity cannot be determined in an objective manner. The test may not be accurate and reliable in the detection of amblyopia and also in predicting the visual acuity difference between both eyes, even though it was found to show a high degree of agreement between examiners. In conclusion, it should be kept in mind that the reliability of FPT may be limited and the results should be interpreted with caution and be supported by other tests.
Elliot, Catherine A; Hamlin, Michael J; Lizamore, Catherine A
2017-07-28
The purpose of this study was to investigate the validity and reliability of the Hexoskin® vest for measuring respiration and heart rate (HR) in elite cyclists during a progressive test to exhaustion. Ten male elite cyclists (age 28.8 ± 12.5 yr, height 179.3 ± 6.0 cm, weight 73.2 ± 9.1 kg, V˙ O2max 60.7 ± 7.8 ml.kg.min mean ± SD) conducted a maximal aerobic cycle ergometer test using a ramped protocol (starting at 100W with 25W increments each min to failure) during two separate occasions over a 3-4 day period. Compared to the criterion measure (Metamax 3B) the Hexoskin® vest showed mainly small typical errors (1.3-6.2%) for HR and breathing frequency (f), but larger typical errors (9.5-19.6%) for minute ventilation (V˙E) during the progressive test to exhaustion. The typical error indicating the reliability of the Hexoskin® vest at moderate intensity exercise between tests was small for HR (2.6-2.9%) and f (2.5-3.2%) but slightly larger for V˙E (5.3-7.9%). We conclude that the Hexoskin® vest is sufficiently valid and reliable for measurements of HR and f in elite athletes during high intensity cycling but the calculated V˙E value the Hexoskin® vest produces during such exercise should be used with caution due to the lower validity and reliability of this variable.
Melchiorri, Giovanni; Viero, Valerio; Triossi, Tamara; Padua, Elvira; Bonifazi, Marco
2017-11-01
This study investigated the applicability of a sport-specific test, the Shuttle Swim Test, in young water polo players to measure RSA. The aims were: to assess the reliability and to measure the responsiveness of the SST in young water polo athletes, and to provide age-related values of SST. Three hundred thirty-three elite athletes (18.3±5.1 years) were involved in the study. Of these, 99 were young people under 13 (13.1±0.5 years) who also underwent measurements for reliability and responsiveness of the SST The following six measures was used to assess anthropometric characteristics of the sample: height, weight, chest circumference, hip circumference, waist circumference, and arm span. Two performance measures were performed on dry land: push up and chin up. Reliability and responsiveness were measured by comparing the average speed of two trials: SST1 was 1.48±0.13 m·s-1 and SST2 1.47±.12 m·s-1. The SST showed good reliability in younger athletes (r=0.96). The Minimal Detectable Change is 0.06 m·s-1 (6 seconds of the total time) which corresponds to 3.6% of the average value measured, confirming the good responsiveness of the test. Coaches and researchers can use this value in the interpretation of the SST test results: changes below these values could be related to a measurement error. The various age-related values reported may help technicians to better interpret the performance of their athletes during competition.
Jensen, K K; Kjaer, M; Jorgensen, L N
2016-12-01
To determine the reliability of measurements obtained by the Good Strength dynamometer, determining isometric abdominal wall and back muscle strength in patients with ventral incisional hernia (VIH) and healthy volunteers with an intact abdominal wall. Ten patients with VIH and ten healthy volunteers with an intact abdominal wall were each examined twice with a 1 week interval. Examination included the assessment of truncal flexion and extension as measured with the Good Strength dynamometer, the completion of the International Physical Activity Questionnaire (IPAQ) and the self-assessment of truncal strength on a visual analogue scale (SATS). The test-retest reliability of truncal flexion and extension was assessed by interclass correlation coefficient (ICC), and Bland and Altman graphs. Finally, correlations between truncal strength, and IPAQ and SATS were examined. Truncal flexion and extension showed excellent test-retest reliability for both patients with VIH (ICC 0.91 and 0.99) and healthy controls (ICC 0.97 and 0.96). Bland and Altman plots showed that no systematic bias was present for neither truncal flexion nor extension when assessing reliability. For patients with VIH, no significant correlations between objective measures of truncal strength and IPAQ or SATS were found. For healthy controls, both truncal flexion (τ 0.58, p = 0.025) and extension (τ 0.58, p = 0.025) correlated significantly with SATS, while no other significant correlation between truncal strength measures and IPAQ was found. The Good Strength dynamometer provided a reliable, low-cost measure of truncal flexion and extension in patients with VIH.
Casartelli, Nicola; Müller, Roland; Maffiuletti, Nicola A
2010-11-01
The aim of the present study was to verify the validity and reliability of the Myotest accelerometric system (Myotest SA, Sion, Switzerland) for the assessment of vertical jump height. Forty-four male basketball players (age range: 9-25 years) performed series of squat, countermovement and repeated jumps during 2 identical test sessions separated by 2-15 days. Flight height was simultaneously quantified with the Myotest system and validated photoelectric cells (Optojump). Two calculation methods were used to estimate the jump height from Myotest recordings: flight time (Myotest-T) and vertical takeoff velocity (Myotest-V). Concurrent validity was investigated comparing Myotest-T and Myotest-V to the criterion method (Optojump), and test-retest reliability was also examined. As regards validity, Myotest-T overestimated jumping height compared to Optojump (p < 0.001) with a systematic bias of approximately 7 cm, even though random errors were low (2.7 cm) and intraclass correlation coefficients (ICCs) where high (>0.98), that is, excellent validity. Myotest-V overestimated jumping height compared to Optojump (p < 0.001), with high random errors (>12 cm), high limits of agreement ratios (>36%), and low ICCs (<0.75), that is, poor validity. As regards reliability, Myotest-T showed high ICCs (range: 0.92-0.96), whereas Myotest-V showed low ICCs (range: 0.56-0.89), and high random errors (>9 cm). In conclusion, Myotest-T is a valid and reliable method for the assessment of vertical jump height, and its use is legitimate for field-based evaluations, whereas Myotest-V is neither valid nor reliable.
Keessen, Paul; Maaskant, Jolanda; Visser, Bart
2018-08-01
The standardized Mensendieck test (SMT) was developed to quantify posture, movement, gait, and respiration. In the hands of an experienced therapist, the SMT is proven to be a reliable tool. It is unclear whether posture, movement, gait, and respiration are related to the degree of functional disability in patients with chronic pain. The objective of this study was to assess the reliability and convergent validity of the SMT in a heterogeneous sample of 50 patients with chronic pain. Internal consistency was determined by Cronbach's α and interrater reliability by the intraclass correlation coefficient (ICC). Convergent validity was assessed by determining the Spearman rank correlation coefficient between the movement quality measured in the SMT and functional limitation measured on the disability rating index (DRI). The internal consistency was Cronbach's α 0.91. Substantial reliability was found for the items: movement (ICC = 0.68), gait (ICC = 0.69), sitting posture (ICC = 0.63), and respiration (ICC = 0.64). Insufficient reliability was found for standing posture (ICC = 0.23). A moderate correlation was found between average test score SMT and the DRI (r = -0.37) and respiration and DRI (r = -0.45). The SMT is a reasonably reliable tool to assess movement, gait, sitting posture, and respiration. None of the items in the domain standing posture has sufficient reliability. A thorough study of this domain should be considered. The results show little evidence for convergent validity. Several items of the SMT correlated moderately with functional limitation with the DRI. These items were global movement, hip flexion, pelvis rotation, and all respiration items.
Test-retest reliability of resting-state magnetoencephalography power in sensor and source space.
Martín-Buro, María Carmen; Garcés, Pilar; Maestú, Fernando
2016-01-01
Several studies have reported changes in spontaneous brain rhythms that could be used as clinical biomarkers or in the evaluation of neuropsychological and drug treatments in longitudinal studies using magnetoencephalography (MEG). There is an increasing necessity to use these measures in early diagnosis and pathology progression; however, there is a lack of studies addressing how reliable they are. Here, we provide the first test-retest reliability estimate of MEG power in resting-state at sensor and source space. In this study, we recorded 3 sessions of resting-state MEG activity from 24 healthy subjects with an interval of a week between each session. Power values were estimated at sensor and source space with beamforming for classical frequency bands: delta (2-4 Hz), theta (4-8 Hz), alpha (8-13 Hz), low beta (13-20 Hz), high beta (20-30 Hz), and gamma (30-45 Hz). Then, test-retest reliability was evaluated using the intraclass correlation coefficient (ICC). We also evaluated the relation between source power and the within-subject variability. In general, ICC of theta, alpha, and low beta power was fairly high (ICC > 0.6) while in delta and gamma power was lower. In source space, fronto-posterior alpha, frontal beta, and medial temporal theta showed the most reliable profiles. Signal-to-noise ratio could be partially responsible for reliability as low signal intensity resulted in high within-subject variability, but also the inherent nature of some brain rhythms in resting-state might be driving these reliability patterns. In conclusion, our results described the reliability of MEG power estimates in each frequency band, which could be considered in disease characterization or clinical trials. © 2015 Wiley Periodicals, Inc.
Çelik, Derya
2016-01-01
The Constant-Murley score (CMS) is widely used to evaluate disabilities associated with shoulder injuries, but it has been criticized for relying on imprecise terminology and a lack of standardized methodology. A modified guideline, therefore, was published in 2008 with several recommendations. This new version has not yet been translated or culturally adapted for Turkish-speaking populations. The purpose of this study was to translate and cross-culturally adapt the modified CMS and its test protocol, as well as define and measure its reliability and validity. The modified CMS was translated into Turkish, consistent with published methodological guidelines. The measurement properties of the Turkish version of the modified CMS were tested in 30 patients (12 males, 18 females; mean age: 59.5±13.5 years) with a variety of shoulder pathologies. Intraclass correlation coefficients (ICC) were used to estimate test-retest reliability. Construct validity was analyzed with the Turkish version of the American Shoulder and Elbow Surgeons (ASES) Standardized Shoulder Assessment Form and Short-Form Health Survey (SF-12). No difficulties were found in the translation process. The Turkish version of the modified CMS showed excellent test-retest reliability (ICC=0.86). The correlation coefficients between the Turkish version of the modified CMS and the ASES, SF-12-physical component score, and SF-12 mental component scores were found to be 0.48, 0.35, and 0.05, respectively. No floor or ceiling effects were found. The translation and cultural adaptation of the modified CMS and its standardized test protocol into Turkish were successful. The Turkish version of the modified CMS has sufficient reliability and validity to measure a variety of shoulder disorders for Turkish-speaking individuals.
Hajdú, Sara Fredslund; Plaschke, Christina Caroline; Johansen, Christoffer; Dalton, Susanne Oksbjerg; Wessel, Irene
2017-08-01
The objectives were to translate and culturally adapt the M.D. Anderson Dysphagia Inventory (MDADI) into Danish and subsequently test the reliability of the Danish version. The MDADI was translated into Danish and cross culturally adapted through cognitive interviews. The final version was test-retest evaluated in a group of head and neck cancer (HNC) patients who responded to the questionnaire twice with a mean of eight days apart. Interclass correlation coefficient, Cronbach's alpha, floor and ceiling effects, standard error of measurement and minimal detectable change were investigated. Fourteen patients were interviewed on the comprehensibility of the Danish MDADI, and all found the questionnaire meaningful, easy to understand, non-offensive and to include relevant aspects of dysphagia related to HNC. Sixty-four patients were included in the test-retest study. Especially, one item in the emotional scale (E7) appeared to be often misinterpreted, and ceiling effects were found in all four subdomains (global, emotional, functional and physical). The four subdomains and the composite score showed acceptable test-retest reliability and internal consistency in a Danish population of HNC patients. The Danish MDADI is reliable in terms of internal consistency and test-retest reproducibility and can be used in assessing the health-related quality of life in head and neck cancer patients with dysphagia.
Choosing a reliability inspection plan for interval censored data
Lu, Lu; Anderson-Cook, Christine Michaela
2017-04-19
Reliability test plans are important for producing precise and accurate assessment of reliability characteristics. This paper explores different strategies for choosing between possible inspection plans for interval censored data given a fixed testing timeframe and budget. A new general cost structure is proposed for guiding precise quantification of total cost in inspection test plan. Multiple summaries of reliability are considered and compared as the criteria for choosing the best plans using an easily adapted method. Different cost structures and representative true underlying reliability curves demonstrate how to assess different strategies given the logistical constraints and nature of the problem. Resultsmore » show several general patterns exist across a wide variety of scenarios. Given the fixed total cost, plans that inspect more units with less frequency based on equally spaced time points are favored due to the ease of implementation and consistent good performance across a large number of case study scenarios. Plans with inspection times chosen based on equally spaced probabilities offer improved reliability estimates for the shape of the distribution, mean lifetime, and failure time for a small fraction of population only for applications with high infant mortality rates. The paper uses a Monte Carlo simulation based approach in addition to the common evaluation based on the asymptotic variance and offers comparison and recommendation for different applications with different objectives. Additionally, the paper outlines a variety of different reliability metrics to use as criteria for optimization, presents a general method for evaluating different alternatives, as well as provides case study results for different common scenarios.« less
Choosing a reliability inspection plan for interval censored data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lu, Lu; Anderson-Cook, Christine Michaela
Reliability test plans are important for producing precise and accurate assessment of reliability characteristics. This paper explores different strategies for choosing between possible inspection plans for interval censored data given a fixed testing timeframe and budget. A new general cost structure is proposed for guiding precise quantification of total cost in inspection test plan. Multiple summaries of reliability are considered and compared as the criteria for choosing the best plans using an easily adapted method. Different cost structures and representative true underlying reliability curves demonstrate how to assess different strategies given the logistical constraints and nature of the problem. Resultsmore » show several general patterns exist across a wide variety of scenarios. Given the fixed total cost, plans that inspect more units with less frequency based on equally spaced time points are favored due to the ease of implementation and consistent good performance across a large number of case study scenarios. Plans with inspection times chosen based on equally spaced probabilities offer improved reliability estimates for the shape of the distribution, mean lifetime, and failure time for a small fraction of population only for applications with high infant mortality rates. The paper uses a Monte Carlo simulation based approach in addition to the common evaluation based on the asymptotic variance and offers comparison and recommendation for different applications with different objectives. Additionally, the paper outlines a variety of different reliability metrics to use as criteria for optimization, presents a general method for evaluating different alternatives, as well as provides case study results for different common scenarios.« less
Hasanpour, Neda; Attarbashi Moghadam, Behrouz; Sami, Ramin; Tavakol, Kamran
2016-08-01
The clinical COPD questionnaire (CCQ) has been developed to measure the health status of COPD patients. The aim of this study was to translate CCQ into the Persian language and assess the validity and reliability of the translated version. We used a forward-backward procedure to translate the questionnaire. In a cross-sectional study 100 COPD patients and 50 healthy subjects over 40 years old were selected to assess the reliability and construct validity of the instrument. The face and content validity were used for the questionnaire validity. Validity was examined in a population of patients with COPD, using the Persian validated version of the St George's Respiratory Questionnaire (PSGRQ). In order to assess the questionnaire's reliability, the Intraclass correlation coefficient (ICC) and Cronbach's alpha were calculated. Test-retest reliability was tested by re-administering the Persian version of the CCQ (PCCQ) after 1 week. Test-retest carry out of data demonstrates that the PCCQ has excellent reliability (ICC for all 3 domains were higher than 0.9). Internal consistency was found by Cronbach's alpha to be 0.96, 0.94, 0.97, and 0.98 for the symptom, mental state, functional state and total scores respectively. In addition, the correlation between the components of PCCQ and PSGRQ showed satisfactory construct validity. Analyzing the data from healthy subjects and patients divulged that the PCCQ has acceptable discriminant validity. In general, the PCCQ had satisfactory reliability and validity for assessing health-related quality of life status of Iranian COPD patients.
Dyke, Katherine; Kim, Soyoung; Jackson, Georgina M; Jackson, Stephen R
Transcranial direct current stimulation (tDCS) is a popular non-invasive brain stimulation technique that has been shown to influence cortical excitability. While polarity specific effects have often been reported, this is not always the case, and variability in both the magnitude and direction of the effects have been observed. We aimed to explore the consistency and reliability of the effects of tDCS by investigating changes in cortical excitability across multiple testing sessions in the same individuals. A within subjects design was used to investigate the effects of anodal and cathodal tDCS applied to the motor cortex. Four experimental sessions were tested for each polarity in addition to two sham sessions. Transcranial magnetic stimulation (TMS) was used to measure cortical excitability (TMS recruitment curves). Changes in excitability were measured by comparing baseline measures and those taken immediately following 20 minutes of 2 mA stimulation or sham stimulation. Anodal tDCS significantly increased cortical excitability at a group level, whereas cathodal tDCS failed to have any significant effects. The sham condition also failed to show any significant changes. Analysis of intra-subject responses to anodal stimulation across four sessions suggest that the amount of change in excitability across sessions was only weakly associated, and was found to have poor reliability across sessions (ICC = 0.276). The effects of cathodal stimulation show even poorer reliability across sessions (ICC = 0.137). In contrast ICC analysis for the two sessions of sham stimulation reflect a moderate level of reliability (ICC = .424). Our findings indicate that although 2 mA anodal tDCS is effective at increasing cortical excitability at group level, the effects are unreliable across repeated testing sessions within individual participants. Our results suggest that 2 mA cathodal tDCS does not significantly alter cortical excitability immediately following stimulation and that there is poor reliability of the effect within the same individual across different testing sessions. Copyright © 2016. Published by Elsevier Inc.
Liu, Chao; Liu, Jinhong; Zhang, Junxiang; Zhu, Shiyao
2018-02-05
The direct counterfactual quantum communication (DCQC) is a surprising phenomenon that quantum information can be transmitted without using any carriers of physical particles. The nested interferometers are promising devices for realizing DCQC as long as the number of interferometers goes to be infinity. Considering the inevitable loss or dissipation in practical experimental interferometers, we analyze the dependence of reliability on the number of interferometers, and show that the reliability of direct communication is being rapidly degraded with the large number of interferometers. Furthermore, we simulate and test this counterfactual deterministic communication protocol with a finite number of interferometers, and demonstrate the improvement of the reliability using dissipation compensation in interferometers.
Extending the validity of the Feeding Practices and Structure Questionnaire.
Jansen, Elena; Mallan, Kimberley M; Daniels, Lynne A
2015-06-30
Feeding practices are commonly examined as potentially modifiable determinants of children's eating behaviours and weight status. Although a variety of questionnaires exist to assess different feeding aspects, many lack thorough reliability and validity testing. The Feeding Practices and Structure Questionnaire (FPSQ) is a tool designed to measure early feeding practices related to non-responsive feeding and structure of the meal environment. Face validity, factorial validity, internal reliability and cross-sectional correlations with children's eating behaviours have been established in mothers with 2-year-old children. The aim of the present study was to further extend the validity of the FPSQ by examining factorial, construct and predictive validity, and stability. Participants were from the NOURISH randomised controlled trial which evaluated an intervention with first-time mothers designed to promote protective feeding practices. Maternal feeding practices (FP) and child eating behaviours were assessed when children were aged 2 years and 3.7 years (n = 388). Confirmatory Factor analysis, group differences, predictive relationships, and stability were tested. The original 9-factor structure was confirmed when children were aged 3.7 ± 0.3 years. Cronbach's alpha was above the recommended 0.70 cut-off for all factors except Structured Meal Timing, Over Restriction and Distrust in Appetite which were 0.58, 0.67 and 0.66 respectively. Allocated group differences reflected behaviour consistent with intervention content and all feeding practices were stable across both time points (range of r = 0.45-0.70). There was some evidence for the predictive validity of factors with 2 FP showing expected relationships, 2 FP showing expected and unexpected relationships and 5 FP showing no relationship. Reliability and validity was demonstrated for most subscales of the FPSQ. Future validation is warranted with culturally diverse samples and with fathers and other caregivers. The use of additional outcomes to further explore predictive validity is recommended as well as testing test-retest reliability of the questionnaire.
Comparing reliabilities of strip and conventional patch testing.
Dickel, Heinrich; Geier, Johannes; Kreft, Burkhard; Pfützner, Wolfgang; Kuss, Oliver
2017-06-01
The standardized protocol for performing the strip patch test has proven to be valid, but evidence on its reliability is still missing. To estimate the parallel-test reliability of the strip patch test as compared with the conventional patch test. In this multicentre, prospective, randomized, investigator-blinded reliability study, 132 subjects were enrolled. Simultaneous duplicate strip and conventional patch tests were performed with the Finn Chambers ® on Scanpor ® tape test system and the patch test preparations nickel sulfate 5% pet., potassium dichromate 0.5% pet., and lanolin alcohol 30% pet. Reliability was estimated by the use of Cohen's kappa coefficient. Parallel-test reliability values of the three standard patch test preparations turned out to be acceptable, with slight advantages for the strip patch test. The differences in reliability were 9% (95%CI: -8% to 26%) for nickel sulfate and 23% (95%CI: -16% to 63%) for potassium dichromate, both favouring the strip patch test. The standardized strip patch test method for the detection of allergic contact sensitization in patients with suspected allergic contact dermatitis is reliable. Its application in routine clinical practice can be recommended, especially if the conventional patch test result is presumably false negative. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
São-João, Thaís Moreira; Rodrigues, Roberta Cunha Matheus; Gallani, Maria Cecilia Bueno Jayme; Miura, Cinthya Tamie de Passos; Domingues, Gabriela de Barros Leite; Godin, Gaston
2013-06-01
To conduct the cultural adaptation of the Brazilian version of the Godin-Shephard Leisure-Time Physical Activity Questionnaire (GSLTPAQ) and to assess its content validity, practicability, acceptability and reliability. The stages of translation, synthesis, back translation, expert committee review and pre-test were carried out, followed by the evaluation of the practicability, acceptability and reliability (test-retest). The judges assessed its semantic, idiomatic, conceptual, cultural and metabolic equivalences. The adapted version was submitted to the pre-test (n = 20), and test-retest (n = 80), in healthy individuals and in those suffering from cardiovascular disease in Limeira, SP, Southeastern Brazil, between 2010 and 2011. The proportion of agreement of the committee of judges was assessed using the Content Validity Index. Reliability was assessed by the criterion of stability, with 15 days between applications. Practicability was evaluated by the time spent interviewing and acceptability was estimated as the percentage of unanswered items and the proportion of patients who responded to all items. The translated version of the questionnaire showed evidence of appropriate semantic-idiomatic, conceptual, cultural and metabolic equivalence, with substitutions of several physical activities more appropriate to the Brazilian population. The practicability analysis showed short time needed for the application of the instrument (mean 3.0 minutes). As for acceptability, all patients answered 100% of the items. The test-retest analysis suggested that stability was good (Intraclass Correlation Coefficient value of 0.84). The Brazilian version of the questionnaire showed satisfactory measures of the qualities in question. Its application to diverse populations in future studies is recommended in order to provide robust measures of these qualities.
Ferrari, Silvano; Manni, Tiziana; Bonetti, Francesca; Villafañe, Jorge Hugo; Vanti, Carla
2015-01-01
Several clinical tests have been proposed on low back pain (LBP), but their usefulness in detecting lumbar instability is not yet clear. The objective of this literature review was to investigate the clinical validity of the main clinical tests used for the diagnosis of lumbar instability in individuals with LBP and to verify their applicability in everyday clinical practice. We searched studies of the accuracy and/or reliability of Prone Instability Test (PIT), Passive Lumbar Extension Test (PLE), Aberrant Movements Pattern (AMP), Posterior Shear Test (PST), Active Straight Leg Raise Test (ASLR) and Prone and Supine Bridge Tests (PB and SB) in Medline, Embase, Cinahl, PubMed, and Scopus databases. Only the studies in which each test was investigated by at least one study concerning both the accuracy and the reliability were considered eligible. The quality of the studies was evaluated by QUADAS and QAREL scales. Six papers considering 333 LBP patients were included. The PLE was the most accurate and informative clinical test, with high sensitivity (0.84, 95% CI: 0.69 - 0.91) and high specificity (0.90, 95% CI: 0.85 -0.97). The diagnostic accuracy of AMP depends on each singular test. The PIT and the PST demonstrated by fair to moderate sensitivity and specificity [PIT sensitivity = 0.71 (95% CI: 0.51 - 0.83), PIT specificity = 0.57 (95% CI: 039 - 0.78); PST sensitivity = 0.50 (95% CI: 0.41 - 0.76), PST specificity = 0.48 (95% CI: 0.22 - 0.58)]. The PLE showed a good reliability (k = 0.76), but this result comes from a single study. The inter-rater reliability of the PIT ranged by slight (k = 0.10 and 0.04), to good (k = 0.87). The inter-rater reliability of the AMP ranged by slight (k = -0.07) to moderate (k = 0.64), whereas the inter-rater reliability of the PST was fair (k = 0.27). The data from the studies provided information on the methods used and suggest that PLE is the most appropriate tests to detect lumbar instability in specific LBP. However, due to the lack of available papers on other lumbar conditions, these findings should be confirmed with studies on non-specific LBP patients.
ERIC Educational Resources Information Center
Helms, LuAnn Sherbeck
This paper discusses the fact that reliability is about scores and not tests and how reliability limits effect sizes. The paper also explores the classical reliability coefficients of stability, equivalence, and internal consistency. Stability is concerned with how stable test scores will be over time, while equivalence addresses the relationship…
Wafer level reliability testing: An idea whose time has come
NASA Technical Reports Server (NTRS)
Trapp, O. D.
1987-01-01
Wafer level reliability testing has been nurtured in the DARPA supported workshops, held each autumn since 1982. The seeds planted in 1982 have produced an active crop of very large scale integration manufacturers applying wafer level reliability test methods. Computer Aided Reliability (CAR) is a new seed being nurtured. Users are now being awakened by the huge economic value of the wafer reliability testing technology.
Reliability approach to rotating-component design. [fatigue life and stress concentration
NASA Technical Reports Server (NTRS)
Kececioglu, D. B.; Lalli, V. R.
1975-01-01
A probabilistic methodology for designing rotating mechanical components using reliability to relate stress to strength is explained. The experimental test machines and data obtained for steel to verify this methodology are described. A sample mechanical rotating component design problem is solved by comparing a deterministic design method with the new design-by reliability approach. The new method shows that a smaller size and weight can be obtained for specified rotating shaft life and reliability, and uses the statistical distortion-energy theory with statistical fatigue diagrams for optimum shaft design. Statistical methods are presented for (1) determining strength distributions for steel experimentally, (2) determining a failure theory for stress variations in a rotating shaft subjected to reversed bending and steady torque, and (3) relating strength to stress by reliability.
Mash, Bob; Derese, Anselme
2013-01-01
Abstract Background Competency-based education and the validity and reliability of workplace-based assessment of postgraduate trainees have received increasing attention worldwide. Family medicine was recognised as a speciality in South Africa six years ago and a satisfactory portfolio of learning is a prerequisite to sit the national exit exam. A massive scaling up of the number of family physicians is needed in order to meet the health needs of the country. Aim The aim of this study was to develop a reliable, robust and feasible portfolio assessment tool (PAT) for South Africa. Methods Six raters each rated nine portfolios from the Stellenbosch University programme, using the PAT, to test for inter-rater reliability. This rating was repeated three months later to determine test–retest reliability. Following initial analysis and feedback the PAT was modified and the inter-rater reliability again assessed on nine new portfolios. An acceptable intra-class correlation was considered to be > 0.80. Results The total score was found to be reliable, with a coefficient of 0.92. For test–retest reliability, the difference in mean total score was 1.7%, which was not statistically significant. Amongst the subsections, only assessment of the educational meetings and the logbook showed reliability coefficients > 0.80. Conclusion This was the first attempt to develop a reliable, robust and feasible national portfolio assessment tool to assess postgraduate family medicine training in the South African context. The tool was reliable for the total score, but the low reliability of several sections in the PAT helped us to develop 12 recommendations regarding the use of the portfolio, the design of the PAT and the training of raters.
Establishing a 'Physician's Spiritual Well-being Scale' and testing its reliability and validity.
Fang, C K; Li, P Y; Lai, M L; Lin, M H; Bridge, D T; Chen, H W
2011-01-01
The purpose of this study was to develop a Physician's Spiritual Well-Being Scale (PSpWBS). The significance of a physician's spiritual well-being was explored through in-depth interviews with and qualitative data collection from focus groups. Based on the results of qualitative analysis and related literature, the PSpWBS consisting of 25 questions was established. Reliability and validity tests were performed on 177 subjects. Four domains of the PSpWBS were devised: physician's characteristics; medical practice challenges; response to changes; and overall well-being. The explainable total variance was 65.65%. Cronbach α was 0.864 when the internal consistency of the whole scale was calculated. Factor analysis showed that the internal consistency Cronbach α value for each factor was between 0.625 and 0.794 and the split-half reliability was 0.865. The scale has satisfactory reliability and validity and could serve as the basis for assessment of the spiritual well-being of a physician.
Interobserver Reliability of the Total Body Score System for Quantifying Human Decomposition.
Dabbs, Gretchen R; Connor, Melissa; Bytheway, Joan A
2016-03-01
Several authors have tested the accuracy of the Total Body Score (TBS) method for quantifying decomposition, but none have examined the reliability of the method as a scoring system by testing interobserver error rates. Sixteen participants used the TBS system to score 59 observation packets including photographs and written descriptions of 13 human cadavers in different stages of decomposition (postmortem interval: 2-186 days). Data analysis used a two-way random model intraclass correlation in SPSS (v. 17.0). The TBS method showed "almost perfect" agreement between observers, with average absolute correlation coefficients of 0.990 and average consistency correlation coefficients of 0.991. While the TBS method may have sources of error, scoring reliability is not one of them. Individual component scores were examined, and the influences of education and experience levels were investigated. Overall, the trunk component scores were the least concordant. Suggestions are made to improve the reliability of the TBS method. © 2016 American Academy of Forensic Sciences.
Reliability of the Wii Balance Board in kayak.
Vando, Stefano; Laffaye, Guillaume; Masala, Daniele; Falese, Lavinia; Padulo, Johnny
2015-01-01
the seat of the kayaker represent the principal contact point to express mechanical Energy. therefore we investigated the reliability of the Wii Balance Board measures in the kayak vs. on the ground. Bland-Altman test showed a low systematic bias on the ground (2.85%) and in kayak (-2.13%) respectively; while 0.996 for Intra-class correlation coefficient. the Wii Balance Board is useful to assess postural sway in kayak.
Cross-cultural adaptation and validation of the neonatal/infant Braden Q risk assessment scale.
de Lima, Edson Luiz; de Brito, Maria José Azevedo; de Souza, Diba Maria Sebba Tosta; Salomé, Geraldo Magela; Ferreira, Lydia Masako
2016-02-01
To translate into Brazilian Portuguese and cross-culturally adapt the Neonatal/Infant Braden Q Risk Assessment Scale (Neonatal/Infant Braden Q Scale), and test the psychometric properties, reproducibility and validity of the instrument. There is a lack of studies on the development of pressure ulcers in children, especially in neonates. Thirty professionals participated in the cross-cultural adaptation of the Brazilian-Portuguese version of the scale. Fifty neonates of both sexes were assessed between July 2013 and June 2014. Reliability and reproducibility were tested in 20 neonates and construct validity was measured by correlating the Neonatal/Infant Braden Q Scale with the Braden Q Risk Assessment Scale (Braden Q Scale). Discriminant validity was assessed by comparing the scores of neonates with and without ulcers. The scale showed inter-rater reliability (ICC = 0.98; P < 0.001) and intra-rater reliability (ICC = 0.79; P < 0.001). A strong correlation was found between the Neonatal/Infant Braden Q Scale and Braden Q Scale (r = 0.96; P < 0.001). The cross-culturally adapted Brazilian version of the Neonatal/Infant Braden Q Scale is a reliable instrument, showing face, content and construct validity. Copyright © 2015 Tissue Viability Society. Published by Elsevier Ltd. All rights reserved.
Thickness effect of ultra-thin Ta2O5 resistance switching layer in 28 nm-diameter memory cell
NASA Astrophysics Data System (ADS)
Park, Tae Hyung; Song, Seul Ji; Kim, Hae Jin; Kim, Soo Gil; Chung, Suock; Kim, Beom Yong; Lee, Kee Jeung; Kim, Kyung Min; Choi, Byung Joon; Hwang, Cheol Seong
2015-11-01
Resistance switching (RS) devices with ultra-thin Ta2O5 switching layer (0.5-2.0 nm) with a cell diameter of 28 nm were fabricated. The performance of the devices was tested by voltage-driven current—voltage (I-V) sweep and closed-loop pulse switching (CLPS) tests. A Ta layer was placed beneath the Ta2O5 switching layer to act as an oxygen vacancy reservoir. The device with the smallest Ta2O5 thickness (0.5 nm) showed normal switching properties with gradual change in resistance in I-V sweep or CLPS and high reliability. By contrast, other devices with higher Ta2O5 thickness (1.0-2.0 nm) showed abrupt switching with several abnormal behaviours, degraded resistance distribution, especially in high resistance state, and much lower reliability performance. A single conical or hour-glass shaped double conical conducting filament shape was conceived to explain these behavioural differences that depended on the Ta2O5 switching layer thickness. Loss of oxygen via lateral diffusion to the encapsulating Si3N4/SiO2 layer was suggested as the main degradation mechanism for reliability, and a method to improve reliability was also proposed.
NASA Technical Reports Server (NTRS)
Vesely, William E.; Colon, Alfredo E.
2010-01-01
Design Safety/Reliability is associated with the probability of no failure-causing faults existing in a design. Confidence in the non-existence of failure-causing faults is increased by performing tests with no failure. Reliability-Growth testing requirements are based on initial assurance and fault detection probability. Using binomial tables generally gives too many required tests compared to reliability-growth requirements. Reliability-Growth testing requirements are based on reliability principles and factors and should be used.
Modeling Reliability Growth in Accelerated Stress Testing
2013-12-01
MODELING RELIABILITY GROWTH IN ACCELERATED STRESS TESTING DISSERTATION Jason K. Freels Major...Defense, or the United States Government. AFIT-ENS-DS-13-D-02 MODELING RELIABILITY GROWTH IN ACCELERATED STRESS TESTING ...DISTRIBUTION UNLIMITED AFIT-ENS-DS-13-D-02 MODELING RELIABILITY GROWTH IN ACCELERATED STRESS TESTING Jason K. Freels
Gutiérrez-Vilahú, Lourdes; Massó-Ortigosa, Núria; Rey-Abella, Ferran; Costa-Tutusaus, Lluís; Guerra-Balic, Myriam
2016-05-01
People with Down syndrome present skeletal abnormalities in their feet that can be analyzed by commonly used gold standard indices (the Hernández-Corvo index, the Chippaux-Smirak index, the Staheli arch index, and the Clarke angle) based on footprint measurements. The use of Photoshop CS5 software (Adobe Systems Software Ireland Ltd, Dublin, Ireland) to measure footprints has been validated in the general population. The present study aimed to assess the reliability and validity of this footprint assessment technique in the population with Down syndrome. Using optical podography and photography, 44 footprints from 22 patients with Down syndrome (11 men [mean ± SD age, 23.82 ± 3.12 years] and 11 women [mean ± SD age, 24.82 ± 6.81 years]) were recorded in a static bipedal standing position. A blinded observer performed the measurements using a validated manual method three times during the 4-month study, with 2 months between measurements. Test-retest was used to check the reliability of the Photoshop CS5 software measurements. Validity and reliability were obtained by intraclass correlation coefficient (ICC). The reliability test for all of the indices showed very good values for the Photoshop CS5 method (ICC, 0.982-0.995). Validity testing also found no differences between the techniques (ICC, 0.988-0.999). The Photoshop CS5 software method is reliable and valid for the study of footprints in young people with Down syndrome.
Kim, J K; Lim, H M
2015-02-01
The purpose of this study was to translate and culturally adapt the Carpal Tunnel Questionnaire to produce an equivalent Korean version. A total of 53 patients completed the Korean version of the Carpal Tunnel Questionnaire pre-operatively and 3 months after open carpal tunnel release. All 53 also completed the Korean version of the Disabilities of Arm, Shoulder, and Hand questionnaire pre-operatively and 3 months post-operatively. Reliability was measured by determining the test-retest reliability and internal consistency. Test-retest reliability was assessed using intraclass correlation coefficients and paired t-tests, and internal consistency using Cronbach's alpha coefficients. Pearson correlation analysis was carried out on the Korean version of the Carpal Tunnel Questionnaire scores and the Korean version of the Disabilities of Arm, Shoulder, and Hand scores to assess construct validity. Responsiveness was evaluated using effect sizes and standardized response means. The reliability of the Korean version of the Carpal Tunnel Questionnaire was good. The scores in the Korean version of the Disabilities of Arm, Shoulder, and Hand strongly correlated with the scores in the Korean version of the Carpal Tunnel Questionnaire. Standardized response mean and effect size were both large for the Korean version of the Carpal Tunnel Questionnaire. The study shows that the Korean version of the Carpal Tunnel Questionnaire is a reliable, valid and responsive instrument for measuring outcomes in carpal tunnel syndrome. © The Author(s) 2014.
Test-retest and interrater reliability of the functional lower extremity evaluation.
Haitz, Karyn; Shultz, Rebecca; Hodgins, Melissa; Matheson, Gordon O
2014-12-01
Repeated-measures clinical measurement reliability study. To establish the reliability and face validity of the Functional Lower Extremity Evaluation (FLEE). The FLEE is a 45-minute battery of 8 standardized functional performance tests that measures 3 components of lower extremity function: control, power, and endurance. The reliability and normative values for the FLEE in healthy athletes are unknown. A face validity survey for the FLEE was sent to sports medicine personnel to evaluate the level of importance and frequency of clinical usage of each test included in the FLEE. The FLEE was then administered and rated for 40 uninjured athletes. To assess test-retest reliability, each athlete was tested twice, 1 week apart, by the same rater. To assess interrater reliability, 3 raters scored each athlete during 1 of the testing sessions. Intraclass correlation coefficients were used to assess the test-retest and interrater reliability of each of the FLEE tests. In the face validity survey, the FLEE tests were rated as highly important by 58% to 71% of respondents but frequently used by only 26% to 45% of respondents. Interrater reliability intraclass correlation coefficients ranged from 0.83 to 1.00, and test-retest reliability ranged from 0.71 to 0.95. The FLEE tests are considered clinically important for assessing lower extremity function by sports medicine personnel but are underused. The FLEE also is a reliable assessment tool. Future studies are required to determine if use of the FLEE to make return-to-play decisions may reduce reinjury rates.
Demonstration of Essential Reliability Services by a 300-MW Solar Photovoltaic Power Plant
DOE Office of Scientific and Technical Information (OSTI.GOV)
Loutan, Clyde; Klauer, Peter; Chowdhury, Sirajul
The California Independent System Operator (CAISO), First Solar, and the National Renewable Energy Laboratory (NREL) conducted a demonstration project on a large utility-scale photovoltaic (PV) power plant in California to test its ability to provide essential ancillary services to the electric grid. With increasing shares of solar- and wind-generated energy on the electric grid, traditional generation resources equipped with automatic governor control (AGC) and automatic voltage regulation controls -- specifically, fossil thermal -- are being displaced. The deployment of utility-scale, grid-friendly PV power plants that incorporate advanced capabilities to support grid stability and reliability is essential for the large-scale integrationmore » of PV generation into the electric power grid, among other technical requirements. A typical PV power plant consists of multiple power electronic inverters and can contribute to grid stability and reliability through sophisticated 'grid-friendly' controls. In this way, PV power plants can be used to mitigate the impact of variability on the grid, a role typically reserved for conventional generators. In August 2016, testing was completed on First Solar's 300-MW PV power plant, and a large amount of test data was produced and analyzed that demonstrates the ability of PV power plants to use grid-friendly controls to provide essential reliability services. These data showed how the development of advanced power controls can enable PV to become a provider of a wide range of grid services, including spinning reserves, load following, voltage support, ramping, frequency response, variability smoothing, and frequency regulation to power quality. Specifically, the tests conducted included various forms of active power control such as AGC and frequency regulation; droop response; and reactive power, voltage, and power factor controls. This project demonstrated that advanced power electronics and solar generation can be controlled to contribute to system-wide reliability. It was shown that the First Solar plant can provide essential reliability services related to different forms of active and reactive power controls, including plant participation in AGC, primary frequency control, ramp rate control, and voltage regulation. For AGC participation in particular, by comparing the PV plant testing results to the typical performance of individual conventional technologies, we showed that regulation accuracy by the PV plant is 24-30 points better than fast gas turbine technologies. The plant's ability to provide volt-ampere reactive control during periods of extremely low power generation was demonstrated as well. The project team developed a pioneering demonstration concept and test plan to show how various types of active and reactive power controls can leverage PV generation's value from being a simple variable energy resource to a resource that provides a wide range of ancillary services. With this project's approach to a holistic demonstration on an actual, large, utility-scale, operational PV power plant and dissemination of the obtained results, the team sought to close some gaps in perspectives that exist among various stakeholders in California and nationwide by providing real test data.« less
Kudo, Yuka; Nakagawa, Atsuo; Tamura, Noriko; Kato, Noriko; Williams, Aya; Aida, Nobuo; Mimura, Masaru
2016-07-01
Parker et al. (2006) proposed a new approach to classify specific sub-types of non-melancholic depression caused by various stress factors and premorbid personality styles: the Temperament and Personality Questionnaire (T&P). The current study aim was to develop the Japanese version of the T&P and evaluate its reliability and validity. We studied 114 patients with non-melancholic depression. Reliability was assessed using the test-retest method. Convergent validity of the T&P was compared with the clinician ratings of each patient for the eight personality traits. We also assessed the impact of depressive state on the T&P. The test-retest intraclass correlation coefficients among eight constructs of the T&P ranged from 0.77 to 0.89, indicating good-to-excellent reliability. Anxious Worrying (rho=0.29), Perfectionism (rho=0.17), Personal Reserve (rho=0.18), Irritability (rho=0.38), and Social Avoidance (rho=0.32) showed adequate levels of convergent validity; Rejection Sensitivity (rho=0.16), Self-criticism (rho=-0.02), and Self-focus (rho=0.07) showed relatively weak convergent validity. Perfectionism (rho=-0.06), Social Avoidance (rho=0.17), Anxious Worrying (rho=0.40), Personal Reserve (rho=0.30), Irritability (rho=0.28), Rejection Sensitivity (rho=0.35), Self-criticism (rho=0.49), and Self-focus (rho=0.24) showed minimal sensitivity to mood state effects. Only one site was used. While a Likert scale was used, the clinician-rated personality trait measure had not been validated. The J-T&P is a reliable and valid measure for assessing temperament and personality in Japanese patients with non-melancholic depression. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
A Human Reliability Based Usability Evaluation Method for Safety-Critical Software
DOE Office of Scientific and Technical Information (OSTI.GOV)
Phillippe Palanque; Regina Bernhaupt; Ronald Boring
2006-04-01
Recent years have seen an increasing use of sophisticated interaction techniques including in the field of safety critical interactive software [8]. The use of such techniques has been required in order to increase the bandwidth between the users and systems and thus to help them deal efficiently with increasingly complex systems. These techniques come from research and innovation done in the field of humancomputer interaction (HCI). A significant effort is currently being undertaken by the HCI community in order to apply and extend current usability evaluation techniques to these new kinds of interaction techniques. However, very little has been donemore » to improve the reliability of software offering these kinds of interaction techniques. Even testing basic graphical user interfaces remains a challenge that has rarely been addressed in the field of software engineering [9]. However, the non reliability of interactive software can jeopardize usability evaluation by showing unexpected or undesired behaviors. The aim of this SIG is to provide a forum for both researchers and practitioners interested in testing interactive software. Our goal is to define a roadmap of activities to cross fertilize usability and reliability testing of these kinds of systems to minimize duplicate efforts in both communities.« less
Buntragulpoontawee, Montana; Phutrit, Suphatha; Tongprasert, Siam; Wongpakaran, Tinakon; Khunachiva, Jeeranan
2018-03-27
This study evaluated additional psychometric properties of the Thai version of the disabilities of the arm, shoulder and hand questionnaire (DASH-TH) which included, test-retest reliability, construct validity, internal consistency of in patients with carpal tunnel syndrome. As for determining construct validity, the Thai EuroQOL questionnaire (EQ-5D-5L) was also administered in order to examine convergent and divergent validity. Fifty patients completed both questionnaires. The DASH-TH showed excellent test-retest reliability (intraclass correlation coefficient = 0.811) and internal consistency (Cronbach's alpha = 0.911). The exploratory factor analysis yielded a six-factor solution while the confirmatory factor analysis denoted that the hypothesized model adequately fit the data with a comparative fit index of 0.967 and a Tucker-Lewis index of 0.964. The related subscales between the DASH-TH and the Thai EQ-5D-5L were significantly correlated, indicating the DASH-TH's convergent and discriminant validity. The DASH-TH demonstrated good reliability, internal consistency construct validity, and multidimensionality, in assessing the upper extremity function in carpal tunnel syndrome patients.
An Update on the Clinical Utility of the Children's Post-Traumatic Cognitions Inventory.
McKinnon, Anna; Smith, Patrick; Bryant, Richard; Salmon, Karen; Yule, William; Dalgleish, Tim; Dixon, Clare; Nixon, Reginald D V; Meiser-Stedman, Richard
2016-06-01
The Children's Post-Traumatic Cognitions Inventory (CPTCI) is a self-report questionnaire that measures maladaptive cognitions in children and young people following exposure to trauma. In this study, the psychometric properties of the CPTCI were examined in further detail with the objective of furthering its utility as a clinical tool. Specifically, we investigated the CPTCI's discriminant validity, test-retest reliability, and the potential for the development of a short form of the measure. Three samples (London, East Anglia, Australia) of children and young people exposed to trauma (N = 535; 7-17 years old) completed the CPTCI and a structured clinical interview to measure posttraumatic stress disorder (PTSD) symptoms between 1 and 6 months following trauma. Test-retest reliability was investigated in a subsample of 203 cases. The results showed that a score in the range of 46 to 48 on the CPTCI was indicative of clinically significant appraisals as determined by the presence of PTSD. The measure also had moderate-to-high test-retest reliability (r = .78) over a 2-month period. The Children's Post-Traumatic Cognitions Inventory-Short Form (CPTCI-S) had excellent internal consistency (α = .92), and moderate-to-high test-retest reliability (r = .78). The examination of construct validity showed the model had an excellent fitting factor structure (Comparative Fit index = 0.95, Tucker-Lewis index = 0.91, Root Mean Square Error of Approximation = .07). A score ranging from 16 to 18 was the best cutoff point on the CPTCI-S, in that it was indicative of clinically significant appraisals as determined by the presence of PTSD. Based on these results, we concluded that the CPTCI is a useful tool to support the practice of clinicians and that the CPTCI-S has excellent psychometric properties. Copyright © 2016 International Society for Traumatic Stress Studies.
Castro-Díaz, D M; Esteban-Fuertes, M; Salinas-Casado, J; Bustamante-Alarma, S; Gago-Ramos, J L; Galacho-Bech, A; García-Matres, M J; Rodríguez-Toves, L A; Zubiaur-Líbano, C; Collado-Serra, A; Batista-Miranda, J E; Ortiz-Gámiz, A
2014-03-01
To evaluate the psychometric properties of the Spanish version of the ICIQ-Male Lower Urinary Tract Symptoms Questionnaire (ICIQ-MLUTS): Feasibility (% of completion and ceiling/ground effects), reliability (Test-retest), convergent validity (vs Bladder Control Self-Assessment Questionnaire [BSAQ] and vs International Prostate Symptom Score [I-PSS]) and criterion validity (according to presence or absence of symptoms). This was an observational, non-interventionist and multicenter study. 223 male patients with lower urinary tract symptoms (LUTS), predominantly storage symptoms and aged 18-65, took part in the study. Patients completed the ICIQ-MLUTS (test), I-PSS and BSAQ questionnaires and referred their urinary symptoms in a single visit, with the exception of a subgroup composed by 49 patients that completed the questionnaire again 15 days after initial visit to evaluate test-retest reliability. The questionnaire includes 13 items divided in 2 sub-scales: Voiding symptoms (V) from 0-20 and Incontinence symptoms (I) from 0-24. Percentage of patients that completed all items: 98.84%. Ground effect is 0 and ceiling effect was under 6% in both sub-scales. Test-retest reliability: Intraclass correlation coefficient (ICC) ranged from 0.68 to 0.88, except on Delay. Kappa shows a good agreement, between 0.60 and 0.81, except for Nocturia. Convergent validity: Correlation (Spearman) between the questionnaire sub-scales scores and the rest of measures is statistically significant (P < .01 and P < .05). Criterion validity: Statistically significant differences (P < .05) between scores on ICIQ-MLUTS, from patients that refer experiencing symptoms and those who do not. The Spanish version of the ICIQ-MLUTS questionnaire shows adequate feasibility, reliability and validity. Copyright © 2013 AEU. Published by Elsevier Espana. All rights reserved.
Than, Christian; Seidl, Laura; Tosovic, Danijel; Brown, J Mark
2018-05-12
This study investigated test-retest reliability of mechanomyography (MMG) on lumbar paraspinal muscles. Healthy male and female subjects (mean ± standard deviation, 25 ± 9.4 years, BMI 21.8 ± 2.99, n = 34) were recruited. Two test sessions (one week apart) consisted of MMG (laser displacement sensor (LDS)) muscle evaluations over the 10 lumbar facet joints, and 2 bilateral sacral sites, in anatomical extension and flexion. Two-way repeated measures ANOVA with Tukey's post hoc showed no significant differences between testing sessions for the same position (p > 0.05). The intra-class correlation coefficients (ICCs) in extension were classified as 'very good' (0.8-0.9) for maximal muscle displacement (Dmax), contraction time (Tc) and velocity of contraction (Vr). Half relaxation time (½Tr) and half relaxation velocity (½Vr) were 'poor' (0.4-0.5) and 'good' (0.7-0.8). In flexion, Dmax, Tc and Vr were 'excellent' (≥0.9) whilst ½Tr and ½Vr were 'fair' (0.6-0.7) and 'very good'. Comparing extension against flexion, significant (p < 0.05) differences in Dmax and ½Vr were found (L1/L2-L5/S1). Tc was significant (p < 0.05) for all sites whilst Vc was for L1/L2 on both sides (p < 0.05). ½Tr showed no significance (p > 0.05). Most MMG-derived parameters thus appear as reliable measures of muscle contractile properties in lumbar extension and flexion, with flexion providing more reliable results (ICCs). Copyright © 2018. Published by Elsevier Ltd.
A Monte Carlo Simulation Study of the Reliability of Intraindividual Variability
Estabrook, Ryne; Grimm, Kevin J.; Bowles, Ryan P.
2012-01-01
Recent research has seen intraindividual variability (IIV) become a useful technique to incorporate trial-to-trial variability into many types of psychological studies. IIV as measured by individual standard deviations (ISDs) has shown unique prediction to several types of positive and negative outcomes (Ram, Rabbit, Stollery, & Nesselroade, 2005). One unanswered question regarding measuring intraindividual variability is its reliability and the conditions under which optimal reliability is achieved. Monte Carlo simulation studies were conducted to determine the reliability of the ISD compared to the intraindividual mean. The results indicate that ISDs generally have poor reliability and are sensitive to insufficient measurement occasions, poor test reliability, and unfavorable amounts and distributions of variability in the population. Secondary analysis of psychological data shows that use of individual standard deviations in unfavorable conditions leads to a marked reduction in statistical power, although careful adherence to underlying statistical assumptions allows their use as a basic research tool. PMID:22268793
Predictive models of safety based on audit findings: Part 1: Model development and reliability.
Hsiao, Yu-Lin; Drury, Colin; Wu, Changxu; Paquet, Victor
2013-03-01
This consecutive study was aimed at the quantitative validation of safety audit tools as predictors of safety performance, as we were unable to find prior studies that tested audit validity against safety outcomes. An aviation maintenance domain was chosen for this work as both audits and safety outcomes are currently prescribed and regulated. In Part 1, we developed a Human Factors/Ergonomics classification framework based on HFACS model (Shappell and Wiegmann, 2001a,b), for the human errors detected by audits, because merely counting audit findings did not predict future safety. The framework was tested for measurement reliability using four participants, two of whom classified errors on 1238 audit reports. Kappa values leveled out after about 200 audits at between 0.5 and 0.8 for different tiers of errors categories. This showed sufficient reliability to proceed with prediction validity testing in Part 2. Copyright © 2012 Elsevier Ltd and The Ergonomics Society. All rights reserved.
Effects of extended lay-off periods on performance and operator trust under adaptable automation.
Chavaillaz, Alain; Wastell, David; Sauer, Jürgen
2016-03-01
Little is known about the long-term effects of system reliability when operators do not use a system during an extended lay-off period. To examine threats to skill maintenance, 28 participants operated twice a simulation of a complex process control system for 2.5 h, with an 8-month retention interval between sessions. Operators were provided with an adaptable support system, which operated at one of the following reliability levels: 60%, 80% or 100%. Results showed that performance, workload, and trust remained stable at the second testing session, but operators lost self-confidence in their system management abilities. Finally, the effects of system reliability observed at the first testing session were largely found again at the second session. The findings overall suggest that adaptable automation may be a promising means to support operators in maintaining their performance at the second testing session. Copyright © 2015 Elsevier Ltd and The Ergonomics Society. All rights reserved.
FACTOR ANALYSIS OF A SOCIAL SKILLS SCALE FOR HIGH SCHOOL STUDENTS.
Wang, H-Y; Lin, C-K
2015-10-01
The objective of this study was to develop a social skills scale for high school students in Taiwan. This study adopted stratified random sampling. A total of 1,729 high school students were included. The students ranged in age from 16 to 18 years. A Social Skills Scale was developed for this study and was designed for classroom teachers to fill out. The test-retest reliability of this scale was tested by Pearson's correlation coefficient. Exploratory factor analysis was used to determine construct validity. The Social Skills Scale had good overall test-retest reliability of .92, and the internal consistency of the five subscales was above .90. The results of the factor analysis showed that the Social Skills Scale covered the five domains of classroom learning skills, communication skills, individual initiative skills, interaction skills, and job-related social skills, and the five factors explained 68.34% of the variance. Thus, the Social Skills Scale had good reliability and validity and would be applicable to and could be promoted for use in schools.
Lange, Toni; Freiberg, Alice; Dröge, Patrik; Lützner, Jörg; Schmitt, Jochen; Kopkow, Christian
2015-06-01
Systematic literature review. Despite their frequent application in routine care, a systematic review on the reliability of clinical examination tests to evaluate the integrity of the ACL is missing. To summarize and evaluate intra- and interrater reliability research on physical examination tests used for the diagnosis of ACL tears. A comprehensive systematic literature search was conducted in MEDLINE, EMBASE and AMED until May 30th 2013. Studies were included if they assessed the intra- and/or interrater reliability of physical examination tests for the integrity of the ACL. Methodological quality was evaluated with the Quality Appraisal of Reliability Studies (QAREL) tool by two independent reviewers. 110 hits were achieved of which seven articles finally met the inclusion criteria. These studies examined the reliability of four physical examination tests. Intrarater reliability was assessed in three studies and ranged from fair to almost perfect (Cohen's k = 0.22-1.00). Interrater reliability was assessed in all included studies and ranged from slight to almost perfect (Cohen's k = 0.02-0.81). The Lachman test is the physical tests with the highest intrarater reliability (Cohen's k = 1.00), the Lachman test performed in prone position the test with the highest interrater reliability (Cohen's k = 0.81). Included studies were partly of low methodological quality. A meta-analysis could not be performed due to the heterogeneity in study populations, reliability measures and methodological quality of included studies. Systematic investigations on the reliability of physical examination tests to assess the integrity of the ACL are scarce and of varying methodological quality. Copyright © 2014 Elsevier Ltd. All rights reserved.
Developing self-concept instrument for pre-service mathematics teachers
NASA Astrophysics Data System (ADS)
Afgani, M. W.; Suryadi, D.; Dahlan, J. A.
2018-01-01
This study aimed to develop self-concept instrument for undergraduate students of mathematics education in Palembang, Indonesia. Type of this study was development research of non-test instrument in questionnaire form. A Validity test of the instrument was performed with construct validity test by using Pearson product moment and factor analysis, while reliability test used Cronbach’s alpha. The instrument was tested by 65 undergraduate students of mathematics education in one of the universities at Palembang, Indonesia. The instrument consisted of 43 items with 7 aspects of self-concept, that were the individual concern, social identity, individual personality, view of the future, the influence of others who become role models, the influence of the environment inside or outside the classroom, and view of the mathematics. The result of validity test showed there was one invalid item because the value of Pearson’s r was 0.107 less than the critical value (0.244; α = 0.05). The item was included in social identity aspect. After the invalid item was removed, Construct validity test with factor analysis generated only one factor. The Kaiser-Meyer-Olkin (KMO) coefficient was 0.846 and reliability coefficient was 0.91. From that result, we concluded that the self-concept instrument for undergraduate students of mathematics education in Palembang, Indonesia was valid and reliable with 42 items.
Reliable change on the Boston naming test.
Sachs, Bonnie C; Lucas, John A; Smith, Glenn E; Ivnik, Robert J; Petersen, Ronald C; Graff-Radford, Neill R; Pedraza, Otto
2012-03-01
Serial assessments are commonplace in neuropsychological practice and used to document cognitive trajectory for many clinical conditions. However, true change scores may be distorted by measurement error, repeated exposure to the assessment instrument, or person variables. The present study provides reliable change indices (RCI) for the Boston Naming Test, derived from a sample of 844 cognitively normal adults aged 56 years and older. All participants were retested between 9 and 24 months after their baseline exam. Results showed that a 4-point decline during a 9-15 month retest period or a 6-point decline during a 16-24 month retest period represents reliable change. These cutoff values were further characterized as a function of a person's age and family history of dementia. These findings may help clinicians and researchers to characterize with greater precision the temporal changes in confrontation naming ability.
Cross-cultural Adaption and Validation of the Danish Voice Handicap Index.
Sorensen, Jesper Roed; Printz, Trine; Mehlum, Camilla Slot; Heidemann, Christian Hamilton; Groentved, Aagot Moeller; Godballe, Christian
2018-02-02
We aimed to assess psychometric properties, including internal consistency, reliability, and clinical validity of the Danish version of the Voice Handicap Index (VHI). A cross-sectional survey study was carried out. For validation, the existing nonvalidated Danish version of the VHI was used. Data from 208 patients with voice disorders of different etiology (neurogenic, functional, and structural) and a control group of 85 vocally healthy individuals were included. A test-retest reliability analysis of 42 patients and 45 control persons was performed. The internal consistency, test-retest reliability, and clinical validity of the questionnaire were assessed. Internal consistency was high with a Cronbach α >0.90 for both the patient and control group. Test-retest reliability measured as intraclass correlation coefficient was good with 0.93 (95% confidence interval [95% confidence interval]: 0.87-0.96) for patients and 0.78 (95% confidence interval: 0.63-0.87) for the control group which indicates sufficient reliability of the questionnaire. The Danish VHI has good clinical validity as it has a strong correlation between patient's perception of the severity of their voice disorder and the VHI score from the Spearman correlation of 0.69. The existing Danish version of the VHI has been thoroughly validated and found to be in line with the original VHI from Jacobsen et al. It showed good internal consistency, test-retest reliability, and clinical validity. It is suitable for use in daily practice and in research projects as it is able to assess patients' perception of their voice disorder severity. Copyright © 2018 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
Validation of hindi translation of DSM-5 level 1 cross-cutting symptom measure.
Goel, Ankit; Kataria, Dinesh
2018-04-01
The DSM-5 Level 1 Cross-Cutting Symptom Measure is a self- or informant-rated measure that assesses mental health domains which are important across psychiatric diagnoses. The absence of this self- or informant-administered instrument in Hindi, which is a major language in India, is an important limitation in using this scale. To translate the English version of the DSM-5 Level 1 Cross-Cutting Symptom Measure to Hindi and evaluate its psychometric properties. The study was conducted at a tertiary care hospital in Delhi. The DSM-5 Level 1 Cross-Cutting Symptom Measure was translated into Hindi using the World Health Organization's translation methodology. Mean and standard deviation were evaluated for continuous variables while for categorical variables frequency and percentages were calculated. The translated version was evaluated for cross-language equivalence, test-retest reliability, internal consistency, and split half reliability. Hindi version was found to have good cross-language equivalence and test-retest reliability at the level of items and domains. Twenty two of the 23 items and all the 23 items had a significant correlation (ρ < 0.001) in cross language concordance and test-retest reliability data, respectively. The Cronbach's alpha was 0.95, and the Spearman-Brown Sphericity value was 0.79 for the Hindi version. The present study shows that cross-language concordance, internal consistency, split-half reliability, and test-retest reliability of the Hindi version of the measure are excellent. Thus, the Hindi version of DSM-5 Level 1 Cross-Cutting Symptom Measure as translated in this study is a valid instrument. Copyright © 2018 Elsevier B.V. All rights reserved.
Perez, Concepcion; Galvez, Rafael; Huelbes, Silvia; Insausti, Joaquin; Bouhassira, Didier; Diaz, Silvia; Rejas, Javier
2007-01-01
Background This study assesses the validity and reliability of the Spanish version of DN4 questionnaire as a tool for differential diagnosis of pain syndromes associated to a neuropathic (NP) or somatic component (non-neuropathic pain, NNP). Methods A study was conducted consisting of two phases: cultural adaptation into the Spanish language by means of conceptual equivalence, including forward and backward translations in duplicate and cognitive debriefing, and testing of psychometric properties in patients with NP (peripheral, central and mixed) and NNP. The analysis of psychometric properties included reliability (internal consistency, inter-rater agreement and test-retest reliability) and validity (ROC curve analysis, agreement with the reference diagnosis and determination of sensitivity, specificity, and positive and negative predictive values in different subsamples according to type of NP). Results A sample of 164 subjects (99 women, 60.4%; age: 60.4 ± 16.0 years), 94 (57.3%) with NP (36 with peripheral, 32 with central, and 26 with mixed pain) and 70 with NNP was enrolled. The questionnaire was reliable [Cronbach's alpha coefficient: 0.71, inter-rater agreement coefficient: 0.80 (0.71–0.89), and test-retest intra-class correlation coefficient: 0.95 (0.92–0.97)] and valid for a cut-off value ≥ 4 points, which was the best value to discriminate between NP and NNP subjects. Discussion This study, representing the first validation of the DN4 questionnaire into another language different than the original, not only supported its high discriminatory value for identification of neuropathic pain, but also provided supplemental psychometric validation (i.e. test-retest reliability, influence of educational level and pain intensity) and showed its validity in mixed pain syndromes. PMID:18053212
Terada, Tasuku; Loehr, Sarah; Guigard, Emmanuel; McCargar, Linda J; Bell, Gordon J; Senior, Peter; Boulé, Normand G
2014-08-01
This study determined the test-retest reliability of a continuous glucose monitoring system (CGMS) (iPro™2; Medtronic, Northridge, CA) under standardized conditions in individuals with type 2 diabetes (T2D). Fourteen individuals with T2D spent two nonconsecutive days in a calorimetry unit. On both days, meals, medication, and exercise were standardized. Glucose concentrations were measured continuously by CGMS, from which daily mean glucose concentration (GLU(mean)), time spent in hyperglycemia (t(>10.0 mmol/L)), and meal, exercise, and nocturnal mean glucose concentrations, as well as glycemic variability (SD(w), percentage coefficient of variation [%cv(w)], mean amplitude of glycemic excursions [MAGEc, MAGE(ave), and MAGE(abs.gos)], and continuous overlapping net glycemic action [CONGA(n)]) were estimated. Absolute and relative reliabilities were investigated using coefficient of variation (CV) and intraclass correlation, respectively. Relative reliability ranged from 0.77 to 0.95 (P<0.05) for GLU(mean) and meal, exercise, and nocturnal glycemia with CV ranging from 3.9% to 11.7%. Despite significant relative reliability (R=0.93; P<0.01), t(>10.0 mmol/L) showed larger CV (54.7%). Among the different glycemic variability measures, a significant between-day difference was observed in MAGEc, MAGE(ave), CONGA6, and CONGA12. The remaining measures (i.e., SD(w), %cv(w), MAGE(abs.gos), and CONGA1-4) indicated no between-day differences and significant relative reliability. In individuals with T2D, CGMS-estimated glycemic profiles were characterized by high relative and absolute reliability for both daily and shorter-term measurements as represented by GLUmean and meal, exercise, and nocturnal glycemia. Among the different methods to calculate glycemic variability, our results showed SD(w), %cv(w), MAGE(abs.gos), and CONGAn with n ≤ 4 were reliable measures. These results suggest the usefulness of CGMS in clinical trials utilizing repeated measured.
Reliability and Validity Assessment of a Linear Position Transducer
Garnacho-Castaño, Manuel V.; López-Lastra, Silvia; Maté-Muñoz, José L.
2015-01-01
The objectives of the study were to determine the validity and reliability of peak velocity (PV), average velocity (AV), peak power (PP) and average power (AP) measurements were made using a linear position transducer. Validity was assessed by comparing measurements simultaneously obtained using the Tendo Weightlifting Analyzer Systemi and T-Force Dynamic Measurement Systemr (Ergotech, Murcia, Spain) during two resistance exercises, bench press (BP) and full back squat (BS), performed by 71 trained male subjects. For the reliability study, a further 32 men completed both lifts using the Tendo Weightlifting Analyzer Systemz in two identical testing sessions one week apart (session 1 vs. session 2). Intraclass correlation coefficients (ICCs) indicating the validity of the Tendo Weightlifting Analyzer Systemi were high, with values ranging from 0.853 to 0.989. Systematic biases and random errors were low to moderate for almost all variables, being higher in the case of PP (bias ±157.56 W; error ±131.84 W). Proportional biases were identified for almost all variables. Test-retest reliability was strong with ICCs ranging from 0.922 to 0.988. Reliability results also showed minimal systematic biases and random errors, which were only significant for PP (bias -19.19 W; error ±67.57 W). Only PV recorded in the BS showed no significant proportional bias. The Tendo Weightlifting Analyzer Systemi emerged as a reliable system for measuring movement velocity and estimating power in resistance exercises. The low biases and random errors observed here (mainly AV, AP) make this device a useful tool for monitoring resistance training. Key points This study determined the validity and reliability of peak velocity, average velocity, peak power and average power measurements made using a linear position transducer The Tendo Weight-lifting Analyzer Systemi emerged as a reliable system for measuring movement velocity and power. PMID:25729300
Perceiving numbers does not cause automatic shifts of spatial attention.
Fattorini, Enrico; Pinto, Mario; Rotondaro, Francesca; Doricchi, Fabrizio
2015-12-01
It is frequently assumed that the brain codes number magnitudes according to an inherent left-to-right spatial organization. In support of this hypothesis it has been reported that in humans, perceiving small numbers induces automatic shifts of attention toward the left side of space whereas perceiving large numbers automatically shifts attention to the right side of space (i.e., Attentional SNARC: Att-SNARC; Fischer, Castel, Dodd, & Pratt, 2003). Nonetheless, the Att-SNARC has been often not replicated and its reliability never tested. To ascertain whether the mere perception of numbers causes shifts of spatial attention or whether number-space interaction takes place at a different stage of cognitive processing, we re-assessed the consistency and reliability of the Att-SNARC and investigated its role in the production of SNARC effects in Parity Judgement (PJ) and Magnitude Comparison (MC) tasks. In a first study in 60 participants, we found no Att-SNARC, despite finding strong PJ- and MC-SNARC effects. No correlation was present between the Att-SNARC and the SNARC. Split-half tests showed no reliability of the Att-SNARC and high reliability of the PJ- and MC-SNARC. In a second study, we re-assessed the Att-SNARC and tested its direct influence on a MC-SNARC task with laterally presented targets. No Att-SNARC and no influence of the Att-SNARC on the MC-SNARC were found. Also in this case, the SNARC was reliable whereas the Att-SNARC task was not. Finally, in a third study we observed a significant Att-SNARC when participants were asked to recall the position occupied on a ruler by the numbers presented in each trial: however the Att-SNARC task was not reliable. These results show that perceiving numbers does not cause automatic shifts of spatial attention and that whenever present, these shifts do not modulate the SNARC. The same results suggest that numbers have no inherent mental left-to-right organization and that, whenever present, this organization can have both response-related and strategically driven memory-related origins. Nonetheless, response-related factors generate more reliable and stable spatial representations of numbers. Copyright © 2015 Elsevier Ltd. All rights reserved.
Influences on and Limitations of Classical Test Theory Reliability Estimates.
ERIC Educational Resources Information Center
Arnold, Margery E.
It is incorrect to say "the test is reliable" because reliability is a function not only of the test itself, but of many factors. The present paper explains how different factors affect classical reliability estimates such as test-retest, interrater, internal consistency, and equivalent forms coefficients. Furthermore, the limits of classical test…
Maki, Dana; Rajab, Ebrahim; Watson, Paul J; Critchley, Duncan J
2014-12-01
Cross-cultural translation, adaptation, and psychometric testing. To cross-culturally translate and adapt the Roland-Morris Disability Questionnaire (RMDQ) into Modern Standard Arabic and examine its validity with Arabic-speaking patients with low back pain (LBP). The English RMDQ is valid, reliable, and commonly used to assess LBP disability in clinical practice and research. There is no valid and reliable version of the RMDQ in Modern Standard Arabic. The RMDQ was forward translated and back translated. An expert committee of musculoskeletal physiotherapists reviewed the translation. Eight patients with LBP evaluated item-by-item comprehensibility. Ten patients piloted the RMDQ for overall comprehensibility and acceptability. Seventeen bilingual patients tested the agreement of the Arabic and English RMDQs. Two-hundred one patients completed the RMDQ and the visual analogue scale. Sixty-four patients were followed-up for test-retest reliability. Translation of most items was uncontroversial. The expert committee found the Arabic RMDQ clinically and culturally appropriate. They reviewed item 11, addressing bending and kneeling, because this has a clinical significance and cultural/religious implication regarding prayer positions. All patients reported that it was easy to understand and complete. The Arabic RMDQ had high overall agreement with the English RMDQ for the global score (intraclass correlation coefficient [ICC] = 0.925; 0.811-0.972). Kappa statistics showed good item-by-item agreement (none ≤0.30). Mean (SD) RMDQ and visual analog scale scores of 201 patients were 10.53 (4.80) and 5.11 (2.28), respectively. The RMDQ had a low correlation against pain intensity (r = 0.259; P < 0.01). A Cronbach α of 0.729 showed high internal consistency. Test-retest reliability of the Arabic RMDQ was good (ICC = 0.900; 95% confidence interval, 0.753-0.951). Kappa statistics were high for 18 items and fair for 6. The Arabic version of the RMDQ has good comprehensibility and acceptability, high internal consistency and reliability, low correlation against pain intensity, and good agreement with the English RMDQ. We recommend its use with Arabic-speaking patients with LBP. 3.
Tucker, Neil; Reid, Duncan; McNair, Peter
2007-01-01
The slump test is a tool to assess the mechanosensitivity of the neuromeningeal structures within the vertebral canal. While some studies have investigated the reliability of aspects of this test within the same day, few have assessed the reliability across days. Therefore, the purpose of this pilot study was to investigate reliability when measuring active knee extension range of motion (AROM) in a modified slump test position within trials on a single day and across days. Ten male and ten female asymptomatic subjects, ages 20-49 (mean age 30.1, SD 6.4) participated in the study. Knee extension AROM in a modified slump position with the cervical spine in a flexed position and then in an extended position was measured via three trials on two separate days. Across three trials, knee extension AROM increased significantly with a mean magnitude of 2 degrees within days for both cervical spine positions (P>0.05). The findings showed that there was no statistically significant difference in knee extension AROM measurements across days (P>0.05). The intraclass correlation coefficients for the mean of the three trials across days were 0.96 (lower limit 95% CI: 0.90) with the cervical spine flexed and 0.93 (lower limit 95% CI: 0.83) with cervical extension. Measurement error was calculated by way of the typical error and 95% limits of agreement, and visually represented in Bland and Altman plots. The typical error for the cervical flexed and extended positions averaged across trials was 2.6 degrees and 3.3 degrees , respectively. The limits of agreement were narrow, and the Bland and Altman plots also showed minimal bias in the joint angles across days with a random distribution of errors across the range of measured angles. This study demonstrated that knee extension AROM could be reliably measured across days in subjects without pathology and that the measurement error was acceptable. Implications of variability over multiple trials are discussed. The modified set-up for the test using the Kincom dynamometer and elevated thigh position may be useful to clinical researchers in determining the mechanosensitivity of the nervous system.
Tucker, Neil; Reid, Duncan; McNair, Peter
2007-01-01
The slump test is a tool to assess the mechanosensitivity of the neuromeningeal structures within the vertebral canal. While some studies have investigated the reliability of aspects of this test within the same day, few have assessed the reliability across days. Therefore, the purpose of this pilot study was to investigate reliability when measuring active knee extension range of motion (AROM) in a modified slump test position within trials on a single day and across days. Ten male and ten female asymptomatic subjects, ages 20–49 (mean age 30.1, SD 6.4) participated in the study. Knee extension AROM in a modified slump position with the cervical spine in a flexed position and then in an extended position was measured via three trials on two separate days. Across three trials, knee extension AROM increased significantly with a mean magnitude of 2° within days for both cervical spine positions (P>0.05). The findings showed that there was no statistically significant difference in knee extension AROM measurements across days (P>0.05). The intraclass correlation coefficients for the mean of the three trials across days were 0.96 (lower limit 95% CI: 0.90) with the cervical spine flexed and 0.93 (lower limit 95% CI: 0.83) with cervical extension. Measurement error was calculated by way of the typical error and 95% limits of agreement, and visually represented in Bland and Altman plots. The typical error for the cervical flexed and extended positions averaged across trials was 2.6° and 3.3°, respectively. The limits of agreement were narrow, and the Bland and Altman plots also showed minimal bias in the joint angles across days with a random distribution of errors across the range of measured angles. This study demonstrated that knee extension AROM could be reliably measured across days in subjects without pathology and that the measurement error was acceptable. Implications of variability over multiple trials are discussed. The modified set-up for the test using the Kincom dynamometer and elevated thigh position may be useful to clinical researchers in determining the mechanosensitivity of the nervous system. PMID:19066666
Reliability of the Q Force; a mobile instrument for measuring isometric quadriceps muscle strength.
Douma, K W; Regterschot, G R H; Krijnen, W P; Slager, G E C; van der Schans, C P; Zijlstra, W
2016-01-01
The ability to generate muscle strength is a pre-requisite for all human movement. Decreased quadriceps muscle strength is frequently observed in older adults and is associated with a decreased performance and activity limitations. To quantify the quadriceps muscle strength and to monitor changes over time, instruments and procedures with a sufficient reliability are needed. The Q Force is an innovative mobile muscle strength measurement instrument suitable to measure in various degrees of extension. Measurements between 110 and 130° extension present the highest values and the most significant increase after training. The objective of this study is to determine the test-retest reliability of muscle strength measurements by the Q Force in older adults in 110° extension. Forty-one healthy older adults, 13 males and 28 females were included in the study. Mean (SD) age was 81.9 (4.89) years. Isometric muscle strength of the Quadriceps muscle was assessed with the Q Force at 110° of knee extension. Participants were measured at two sessions with a three to eight day interval between sessions. To determine relative reliability, the intraclass correlation coefficient (ICC) was calculated. To determine absolute reliability, Bland and Altman Limits of Agreement (LOA) were calculated and t-tests were performed. Relative reliability of the Q Force is good to excellent as all ICC coefficients are higher than 0.75. Generally a large 95 % LOA, reflecting only moderate absolute reliability, is found as exemplified for the peak torque left leg of -18.6 N to 33.8 N and the right leg of -9.2 N to 26.4 N was between 15.7 and 23.6 Newton representing 25.2 % to 39.9 % of the size of the mean. Small systematic differences in mean were found between measurement session 1 and 2. The present study shows that the Q Force has excellent relative test-retest reliability, but limited absolute test-retest reliability. Since the Q Force is relatively cheap and mobile it is suitable for application in various clinical settings, however, its capability to detect changes in muscle force over time is limited but comparable to existing instruments.
Alagha, M Abdulhadi; Alagha, Mahmoud A; Dunstan, Eleanor; Sperwer, Olaf; Timmins, Kate A; Boszczyk, Bronek M
2017-04-01
To assess the reliability and validity of a hand motion sensor, Leap Motion Controller (LMC), in the 15-s hand grip-and-release test, as compared against human inspection of an external digital camera recording. Fifty healthy participants were asked to fully grip-and-release their dominant hand as rapidly as possible for two trials with a 10-min rest in-between, while wearing a non-metal wrist splint. Each test lasted for 15 s, and a digital camera was used to film the anterolateral side of the hand on the first test. Three assessors counted the frequency of grip-and-release (G-R) cycles independently and in a blinded fashion. The average mean of the three was compared with that measured by LMC using the Bland-Altman method. Test-retest reliability was examined by comparing the two 15-s tests. The mean number of G-R cycles recorded was: 47.8 ± 6.4 (test 1, video observer); 47.7 ± 6.5 (test 1, LMC); and 50.2 ± 6.5 (test 2, LMC). Bland-Altman indicated good agreement, with a low bias (0.15 cycles) and narrow limits of agreement. The ICC showed high inter-rater agreement and the coefficient of repeatability for the number of cycles was ±5.393, with a mean bias of 3.63. LMC appears to be valid and reliable in the 15-s grip-and-release test. This serves as a first step towards the development of an objective myelopathy assessment device and platform for the assessment of neuromotor hand function in general. Further assessment in a clinical setting and to gauge healthy benchmark values is warranted.
Forward Skirt Structural Testing on the Space Launch System (SLS) Program
NASA Technical Reports Server (NTRS)
Lohrer, J. D.; Wright, R. D.
2016-01-01
Structural testing was performed to evaluate heritage forward skirts from the Space Shuttle program for use on the NASA Space Launch System (SLS) program. Testing was needed because SLS ascent loads are 35% higher than Space Shuttle loads. Objectives of testing were to determine margins of safety, demonstrate reliability, and validate analytical models. Testing combined with analysis was able to show heritage forward skirts were acceptable to use on the SLS program.
Hashimoto, Ryusaku; Kashiwagi, Mitsuru; Suzuki, Shuhei
2008-09-01
We developed a rapid word reading test for examining the phonological processing ability of Japanese children. We prepared two versions of the test, version A and B. Each test has word and non-word tasks. Twenty-two healthy boys of third grade in primary schools participated in this validation study. For criterion related validity, we performed the serial Hiragana reading test, the sentence reading test, Raven's coloured progressive matrices (RCPM), the Token test for children, the Kana word dictation test, the standardized comprehension test of abstract words (SCTAW), and Trail Circle test. The reading times of the newly developed test correlated moderately or highly with those of the serial Hiragana reading test and the sentence reading test. However, the scores of the other tests (RCPM, Token test for children, Kana word dictation test, SCTAW, Trail Circle test) did not correlated with the reading time of the rapid word reading test. Test-retest reliabilities in the word tasks were more than moderate: 0.52 and 0.76 in versions A and B, while those in the non-word tasks were high: 0.91 and 0.88 in versions A and B. The correlation coefficient between versions A and B was 0.7 for the word tasks and 0.92 for the non-word tasks. This study showed that the rapid word reading test has substantial validity and reliability for testing the phonological processing ability of Japanese children. In addition, the non-word tasks were more suitable for selectively examining the speed of the grapheme to phoneme conversion process.
A reliable unipedal stance test for the assessment of balance using a force platform.
Ponce-González, J G; Sanchis-Moysi, J; González-Henriquez, J J; Arteaga-Ortiz, R; Calbet, J A L; Dorado, C
2014-02-01
The aim was to develop a unipedal stance test for the assessment of balance using a force platform. A single-leg balance test was conducted in 23 students (mean ± SD) age: 23 ± 3 years) in a standard position limiting the movement of the arms and non-supporting leg. Six attempts, with both the jumping (JL) and the contralateral leg (CL), were performed under 3 conditions: 1) eyes opened; 2) eyes closed; 3) eyes opened and executing a precision task. The same protocol was repeated two-week apart. The mean and the best result of the six attempts performed each day were taken as representative of balance. The speed of the centre of pressure (CP-Speed) showed excellent reliability for the "best result" analysis in all tests (ICCs 0.87-0.97), except in the test with the eyes closed performed on the CL (ICC<0.4). The CP-Speed had better reliability with the "best result" than with the "mean result" analysis (P<0.05), whilst no significant differences were observed between the JL and the CL (P=0.71 and P=0.96 for mean and best results analysis, respectively). A lower dispersion in the Bland and Altman graph was observed with the eyes opened than closed, and the dynamic test. The single-leg stance balance test proposed is a reliable method to assess balance, especially when performed in a static position, with the eyes opened and using the best result of six attempts as reference, independently of the stance leg.
Gehling, Julia; Mainka, Tina; Vollert, Jan; Pogatzki-Zahn, Esther M; Maier, Christoph; Enax-Krumova, Elena K
2016-08-05
Conditioned Pain Modulation (CPM) is often used to assess human descending pain inhibition. Nine different studies on the test-retest-reliability of different CPM paradigms have been published, but none of them has investigated the commonly used heat-cold-pain method. The results vary widely and therefore, reliability measures cannot be extrapolated from one CPM paradigm to another. Aim of the present study was to analyse the test-retest-reliability of the common heat-cold-pain method and its correlation to pain thresholds. We tested the short-term test-retest-reliability within 40 ± 19.9 h using a cold-water immersion (10 °C, left hand) as conditioning stimulus (CS) and heat pain (43-49 °C, pain intensity 60 ± 5 on the 101-point numeric rating scale, right forearm) as test stimulus (TS) in 25 healthy right-handed subjects (12females, 31.6 ± 14.1 years). The TS was applied 30s before (TSbefore), during (TSduring) and after (TSafter) the 60s CS. The difference between the pain ratings for TSbefore and TSduring represents the early CPM-effect, between TSbefore and TSafter the late CPM-effect. Quantitative sensory testing (QST, DFNS protocol) was performed on both sessions before the CPM assessment. paired t-tests, Intraclass correlation coefficient (ICC), standard error of measurement (SEM), smallest real difference (SRD), Pearson's correlation, Bland-Altman analysis, significance level p < 0.05 with Bonferroni correction for multiple comparisons, when necessary. Pain ratings during CPM correlated significantly (ICC: 0.411…0.962) between both days, though ratings for TSafter were lower on day 2 (p < 0.005). The early (day 1: 16.7 ± 11.7; day 2: 19.5 ± 11.9; ICC: 0.618, SRD: 20.2) and late (day 1: 1.7 ± 9.2; day 2: 7.6 ± 11.5; ICC: 0.178, SRD: 27.0) CPM effect did not differ significantly between both days. Both early and late CPM-effects did not correlate with the pain thresholds. The short-term test-retest-reliability of the early CPM-effect using the heat-cold-pain method in healthy subjects achieved satisfying results in terms of the ICC. The SRD of the early CPM effect showed that an individual change of > 20 NRS can be attributed to a real change rather than chance. The late CPM-effect was weaker and not reliable.
ERIC Educational Resources Information Center
Khorashad, Behzad S.; Baron-Cohen, Simon; Roshan, Ghasem M.; Kazemian, Mojtaba; Khazai, Ladan; Aghili, Zahra; Talaei, Ali; Afkhamizadeh, Mozhgan
2015-01-01
The psychometric properties of the Persian "Reading the Mind in the Eyes" test were investigated, so were the predictions from the Empathizing-Systemizing theory of psychological sex differences. Adults aged 16-69 years old (N = 545, female = 51.7%) completed the test online. The analysis of items showed them to be generally acceptable.…
ERIC Educational Resources Information Center
Lee, Guemin; Park, In-Yong
2012-01-01
Previous assessments of the reliability of test scores for testlet-composed tests have indicated that item-based estimation methods overestimate reliability. This study was designed to address issues related to the extent to which item-based estimation methods overestimate the reliability of test scores composed of testlets and to compare several…
Innes, Carrie R H; Jones, Richard D; Anderson, Tim J; Hollobon, Susan G; Dalrymple-Alford, John C
2009-05-01
Currently, there is no international standard for the assessment of fitness to drive for cognitively or physically impaired persons. A computerized battery of driving-related sensory-motor and cognitive tests (SMCTests) has been developed, comprising tests of visuoperception, visuomotor ability, complex attention, visual search, decision making, impulse control, planning, and divided attention. Construct validity analysis was conducted in 60 normal, healthy subjects and showed that, overall, the novel cognitive tests assessed cognitive functions similar to a set of standard neuropsychological tests. The novel tests were found to have greater perceived face validity for predicting on-road driving ability than was found in the equivalent standard tests. Test-retest stability and reliability of SMCTests measures, as well as correlations between SMCTests and on-road driving, were determined in a subset of 12 subjects. The majority of test measures were stable and reliable across two sessions, and significant correlations were found between on-road driving scores and measures from ballistic movement, footbrake reaction, hand-control reaction, and complex attention. The substantial face validity, construct validity, stability, and reliability of SMCTests, together with the battery's level of correlation with on-road driving in normal subjects, strengthen our confidence in the ability of SMCTests to detect and identify sensory-motor and cognitive deficits related to unsafe driving and increased risk of accidents.
Bruce, Jared; Echemendia, Ruben; Tangeman, Lindy; Meeuwisse, Willem; Comper, Paul; Hutchison, Michael; Aubry, Mark
2016-01-01
Computerized neuropsychological tests are frequently used to assist in return-to-play decisions following sports concussion. However, due to concerns about test reliability, the Centers for Disease Control and Prevention recommends yearly baseline testing. The standard practice that has developed in baseline/postinjury comparisons is to examine the difference between the most recent baseline test and postconcussion performance. Drawing from classical test theory, the present study investigated whether temporal stability could be improved by taking an alternate approach that uses the aggregate of 2 baselines to more accurately estimate baseline cognitive ability. One hundred fifteen English-speaking professional hockey players with 3 consecutive Immediate Postconcussion Assessment and Testing (ImPACT) baseline tests were extracted from a clinical program evaluation database overseen by the National Hockey League and National Hockey League Players' Association. The temporal stability of ImPACT composite scores was significantly increased by aggregating test performance during Sessions 1 and 2 to predict performance during Session 3. Using this approach, the 2-factor Memory (r = .72) and Speed (r = .79) composites of ImPACT showed acceptable long-term reliability. Using the aggregate of 2 baseline scores significantly improves temporal stability and allows for more accurate predictions of cognitive change following concussion. Clinicians are encouraged to estimate baseline abilities by taking into account all of an athlete's previous baseline scores.
Gustafsson, Peik; Svedin, Carl Göran; Ericsson, Ingegerd; Lindén, Christian; Karlsson, Magnus K; Thernlund, Gunilla
2010-04-01
To study the value and reliability of an examination of neurological soft-signs, often used in Sweden, in the assessment of children with attention-deficit-hyperactivity disorder (ADHD), by examining children with and without ADHD, as diagnosed by an experienced clinician using the DSM-III-R. We have examined interrater reliability (26 males, nine females; age range 5y 6mo-11y), internal consistency (94 males, 43 females; age range 5y 6mo-11y), test-retest reliability (12 males, eight females; age range 6-9y), and validity (79 males, 33 females; age range 5y 6mo-9y). The sum of the scores for the items on the examination had good interrater reliability (intraclass correlation [ICC] 0.95) and acceptable internal consistency (Cronbach's alpha 0.76). The test-retest study also showed good reliability (ICC 0.91). There were modest associations between the examination and the assessment of motor function made by the physical education teacher (ICC 0.37) as well as from the parents' description (ICC 0.39). The examination of neurological soft-signs had a sensitivity of 0.80 and a specificity of 0.76 in predicting motor problems as evaluated by the physical education teacher. The reliability and validity of this examination seem to be good and can be recommended for clinical practice and research.
Miki, Emi; Yamane, Shingo; Yamaoka, Mai; Fujii, Hiroe; Ueno, Hiroka; Kawahara, Toshie; Tanaka, Keiko; Tamashiro, Hiroaki; Inoue, Eiji; Okamoto, Takatsugu; Kuriyama, Masaru
2016-09-01
The study aim was to investigate the validity and reliability of the Functional Independence Measure and Functional Assessment Measure (FIM + FAM), which is unfamiliar in Japan, by using its Japanese version (FIM + FAM-j) in patients with cerebrovascular accident (CVA). Forty-two CVA patients participated. Criterion validity was examined by correlating the full scale and subscales of FIM + FAM-j with several well-established measurements using Spearman's correlation coefficient. Reliability was evaluated by internal consistency (tested by Cronbach's alpha coefficient) and intra-rater reliability (tested by Kendall's tau correlation coefficient). Good-to-excellent criterion validity was found between the full scale and motor subscales of the FIM + FAM-j and the Barthel Index, National Institutes of Health Stroke Scale, modified Rankin Scale, and lower extremity Brunnstrom Recovery Stage. High internal consistency was observed within the full-scale FIM + FAM-j and the motor and cognitive subscales (Cronbach's alphas were 0.968, 0.954, and 0.948, respectively). Additionally, good intra-rater reliability was observed within the full scale and motor subscales, and excellent reliability for the cognitive subscales (taus were 0.83, 0.80, and 0.98, respectively). This study showed that the FIM + FAM-j demonstrated acceptable levels of validity and reliability when used for CVA as a measure of disability.
Test-Retest Reliability of Diffusion Tensor Imaging in Huntington's Disease.
Cole, James H; Farmer, Ruth E; Rees, Elin M; Johnson, Hans J; Frost, Chris; Scahill, Rachael I; Hobbs, Nicola Z
2014-03-21
Diffusion tensor imaging (DTI) has shown microstructural abnormalities in patients with Huntington's Disease (HD) and work is underway to characterise how these abnormalities change with disease progression. Using methods that will be applied in longitudinal research, we sought to establish the reliability of DTI in early HD patients and controls. Test-retest reliability, quantified using the intraclass correlation coefficient (ICC), was assessed using region-of-interest (ROI)-based white matter atlas and voxelwise approaches on repeat scan data from 22 participants (10 early HD, 12 controls). T1 data was used to generate further ROIs for analysis in a reduced sample of 18 participants. The results suggest that fractional anisotropy (FA) and other diffusivity metrics are generally highly reliable, with ICCs indicating considerably lower within-subject compared to between-subject variability in both HD patients and controls. Where ICC was low, particularly for the diffusivity measures in the caudate and putamen, this was partly influenced by outliers. The analysis suggests that the specific DTI methods used here are appropriate for cross-sectional research in HD, and give confidence that they can also be applied longitudinally, although this requires further investigation. An important caveat for DTI studies is that test-retest reliability may not be evenly distributed throughout the brain whereby highly anisotropic white matter regions tended to show lower relative within-subject variability than other white or grey matter regions.
49 CFR 192.717 - Transmission lines: Permanent field repair of leaks.
Code of Federal Regulations, 2012 CFR
2012-10-01
... encirclement welded split sleeve of appropriate design, unless the transmission line is joined by mechanical... method that reliable engineering tests and analyses show can permanently restore the serviceability of...
49 CFR 192.717 - Transmission lines: Permanent field repair of leaks.
Code of Federal Regulations, 2014 CFR
2014-10-01
... encirclement welded split sleeve of appropriate design, unless the transmission line is joined by mechanical... method that reliable engineering tests and analyses show can permanently restore the serviceability of...
49 CFR 192.717 - Transmission lines: Permanent field repair of leaks.
Code of Federal Regulations, 2013 CFR
2013-10-01
... encirclement welded split sleeve of appropriate design, unless the transmission line is joined by mechanical... method that reliable engineering tests and analyses show can permanently restore the serviceability of...
49 CFR 192.717 - Transmission lines: Permanent field repair of leaks.
Code of Federal Regulations, 2011 CFR
2011-10-01
... encirclement welded split sleeve of appropriate design, unless the transmission line is joined by mechanical... method that reliable engineering tests and analyses show can permanently restore the serviceability of...
Dijkhuizen, Annemarie; Douma, Rob K; Krijnen, Wim P; van der Schans, Cees P; Waninge, Aly
2018-05-30
A feasible and reliable instrument to measure strength in persons with severe intellectual and visual disabilities (SIVD) is lacking. The aim of our study was to determine feasibility, learning period and reliability of three strength tests. Twenty-nine participants with SIVD performed the Minimum Sit-to-Stand Height test (MSST), the Leg Extension test (LE) and the 30 seconds Chair-Stand test (30sCS), once per week for 5 weeks. Feasibility was determined by the percentage of successful measurements; learning effect by using paired t test between two consecutive measurements; test-retest reliability by intraclass correlation coefficient and Limits of Agreement and, correlations by Pearson correlations. A sufficient feasibility and learning period of the tests was shown. The methods had sufficient test-retest reliability and moderate-to-sufficient correlations. The MSST, the LE, and the 30sCS are feasible tests for measuring muscle strength in persons with SIVD, having sufficient test re-test reliability. © 2018 John Wiley & Sons Ltd.
Automatically generated acceptance test: A software reliability experiment
NASA Technical Reports Server (NTRS)
Protzel, Peter W.
1988-01-01
This study presents results of a software reliability experiment investigating the feasibility of a new error detection method. The method can be used as an acceptance test and is solely based on empirical data about the behavior of internal states of a program. The experimental design uses the existing environment of a multi-version experiment previously conducted at the NASA Langley Research Center, in which the launch interceptor problem is used as a model. This allows the controlled experimental investigation of versions with well-known single and multiple faults, and the availability of an oracle permits the determination of the error detection performance of the test. Fault interaction phenomena are observed that have an amplifying effect on the number of error occurrences. Preliminary results indicate that all faults examined so far are detected by the acceptance test. This shows promise for further investigations, and for the employment of this test method on other applications.
The development of the "Cantonese receptive vocabulary test' for children aged 2-6 in Hong Kong.
Cheung, P S; Lee, K Y; Lee, L W
1997-01-01
The study aims to develop a Cantonese receptive vocabulary test to assess 2-6-year-old children in Hong Kong. The test consists of 100 test items. Each target item is accompanied by a phonological distractor, a semantic distractor and an unrelated distractor. A sample of 609 normal children from four Maternal and Child Health Centres and nine kindergartens was selected. The results show that there is a significant effect of age on the correct score. ANOVA was performed to look at the age effect on each distractor individually. It was found that the scores of the three distractors decrease in their own patterns as age increases. With strong content validity, strong construct validity and high correlation coefficients in the split-half reliability, this test could be used as a reliable measurement for the Cantonese-speaking population in Hong Kong.
NASA Technical Reports Server (NTRS)
Patterson, Richard L.; Boomer, Kristen T.; Scheick, Leif; Lauenstein, Jean-Marie; Casey, Megan; Hammoud, Ahmad
2014-01-01
The power systems for use in NASA space missions must work reliably under harsh conditions including radiation, thermal cycling, and exposure to extreme temperatures. Gallium nitride semiconductors show great promise, but information pertaining to their performance is scarce. Gallium nitride N-channel enhancement-mode field effect transistors made by EPC Corporation in a 2nd generation of manufacturing were exposed to radiation followed by long-term thermal cycling and testing under high temperature reverse bias conditions in order to address their reliability for use in space missions. Result of the experimental work are presented and discussed.
Image Capture and Display Based on Embedded Linux
NASA Astrophysics Data System (ADS)
Weigong, Zhang; Suran, Di; Yongxiang, Zhang; Liming, Li
For the requirement of building a highly reliable communication system, SpaceWire was selected in the integrated electronic system. There was a need to test the performance of SpaceWire. As part of the testing work, the goal of this paper is to transmit image data from CMOS camera through SpaceWire and display real-time images on the graphical user interface with Qt in the embedded development platform of Linux & ARM. A point-to-point mode of transmission was chosen; the running result showed the two communication ends basically reach a consensus picture in succession. It suggests that the SpaceWire can transmit the data reliably.
Effect of individual shades on reliability and validity of observers in colour matching.
Lagouvardos, P E; Diamanti, H; Polyzois, G
2004-06-01
The effect of individual shades in shade guides, on the reliability and validity of measurements in a colour matching process is very important. Observer's agreement on shades and sensitivity/specificity of shades, can give us an estimate of shade's effect on observer's reliability and validity. In the present study, a group of 16 students, matched 15 shades of a Kulzer's guide and 10 human incisors to Kulzer's and/or Vita's shade tabs, in 4 different tests. The results showed shades I, B10, C40, A35 and A10 were those with the highest reliability and validity values. In conclusion, a) the matching process with shades of different materials was not accurate enough, b) some shades produce a more reliable and valid match than others and c) teeth are matched with relative difficulty.
Andersen, Kenneth Geving; Kehlet, Henrik; Aasvang, Eske Kvanner
2015-05-01
Quantitative sensory testing (QST) is used to assess sensory dysfunction and nerve damage by examining psychophysical responses to controlled, graded stimuli such as mechanical and thermal detection and pain thresholds. In the breast cancer population, 4 studies have used QST to examine persistent pain after breast cancer treatment, suggesting neuropathic pain being a prominent pain mechanism. However, the agreement and reliability of QST has not been described in the postsurgical breast cancer population, hindering exact interpretation of QST studies in this population. The aim of the present study was to assess test-retest properties of QST after breast cancer surgery. A total of 32 patients recruited from a larger ongoing prospective trial were examined with QST 12 months after breast cancer surgery and reexamined a week later. A standardized QST protocol was used, including sensory mapping for mechanical, warmth and cold areas of sensory dysfunction, mechanical thresholds using monofilaments and pin-prick, thermal thresholds including warmth and cold detection thresholds and heat pain threshold, with bilateral examination. Agreement and reliability were assessed by Bland-Altman plots, descriptive statistics, coefficients of variance, and intraclass correlation. Bland-Altman plots showed high variation on the surgical side. Intraclass coefficients ranged from 0.356 to 0.847 (moderate to substantial reliability). Between-patient variation was generally higher (0.9 to 14.5 SD) than within-patient variation (0.23 to 3.55 SD). There were no significant differences between pain and pain-free patients. The individual test-retest variability was higher on the operated side compared with the nonoperated side. The QST protocol reliability allows for group-to-group comparison of sensory function, but less so for individual follow-up after breast cancer surgery.
Goetz, Katja; Hasse, Philipp; Szecsenyi, Joachim; Campbell, Stephen M
2016-04-01
The consideration of organisational aspects, such as shared goals and clear communication, within the health care team is important to ensure good quality care. In primary health care, the instrument Survey of Organizational Attributes for Primary Care (SOAPC) is available to measure organisational attributes of care. However, there is no instrument available for dental care. The aim of the present study was to investigate psychometric properties and test-retest reliability of the version of SOAPC adapted for dental care, namely the Survey of Organizational Attributes in Dental Care (SOADC). The SOADC consists of 21 items in the following four subscales: communication; decision making; stress/chaos; and history of change. Convergent construct validity was measured using the job satisfaction scale. A total of 287 dental-care practices were asked to participate in the validation study. Psychometric properties and test-retest reliability were observed. A total of 43 dental-care practices responded to the survey. At baseline, 178 dental-care staff completed the questionnaire, and 4 weeks later 138 did so. Internal consistency, measured by Cronbach's alpha, was 0.718 or higher in the subscales. The test-retest reliability for each subscale and the overall SOADC score demonstrated good correlations over the 4-week test-retest interval, except for 'history of change'. A strong correlation with the aggregated job-satisfaction scale showed high convergent construct validity of SOADC. The consideration of organisational aspects from the perspective of dental-care teams is important for providing good quality of care. The SOADC is a reliable instrument with good psychometric properties and is suitable for the evaluation of organisational attributes in dental-care practices. © 2015 FDI World Dental Federation.
Çelik, Derya; Can, Canan; Aslan, Yasemin; Ceylan, Hasan Huseyin; Bilsel, Kerem; Ozdincler, Arzu Razak
2014-01-01
The Harris Hip Score (HHS) developed to assess function and pain from the perspective of patients hip pathologies. The purpose of this study was to translate and culturally adapt the HHS into Turkish, and thereby determine the reliability and validity of the translated version. The HHS was translated into Turkish in accordance with the stages recommended by Beaton. The measurement properties of the HHS were tested in 80 patients; 52 males, mean age 51 years (range 21-75 years) suffering from different hip pathologies. The test-retest reliability was tested in 58 patients; 28 males mean age, 52 years (range 30-73 years) after an interval of seven days. The Cronbach's Alpha was used to assess internal consistency and the intra-class correlation coefficient (ICC) was used to estimate the test-retest reliability. Patients were asked to answer the Oxford Hip Score (OHS), the Western Ontario and McMaster Universities Arthritis Index (WOMAC), the VAS and the Short Form-36 (SF-36) for the validity of the estimation. The Turkish version of the HHS showed sufficient internal consistency (Cronbach's alpha,0.70) and test-retest reliability (ICC = 0.91). The correlation coefficients between the HHS, the WOMAC and the OHS were 0.64 and 0.89 respectively. The highest correlations between the HHS and SF-36 were with the physical function scale (r = 0.72), and the lowest correlations were with the mental function scale (r = 0.10). We observed no floor or ceiling effects. The Turkish version of the HHS has sufficient reliability and validity to measure patient-reported outcome for Turkish-speaking individuals with a variety of hip disorders.
[Transcultural adaptation of the Antifat Attitudes Test to Brazilian Portuguese].
Obara, Angélica Almeida; Alvarenga, Marle Dos Santos
2018-05-01
Obese individuals are often blamed for their own condition and the targets of discrimination and prejudice. The scope of this study is to describe the cross-cultural adaptation to Brazilian Portuguese and the validation of the Antifat Attitudes Test - specifically developed for evaluation of negative attitudes toward the obese individual. The scale has 34 statements distributed in three subscales - Social/Character Disparagement (15 items), Physical/Romantic Unattractiveness (10 items) and Weight Control/Blame (9 items). The method involved the translation of the scale; evaluation of the conceptual, operational and item equivalence; evaluation of the semantic equivalence using the paired t test, the Pearson correlation coefficient and the intraclass correlation coefficient (ICC); internal consistency evaluation (Cronbach's alpha) and test-retest reliability (ICC) and Confirmatory Factor Analysis - after application in 340 college students in the area of health. The results showed good global internal consistency and reliability (α 0.85; CCI 0.83), and factor analysis showed that the original subscales can be kept in the adaptation, and therefore the scale adapted to the Brazilian-Portuguese version is valid and useful in studies to explore negative attitudes toward obese individuals.
NASA Astrophysics Data System (ADS)
Pyo, Ju-Young; Cho, Won-Ju
2017-09-01
In this paper, we propose an amorphous indium gallium zinc oxide (a-IGZO) thin-film transistor (TFT) with off-planed source/drain electrodes. We applied different metals for the source/drain electrodes with Ni and Ti to control the work function as high and low. When we measured the configuration of Ni to drain and source to Ti, the a-IGZO TFT showed increased driving current, decreased leakage current, a high on/off current ratio, low subthreshold swing, and high mobility. In addition, we conducted a reliability test with a gate bias stress test at various temperatures. The results of the reliability test showed the Ni drain and Ti drain had an equivalent effective energy barrier height. Thus, we confirmed that the proposed off-planed structure improved the electrical characteristics of the fabricated devices without any degradation of characteristics. Through the a-IGZO TFT with different source/drain electrode metal engineering, we realized high-performance TFTs for next-generation display devices.
Hedlund, Lena; Gyllensten, Amanda Lundvik; Hansson, Lars
2015-04-01
Fatigue is frequently reported by patients with mental illness. The multidimensional fatigue inventory (MFI-20) is a self-assessment instrument with 20 items including five dimensions of fatigue. The purpose of this study was to examine the test-retest reliability, internal consistency, convergent construct validity and feasibility of using MFI-20 in patients with schizophrenia spectrum disorders. Patients completed two self-assessment instruments, MFI-20 (n = 93) and Visual Analogue Scale (n = 79), twice within 1 week ± 2 days. Fifty-three patients also rated the feasibility of responding to the MFI-20 with a Likert scale. The test-retest reliability and validity were analysed by using Spearman's correlations and internal consistency by calculating Cronbach's α. The test-retest showed a correlation between .66 and .91 for all subscales of MFI. The internal consistency was .92. The analysis of convergent construct validity showed a correlation of .68 (time 1) and .77 (time 2). No item was systematically identified as being difficult to answer.
Koritar, Priscila; Philippi, Sonia Tucunduva; Alvarenga, Marle dos Santos; Santos, Bernardo dos
2014-08-01
The scope of this study was to show the cross-cultural adaptation and validation of the Health and Taste Attitude Scale in Portuguese. The methodology included translation of the scale; evaluation of conceptual, operational and item-based equivalence by 14 experts and 51 female undergraduates; semantic equivalence and measurement assessment by 12 bilingual women by the paired t-test, the Pearson correlation coefficient and the coefficient intraclass correlation; internal consistency and test-retest reliability by Cronbach's alpha and intraclass correlation coefficient, respectively, after application on 216 female undergraduates; assessment of discriminant and concurrent validity via the t-test and Spearman's correlation coefficient, respectively, in addition to Confirmatory Factor and Exploratory Factor Analysis. The scale was considered adequate and easily understood by the experts and university students and presented good internal consistency and reliability (µ 0.86, ICC 0.84). The results show that the scale is valid and can be used in studies with women to better understand attitudes related to taste.
Development of self and peer performance assessment on iodometric titration experiment
NASA Astrophysics Data System (ADS)
Nahadi; Siswaningsih, W.; Kusumaningtyas, H.
2018-05-01
This study aims to describe the process in developing of reliable and valid assessment to measure students’ performance on iodometric titration and the effect of the self and peer assessment on students’ performance. The self and peer-instrument provides valuable feedback for the student performance improvement. The developed assessment contains rubric and task for facilitating self and peer assessment. The participants are 24 students at the second-grade student in certain vocational high school in Bandung. The participants divided into two groups. The first 12 students involved in the validity test of the developed assessment, while the remain 12 students participated for the reliability test. The content validity was evaluated based on the judgment experts. Test result of content validity based on judgment expert show that the developed performance assessment instrument categorized as valid on each task with the realibity classified as very good. Analysis of the impact of the self and peer assessment implementation showed that the peer instrument supported the self assessment.
Muir-Hunter, Susan W; Graham, Laura; Montero Odasso, Manuel
2015-08-01
To measure test-retest and interrater reliability of the Berg Balance Scale (BBS) in community-dwelling adults with mild to moderate Alzheimer disease (AD). Method : A sample of 15 adults (mean age 80.20 [SD 5.03] years) with AD performed three balance tests: the BBS, timed up-and-go test (TUG), and Functional Reach Test (FRT). Both relative reliability, using the intra-class correlation coefficient (ICC), and absolute reliability, using standard error of measurement (SEM) and minimal detectable change (MDC95) values, were calculated; Bland-Altman plots were constructed to evaluate inter-tester agreement. The test-retest interval was 1 week. Results : For the BBS, relative reliability values were 0.95 (95% CI, 0.85-0.98) for test-retest reliability and 0.72 (95% CI, 0.31-0.91) for interrater reliability; SEM was 6.01 points and MDC95 was 16.66 points; and interrater agreement was 16.62 points. The BBS performed better in test-retest reliability than the TUG and FRT, tests with established reliability in AD. Between 33% and 50% of participants required cueing beyond standardized instructions because they were unable to remember test instructions. Conclusions : The BBS achieved relative reliability values that support its clinical utility, but MDC95 and agreement values indicate the scale has performance limitations in AD. Further research to optimize balance assessment for people with AD is required.
Newly developed double neural network concept for reliable fast plasma position control
NASA Astrophysics Data System (ADS)
Jeon, Young-Mu; Na, Yong-Su; Kim, Myung-Rak; Hwang, Y. S.
2001-01-01
Neural network is considered as a parameter estimation tool in plasma controls for next generation tokamak such as ITER. The neural network has been reported to be so accurate and fast for plasma equilibrium identification that it may be applied to the control of complex tokamak plasmas. For this application, the reliability of the conventional neural network needs to be improved. In this study, a new idea of double neural network is developed to achieve this. The new idea has been applied to simple plasma position identification of KSTAR tokamak for feasibility test. Characteristics of the concept show higher reliability and fault tolerance even in severe faulty conditions, which may make neural network applicable to plasma control reliably and widely in future tokamaks.
Reliability of the Melbourne assessment of unilateral upper limb function.
Randall, M; Carlin, J B; Chondros, P; Reddihough, D
2001-11-01
This study examines the reliability of the Melbourne Assessment of Unilateral Upper Limb Function: a quantitative test of quality of movement in children with neurological impairment. The assessment was administered to 20 children aged from 5 to 16 years (mean age 9 years 10 months, SD 2 years 10 months) who had various types and degrees of cerebral palsy (CP). The performances of the 20 children during assessment were videotaped for subsequent scoring by 15 occupational therapists. Scores were analyzed for internal consistency of test items, inter- and intrarater reliability of scorings of the same videotapes, and test-retest reliability using repeat videotaping. Results revealed very high internal consistency of test items (alpha=0.96), moderate to high agreement both within and between raters for all test items (intraclass correlations of at least 0.7) apart from item 16 (hand to mouth and down), and high interrater reliability (0.95) and intrarater reliability (0.97) for total test scores. Test-retest results revealed moderate to high intrarater reliability for item totals (mean of 0.83 and 0.79) for each rater and high reliability for test totals (0.98 and 0.97). These findings indicate that the Melbourne Assessment of Unilateral Upper Limb Function is a reliable tool for measuring the quality of unilateral upper-limb movement in children with CP.
Reliability and Validity of the Turkish Version of the Voice-Related Quality of Life Measure.
Tezcaner, Zahide Çiler; Aksoy, Songül
2017-03-01
This study aims to test the validity and reliability of the Turkish version of the Voice-Related Quality of Life (V-RQOL) questionnaire. This is a nonrandomized, prospective study with control group. The questionnaire was administered to 249 individuals-130 with vocal complaint and 119 without-with a mean age of 37.8 ± 12.3 years. The Turkish version of the Voice Handicap Index (VHI) and perceptual voice evaluation measures were also administered at 2-14 days for retest reliability. The instrument was submitted to validity and reliability evaluation. The V-RQOL measure showed a strong internal consistency and test-retest reliability; the Cronbach's alpha coefficient for the overall V-RQOL was 0.969, the physical functioning domain was 0.949, and the social-emotional domain was 0.940. In the test-retest reliability test, the overall V-RQOL was found to be 0.989. The construct validity of the V-RQOL was determined based on the strength and direction of its relation to the VHI and the perceptual voice evaluation measure. The higher the VHI level, the lower the physical functioning, social-emotional, and overall score levels of the V-RQOL (r = -0.927, r = -0.912, r = -0.944, respectively; P < 0.001). Following the perceptual voice self-assessment, a statistically significant difference was found between the V-RQOL scores of individuals who defined their voices as good, very good, and perfect, and those who defined their voices as bad and very bad (P < 0.001). The results suggest that the Turkish version of the V-RQOL measure has reliability and validity and may play a crucial role in evaluating Turkish-speaking patients with voice disorders. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
van de Pol, Daan; Zacharian, Tigran; Maas, Mario; Kuijer, P Paul F M
2017-06-01
The Shoulder posterior circumflex humeral artery Pathology and digital Ischemia - questionnaire (SPI-Q) has been developed to enable periodic surveillance of elite volleyball players, who are at risk for digital ischemia. Prior to implementation, assessing reliability is mandatory. Therefore, the test-retest reliability and agreement of the SPI-Q were evaluated among the population at risk. A questionnaire survey was performed with a 2-week interval among 65 elite male volleyball players assessing symptoms of cold, pale and blue digits in the dominant hand during or after practice or competition using a 4-point Likert scale (never, sometimes, often and always). Kappa (κ) and percentage of agreement (POA) were calculated for individual symptoms, and to distinguish symptomatic and asymptomatic players. For the individual symptoms, κ ranged from "poor" (0.25) to "good" (0.63), and POA ranged from "moderate" (78%) to "good" (97%). To classify symptomatic players, the SPI-Q showed "good" reliability (κ = 0.83; 95%CI 0.69-0.97) and "good" agreement (POA = 92%). The current study has proven the SPI-Q to be reliable for detecting elite male indoor volleyball players with symptoms of digital ischemia.
Resting-state fMRI correlations: From link-wise unreliability to whole brain stability.
Pannunzi, Mario; Hindriks, Rikkert; Bettinardi, Ruggero G; Wenger, Elisabeth; Lisofsky, Nina; Martensson, Johan; Butler, Oisin; Filevich, Elisa; Becker, Maxi; Lochstet, Martyna; Kühn, Simone; Deco, Gustavo
2017-08-15
The functional architecture of spontaneous BOLD fluctuations has been characterized in detail by numerous studies, demonstrating its potential relevance as a biomarker. However, the systematic investigation of its consistency is still in its infancy. Here, we analyze within- and between-subject variability and test-retest reliability of resting-state functional connectivity (FC) in a unique data set comprising multiple fMRI scans (42) from 5 subjects, and 50 single scans from 50 subjects. We adopt a statistical framework that enables us to identify different sources of variability in FC. We show that the low reliability of single links can be significantly improved by using multiple scans per subject. Moreover, in contrast to earlier studies, we show that spatial heterogeneity in FC reliability is not significant. Finally, we demonstrate that despite the low reliability of individual links, the information carried by the whole-brain FC matrix is robust and can be used as a functional fingerprint to identify individual subjects from the population. Copyright © 2017 Elsevier Inc. All rights reserved.
Reliability of an fMRI Paradigm for Emotional Processing in a Multisite Longitudinal Study
Gee, Dylan G.; McEwen, Sarah C.; Forsyth, Jennifer K.; Haut, Kristen M.; Bearden, Carrie E.; Addington, Jean; Goodyear, Bradley; Cadenhead, Kristin S.; Mirzakhanian, Heline; Cornblatt, Barbara A.; Olvet, Doreen; Mathalon, Daniel H.; McGlashan, Thomas H.; Perkins, Diana O.; Belger, Aysenil; Seidman, Larry J.; Thermenos, Heidi; Tsuang, Ming T.; van Erp, Theo G.M.; Walker, Elaine F.; Hamann, Stephan; Woods, Scott W.; Constable, Todd; Cannon, Tyrone D.
2015-01-01
Multisite neuroimaging studies can facilitate the investigation of brain-related changes in many contexts, including patient groups that are relatively rare in the general population. Though multisite studies have characterized the reliability of brain activation during working memory and motor functional magnetic resonance imaging tasks, emotion processing tasks, pertinent to many clinical populations, remain less explored. A traveling participants study was conducted with eight healthy volunteers scanned twice on consecutive days at each of the eight North American Longitudinal Prodrome Study sites. Tests derived from generalizability theory showed excellent reliability in the amygdala (Eρ2=0.82), inferior frontal gyrus (IFG;Eρ2=0.83), anterior cingulate cortex (ACC;Eρ2=0.76), insula (Eρ2=0.85), and fusiform gyrus (Eρ2=0.91) for maximum activation and fair to excellent reliability in the amygdala (Eρ2=0.44), IFG (Eρ2=0.48), ACC (Eρ2=0.55), insula (Eρ2=0.42), and fusiform gyrus (Eρ2=0.83) for mean activation across sites and test days. For the amygdala, habituation (Eρ2=0.71) was more stable than mean activation. In a second investigation, data from 111 healthy individuals across sites were aggregated in a voxelwise, quantitative meta-analysis. When compared with a mixed effects model controlling for site, both approaches identified robust activation in regions consistent with expected results based on prior single-site research. Overall, regions central to emotion processing showed strong reliability in the traveling participants study and robust activation in the aggregation study. These results support the reliability of blood oxygen level-dependent signal in emotion processing areas across different sites and scanners and may inform future efforts to increase efficiency and enhance knowledge of rare conditions in the population through multisite neuroimaging paradigms. PMID:25821147
Duruöz, M T; Unal, C; Toprak, C Sanal; Sezer, I; Yilmaz, F; Ulutatar, F; Atagündüz, P; Baklacioglu, H S
2017-12-01
Background Systemic lupus erythematosus (SLE) may have a profound impact on quality of life. There is increasing interest in measuring quality of life in lupus patients. The purpose of this study was to investigate the validity and reliability of SLE Quality of Life Questionnaire (L-QoL) in Turkish SLE patients. Methods SLE according to 2012 Systemic Lupus International Collaborating Clinics Classification Criteria were recruited into the study. Demographic data, clinical parameters and disease activity measured with the Systemic Lupus Erythematosus Disease Activity Index-2000 (SLEDAI-2K); were noted. Nottingham Health Profile and Health Assessment Questionnaire were filled out in addition to the Turkish L-QoL (LQoL-TR). Internal consistency, test-retest reliability, and convergent and discriminant validity were evaluated. Results The mean age of participants was 43.55 ± 14.33 years and the mean disease duration was 89.8 ± 92.1 months. The patients filled out LQoL-TR in 2.5 min. Strong correlation of LQoL-TR with all subgroups of the Nottingham Health Profile and the Health Assessment Questionnaire were established showing the convergent validity. The highest correlation was demonstrated with emotional reactions (rho = 0.72) and sleep component (rho = 0.65) of the Nottingham Health Profile scale ( p < 0.0001). Its poor and not significant correlation with nonfunctional parameters (age, disease duration, perceived general health, SLEDAI-2K) showed its discriminative properties. LQoL-TR demonstrated good internal reliability with a Cronbach's α of 0.93 and test-retest reliability with intraclass correlation coefficient of 0.87. Conclusion The LQoL-TR is a practical and useful tool which demonstrates good validity and reliability.
Martin, T P C; Moualed, D; Paul, A; Ronan, N; Tysome, J R; Donnelly, N P; Cook, R; Axon, P R
2015-04-01
The Cambridge Otology Quality of Life Questionnaire (COQOL) is a patient-recorded outcome measurement (PROM) designed to quantify the quality of life of patients attending otology clinics. Item-reduction model. A systematically designed long-form version (74 items) was tested with patient focus groups before being presented to adult otology patients (n. 137). Preliminary item analysis tested reliability, reducing the COQOL to 24 questions. This was then presented in conjunction with the SF-36 (V1) questionnaire to a total of 203 patients. Subsequently, these were re-presented at T + 3 months, and patients recorded whether they felt their condition had improved, deteriorated or remained the same. Non-responders were contacted by post. A correlation between COQOL scores and patient perception of change was examined to analyse content validity. Teaching hospital and university psychology department. Adult patients attending otology clinics with a wide range of otological conditions. Item reliability measured by item–total correlation, internal consistency and test– retest reliability. Validity measured by correlation between COQOL scores and patient-reported symptom change. Reliability: the COQOL showed excellent internal consistency at both initial presentation (a = 0.90) and 3 months later (a = 0.93). Validity: One-way analysis of variance showed a significant difference between groups reporting change and those reporting no change in quality of life (F(2, 80) = 5.866, P < 0.01). The COQOL is the first otology-specific PROM. Initial studies demonstrate excellent reliability and encouraging preliminary criterion validity: further studies will allow a deeper validation of the instrument.
Process Skill Assessment Instrument: Innovation to measure student’s learning result holistically
NASA Astrophysics Data System (ADS)
Azizah, K. N.; Ibrahim, M.; Widodo, W.
2018-01-01
Science process skills (SPS) are very important skills for students. However, the fact that SPS is not being main concern in the primary school learning is undeniable. This research aimed to develop a valid, practical, and effective assessment instrument to measure student’s SPS. Assessment instruments comprise of worksheet and test. This development research used one group pre-test post-test design. Data were obtained with validation, observation, and test method to investigate validity, practicality, and the effectivenss of the instruments. Results showed that the validity of assessment instruments is very valid, the reliability is categorized as reliable, student SPS activities have a high percentage, and there is significant improvement on student’s SPS score. It can be concluded that assessment instruments of SPS are valid, practical, and effective to be used to measure student’s SPS result.
Quality of life in children with Prader Willi Syndrome: Parent and child reports.
Wilson, Kathleen S; Wiersma, Lenny D; Rubin, Daniela A
2016-10-01
The purpose of this study was to evaluate the use of the Peds QL4.0 instrument to assess quality of life (QL) in children with Prader Willi Syndrome (PWS). This study also sought to compare differences in parent and child report as well as between children with PWS and without PWS. Parents and children with PWS (N=44) completed the PedsQL 4.0 instrument. A sub-sample of children completed the Peds QL 4.0 a second time to assess test-retest reliability. A comparison sample of children who were obese but without PWS (N=66) also completed the PedsQL 4.0. PedsQL 4.0 showed acceptable internal consistency for the child report (αs >0.72) and was acceptable for 4 out of the 6 scales for the parent report (αs >0.66). Test-retest reliability coefficients showed support for the reliability of the instrument (ICCs>0.64). Parents perceived lower QL than children with PWS. Children with PWS also showed lower QL than children without PWS. This study provides support for the use of the PedsQL 4.0 instrument in children with PWS. As observed in other populations, parents perceive a lower QL for their children with PWS than the children themselves. Copyright © 2016 Elsevier Ltd. All rights reserved.
Development and psychometric testing of the Cancer Knowledge Scale for Elders.
Su, Ching-Ching; Chen, Yuh-Min; Kuo, Bo-Jein
2009-03-01
To develop the Cancer Knowledge Scale for Elders and test its validity and reliability. The number of elders suffering from cancer is increasing. To facilitate cancer prevention behaviours among elders, they shall be educated about cancer-related knowledge. Prior to designing a programme that would respond to the special needs of elders, understanding the cancer-related knowledge within this population was necessary. However, extensive review of the literature revealed a lack of appropriate instruments for measuring cancer-related knowledge. A valid and reliable cancer knowledge scale for elders is necessary. A non-experimental methodological design was used to test the psychometric properties of the Cancer Knowledge Scale for Elders. Item analysis was first performed to screen out items that had low corrected item-total correlation coefficients. Construct validity was examined with a principle component method of exploratory factor analysis. Cancer-related health behaviour was used as the criterion variable to evaluate criterion-related validity. Internal consistency reliability was assessed by the KR-20. Stability was determined by two-week test-retest reliability. The factor analysis yielded a four-factor solution accounting for 49.5% of the variance. For criterion-related validity, cancer knowledge was positively correlated with cancer-related health behaviour (r = 0.78, p < 0.001). The KR-20 coefficients of each factor were 0.85, 0.76, 0.79 and 0.67 and 0.87 for the total scale. Test-retest reliability over a two-week period was 0.83 (p < 0.001). This study provides evidence for content validity, construct validity, criterion-related validity, internal consistency and stability of the Cancer Knowledge Scale for Elders. The results show that this scale is an easy-to-use instrument for elders and has adequate validity and reliability. The scale can be used as an assessment instrument when implementing cancer education programmes for elders. It can also be used to evaluate the effects of education programmes.
Pan, Xiaoping; Chen, Haobo; Bickerton, Wai-Ling; Lau, Johnny King Lam; Kong, Anthony Pak Hin; Rotshtein, Pia; Guo, Aihua; Hu, Jianxi; Humphreys, Glyn W
2015-01-01
Background There are no currently effective cognitive assessment tools for patients who have suffered stroke in the People’s Republic of China. The Birmingham Cognitive Screen (BCoS) has been shown to be a promising tool for revealing patients’ poststroke cognitive deficits in specific domains, which facilitates more individually designed rehabilitation in the long run. Hence we examined the reliability and validity of a Cantonese version BCoS in patients with acute ischemic stroke, in Guangzhou. Method A total of 98 patients with acute ischemic stroke were assessed with the Cantonese version of the BCoS, and an additional 133 healthy individuals were recruited as controls. Apart from the BCoS, the patients also completed a number of external cognitive tests, including the Montreal Cognitive Assessment Test (MoCA), Mini Mental State Examination (MMSE), Albert’s cancellation test, the Rey–Osterrieth Complex Figure Test, and six gesture matching tasks. Cutoff scores for failing each subtest, ie, deficits, were computed based on the performance of the controls. The validity and reliability of the Cantonese BCoS were examined, as well as interrater and test–retest reliability. We also compared the proportions of cases being classified as deficits in controlled attention, memory, character writing, and praxis, between patients with and without spoken language impairment. Results Analyses showed high test–retest reliability and agreement across independent raters on the qualitative aspects of measurement. Significant correlations were observed between the subtests of the Cantonese BCoS and the other external cognitive tests, providing evidence for convergent validity of the Cantonese BCoS. The screen was also able to generate measures of cognitive functions that were relatively uncontaminated by the presence of aphasia. Conclusion This study suggests good reliability and validity of the Cantonese version of the BCoS. The Cantonese BCoS is a very promising tool for the detection of cognitive problems in Cantonese speakers. PMID:26396522
Tan, Christine L.; Hassali, Mohamed A.; Saleem, Fahad; Shafie, Asrul A.; Aljadhey, Hisham; Gan, Vincent B.
2015-01-01
Objective: (i) To develop the Pharmacy Value-Added Services Questionnaire (PVASQ) using emerging themes generated from interviews. (ii) To establish reliability and validity of questionnaire instrument. Methods: Using an extended Theory of Planned Behavior as the theoretical model, face-to-face interviews generated salient beliefs of pharmacy value-added services. The PVASQ was constructed initially in English incorporating important themes and later translated into the Malay language with forward and backward translation. Intention (INT) to adopt pharmacy value-added services is predicted by attitudes (ATT), subjective norms (SN), perceived behavioral control (PBC), knowledge and expectations. Using a 7-point Likert-type scale and a dichotomous scale, test-retest reliability (N=25) was assessed by administrating the questionnaire instrument twice at an interval of one week apart. Internal consistency was measured by Cronbach’s alpha and construct validity between two administrations was assessed using the kappa statistic and the intraclass correlation coefficient (ICC). Confirmatory Factor Analysis, CFA (N=410) was conducted to assess construct validity of the PVASQ. Results: The kappa coefficients indicate a moderate to almost perfect strength of agreement between test and retest. The ICC for all scales tested for intra-rater (test-retest) reliability was good. The overall Cronbach’ s alpha (N=25) is 0.912 and 0.908 for the two time points. The result of CFA (N=410) showed most items loaded strongly and correctly into corresponding factors. Only one item was eliminated. Conclusions: This study is the first to develop and establish the reliability and validity of the Pharmacy Value-Added Services Questionnaire instrument using the Theory of Planned Behavior as the theoretical model. The translated Malay language version of PVASQ is reliable and valid to predict Malaysian patients’ intention to adopt pharmacy value-added services to collect partial medicine supply. PMID:26445622
Testing for PV Reliability (Presentation)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kurtz, S.; Bansal, S.
The DOE SUNSHOT workshop is seeking input from the community about PV reliability and how the DOE might address gaps in understanding. This presentation describes the types of testing that are needed for PV reliability and introduces a discussion to identify gaps in our understanding of PV reliability testing.
The seated medicine ball throw as a test of upper body power in older adults.
Harris, Chad; Wattles, Andrew P; DeBeliso, Mark; Sevene-Adams, Patricia G; Berning, Joseph M; Adams, Kent J
2011-08-01
Practitioners training the older adult may benefit from a low-cost, easy-to-administer field test of upper body power. This study evaluated validity and reliability of the seated medicine ball throw (SMBT) in older adults. Subjects (n = 33; age 72.4 ± 5.2 years) completed 6 trials of an SMBT in each of 2 testing days and 2 ball masses (1.5 and 3.0 kg). Subjects also completed 6 trials of an explosive push-up (EPU) on a force plate over 2 testing days. Validity was assessed via a Pearson Product-Moment correlation (PPM) between SMBT and EPU maximal vertical force. Reliability of the SMBT was determined using PPMs (r), Intraclass correlation (ICC, R) and Bland-Altman plots (BAPs). For validity, the association between the SMBT and the EPU revealed a PPM of r = 0.641 and r = 0.614 for the 1.5- and 3.0-kg medicine balls, respectively. Test-retest reliability of the 1.5- and 3.0-kg SMBT was r = 0.967 and r = 0.958, respectively. The ICC values of the 1.5- and 3.0-kg SMBT were R = 0.994 and 0.989, respectively. The BAPs revealed 94% of the differences between day 1 and 2 scores were within the 95% confidence interval of the mean difference. Test-retest reliability for the EPU was r = 0.944, R = 0.969. The BAPs showed 94% of the differences between day 1 and 2 scores were within the 95% confidence interval of the mean difference, for both medicine ball throws. In conclusion, for the older adult, the SMBT appears to be highly reliable test of upper body power. Its validity relative to the maximal force exerted during the EPU is modest. The SMBT is an inexpensive, safe, and repeatable measure of upper body power for the older adult.
de Vasconcelos, Rodrigo Antunes; Bevilaqua-Grossi, Débora; Shimano, Antonio Carlos; Paccola, Cleber Jansen; Salvini, Tânia Fátima; Prado, Christiane Lanatovits; Junior, Wilson A. Mello
2015-01-01
Objectives: The aim of this study was to evaluate the reliability and validity of a modified isometric dynamometer (MID) in performance deficits of the knee extensor and flexor muscles in normal individuals and in those with ACL reconstructions. Methods: Sixty male subjects were invited to participate of the study, being divided into three groups with 20 subjects each: control group (GC), group of individuals with ACL reconstruction with patellar tendon graft (GTP, and group of individuals with ACL reconstruction with hamstrings graft (GTF). All individuals performed isometric tests in the MID, muscular strength deficits collected were subsequently compared to the tests performed on the Biodex System 3 operating in the isometric and isokinetic mode at speeds of 60°/s and 180o/s. Intraclass ICC correlation calculations were done in order to assess MID reliability, specificity, sensitivity and Kappa's consistency coefficient calculations, respectively, for assessing the MID's validity in detecting muscular deficits and intra- and intergroup comparisons when performing the four strength tests using the ANOVA method. Results: The modified isometric dynamometer (MID) showed excellent reliability and good validity in the assessment of the performance of the knee extensor and flexor muscles groups. In the comparison between groups, the GTP showed significantly greater deficits as compared to the GTF and GC groups. Conclusion: Isometric dynamometers connected to mechanotherapy equipments could be an alternative option to collect data concerning performance deficits of the extensor and flexor muscles groups of the knee in subjects with ACL reconstruction. PMID:27004175
Structural Testing of the Blade Reliability Collaborative Effect of Defect Wind Turbine Blades
DOE Office of Scientific and Technical Information (OSTI.GOV)
Desmond, M.; Hughes, S.; Paquette, J.
Two 8.3-meter (m) wind turbine blades intentionally constructed with manufacturing flaws were tested to failure at the National Wind Technology Center (NWTC) at the National Renewable Energy Laboratory (NREL) south of Boulder, Colorado. Two blades were tested; one blade was manufactured with a fiberglass spar cap and the second blade was manufactured with a carbon fiber spar cap. Test loading primarily consisted of flap fatigue loading of the blades, with one quasi-static ultimate load case applied to the carbon fiber spar cap blade. Results of the test program were intended to provide the full-scale test data needed for validation ofmore » model and coupon test results of the effect of defects in wind turbine blade composite materials. Testing was part of the Blade Reliability Collaborative (BRC) led by Sandia National Laboratories (SNL). The BRC seeks to develop a deeper understanding of the causes of unexpected blade failures (Paquette 2012), and to develop methods to enable blades to survive to their expected operational lifetime. Recent work in the BRC includes examining and characterizing flaws and defects known to exist in wind turbine blades from manufacturing processes (Riddle et al. 2011). Recent results from reliability databases show that wind turbine rotor blades continue to be a leading contributor to turbine downtime (Paquette 2012).« less
Validity and reliability of the Turkish Migraine Disability Assessment (MIDAS) questionnaire.
Ertaş, Mustafa; Siva, Aksel; Dalkara, Turgay; Uzuner, Nevzat; Dora, Babür; Inan, Levent; Idiman, Fethi; Sarica, Yakup; Selçuki, Deniz; Sirin, Hadiye; Oğuzhanoğlu, Atilla; Irkeç, Ceyla; Ozmenoğlu, Mehmet; Ozbenli, Taner; Oztürk, Musa; Saip, Sabahattin; Neyal, Münife; Zarifoğlu, Mehmet
2004-09-01
The aim of this study is to assess the comprehensibility, internal consistency, patient-physician reliability, test-retest reliability, and validity of Turkish version of Migraine Disability Assessment (MIDAS) questionnaire in patients with headache. MIDAS questionnaire has been developed by Stewart et al and shown to be reliable and valid to determine the degree of disability caused by migraine. This study was designed as a national multicenter study to demonstrate the reliability and validity of Turkish version of MIDAS questionnaire. Patients applying to 17 Neurology Clinics in Turkey were evaluated at the baseline (visit 1), week 4 (visit 2), and week 12 (visit 3) visits in terms of disease severity and comprehensibility, internal consistency, test-retest reliability, and validity of MIDAS. Since the severity of the disease has been found to change significantly at visit 2 compared to visit 1, test-retest reliability was assessed using the MIDAS scores of a subgroup of patients whose disease severity remained unchanged (up to +/-3 days difference in the number of days with headache between visits 1 and 2). A total of 306 patients (86.2% female, mean age: 35.0 +/- 9.8 years) were enrolled into the study. A total of 65.7%, 77.5%, 82.0% of patients reported that "they had fully understood the MIDAS questionnaire" in visits 1, 2, and 3, respectively. A highly positive correlation was found between physician and patient and the applied total MIDAS scores in all three visits (Spearman correlation coefficients were R= 0.87, 0.83, and 0.90, respectively, P <.001). Internal consistency of MIDAS was assessed using Cronbach's alpha and was found at acceptable (>0.7) or excellent (>0.8) levels in both patient and physician applied MIDAS scores, respectively. Total MIDAS score showed good test-retest reliability (R= 0.68). Both the number of days with headache and the total MIDAS scores were positively correlated at all visits with correlation coefficients between 0.47 and 0.63. There was also a moderate degree of correlation (R= 0.54) between the total MIDAS score at week 12 and the number of days with headache at visit 2 + visit 3, which quantify headache-related disability over a 3-month period similar to MIDAS questionnaire. These findings demonstrated that the Turkish translation is equivalent to the English version of MIDAS in terms of internal consistency, test-retest reliability, and validity. Physicians can reliably use the Turkish translation of the MIDAS questionnaire in defining the severity of illness and its treatment strategy when applied as a self-administered report by migraine patients themselves.
Montazeri, Ali; Torkan, Behnaz; Omidvari, Sepideh
2007-04-04
The Edinburgh Postnatal Depression Scale (EPDS) is a widely used instrument to measure postnatal depression. This study aimed to translate and to test the reliability and validity of the EPDS in Iran. The English language version of the EPDS was translated into Persian (Iranian language) and was used in this study. The questionnaire was administered to a consecutive sample of 100 women with normal (n = 50) and caesarean section (n = 50) deliveries at two points in time: 6 to 8 weeks and 12 to 14 weeks after delivery. Statistical analysis was performed to test the reliability and validity of the EPDS. Overall 22% of women at time 1 and 18% at time 2 reported experiencing postpartum depression. In general, the Iranian version of the EPDS was found to be acceptable to almost all women. Cronbach's alpha coefficient (to test reliability) was found to be 0.77 at time 1 and 0.86 at time 2. In addition, test-rest reliability was performed and the intraclass correlation coefficient was found to be 0.80. Validity as performed using known groups comparison showed satisfactory results. The questionnaire discriminated well between sub-groups of women differing in mode of delivery in the expected direction. The factor analysis indicated a three-factor structure that jointly accounted for 58% of the variance. This preliminary validation study of the Iranian version of the EPDS proved that it is an acceptable, reliable and valid measure of postnatal depression. It seems that the EPDS not only measures postpartum depression but also may be measuring something more.
Papadopoulou, Soultana L.; Exarchakos, Georgios; Christodoulou, Dimitrios; Theodorou, Stavroula; Beris, Alexandre; Ploumis, Avraam
2016-01-01
Introduction The Ohkuma questionnaire is a validated screening tool originally used to detect dysphagia among patients hospitalized in Japanese nursing facilities. Objective The purpose of this study is to evaluate the reliability and validity of the adapted Greek version of the Ohkuma questionnaire. Methods Following the steps for cross-cultural adaptation, we delivered the validated Ohkuma questionnaire to 70 patients (53 men, 17 women) who were either suffering from dysphagia or not. All of them completed the questionnaire a second time within a month. For all of them, we performed a bedside and VFSS study of dysphagia and asked participants to undergo a second VFSS screening, with the exception of nine individuals. Statistical analysis included measurement of internal consistency with Cronbach's α coefficient, reliability with Cohen's Kappa, Pearson's correlation coefficient and construct validity with categorical components, and One-Way Anova test. Results According to Cronbach's α coefficient (0.976) for total score, there was high internal consistency for the Ohkuma Dysphagia questionnaire. Test-retest reliability (Cohen's Kappa) ranged from 0.586 to 1.00, exhibiting acceptable stability. We also estimated the Pearson's correlation coefficient for the test-retest total score, which reached high levels (0.952; p = 0.000). The One-Way Anova test in the two measurement times showed statistically significant correlation in both measurements (p = 0.02 and p = 0.016). Conclusion The adapted Greek version of the questionnaire is valid and reliable and can be used for the screening of dysphagia in the Greek-speaking patients. PMID:28050209
Papadopoulou, Soultana L; Exarchakos, Georgios; Christodoulou, Dimitrios; Theodorou, Stavroula; Beris, Alexandre; Ploumis, Avraam
2017-01-01
Introduction The Ohkuma questionnaire is a validated screening tool originally used to detect dysphagia among patients hospitalized in Japanese nursing facilities. Objective The purpose of this study is to evaluate the reliability and validity of the adapted Greek version of the Ohkuma questionnaire. Methods Following the steps for cross-cultural adaptation, we delivered the validated Ohkuma questionnaire to 70 patients (53 men, 17 women) who were either suffering from dysphagia or not. All of them completed the questionnaire a second time within a month. For all of them, we performed a bedside and VFSS study of dysphagia and asked participants to undergo a second VFSS screening, with the exception of nine individuals. Statistical analysis included measurement of internal consistency with Cronbach's α coefficient, reliability with Cohen's Kappa, Pearson's correlation coefficient and construct validity with categorical components, and One-Way Anova test. Results According to Cronbach's α coefficient (0.976) for total score, there was high internal consistency for the Ohkuma Dysphagia questionnaire. Test-retest reliability (Cohen's Kappa) ranged from 0.586 to 1.00, exhibiting acceptable stability. We also estimated the Pearson's correlation coefficient for the test-retest total score, which reached high levels (0.952; p = 0.000). The One-Way Anova test in the two measurement times showed statistically significant correlation in both measurements ( p = 0.02 and p = 0.016). Conclusion The adapted Greek version of the questionnaire is valid and reliable and can be used for the screening of dysphagia in the Greek-speaking patients.
A sensitive and reliable test instrument to assess swimming in rats with spinal cord injury.
Xu, Ning; Åkesson, Elisabet; Holmberg, Lena; Sundström, Erik
2015-09-15
For clinical translation of experimental spinal cord injury (SCI) research, evaluation of animal SCI models should include several sensorimotor functions. Validated and reliable assessment tools should be applicable to a wide range of injury severity. The BBB scale is the most widely used test instrument, but similar to most others it is used to assess open field ambulation. We have developed an assessment tool for swimming in rats with SCI, with high discriminative power and sensitivity to functional recovery after mild and severe injuries, without need for advanced test equipment. We studied various parameters of swimming in four groups of rats with thoracic SCI of different severity and a control group, for 8 weeks after surgery. Six parameters were combined in a multiple item scale, the Karolinska Institutet Swim Assessment Tool (KSAT). KSAT scores for all SCI groups showed consistent functional improvement after injury, and significant differences between the five experimental groups. The internal consistency, the inter-rater and the test-retest reliability were very high. The KSAT score was highly correlated to the cross-section area of white matter spared at the injury epicenter. Importantly, even after 8 weeks of recovery the KSAT score reliably discriminated normal animals from those inflicted by the mildest injury, and also displayed the recovery of the most severely injured rats. We conclude that this swim scale is an efficient and reliable tool to assess motor activity during swimming, and an important addition to the methods available for evaluating rat models of SCI. Copyright © 2015 Elsevier B.V. All rights reserved.
Drew Sayer, R; Tamer, Gregory G; Chen, Ningning; Tregellas, Jason R; Cornier, Marc-Andre; Kareken, David A; Talavage, Thomas M; McCrory, Megan A; Campbell, Wayne W
2016-10-01
The brain's reward system influences ingestive behavior and subsequently obesity risk. Functional magnetic resonance imaging (fMRI) is a common method for investigating brain reward function. This study sought to assess the reproducibility of fasting-state brain responses to visual food stimuli using BOLD fMRI. A priori brain regions of interest included bilateral insula, amygdala, orbitofrontal cortex, caudate, and putamen. Fasting-state fMRI and appetite assessments were completed by 28 women (n = 16) and men (n = 12) with overweight or obesity on 2 days. Reproducibility was assessed by comparing mean fasting-state brain responses and measuring test-retest reliability of these responses on the two testing days. Mean fasting-state brain responses on day 2 were reduced compared with day 1 in the left insula and right amygdala, but mean day 1 and day 2 responses were not different in the other regions of interest. With the exception of the left orbitofrontal cortex response (fair reliability), test-retest reliabilities of brain responses were poor or unreliable. fMRI-measured responses to visual food cues in adults with overweight or obesity show relatively good mean-level reproducibility but considerable within-subject variability. Poor test-retest reliability reduces the likelihood of observing true correlations and increases the necessary sample sizes for studies. © 2016 The Obesity Society.
dos Anjos, Daniela Brianne Martins; Rodrigues, Roberta Cunha Matheus; Padilha, Kátia Melissa; Pedrosa, Rafaela Batista dos Santos; Gallani, Maria Cecília Bueno Jayme
2016-01-01
ABSTRACT Objective: evaluate the practicality, acceptability and the floor and ceiling effects, estimate the reliability and verify the convergent construct's validity with the instrument called the Heart Valve Disease Impact on daily life (IDCV) of the valve disease in patients with mitral and or aortic heart valve disease. Method: data was obtained from 86 heart valve disease patients through 3 phases: a face to face interview for a socio-demographic and clinic characterization and then other two done through phone calls of the interviewed patients for application of the instrument (test and repeat test). Results: as for the practicality and acceptability, the instrument was applied with an average time of 9,9 minutes and with 110% of responses, respectively. Ceiling and floor effects observed for all domains, especially floor effect. Reliability was tested using the test - repeating pattern to give evidence of temporal stability of the measurement. Significant negative correlations with moderate to strong magnitude were found between the score of the generic question about the impact of the disease and the scores of IDCV, which points to the validity of the instrument convergent construct. Conclusion: the instrument to measure the impact of valve heart disease on the patient's daily life showed evidence of reliability and validity when applied to patients with heart valve disease. PMID:27992024
Kang, Qing; Chan, Raymond C K; Li, Xiaoping; Arcelus, Jon; Yue, Ling; Huang, Jiabin; Gu, Lian; Fan, Qing; Zhang, Haiyin; Xiao, Zeping; Chen, Jue
2017-11-01
The study aimed to investigate the reliability and validity of the Chinese version of the eating attitudes test (EAT-26) among female adolescents and young adults in Mainland China. This scale was administered to 396 female eating disorder patients and 406 noneating disorder healthy controls, in addition 35 healthy controls completed a retest after a 4-week intervals. Tests for reliability, convergent validity and receiver operating characteristic analysis were performed to detect the psychometric properties. The EAT-26 demonstrated good internal consistency (Cronbach's alpha = 0.822-0.922), test-retest reliability (interclass correlation coefficient = 0.817) and convergent validity(r = 0.450-0.750). The receiver operating characteristic analysis showed that the cut-off 14 for anorexia nervosa and 15 for bulimia nervosa represented good compromises with approximate sensitivity (0.66-0.68) and specificity (0.85-0.86). Our findings provided evidence that the Chinese version of the EAT-26 was a psychometrically reliable and valid self-rating instrument for identifying people suffering from an eating disorder in Mainland China. A clinical cut-off range between 14 and 15 could be used, but caution should be exercised because of the low sensitivity of the tool. Copyright © 2017 John Wiley & Sons, Ltd and Eating Disorders Association. Copyright © 2017 John Wiley & Sons, Ltd and Eating Disorders Association.
Slagers, Anton J; Reininga, Inge H F; van den Akker-Scheek, Inge
2017-02-01
The ACL-Return to Sport after Injury scale (ACL-RSI) measures athletes' emotions, confidence in performance, and risk appraisal in relation to return to sport after ACL reconstruction. Aim of this study was to study the validity and reliability of the Dutch version of the ACL-RSI (ACL-RSI (NL)). Total 150 patients, who were 3-16 months postoperative, completed the ACL-RSI(NL) and 5 other questionnaires regarding psychological readiness to return to sports, knee-specific physical functioning, kinesiophobia, and health-specific locus of control. Construct validity of the ACL-RSI(NL) was determined with factor analysis and by exploring 10 hypotheses regarding correlations between ACL-RSI(NL) and the other questionnaires. For test-retest reliability, 107 patients (5-16 months postoperative) completed the ACL-RSI(NL) again 2 weeks after the first administration. Cronbach's alpha, Intraclass Correlation Coefficient (ICC), SEM, and SDC, were calculated. Bland-Altman analysis was conducted to assess bias between test and retest. Nine hypotheses (90%) were confirmed, indicating good construct validity. The ACL-RSI(NL) showed good internal consistency (Cronbach's alpha 0.94) and test-retest reliability (ICC 0.93). SEM was 5.5 and SDC was 15. A significant bias of 3.2 points between test and retest was found. Therefore, the ACL-RSI(NL) can be used to investigate psychological factors relevant to returning to sport after ACL reconstruction.
Sayer, R Drew; Tamer, Gregory G; Chen, Ningning; Tregellas, Jason R; Cornier, Marc-Andre; Kareken, David A; Talavage, Thomas M; McCrory, Megan A; Campbell, Wayne W
2016-01-01
Objective The brain’s reward system influences ingestive behavior and subsequently, obesity risk. Functional magnetic resonance imaging (fMRI) is a common method for investigating brain reward function. We sought to assess the reproducibility of fasting-state brain responses to visual food stimuli using BOLD fMRI. Methods A priori brain regions of interest included bilateral insula, amygdala, orbitofrontal cortex, caudate, and putamen. Fasting-state fMRI and appetite assessments were completed by 28 women (n=16) and men (n=12) with overweight or obesity on 2 days. Reproducibility was assessed by comparing mean fasting-state brain responses and measuring test-retest reliability of these responses on the 2 testing days. Results Mean fasting-state brain responses on Day 2 were reduced compared to Day 1 in the left insula and right amygdala, but mean Day 1 and Day 2 responses were not different in the other regions of interest. With the exception of the left orbitofrontal cortex response (fair reliability), test-retest reliabilities of brain responses were poor or unreliable. Conclusion fMRI-measured responses to visual food cues in adults with overweight or obesity show relatively good mean-level reproducibility, but considerable within-subject variability. Poor test-retest reliability reduces the likelihood of observing true correlations and increases the necessary sample sizes for studies. PMID:27542906
Suen, Yi-Nam; Cerin, Ester; Mellecker, Robin R
2014-07-18
Parents' perceived informal social control, defined as the informal ways residents intervene to create a safe and orderly neighbourhood environment, may influence young children's physical activity (PA) in the neighbourhood. This study aimed to develop and test the reliability of a scale of PA-related informal social control relevant to Chinese parents/caregivers of pre-schoolers (children aged 3 to 5 years) living in Hong Kong. Nominal Group Technique (NGT), a structured, multi-step brainstorming technique, was conducted with two groups of caregivers (mainly parents; n = 11) of Hong Kong pre-schoolers in June 2011. Items collected in the NGT sessions and those generated by a panel of experts were used to compile a list of items (n = 22) for a preliminary version of a questionnaire of informal social control. The newly-developed scale was tested with 20 Chinese-speaking parents/caregivers using cognitive interviews (August 2011). The modified scale, including all 22 original items of which a few were slightly reworded, was subsequently administered on two occasions, a week apart, to 61 Chinese parents/caregivers of Hong Kong pre-schoolers in early 2012. The test-retest reliability and internal consistency of the items and scale were examined using intraclass correlation coefficients (ICC), paired t-tests, relative percentages of shifts in responses to items, and Cronbach's α coefficient. Thirteen items generated by parents/caregivers and nine items generated by the panel of experts (total 22 items) were included in a first working version of the scale and classified into three subscales: "Personal involvement and general informal supervision", "Civic engagement for the creation of a better neighbourhood environment" and "Educating and assisting neighbourhood children". Twenty out of 22 items showed moderate to excellent test-test reliability (ICC range: 0.40-0.81). All three subscales of informal social control showed acceptable levels of internal consistency (Cronbach's α >0.70). A reliable scale examining PA-related informal social control relevant to Chinese parents/caregivers of pre-schoolers living in Hong Kong was developed. Further studies should examine the factorial validity of the scale, its associations with Chinese children's PA and its appropriateness for other populations of parents of young children.
van Weemen, B.; Kacaki, J.
1976-01-01
A modified haemagglutination inhibition test for rubella antibodies, using standardized freeze-dried reagents, was developed and compared with haemagglutination inhibition tests using fresh erythrocytes. This comparison was made in collaboration with six European laboratories. A total of 4205 serum samples were tested. The results show that: (1) Sensitivity and reliability of the modified test are good; (2) the modified test can be performed in polystyrene microtitration plates. PMID:789763
Koh, Yi Ling Eileen; Lua, Yi Hui Adela; Hong, Liyue; Bong, Huey Shin Shirley; Yeo, Ling Sui Jocelyn; Tsang, Li Ping Marianne; Ong, Kai Zhi; Wong, Sook Wai Samantha; Tan, Ngiap Chuan
2016-03-01
Essential hypertension often requires affected patients to self-manage their condition most of the time. Besides seeking regular medical review of their life-long condition to detect vascular complications, patients have to maintain healthy lifestyles in between physician consultations via diet and physical activity, and to take their medications according to their prescriptions. Their self-management ability is influenced by their self-efficacy capacity, which can be assessed using questionnaire-based tools. The "Hypertension Self-Care Profile" (HTN-SCP) is 1 such questionnaire assessing self-efficacy in the domains of "behavior," "motivation," and "self-efficacy." This study aims to determine the test-retest reliability of HTN-SCP in an English-literate Asian population using a web-based approach. Multiethnic Asian patients, aged 40 years and older, with essential hypertension were recruited from a typical public primary care clinic in Singapore. The investigators guided the patients to fill up the web-based 60-item HTN-SCP in English using a tablet or smartphone on the first visit and refilled the instrument 2 weeks later in the retest. Internal consistency and test-retest reliability were evaluated using Cronbach's Alpha and intraclass correlation coefficients (ICC), respectively. The t test was used to determine the relationship between the overall HTN-SCP scores of the patients and their self-reported self-management activities. A total of 160 patients completed the HTN-SCP during the initial test, from which 71 test-retest responses were completed. No floor or ceiling effect was found for the scores for the 3 subscales. Cronbach's Alpha coefficients were 0.857, 0.948, and 0.931 for "behavior," "motivation," and "self-efficacy" domains respectively, indicating high internal consistency. The item-total correlation ranges for the 3 scales were from 0.105 to 0.656 for Behavior, 0.401 to 0.808 for Motivation, 0.349 to 0.789 for Self-efficacy. The corresponding ICC scores of 0.671, 0.762, and 0.720 for these respective domains showed good test-retest reliability. The correlation of the HTN-SCP scores and patients' reported self-management measures were significant, except for keeping their food diary. HTN-SCP showed satisfactory internal consistency and test-retest reliability in an English literate Asian population. A web-based approach is feasible if similar studies are needed to validate its translated versions of the tool for wider application in the local multilingual population.
Utility and reliability of non-invasive muscle function tests in high-fat-fed mice.
Martinez-Huenchullan, Sergio F; McLennan, Susan V; Ban, Linda A; Morsch, Marco; Twigg, Stephen M; Tam, Charmaine S
2017-07-01
What is the central question of this study? Non-invasive muscle function tests have not been validated for use in the study of muscle performance in high-fat-fed mice. What is the main finding and its importance? This study shows that grip strength, hang wire and four-limb hanging tests are able to discriminate the muscle performance between chow-fed and high-fat-fed mice at different time points, with grip strength being reliable after 5, 10 and 20 weeks of dietary intervention. Non-invasive tests are commonly used for assessing muscle function in animal models. The value of these tests in obesity, a condition where muscle strength is reduced, is unclear. We investigated the utility of three non-invasive muscle function tests, namely grip strength (GS), hang wire (HW) and four-limb hanging (FLH), in C57BL/6 mice fed chow (chow group, n = 48) or a high-fat diet (HFD group, n = 48) for 20 weeks. Muscle function tests were performed at 5, 10 and 20 weeks. After 10 and 20 weeks, HFD mice had significantly reduced GS (in newtons; mean ± SD: 10 weeks chow, 1.89 ± 0.1 and HFD, 1.79 ± 0.1; 20 weeks chow, 1.99 ± 0.1 and HFD, 1.75 ± 0.1), FLH [in seconds per gram body weight; median (interquartile range): 10 weeks chow, 2552 (1337-4964) and HFD, 1230 (749-1994); 20 weeks chow, 2048 (765-3864) and HFD, 1036 (717-1855)] and HW reaches [n; median (interquartile range): 10 weeks chow, 4 (2-5) and HFD, 2 (1-3); 20 weeks chow, 3 (1-5) and HFD, 1 (0-2)] and higher falls [n; median (interquartile range): 10 weeks chow, 0 (0-2) and HFD, 3 (1-7); 20 weeks chow, 1 (0-4) and HFD, 8 (5-10)]. Grip strength was reliable in both dietary groups [intraclass correlation coefficient (ICC) = 0.5-0.8; P < 0.05], whereas FLH showed good reliability in chow (ICC = 0.7; P < 0.05) but not in HFD mice after 10 weeks (ICC < 0.5). Our data demonstrate that non-invasive muscle function tests are valuable and reliable tools for assessment of muscle strength and function in high-fat-fed mice. © 2017 The Authors. Experimental Physiology © 2017 The Physiological Society.
Carvalho, Flávia A; Morelhão, Priscila K; Franco, Marcia R; Maher, Chris G; Smeets, Rob J E M; Oliveira, Crystian B; Freitas Júnior, Ismael F; Pinto, Rafael Z
2017-02-01
Although there is some evidence for reliability and validity of self-report physical activity (PA) questionnaires in the general adult population, it is unclear whether we can assume similar measurement properties in people with chronic low back pain (LBP). To determine the test-retest reliability of the International Physical Activity Questionnaire (IPAQ) long-version and the Baecke Physical Activity Questionnaire (BPAQ) and their criterion-related validity against data derived from accelerometers in patients with chronic LBP. Cross-sectional study. Patients with non-specific chronic LBP were recruited. Each participant attended the clinic twice (one week interval) and completed self-report PA. Accelerometer measures >7 days included time spent in moderate-and-vigorous physical activity, steps/day, counts/minute, and vector magnitude counts/minute. Intraclass Correlation Coefficients (ICC) and Bland and Altman method were used to determine reliability and spearman rho correlation were used for criterion-related validity. A total of 73 patients were included in our analyses. The reliability analyses revealed that the BPAQ and its subscales have moderate to excellent reliability (ICC 2,1 : 0.61 to 0.81), whereas IPAQ and most IPAQ domains (except walking) showed poor reliability (ICC 2,1 : 0.20 to 0.40). The Bland and Altman method revealed larger discrepancies for the IPAQ. For the validity analysis, questionnaire and accelerometer measures showed at best fair correlation (rho < 0.37). Although the BPAQ showed better reliability than the IPAQ long-version, both questionnaires did not demonstrate acceptable validity against accelerometer data. These findings suggest that questionnaire and accelerometer PA measures should not be used interchangeably in this population. Copyright © 2016 Elsevier Ltd. All rights reserved.
Van de Vossenberg, B T L H; Van der Straten, M J
2014-08-01
The genus Spodoptera comprises 31 species, 4 of which are listed as quarantine pests for the European Union: Spodoptera eridania (Cramer), Spodoptera frugiperda (Smith), Spodoptera littoralis (Boisduval), and Spodoptera litura (F.). In international trade, the earlier life stages (eggs and larvae) are being intercepted at point of inspection most frequently, challenging the possibilities of morphological identification. To realize a rapid and reliable identification for all stages, we developed and validated four simplex real-time polymerase chain reaction identification tests based on the mitochondrial cytochrome b gene using dual-labeled hydrolysis probes. Method validation on dilutions of extracted DNA of the target organisms showed that low levels of template (up to 0.2-100 pg) can reliably be identified. No cross-reactivity was observed with 14 nontarget Spodoptera and 5 non-Spodoptera species in the specific Spodoptera tests. The tests showed to be repeatable, reproducible (both 100%), and robust. The new Spodoptera tests have proven to be suitable tools for routine identification of all life stages of S. eridania, S. frugiperda, S. littoralis, and S. litura.
Stone, Lisanne L; Janssens, Jan M A M; Vermulst, Ad A; Van Der Maten, Marloes; Engels, Rutger C M E; Otten, Roy
2015-01-01
The Strengths and Difficulties Questionnaire is one of the most employed screening instruments. Although there is a large research body investigating its psychometric properties, reliability and validity are not yet fully tested using modern techniques. Therefore, we investigate reliability, construct validity, measurement invariance, and predictive validity of the parent and teacher version in children aged 4-7. Besides, we intend to replicate previous studies by investigating test-retest reliability and criterion validity. In a Dutch community sample 2,238 teachers and 1,513 parents filled out questionnaires regarding problem behaviors and parenting, while 1,831 children reported on sociometric measures at T1. These children were followed-up during three consecutive years. Reliability was examined using Cronbach's alpha and McDonald's omega, construct validity was examined by Confirmatory Factor Analysis, and predictive validity was examined by calculating developmental profiles and linking these to measures of inadequate parenting, parenting stress and social preference. Further, mean scores and percentiles were examined in order to establish norms. Omega was consistently higher than alpha regarding reliability. The original five-factor structure was replicated, and measurement invariance was established on a configural level. Further, higher SDQ scores were associated with future indices of higher inadequate parenting, higher parenting stress and lower social preference. Finally, previous results on test-retest reliability and criterion validity were replicated. This study is the first to show SDQ scores are predictively valid, attesting to the feasibility of the SDQ as a screening instrument. Future research into predictive validity of the SDQ is warranted.
Urdu translation of the Hamilton Rating Scale for Depression: Results of a validation study
Hashmi, Ali M.; Naz, Shahana; Asif, Aftab; Khawaja, Imran S.
2016-01-01
Objective: To develop a standardized validated version of the Hamilton Rating Scale for Depression (HAM-D) in Urdu. Methods: After translation of the HAM-D into the Urdu language following standard guidelines, the final Urdu version (HAM-D-U) was administered to 160 depressed outpatients. Inter-item correlation was assessed by calculating Cronbach alpha. Correlation between HAM-D-U scores at baseline and after a 2-week interval was evaluated for test-retest reliability. Moreover, scores of two clinicians on HAM-D-U were compared for inter-rater reliability. For establishing concurrent validity, scores of HAM-D-U and BDI-U were compared by using Spearman correlation coefficient. The study was conducted at Mayo Hospital, Lahore, from May to December 2014. Results: The Cronbach alpha for HAM-D-U was 0.71. Composite scores for HAM-D-U at baseline and after a 2-week interval were also highly correlated with each other (Spearman correlation coefficient 0.83, p-value < 0.01) indicating good test-retest reliability. Composite scores for HAM-D-U and BDI-U were positively correlated with each other (Spearman correlation coefficient 0.85, p < 0.01) indicating good concurrent validity. Scores of two clinicians for HAM-D-U were also positively correlated (Spearman correlation coefficient 0.82, p-value < 0.01) indicated good inter-rater reliability. Conclusion: The HAM-D-U is a valid and reliable instrument for the assessment of Depression. It shows good inter-rater and test-retest reliability. The HAM-D-U can be a tool either for clinical management or research. PMID:28083049
Urdu translation of the Hamilton Rating Scale for Depression: Results of a validation study.
Hashmi, Ali M; Naz, Shahana; Asif, Aftab; Khawaja, Imran S
2016-01-01
To develop a standardized validated version of the Hamilton Rating Scale for Depression (HAM-D) in Urdu. After translation of the HAM-D into the Urdu language following standard guidelines, the final Urdu version (HAM-D-U) was administered to 160 depressed outpatients. Inter-item correlation was assessed by calculating Cronbach alpha. Correlation between HAM-D-U scores at baseline and after a 2-week interval was evaluated for test-retest reliability. Moreover, scores of two clinicians on HAM-D-U were compared for inter-rater reliability. For establishing concurrent validity, scores of HAM-D-U and BDI-U were compared by using Spearman correlation coefficient. The study was conducted at Mayo Hospital, Lahore, from May to December 2014. The Cronbach alpha for HAM-D-U was 0.71. Composite scores for HAM-D-U at baseline and after a 2-week interval were also highly correlated with each other (Spearman correlation coefficient 0.83, p-value < 0.01) indicating good test-retest reliability. Composite scores for HAM-D-U and BDI-U were positively correlated with each other (Spearman correlation coefficient 0.85, p < 0.01) indicating good concurrent validity. Scores of two clinicians for HAM-D-U were also positively correlated (Spearman correlation coefficient 0.82, p-value < 0.01) indicated good inter-rater reliability. The HAM-D-U is a valid and reliable instrument for the assessment of Depression. It shows good inter-rater and test-retest reliability. The HAM-D-U can be a tool either for clinical management or research.
Test battery for measuring the perception and recognition of facial expressions of emotion
Wilhelm, Oliver; Hildebrandt, Andrea; Manske, Karsten; Schacht, Annekathrin; Sommer, Werner
2014-01-01
Despite the importance of perceiving and recognizing facial expressions in everyday life, there is no comprehensive test battery for the multivariate assessment of these abilities. As a first step toward such a compilation, we present 16 tasks that measure the perception and recognition of facial emotion expressions, and data illustrating each task's difficulty and reliability. The scoring of these tasks focuses on either the speed or accuracy of performance. A sample of 269 healthy young adults completed all tasks. In general, accuracy and reaction time measures for emotion-general scores showed acceptable and high estimates of internal consistency and factor reliability. Emotion-specific scores yielded lower reliabilities, yet high enough to encourage further studies with such measures. Analyses of task difficulty revealed that all tasks are suitable for measuring emotion perception and emotion recognition related abilities in normal populations. PMID:24860528
A measure of state persecutory ideation for experimental studies.
Freeman, Daniel; Pugh, Katherine; Green, Catherine; Valmaggia, Lucia; Dunn, Graham; Garety, Philippa
2007-09-01
Experimental research is increasingly important in developing the understanding of paranoid thinking. An assessment measure of persecutory ideation is necessary for such work. We report the reliability and validity of the first state measure of paranoia: The State Social Paranoia Scale. The items in the measure conform to a recent definition in which persecutory thinking has the 2 elements of feared harm and perpetrator intent. The measure was tested with 164 nonclinical participants and 21 individuals at high risk of psychosis with attenuated positive symptoms. The participants experienced a social situation presented in virtual reality and completed the new measure. The State Social Paranoia Scale was found to have excellent internal reliability, adequate test-retest reliability, clear convergent validity as assessed by both independent interviewer ratings and self-report measures, and showed divergent validity with measures of positive and neutral thinking. The measure of paranoia in a recent social situation has good psychometric properties.
Tork, Hanan; Lohrmann, Christa; Dassen, Theo
2008-03-01
The objectives of this study were to examine the psychometric properties of the modified Care Dependency Scale in a pediatric setting and to explore the extent of dependency of school-aged children regarding their self-care. The data were collected from 130 hospitalized children, aged 6-12 years. The reliability was determined by Cronbach's alpha, which showed a high level of consistency. The subsequent inter-rater reliability revealed moderate-to-substantial agreement. The criterion-related validity was tested by comparing the sum scores of the Care Dependency Scale for Paediatrics and the Visual Analog Scale. Factor analysis was used to investigate the construct validity and resulted in a one-factor solution. In conclusion, this study provides evidence that the Care Dependency Scale for Paediatrics is a valid and reliable measure that offers a comprehensive assessment from a nursing perspective and enables nurses to help children acquire independence.
Ishii, Hitoshi; Shimatsu, Akira; Okimura, Yasuhiko; Tanaka, Toshiaki; Hizuka, Naomi; Kaji, Hidesuke; Hanew, Kunihiko; Oki, Yutaka; Yamashiro, Sayuri; Takano, Koji; Chihara, Kazuo
2012-01-01
To develop and validate the Adult Hypopituitarism Questionnaire (AHQ) as a disease-specific, self-administered questionnaire for evaluation of quality of life (QOL) in adult patients with hypopituitarism. We developed and validated this new questionnaire, using a standardized procedure which included item development, pilot-testing and psychometric validation. Of the patients who participated in psychometric validation, those whose clinical conditions were judged to be stable were asked to answer the survey questionnaire twice, in order to assess test-retest reliability. Content validity of the initial questionnaire was evaluated via two pilot tests. After these tests, we made minor revisions and finalized the initial version of the questionnaire. The questionnaire was constructed with two domains, one psycho-social and the other physical. For psychometric assessment, analyses were performed on the responses of 192 adult patients with various types of hypopituitarism. The intraclass correlations of the respective domains were 0.91 and 0.95, and the Cronbach's alpha coefficients were 0.96 and 0.95, indicating adequate test-retest reliability and internal consistency for each domain. For known-group validity, patients with hypopituitarism due to hypothalamic disorder showed significantly lower scores in 11 out of 13 sub-domains compared to those who had hypopituitarism due to pituitary disorder. Regarding construct validity, the domain structure was found to be almost the same as that initially hypothesized. Exploratory factor analysis (n = 228) demonstrated that each domain consisted of six and seven sub-domains. The AHQ showed good reliability and validity for evaluating QOL in adult patients with hypopituitarism.
Brown, Laura J E; Adlam, Tim; Hwang, Faustina; Khadra, Hassan; Maclean, Linda M; Rudd, Bridey; Smith, Tom; Timon, Claire; Williams, Elizabeth A; Astell, Arlene J
2016-08-01
Patterns of cognitive change over micro-longitudinal timescales (i.e., ranging from hours to days) are associated with a wide range of age-related health and functional outcomes. However, practical issues of conducting high-frequency assessments make investigations of micro-longitudinal cognition costly and burdensome to run. One way of addressing this is to develop cognitive assessments that can be performed by older adults, in their own homes, without a researcher being present. Here, we address the question of whether reliable and valid cognitive data can be collected over micro-longitudinal timescales using unsupervised cognitive tests.In study 1, 48 older adults completed two touchscreen cognitive tests, on three occasions, in controlled conditions, alongside a battery of standard tests of cognitive functions. In study 2, 40 older adults completed the same two computerized tasks on multiple occasions, over three separate week-long periods, in their own homes, without a researcher present. Here, the tasks were incorporated into a wider touchscreen system (Novel Assessment of Nutrition and Ageing (NANA)) developed to assess multiple domains of health and behavior. Standard tests of cognitive function were also administered prior to participants using the NANA system.Performance on the two "NANA" cognitive tasks showed convergent validity with, and similar levels of reliability to, the standard cognitive battery in both studies. Completion and accuracy rates were also very high. These results show that reliable and valid cognitive data can be collected from older adults using unsupervised computerized tests, thus affording new opportunities for the investigation of cognitive.
Chiarotto, Alessandro; Vanti, Carla; Ostelo, Raymond W; Ferrari, Silvano; Tedesco, Giuseppe; Rocca, Barbara; Pillastrini, Paolo; Monticone, Marco
2015-11-01
The Pain Self-Efficacy Questionnaire (PSEQ) is a patient self-reported measurement instrument that evaluates pain self-efficacy beliefs in patients with chronic pain. The measurement properties of the PSEQ have been tested in its original and translated versions, showing satisfactory results for validity and reliability. The aims of this study were 2 fold as follows: (1) to translate the PSEQ into Italian through a process of cross-cultural adaptation, (2) to test the measurement properties of the Italian PSEQ (PSEQ-I). The cross-cultural adaptation was completed in 5 months without omitting any item of the original PSEQ. Measurement properties were tested in 165 patients with chronic low back pain (CLBP) (65% women, mean age 49.9 years). Factor analysis confirmed the one-factor structure of the questionnaire. Internal consistency (Cronbach's α = 0.94) and test-retest reliability (ICCagreement = 0.82) of the PSEQ-I showed good results. The smallest detectable change was equal to 15.69 scale points. The PSEQ-I displayed a high construct validity by meeting more than 75% of a priori hypotheses on correlations with measurement instruments assessing pain intensity, disability, anxiety, depression, pain catastrophizing, fear of movement, and coping strategies. Additionally, the PSEQ-I differentiated patients taking pain medication or not. The results of this study suggest that the PSEQ-I can be used as a valid and reliable tool in Italian patients with CLBP. © 2014 World Institute of Pain.
The validation of the visual analogue scale for patient satisfaction after total hip arthroplasty.
Brokelman, Roy B G; Haverkamp, Daniel; van Loon, Corné; Hol, Annemiek; van Kampen, Albert; Veth, Rene
2012-06-01
INTRODUCTION: Patient satisfaction becomes more important in our modern health care system. The assessment of satisfaction is difficult because it is a multifactorial item for which no golden standard exists. One of the potential methods of measuring satisfaction is by using the well-known visual analogue scale (VAS). In this study, we validated VAS for satisfaction. PATIENT AND METHODS: In this prospective study, we studied 147 patients (153 hips). The construct validity was measured using the Spearman correlation test that compares the satisfaction VAS with the Harris hip score, pain VAS at rest and during activity, Oxford hip score, Short Form 36 and Western Ontario McMaster Universities Osteoarthritis Index. The reliability was tested using the intra-class coefficient. RESULTS: The Pearson correlation test showed correlations in the range of 0.40-0.80. The satisfaction VAS had a high correlation between the pain VAS and Oxford hip score, which could mean that pain is one of the most important factors in patient satisfaction. The intra-class coefficient was 0.95. CONCLUSIONS: There is a moderate to mark degree of correlation between the satisfaction VAS and the currently available subjective and objective scoring systems. The intra-class coefficient of 0.95 indicates an excellent test-retest reliability. The VAS satisfaction is a simple instrument to quantify the satisfaction of a patient after total hip arthroplasty. In this study, we showed that the satisfaction VAS has a good validity and reliability.
McCurdy, M; Bellows, A; Deng, D; Leppert, M; Mahone, E; Pritchard, A
2015-01-01
Reliable and valid screening and assessment tools are necessary to identify children at risk for neurodevelopmental disabilities who may require additional services. This study evaluated the test-retest reliability of the Capute Scales in a high-risk sample, hypothesizing adequate reliability across 6- and 12-month intervals. Capute Scales scores (N = 66) were collected via retrospective chart review from a NICU follow-up clinic within a large urban medical center spanning three age-ranges: 12-18, 19-24, and 25-36 months. On average, participants were classified as very low birth weight and premature. Reliability of the Capute Scales was evaluated with intraclass correlation coefficients across length of test-retest interval, age at testing, and degree of neonatal complications. The Capute Scales demonstrated high reliability, regardless of length of test-retest interval (ranging from 6 to 14 months) or age of participant, for all index scores, including overall Developmental Quotient (DQ), language-based skill index (CLAMS) and nonverbal reasoning index (CAT). Linear regressions revealed that greater neonatal risk was related to poorer test-retest reliability; however, reliability coefficients remained strong. The Capute Scales afford clinicians a reliable and valid means of screening and assessing for neurodevelopmental delay within high-risk infant populations.
Reliability of the Wii Balance Board in kayak
Vando, Stefano; Laffaye, Guillaume; Masala, Daniele; Falese, Lavinia; Padulo, Johnny
2015-01-01
Summary Background: the seat of the kayaker represent the principal contact point to express mechanical Energy. Methods: therefore we investigated the reliability of the Wii Balance Board measures in the kayak vs. on the ground. Results: Bland-Altman test showed a low systematic bias on the ground (2.85%) and in kayak (−2.13%) respectively; while 0.996 for Intra-class correlation coefficient. Conclusion: the Wii Balance Board is useful to assess postural sway in kayak. PMID:25878987
Aandstad, Anders; Holtberget, Kristian; Hageberg, Rune; Holme, Ingar; Anderssen, Sigmund A
2014-02-01
Previous studies show that body composition is related to injury risk and physical performance in soldiers. Thus, valid methods for measuring body composition in military personnel are needed. The frequently used body mass index method is not a valid measure of body composition in soldiers, but reliability and validity of alternative field methods are less investigated in military personnel. Thus, we carried out test and retest of skinfold (SKF), single frequency bioelectrical impedance analysis (SF-BIA), and multifrequency bioelectrical impedance analysis measurements in 65 male and female soldiers. Several validated equations were used to predict percent body fat from these methods. Dual-energy X-ray absorptiometry was also measured, and acted as the criterion method. Results showed that SF-BIA was the most reliable method in both genders. In women, SF-BIA was also the most valid method, whereas SKF or a combination of SKF and SF-BIA produced the highest validity in men. Reliability and validity varied substantially among the equations examined. The best methods and equations produced test-retest 95% limits of agreement below ±1% points, whereas the corresponding validity figures were ±3.5% points. Each investigator and practitioner must consider whether such measurement errors are acceptable for its specific use. Reprint & Copyright © 2014 Association of Military Surgeons of the U.S.
Silva, Rita Oliveira da; Gomes, Mariano Tamura Vieira; Castro, Rodrigo de Aquino; Bonduki, Cláudio Emílio; Girão, Manoel João Batista Castello
2016-10-01
Purpose To translate into Portuguese, culturally adapt and validate the Uterine Fibroid Symptom - Quality of Life (UFS-QoL) questionnaire for Brazilian women with uterine leiomyoma. Methods Initially, the UFS-QoL questionnaire was translated into Brazilian Portuguese in accordance with international standards, with subsequent cultural, structural, conceptual and semantic adaptations, so that patients were able to properly answer the questionnaire. Fifty patients with uterine leiomyoma and 19 patients without the disease, confirmed by abdominal pelvic examination and/or transvaginal ultrasound, were selected at the outpatient clinics of the Department of Gynecology of the Universidade Federal de São Paulo (Unifesp). The UFS-QoL questionnaire was administered to all women twice on the same day, with two different interviewers, with an interval of 15 minutes between interviews. After 15 days, the questionnaire was re-administered by the first interviewer. Reliability (internal consistency and test-retest), construct and discriminative validity were tested to ratify the questionnaire. Results The reliability of the instrument was assessed by Cronbach's α coefficient with an overall result of 0.97, indicating high reliability. The survey results showed a high correlation ( p = 0.94; p ≤ 0.001). Conclusion The UFS-QoL questionnaire was successfully adapted to the Brazilian Portuguese language and Brazilian culture, showing reliability and validity. Thieme Publicações Ltda Rio de Janeiro, Brazil.