ERIC Educational Resources Information Center
Ghazali, Nor Hasnida Md
2016-01-01
A valid, reliable and practical instrument is needed to evaluate the implementation of the school-based assessment (SBA) system. The aim of this study is to develop and assess the validity and reliability of an instrument to measure the perception of teachers towards the SBA implementation in schools. The instrument is developed based on a…
Charlton, Paula C; Mentiplay, Benjamin F; Pua, Yong-Hao; Clark, Ross A
2015-05-01
Traditional methods of assessing joint range of motion (ROM) involve specialized tools that may not be widely available to clinicians. This study assesses the reliability and validity of a custom Smartphone application for assessing hip joint range of motion. Intra-tester reliability with concurrent validity. Passive hip joint range of motion was recorded for seven different movements in 20 males on two separate occasions. Data from a Smartphone, bubble inclinometer and a three dimensional motion analysis (3DMA) system were collected simultaneously. Intraclass correlation coefficients (ICCs), coefficients of variation (CV) and standard error of measurement (SEM) were used to assess reliability. To assess validity of the Smartphone application and the bubble inclinometer against the three dimensional motion analysis system, intraclass correlation coefficients and fixed and proportional biases were used. The Smartphone demonstrated good to excellent reliability (ICCs>0.75) for four out of the seven movements, and moderate to good reliability for the remaining three movements (ICC=0.63-0.68). Additionally, the Smartphone application displayed comparable reliability to the bubble inclinometer. The Smartphone application displayed excellent validity when compared to the three dimensional motion analysis system for all movements (ICCs>0.88) except one, which displayed moderate to good validity (ICC=0.71). Smartphones are portable and widely available tools that are mostly reliable and valid for assessing passive hip range of motion, with potential for large-scale use when a bubble inclinometer is not available. However, caution must be taken in its implementation as some movement axes demonstrated only moderate reliability. Copyright © 2014 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.
Gentile, Douglas A; Humphrey, Jeremy; Walsh, David A
2005-06-01
This article review is organized by studies that are relevant for testing the reliability and validity of ratings systems. Specifically, the interrater reliability, consistency, temporal stability, content validity, construct validity, and criterion validity of media ratings systems are reviewed. Data that are related to testing the "forbidden fruit" and "tainted fruit" hypotheses also are reviewed. Several changes are recommended to improve the ratings systems, including the creation of a universal ratings system that could be applied equally to all media. The research reviewed here can provide a guide for how to construct a reliable, valid, and more useful ratings system. This is important because the decisions that parents make regarding their children's media use can be only as good as the information to which the parents have access.
NASA Astrophysics Data System (ADS)
Zhang, Ding; Zhang, Yingjie
2017-09-01
A framework for reliability and maintenance analysis of job shop manufacturing systems is proposed in this paper. An efficient preventive maintenance (PM) policy in terms of failure effects analysis (FEA) is proposed. Subsequently, reliability evaluation and component importance measure based on FEA are performed under the PM policy. A job shop manufacturing system is applied to validate the reliability evaluation and dynamic maintenance policy. Obtained results are compared with existed methods and the effectiveness is validated. Some vague understandings for issues such as network modelling, vulnerabilities identification, the evaluation criteria of repairable systems, as well as PM policy during manufacturing system reliability analysis are elaborated. This framework can help for reliability optimisation and rational maintenance resources allocation of job shop manufacturing systems.
Soleymani, Zahra; Joveini, Ghodsiye; Baghestani, Ahmad Reza
2015-03-01
This study developed a Farsi language Communication Function Classification System and then tested its reliability and validity. Communication Function Classification System is designed to classify the communication functions of individuals with cerebral palsy. Up until now, there has been no instrument for assessment of this communication function in Iran. The English Communication Function Classification System was translated into Farsi and cross-culturally modified by a panel of experts. Professionals and parents then assessed the content validity of the modified version. A backtranslation of the Farsi version was confirmed by the developer of the English Communication Function Classification System. Face validity was assessed by therapists and parents of 10 patients. The Farsi Communication Function Classification System was administered to 152 individuals with cerebral palsy (age, 2 to 18 years; median age, 10 years; mean age, 9.9 years; standard deviation, 4.3 years). Inter-rater reliability was analyzed between parents, occupational therapists, and speech and language pathologists. The test-retest reliability was assessed for 75 patients with a 14 day interval between tests. The inter-rater reliability of the Communication Function Classification System was 0.81 between speech and language pathologists and occupational therapists, 0.74 between parents and occupational therapists, and 0.88 between parents and speech and language pathologists. The test-retest reliability was 0.96 for occupational therapists, 0.98 for speech and language pathologists, and 0.94 for parents. The findings suggest that the Farsi version of Communication Function Classification System is a reliable and valid measure that can be used in clinical settings to assess communication function in patients with cerebral palsy. Copyright © 2015 Elsevier Inc. All rights reserved.
Chang, Jasper O; Levy, Susan S; Seay, Seth W; Goble, Daniel J
2014-05-01
Recent guidelines advocate sports medicine professionals to use balance tests to assess sensorimotor status in the management of concussions. The present study sought to determine whether a low-cost balance board could provide a valid, reliable, and objective means of performing this balance testing. Criterion validity testing relative to a gold standard and 7 day test-retest reliability. University biomechanics laboratory. Thirty healthy young adults. Balance ability was assessed on 2 days separated by 1 week using (1) a gold standard measure (ie, scientific grade force plate), (2) a low-cost Nintendo Wii Balance Board (WBB), and (3) the Balance Error Scoring System (BESS). Validity of the WBB center of pressure path length and BESS scores were determined relative to the force plate data. Test-retest reliability was established based on intraclass correlation coefficients. Composite scores for the WBB had excellent validity (r = 0.99) and test-retest reliability (R = 0.88). Both the validity (r = 0.10-0.52) and test-retest reliability (r = 0.61-0.78) were lower for the BESS. These findings demonstrate that a low-cost balance board can provide improved balance testing accuracy/reliability compared with the BESS. This approach provides a potentially more valid/reliable, yet affordable, means of assessing sports-related concussion compared with current methods.
Reliability of human-supervised formant-trajectory measurement for forensic voice comparison.
Zhang, Cuiling; Morrison, Geoffrey Stewart; Ochoa, Felipe; Enzinger, Ewald
2013-01-01
Acoustic-phonetic approaches to forensic voice comparison often include human-supervised measurement of vowel formants, but the reliability of such measurements is a matter of concern. This study assesses the within- and between-supervisor variability of three sets of formant-trajectory measurements made by each of four human supervisors. It also assesses the validity and reliability of forensic-voice-comparison systems based on these measurements. Each supervisor's formant-trajectory system was fused with a baseline mel-frequency cepstral-coefficient system, and performance was assessed relative to the baseline system. Substantial improvements in validity were found for all supervisors' systems, but some supervisors' systems were more reliable than others.
Validity and Reliability of the 8-Item Work Limitations Questionnaire.
Walker, Timothy J; Tullar, Jessica M; Diamond, Pamela M; Kohl, Harold W; Amick, Benjamin C
2017-12-01
Purpose To evaluate factorial validity, scale reliability, test-retest reliability, convergent validity, and discriminant validity of the 8-item Work Limitations Questionnaire (WLQ) among employees from a public university system. Methods A secondary analysis using de-identified data from employees who completed an annual Health Assessment between the years 2009-2015 tested research aims. Confirmatory factor analysis (CFA) (n = 10,165) tested the latent structure of the 8-item WLQ. Scale reliability was determined using a CFA-based approach while test-retest reliability was determined using the intraclass correlation coefficient. Convergent/discriminant validity was tested by evaluating relations between the 8-item WLQ with health/performance variables for convergent validity (health-related work performance, number of chronic conditions, and general health) and demographic variables for discriminant validity (gender and institution type). Results A 1-factor model with three correlated residuals demonstrated excellent model fit (CFI = 0.99, TLI = 0.99, RMSEA = 0.03, and SRMR = 0.01). The scale reliability was acceptable (0.69, 95% CI 0.68-0.70) and the test-retest reliability was very good (ICC = 0.78). Low-to-moderate associations were observed between the 8-item WLQ and the health/performance variables while weak associations were observed between the demographic variables. Conclusions The 8-item WLQ demonstrated sufficient reliability and validity among employees from a public university system. Results suggest the 8-item WLQ is a usable alternative for studies when the more comprehensive 25-item WLQ is not available.
Casartelli, Nicola; Müller, Roland; Maffiuletti, Nicola A
2010-11-01
The aim of the present study was to verify the validity and reliability of the Myotest accelerometric system (Myotest SA, Sion, Switzerland) for the assessment of vertical jump height. Forty-four male basketball players (age range: 9-25 years) performed series of squat, countermovement and repeated jumps during 2 identical test sessions separated by 2-15 days. Flight height was simultaneously quantified with the Myotest system and validated photoelectric cells (Optojump). Two calculation methods were used to estimate the jump height from Myotest recordings: flight time (Myotest-T) and vertical takeoff velocity (Myotest-V). Concurrent validity was investigated comparing Myotest-T and Myotest-V to the criterion method (Optojump), and test-retest reliability was also examined. As regards validity, Myotest-T overestimated jumping height compared to Optojump (p < 0.001) with a systematic bias of approximately 7 cm, even though random errors were low (2.7 cm) and intraclass correlation coefficients (ICCs) where high (>0.98), that is, excellent validity. Myotest-V overestimated jumping height compared to Optojump (p < 0.001), with high random errors (>12 cm), high limits of agreement ratios (>36%), and low ICCs (<0.75), that is, poor validity. As regards reliability, Myotest-T showed high ICCs (range: 0.92-0.96), whereas Myotest-V showed low ICCs (range: 0.56-0.89), and high random errors (>9 cm). In conclusion, Myotest-T is a valid and reliable method for the assessment of vertical jump height, and its use is legitimate for field-based evaluations, whereas Myotest-V is neither valid nor reliable.
NASA Technical Reports Server (NTRS)
Fisher, Marcus S.; Northey, Jeffrey; Stanton, William
2014-01-01
The purpose of this presentation is to outline how the NASA Independent Verification and Validation (IVV) Program helps to build reliability into the Space Mission Software Systems (SMSSs) that its customers develop.
Validation of a new classification system for skin tears.
LeBlanc, Kimberly; Baranoski, Sharon; Holloway, Samantha; Langemo, Diane
2013-06-01
The aim of this study was to validate and establish reliability of the International Skin Tear classification system. A consensus panel of 12 internationally recognized key opinion leaders convened in 2011 to establish consensus statements on the prevention, prediction, assessment, and treatment of skin tears. Subsequently, a new skin tear classification system was proposed. The system was then tested for interrater and intrarater reliability between the experts before being tested more widely on a sample of 327 individuals from the United States, Canada, and Europe. The results of the study indicated a substantial level of agreement for the expert panel (Fleiss κ = 0.619; 2-month follow-up = 0.653). Intrarater reliability was high (Cohen κ = 0.877). Interrater reliability was moderate (Fleiss κ = 0.555) for healthcare professionals (n = 303) and fair for non-health professionals (Fleiss κ = 0.338; n = 24). This international study established the reliability and validity of a new classification system for skin tears.
ERIC Educational Resources Information Center
Hidecker, Mary Jo Cooley; Paneth, Nigel; Rosenbaum, Peter L.; Kent, Raymond D.; Lillie, Janet; Eulenberg, John B.; Chester, Ken, Jr.; Johnson, Brenda; Michalsen, Lauren; Evatt, Morgan; Taylor, Kara
2011-01-01
Aim: The purpose of this study was to create and validate the Communication Function Classification System (CFCS) for children with cerebral palsy (CP), for use by a wide variety of individuals who are interested in CP. This paper reports the content validity, interrater reliability, and test-retest reliability of the CFCS for children with CP.…
Cachafeiro, Thais Hofmann; Escobar, Gabriela Fortes; Maldonado, Gabriela; Cestari, Tania Ferreira
2014-01-01
The "Quantitative Global Scarring Grading System for Postacne Scarring" was developed in English for acne scar grading, based on the number and severity of each type of scar. The aims of this study were to translate this scale into Brazilian Portuguese and verify its reliability and validity. The study followed five steps: Translation, Expert Panel, Back Translation, Approval of authors and Validation. The translated scale showed high internal consistency and high test-retest reliability, confirming its reproducibility. Therefore, it has been validated for our population and can be recommended as a reliable instrument to assess acne scarring. PMID:25184939
Lindskog, Marcus; Winman, Anders; Juslin, Peter; Poom, Leo
2013-01-01
Two studies investigated the reliability and predictive validity of commonly used measures and models of Approximate Number System acuity (ANS). Study 1 investigated reliability by both an empirical approach and a simulation of maximum obtainable reliability under ideal conditions. Results showed that common measures of the Weber fraction (w) are reliable only when using a substantial number of trials, even under ideal conditions. Study 2 compared different purported measures of ANS acuity as for convergent and predictive validity in a within-subjects design and evaluated an adaptive test using the ZEST algorithm. Results showed that the adaptive measure can reduce the number of trials needed to reach acceptable reliability. Only direct tests with non-symbolic numerosity discriminations of stimuli presented simultaneously were related to arithmetic fluency. This correlation remained when controlling for general cognitive ability and perceptual speed. Further, the purported indirect measure of ANS acuity in terms of the Numeric Distance Effect (NDE) was not reliable and showed no sign of predictive validity. The non-symbolic NDE for reaction time was significantly related to direct w estimates in a direction contrary to the expected. Easier stimuli were found to be more reliable, but only harder (7:8 ratio) stimuli contributed to predictive validity. PMID:23964256
Non-Technical Skills for Surgeons (NOTSS): Critical appraisal of its measurement properties.
Jung, James J; Borkhoff, Cornelia M; Jüni, Peter; Grantcharov, Teodor P
2018-02-17
To critically appraise the development and measurement properties, including sensibility, reliability, and validity of the Non-Technical Skills of Surgeons (NOTSS) system. Articles that described development process of the NOTSS system were identified. Relevant primary studies that presented evidence of reliability and validity were identified through a comprehensive literature review. NOTSS was developed through robust item generation and reduction strategies. It was shown to have good content validity, acceptability, and feasibility. Inter-rater reliability increased with greater expertise and number of assessors. Studies demonstrated evidence of cross-sectional construct validity, in that the tool was able to differentiate known groups of varied non-technical skill levels. Evidence of longitudinal construct validity also existed to demonstrate that NOTSS detected changes in non-technical skills before and after targeted training. In populations and settings presented in our critical appraisal, NOTSS provided reliable and valid measurements of intraoperative non-technical skills of surgeons. Copyright © 2018 Elsevier Inc. All rights reserved.
Validation of highly reliable, real-time knowledge-based systems
NASA Technical Reports Server (NTRS)
Johnson, Sally C.
1988-01-01
Knowledge-based systems have the potential to greatly increase the capabilities of future aircraft and spacecraft and to significantly reduce support manpower needed for the space station and other space missions. However, a credible validation methodology must be developed before knowledge-based systems can be used for life- or mission-critical applications. Experience with conventional software has shown that the use of good software engineering techniques and static analysis tools can greatly reduce the time needed for testing and simulation of a system. Since exhaustive testing is infeasible, reliability must be built into the software during the design and implementation phases. Unfortunately, many of the software engineering techniques and tools used for conventional software are of little use in the development of knowledge-based systems. Therefore, research at Langley is focused on developing a set of guidelines, methods, and prototype validation tools for building highly reliable, knowledge-based systems. The use of a comprehensive methodology for building highly reliable, knowledge-based systems should significantly decrease the time needed for testing and simulation. A proven record of delivering reliable systems at the beginning of the highly visible testing and simulation phases is crucial to the acceptance of knowledge-based systems in critical applications.
Space Shuttle Propulsion System Reliability
NASA Technical Reports Server (NTRS)
Welzyn, Ken; VanHooser, Katherine; Moore, Dennis; Wood, David
2011-01-01
This session includes the following sessions: (1) External Tank (ET) System Reliability and Lessons, (2) Space Shuttle Main Engine (SSME), Reliability Validated by a Million Seconds of Testing, (3) Reusable Solid Rocket Motor (RSRM) Reliability via Process Control, and (4) Solid Rocket Booster (SRB) Reliability via Acceptance and Testing.
The Modified Cognitive Constructions Coding System: Reliability and Validity Assessments
ERIC Educational Resources Information Center
Moran, Galia S.; Diamond, Gary M.
2006-01-01
The cognitive constructions coding system (CCCS) was designed for coding client's expressed problem constructions on four dimensions: intrapersonal-interpersonal, internal-external, responsible-not responsible, and linear-circular. This study introduces, and examines the reliability and validity of, a modified version of the CCCS--a version that…
Reliability and validity analysis of the open-source Chinese Foot and Ankle Outcome Score (FAOS).
Ling, Samuel K K; Chan, Vincent; Ho, Karen; Ling, Fona; Lui, T H
2017-12-21
Develop the first reliable and validated open-source outcome scoring system in the Chinese language for foot and ankle problems. Translation of the English FAOS into Chinese following regular protocols. First, two forward-translations were created separately, these were then combined into a preliminary version by an expert committee, and was subsequently back-translated into English. The process was repeated until the original and back translations were congruent. This version was then field tested on actual patients who provided feedback for modification. The final Chinese FAOS version was then tested for reliability and validity. Reliability analysis was performed on 20 subjects while validity analysis was performed on 50 subjects. Tools used to validate the Chinese FAOS were the SF36 and Pain Numeric Rating Scale (NRS). Internal consistency between the FAOS subgroups was measured using Cronbach's alpha. Spearman's correlation was calculated between each subgroup in the FAOS, SF36 and NRS. The Chinese FAOS passed both reliability and validity testing; meaning it is reliable, internally consistent and correlates positively with the SF36 and the NRS. The Chinese FAOS is a free, open-source scoring system that can be used to provide a relatively standardised outcome measure for foot and ankle studies. Copyright © 2017 Elsevier Ltd. All rights reserved.
Malt, U F
1986-01-01
Experiences from teaching DSM-III to more than three hundred Norwegian psychiatrists and clinical psychologists suggest that reliable DSM-III diagnoses can be achieved within a few hours training with reference to the decision trees and the diagnostic criteria only. The diagnoses provided are more reliable than the corresponding ICD diagnoses which the participants were more familiar with. The three main sources of reduced reliability of the DSM-III diagnoses are related to: poor knowledge of the criteria which often is connected with failure of obtaining diagnostic key information during the clinical interview; unfamiliar concepts and vague or ambiguous criteria. The two first issues are related to the quality of the teaching of DSM-III. The third source of reduced reliability reflects unsolved validity issues. By using the classification of five affective case stories as examples, these sources of diagnostic pitfalls, reducing reliability and ways to overcome these problems when teaching the DSM-III system, are discussed. It is concluded that the DSM-III system of classification is easy to teach and that the system is superior to other classification systems available from a reliability point of view. The current version of the DSM-III system, however, partly owes a high degree of reliability to broad and heterogeneous diagnostic categories like the concept major depression, which may have questionable validity. Thus, the future revisions of the DSM-III system should, above all, address the issue of validity.
HIDECKER, MARY JO COOLEY; PANETH, NIGEL; ROSENBAUM, PETER L; KENT, RAYMOND D; LILLIE, JANET; EULENBERG, JOHN B; CHESTER, KEN; JOHNSON, BRENDA; MICHALSEN, LAUREN; EVATT, MORGAN; TAYLOR, KARA
2011-01-01
Aim The purpose of this study was to create and validate a Communication Function Classification System (CFCS) for children with cerebral palsy (CP) that can be used by a wide variety of individuals who are interested in CP. This paper reports the content validity, interrater reliability, and test–retest reliability of the CFCS for children with CP. Method An 11-member development team created comprehensive descriptions of the CFCS levels, and four nominal groups comprising 27 participants critiqued these levels. Within a Delphi survey, 112 participants commented on the clarity and usefulness of the CFCS. Interrater reliability was completed by 61 professionals and 68 parents/relatives who classified 69 children with CP aged 2 to 18 years. Test–retest reliability was completed by 48 professionals who allowed at least 2 weeks between classifications. The participants who assessed the CFCS were all relevant stakeholders: adults with CP, parents of children with CP, educators, occupational therapists, physical therapists, physicians, and speech–language pathologists. Results The interrater reliability of the CFCS was 0.66 between two professionals and 0.49 between a parent and a professional. Professional interrater reliability improved to 0.77 for classification of children older than 4 years. The test–retest reliability was 0.82. Interpretation The CFCS demonstrates content validity and shows very good test–retest reliability, good professional interrater reliability, and moderate parent–professional interrater reliability. Combining the CFCS with the Gross Motor Function Classification System and the Manual Ability Classification System contributes to a functional performance view of daily life for individuals with CP, in accordance with the World Health Organization’s International Classification of Functioning, Disability and Health. PMID:21707596
Dwyer, Tim; Martin, C Ryan; Kendra, Rita; Sermer, Corey; Chahal, Jaskarndip; Ogilvie-Harris, Darrell; Whelan, Daniel; Murnaghan, Lucas; Nauth, Aaron; Theodoropoulos, John
2017-06-01
To determine the interobserver reliability of the International Cartilage Repair Society (ICRS) grading system of chondral lesions in cadavers, to determine the intraobserver reliability of the ICRS grading system comparing arthroscopy and video assessment, and to compare the arthroscopic ICRS grading system with histological grading of lesion depth. Eighteen lesions in 5 cadaveric knee specimens were arthroscopically graded by 7 fellowship-trained arthroscopic surgeons using the ICRS classification system. The arthroscopic video of each lesion was sent to the surgeons 6 weeks later for repeat grading and determination of intraobserver reliability. Lesions were biopsied, and the depth of the cartilage lesion was assessed. Reliability was calculated using intraclass correlations. The interobserver reliability was 0.67 (95% confidence interval, 0.5-0.89) for the arthroscopic grading, and the intraobserver reliability with the video grading was 0.8 (95% confidence interval, 0.67-0.9). A high correlation was seen between the arthroscopic grading of depth and the histological grading of depth (0.91); on average, surgeons graded lesions using arthroscopy a mean of 0.37 (range, 0-0.86) deeper than the histological grade. The arthroscopic ICRS classification system has good interobserver and intraobserver reliability. A high correlation with histological assessment of depth provides evidence of validity for this classification system. As cartilage lesions are treated on the basis of the arthroscopic ICRS classification, it is important to ascertain the reliability and validity of this method. Copyright © 2016 Arthroscopy Association of North America. Published by Elsevier Inc. All rights reserved.
Reliability and Validity of Finger Strength and Endurance Measurements in Rock Climbing
ERIC Educational Resources Information Center
Michailov, Michail Lubomirov; Baláš, Jirí; Tanev, Stoyan Kolev; Andonov, Hristo Stoyanov; Kodejška, Jan; Brown, Lee
2018-01-01
Purpose: An advanced system for the assessment of climbing-specific performance was developed and used to: (a) investigate the effect of arm fixation (AF) on construct validity evidence and reliability of climbing-specific finger-strength measurement; (b) assess reliability of finger-strength and endurance measurements; and (c) evaluate the…
Validity and reliability of Optojump photoelectric cells for estimating vertical jump height.
Glatthorn, Julia F; Gouge, Sylvain; Nussbaumer, Silvio; Stauffacher, Simone; Impellizzeri, Franco M; Maffiuletti, Nicola A
2011-02-01
Vertical jump is one of the most prevalent acts performed in several sport activities. It is therefore important to ensure that the measurements of vertical jump height made as a part of research or athlete support work have adequate validity and reliability. The aim of this study was to evaluate concurrent validity and reliability of the Optojump photocell system (Microgate, Bolzano, Italy) with force plate measurements for estimating vertical jump height. Twenty subjects were asked to perform maximal squat jumps and countermovement jumps, and flight time-derived jump heights obtained by the force plate were compared with those provided by Optojump, to examine its concurrent (criterion-related) validity (study 1). Twenty other subjects completed the same jump series on 2 different occasions (separated by 1 week), and jump heights of session 1 were compared with session 2, to investigate test-retest reliability of the Optojump system (study 2). Intraclass correlation coefficients (ICCs) for validity were very high (0.997-0.998), even if a systematic difference was consistently observed between force plate and Optojump (-1.06 cm; p < 0.001). Test-retest reliability of the Optojump system was excellent, with ICCs ranging from 0.982 to 0.989, low coefficients of variation (2.7%), and low random errors (±2.81 cm). The Optojump photocell system demonstrated strong concurrent validity and excellent test-retest reliability for the estimation of vertical jump height. We propose the following equation that allows force plate and Optojump results to be used interchangeably: force plate jump height (cm) = 1.02 × Optojump jump height + 0.29. In conclusion, the use of Optojump photoelectric cells is legitimate for field-based assessments of vertical jump height.
Reviewing Reliability and Validity of Information for University Educational Evaluation
NASA Astrophysics Data System (ADS)
Otsuka, Yusaku
To better utilize evaluations in higher education, it is necessary to share the methods of reviewing reliability and validity of examination scores and grades, and to accumulate and share data for confirming results. Before the GPA system is first introduced into a university or college, the reliability of examination scores and grades, especially for essay examinations, must be assured. Validity is a complicated concept, so should be assured in various ways, including using professional audits, theoretical models, and statistical data analysis. Because individual students and teachers are continually improving, using evaluations to appraise their progress is not always compatible with using evaluations in appraising the implementation of accountability in various departments or the university overall. To better utilize evaluations and improve higher education, evaluations should be integrated into the current system by sharing the vision of an academic learning community and promoting interaction between students and teachers based on sufficiently reliable and validated evaluation tools.
Fault-tolerant clock synchronization validation methodology. [in computer systems
NASA Technical Reports Server (NTRS)
Butler, Ricky W.; Palumbo, Daniel L.; Johnson, Sally C.
1987-01-01
A validation method for the synchronization subsystem of a fault-tolerant computer system is presented. The high reliability requirement of flight-crucial systems precludes the use of most traditional validation methods. The method presented utilizes formal design proof to uncover design and coding errors and experimentation to validate the assumptions of the design proof. The experimental method is described and illustrated by validating the clock synchronization system of the Software Implemented Fault Tolerance computer. The design proof of the algorithm includes a theorem that defines the maximum skew between any two nonfaulty clocks in the system in terms of specific system parameters. Most of these parameters are deterministic. One crucial parameter is the upper bound on the clock read error, which is stochastic. The probability that this upper bound is exceeded is calculated from data obtained by the measurement of system parameters. This probability is then included in a detailed reliability analysis of the system.
Savage, Jason W; Moore, Timothy A; Arnold, Paul M; Thakur, Nikhil; Hsu, Wellington K; Patel, Alpesh A; McCarthy, Kathryn; Schroeder, Gregory D; Vaccaro, Alexander R; Dimar, John R; Anderson, Paul A
2015-09-15
The thoracolumbar injury classification system (TLICS) was evaluated in 20 consecutive pediatric spine trauma cases. The purpose of this study was to determine the reliability and validity of the TLICS in pediatric spine trauma. The TLICS was developed to improve the categorization and management of thoracolumbar trauma. TLICS has been shown to have good reliability and validity in the adult population. The clinical and radiographical findings of 20 pediatric thoracolumbar fractures were prospectively presented to 20 surgeons with disparate levels of training and experience with spinal trauma. These injuries were consecutively scored using the TLICS. Cohen unweighted κ coefficients and Spearman rank order correlation values were calculated for the key parameters (injury morphology, status of posterior ligamentous complex, neurological status, TLICS total score, and proposed management) to assess the inter-rater reliabilities. Five surgeons scored the same cases 3 months later to assess the intra-rater reliability. The actual management of each case was then compared with the treatment recommended by the TLICS algorithm to assess validity. The inter-rater κ statistics of all subgroups (injury morphology, status of the posterior ligamentous complex, neurological status, TLICS total score, and proposed treatment) were within the range of moderate to substantial reproducibility (0.524-0.958). All subgroups had excellent intra-rater reliability (0.748-1.000). The various indices for validity were calculated (80.3% correct, 0.836 sensitivity, 0.785 specificity, 0.676 positive predictive value, 0.899 negative predictive value). Overall, TLICS demonstrated good validity. The TLICS has good reliability and validity when used in the pediatric population. The inter-rater reliability of predicting management and indices for validity are lower than those in adults with thoracolumbar fractures, which is likely due to differences in the way children are treated for certain types of injuries. TLICS can be used to reliably categorize thoracolumbar injuries in the pediatric population; however, modifications may be needed to better guide treatment in this specific patient population. 4.
Tschirren, Lea; Bauer, Susanne; Hanser, Chiara; Marsico, Petra; Sellers, Diane; van Hedel, Hubertus J A
2018-06-01
As there is little evidence for concurrent validity of the Eating and Drinking Ability Classification System (EDACS), this study aimed to determine its concurrent validity and reliability in children and adolescents with cerebral palsy (CP). After an extensive translation procedure, we applied the German language version to 52 participants with CP (30 males, 22 females, mean age 9y 7mo [SD 4y 2mo]). We correlated (Kendall's tau or K τ ) the EDACS levels with the Bogenhausener Dysphagiescore (BODS), and the EDACS level of assistance with the Manual Ability Classification System (MACS) and the item 'eating' of the Functional Independence Measure for Children (WeeFIM). We further quantified the interrater reliability between speech and language therapists (SaLTs) and between SaLTs and parents with Kappa (κ). The EDACS levels correlated highly with the BODS (K τ =0.79), and the EDACS level of assistance correlated highly with the MACS (K τ =0.73) and WeeFIM eating item (K τ =-0.80). Interrater reliability proved almost perfect between SaLTs (EDACS: κ=0.94; EDACS level of assistance: κ=0.89) and SaLTs and parents (EDACS: κ=0.82; EDACS level of assistance: κ=0.89). The EDACS levels and level of assistance seem valid and showed almost perfect interrater reliability when classifying eating and drinking problems in children and adolescents with CP. The Eating and Drinking Ability Classification System (EDACS) correlates well with a dysphagia score. The EDACS level of assistance proves valid. The German version of EDACS is highly reliable. EDACS correlates moderately to highly with other classification systems. © 2018 Mac Keith Press.
Inertial Measurement Units for Clinical Movement Analysis: Reliability and Concurrent Validity
Nicholas, Kevin; Sparkes, Valerie; Sheeran, Liba; Davies, Jennifer L
2018-01-01
The aim of this study was to investigate the reliability and concurrent validity of a commercially available Xsens MVN BIOMECH inertial-sensor-based motion capture system during clinically relevant functional activities. A clinician with no prior experience of motion capture technologies and an experienced clinical movement scientist each assessed 26 healthy participants within each of two sessions using a camera-based motion capture system and the MVN BIOMECH system. Participants performed overground walking, squatting, and jumping. Sessions were separated by 4 ± 3 days. Reliability was evaluated using intraclass correlation coefficient and standard error of measurement, and validity was evaluated using the coefficient of multiple correlation and the linear fit method. Day-to-day reliability was generally fair-to-excellent in all three planes for hip, knee, and ankle joint angles in all three tasks. Within-day (between-rater) reliability was fair-to-excellent in all three planes during walking and squatting, and poor-to-high during jumping. Validity was excellent in the sagittal plane for hip, knee, and ankle joint angles in all three tasks and acceptable in frontal and transverse planes in squat and jump activity across joints. Our results suggest that the MVN BIOMECH system can be used by a clinician to quantify lower-limb joint angles in clinically relevant movements. PMID:29495600
Psychometrics of the MHSIP Adult Consumer Survey.
Jerrell, Jeanette M
2006-10-01
The reliability and validity of the Mental Health Statistics Improvement Program (MHSIP) Adult Consumer Survey were assessed in a statewide convenience sample of 459 persons with severe mental illness served through a public mental health system. Consistent with previous findings and the intent of its developers, three factors were identified that demonstrate good internal consistency, moderate test-retest reliability, and good convergent validity with consumer perceptions of other aspects of their care. The reliability and validity of the MHSIP Adult Consumer Survey documented in this study underscore its scientific and practical utility as an abbreviated tool for assessing access, quality and appropriateness, and outcome in mental health service systems.
The Author’s Guide To Writing 412th Test Wing Technical Reports
2014-12-01
control CAD computer aided design cc cubic centimeters C.O. carry-over c/o checkout USAF United States Air Force C1 rolling moment coefficient...cooling air. Mission Impact: Results in maintenance inability to reliably duplicate and isolate valid aircraft failures, and degrades reliability...air. Mission Impact: Results in maintenance inability to reliably duplicate and isolate valid aircraft failures, and degrades reliability of system
Grooten, Wilhelmus Johannes Andreas; Sandberg, Lisa; Ressman, John; Diamantoglou, Nicolas; Johansson, Elin; Rasmussen-Barr, Eva
2018-01-08
Clinical examinations are subjective and often show a low validity and reliability. Objective and highly reliable quantitative assessments are available in laboratory settings using 3D motion analysis, but these systems are too expensive to use for simple clinical examinations. Qinematic™ is an interactive movement analyses system based on the Kinect camera and is an easy-to-use clinical measurement system for assessing posture, balance and side-bending. The aim of the study was to test the test-retest the reliability and construct validity of Qinematic™ in a healthy population, and to calculate the minimal clinical differences for the variables of interest. A further aim was to identify the discriminative validity of Qinematic™ in people with low-back pain (LBP). We performed a test-retest reliability study (n = 37) with around 1 week between the occasions, a construct validity study (n = 30) in which Qinematic™ was tested against a 3D motion capture system, and a discriminative validity study, in which a group of people with LBP (n = 20) was compared to healthy controls (n = 17). We tested a large range of psychometric properties of 18 variables in three sections: posture (head and pelvic position, weight distribution), balance (sway area and velocity in single- and double-leg stance), and side-bending. The majority of the variables in the posture and balance sections, showed poor/fair reliability (ICC < 0.4) and poor/fair validity (Spearman <0.4), with significant differences between occasions, between Qinematic™ and the 3D-motion capture system. In the clinical study, Qinematic™ did not differ between people with LPB and healthy for these variables. For one variable, side-bending to the left, there was excellent reliability (ICC =0.898), excellent validity (r = 0.943), and Qinematic™ could differentiate between LPB and healthy individuals (p = 0.012). This paper shows that a novel software program (Qinematic™) based on the Kinect camera for measuring balance, posture and side-bending has poor psychometric properties, indicating that the variables on balance and posture should not be used for monitoring individual changes over time or in research. Future research on the dynamic tasks of Qinematic™ is warranted.
Gao, Zhongyang; Song, Hui; Ren, Fenggang; Li, Yuhuan; Wang, Dong; He, Xijing
2017-12-01
The aim of the present study was to evaluate the reliability of the Cartesian Optoelectronic Dynamic Anthropometer (CODA) motion system in measuring the cervical range of motion (ROM) and verify the construct validity of the CODA motion system. A total of 26 patients with cervical spondylosis and 22 patients with anterior cervical fusion were enrolled and the CODA motion analysis system was used to measure the three-dimensional cervical ROM. Intra- and inter-rater reliability was assessed by interclass correlation coefficients (ICCs), standard error of measurement (SEm), Limits of Agreements (LOA) and minimal detectable change (MDC). Independent samples t-tests were performed to examine the differences of cervical ROM between cervical spondylosis and anterior cervical fusion patients. The results revealed that in the cervical spondylosis group, the reliability was almost perfect (intra-rater reliability: ICC, 0.87-0.95; LOA, -12.86-13.70; SEm, 2.97-4.58; inter-rater reliability: ICC, 0.84-0.95; LOA, -13.09-13.48; SEm, 3.13-4.32). In the anterior cervical fusion group, the reliability was high (intra-rater reliability: ICC, 0.88-0.97; LOA, -10.65-11.08; SEm, 2.10-3.77; inter-rater reliability: ICC, 0.86-0.96; LOA, -10.91-13.66; SEm, 2.20-4.45). The cervical ROM in the cervical spondylosis group was significantly higher than that in the anterior cervical fusion group in all directions except for left rotation. In conclusion, the CODA motion analysis system is highly reliable in measuring cervical ROM and the construct validity was verified, as the system was sufficiently sensitive to distinguish between the cervical spondylosis and anterior cervical fusion groups based on their ROM.
Anderson, Ruth A.; Hsieh, Pi-Ching; Su, Hui Fang; Landerman, Lawrence R.; McDaniel, Reuben R.
2013-01-01
Objectives. To (1) describe participation in decision-making as a systems-level property of complex adaptive systems and (2) present empirical evidence of reliability and validity of a corresponding measure. Method. Study 1 was a mail survey of a single respondent (administrators or directors of nursing) in each of 197 nursing homes. Study 2 was a field study using random, proportionally stratified sampling procedure that included 195 organizations with 3,968 respondents. Analysis. In Study 1, we analyzed the data to reduce the number of scale items and establish initial reliability and validity. In Study 2, we strengthened the psychometric test using a large sample. Results. Results demonstrated validity and reliability of the participation in decision-making instrument (PDMI) while measuring participation of workers in two distinct job categories (RNs and CNAs). We established reliability at the organizational level aggregated items scores. We established validity of the multidimensional properties using convergent and discriminant validity and confirmatory factor analysis. Conclusions. Participation in decision making, when modeled as a systems-level property of organization, has multiple dimensions and is more complex than is being traditionally measured. Managers can use this model to form decision teams that maximize the depth and breadth of expertise needed and to foster connection among them. PMID:24349771
Anderson, Ruth A; Plowman, Donde; Corazzini, Kirsten; Hsieh, Pi-Ching; Su, Hui Fang; Landerman, Lawrence R; McDaniel, Reuben R
2013-01-01
Objectives. To (1) describe participation in decision-making as a systems-level property of complex adaptive systems and (2) present empirical evidence of reliability and validity of a corresponding measure. Method. Study 1 was a mail survey of a single respondent (administrators or directors of nursing) in each of 197 nursing homes. Study 2 was a field study using random, proportionally stratified sampling procedure that included 195 organizations with 3,968 respondents. Analysis. In Study 1, we analyzed the data to reduce the number of scale items and establish initial reliability and validity. In Study 2, we strengthened the psychometric test using a large sample. Results. Results demonstrated validity and reliability of the participation in decision-making instrument (PDMI) while measuring participation of workers in two distinct job categories (RNs and CNAs). We established reliability at the organizational level aggregated items scores. We established validity of the multidimensional properties using convergent and discriminant validity and confirmatory factor analysis. Conclusions. Participation in decision making, when modeled as a systems-level property of organization, has multiple dimensions and is more complex than is being traditionally measured. Managers can use this model to form decision teams that maximize the depth and breadth of expertise needed and to foster connection among them.
Bohannon, Richard W; Harrison, Steven; Kinsella-Shaw, Jeffrey
2009-01-01
Background Spasticity is a common impairment accompanying stroke. Spasticity of the quadriceps femoris muscle can be quantified using the pendulum test. The measurement properties of pendular kinematics captured using a magnetic tracking system has not been studied among patients who have experienced a stroke. Therefore, this study describes the test-retest reliability and known groups and convergent validity of the pendulum test measures obtained with the Polhemus tracking system. Methods Eight patients with chronic stroke underwent pendulum tests with their affected and unaffected lower limbs, with and without the addition of a 2.2 kg cuff weight at the ankle, using the Polhemus magnetic tracking system. Also measured bilaterally were knee resting angles, Ashworth scores (grades 0–4) of quadriceps femoris muscles, patellar tendon (knee jerk) reflexes (grades 0–4), and isometric knee extension force. Results Three measures obtained from pendular traces of the affected side were reliable (intraclass correlation coefficient ≥ .844). Known groups validity was confirmed by demonstration of a significant difference in the measurements between sides. Convergent validity was supported by correlations ≥ .57 between pendulum test measures and other measures reflective of spasticity. Conclusion Pendulum test measures obtained with the Polhemus tracking system from the affected side of patients with stroke have good test-retest reliability and both known groups and convergent validity. PMID:19642989
Bohannon, Richard W; Harrison, Steven; Kinsella-Shaw, Jeffrey
2009-07-30
Spasticity is a common impairment accompanying stroke. Spasticity of the quadriceps femoris muscle can be quantified using the pendulum test. The measurement properties of pendular kinematics captured using a magnetic tracking system has not been studied among patients who have experienced a stroke. Therefore, this study describes the test-retest reliability and known groups and convergent validity of the pendulum test measures obtained with the Polhemus tracking system. Eight patients with chronic stroke underwent pendulum tests with their affected and unaffected lower limbs, with and without the addition of a 2.2 kg cuff weight at the ankle, using the Polhemus magnetic tracking system. Also measured bilaterally were knee resting angles, Ashworth scores (grades 0-4) of quadriceps femoris muscles, patellar tendon (knee jerk) reflexes (grades 0-4), and isometric knee extension force. Three measures obtained from pendular traces of the affected side were reliable (intraclass correlation coefficient > or = .844). Known groups validity was confirmed by demonstration of a significant difference in the measurements between sides. Convergent validity was supported by correlations > or = .57 between pendulum test measures and other measures reflective of spasticity. Pendulum test measures obtained with the Polhemus tracking system from the affected side of patients with stroke have good test-retest reliability and both known groups and convergent validity.
Diagnostic classification past, present, and future: implications for pharmacotherapy.
Howland, Robert H
2013-04-01
Making a diagnosis is a key step in understanding the natural course of a disorder, selecting an appropriate treatment for the disorder, and predicting its response to treatment. Diagnostic proposals can be evaluated in two ways: reliability and validity. The reliability and validity of diagnoses are not one and the same, although establishing reliability is usually a necessary step before being able to evaluate and determine validity. There is little evidence that most psychiatric diagnoses are valid, but the reliability of diagnoses using classification systems developed since 1970 have greatly improved and are important for clinical practice and research. Past and current diagnostic systems have not optimally assisted the search for disorder-specific pathophysiological mechanisms, and they do not provide the specificity that clinicians would like when selecting medication. The Research Domain Criteria project is intended to shift research away from categorical diagnoses to focus on dysregulated neurobiological systems, and this approach ultimately may be more useful for understanding the pathophysiology of mental disorders and improving the development and use of treatment interventions. Copyright 2013, SLACK Incorporated.
Lee, Myungmo; Song, Changho; Lee, Kyoungjin; Shin, Doochul; Shin, Seungho
2014-07-14
Treadmill gait analysis was more advantageous than over-ground walking because it allowed continuous measurements of the gait parameters. The purpose of this study was to investigate the concurrent validity and the test-retest reliability of the OPTOGait photoelectric cell system against the treadmill-based gait analysis system by assessing spatio-temporal gait parameters. Twenty-six stroke patients and 18 healthy adults were asked to walk on the treadmill at their preferred speed. The concurrent validity was assessed by comparing data obtained from the 2 systems, and the test-retest reliability was determined by comparing data obtained from the 1st and the 2nd session of the OPTOGait system. The concurrent validity, identified by the intra-class correlation coefficients (ICC [2, 1]), coefficients of variation (CVME), and 95% limits of agreement (LOA) for the spatial-temporal gait parameters, were excellent but the temporal parameters expressed as a percentage of the gait cycle were poor. The test-retest reliability of the OPTOGait System, identified by ICC (3, 1), CVME, 95% LOA, standard error of measurement (SEM), and minimum detectable change (MDC95%) for the spatio-temporal gait parameters, was high. These findings indicated that the treadmill-based OPTOGait System had strong concurrent validity and test-retest reliability. This portable system could be useful for clinical assessments.
Duruöz, M T; Unal, C; Toprak, C Sanal; Sezer, I; Yilmaz, F; Ulutatar, F; Atagündüz, P; Baklacioglu, H S
2017-12-01
Background Systemic lupus erythematosus (SLE) may have a profound impact on quality of life. There is increasing interest in measuring quality of life in lupus patients. The purpose of this study was to investigate the validity and reliability of SLE Quality of Life Questionnaire (L-QoL) in Turkish SLE patients. Methods SLE according to 2012 Systemic Lupus International Collaborating Clinics Classification Criteria were recruited into the study. Demographic data, clinical parameters and disease activity measured with the Systemic Lupus Erythematosus Disease Activity Index-2000 (SLEDAI-2K); were noted. Nottingham Health Profile and Health Assessment Questionnaire were filled out in addition to the Turkish L-QoL (LQoL-TR). Internal consistency, test-retest reliability, and convergent and discriminant validity were evaluated. Results The mean age of participants was 43.55 ± 14.33 years and the mean disease duration was 89.8 ± 92.1 months. The patients filled out LQoL-TR in 2.5 min. Strong correlation of LQoL-TR with all subgroups of the Nottingham Health Profile and the Health Assessment Questionnaire were established showing the convergent validity. The highest correlation was demonstrated with emotional reactions (rho = 0.72) and sleep component (rho = 0.65) of the Nottingham Health Profile scale ( p < 0.0001). Its poor and not significant correlation with nonfunctional parameters (age, disease duration, perceived general health, SLEDAI-2K) showed its discriminative properties. LQoL-TR demonstrated good internal reliability with a Cronbach's α of 0.93 and test-retest reliability with intraclass correlation coefficient of 0.87. Conclusion The LQoL-TR is a practical and useful tool which demonstrates good validity and reliability.
Rodrigues, Letícia C.; Marques, Aline P.; Barros, Paula B.; Michaelsen, Stella M.
2014-01-01
BACKGROUND: The Balance Evaluation Systems Test (BESTest) was recently created to allow the development of treatments according to the specific balance system affected in each patient. The Brazilian version of the BESTest has not been specifically tested after stroke. OBJECTIVE: To evaluate the intra- and inter-rater reliability and concurrent and convergent validity of the total score of the BESTest and BESTest sections for adults with hemiparesis after stroke. METHOD: The study included 16 subjects (61.1±7.5 years) with chronic hemiparesis (54.5±43.5 months after stroke). The BESTest was administered by two raters in the same week and one of the raters repeated the test after a one-week interval. Intraclass correlation coefficient (ICC) was calculated to assess intra- and interrater reliability. Concurrent validity with the Berg Balance Scale (BBS) and convergent validity with the Activities-specific Balance Confidence scale (ABC-Brazil) were assessed using Pearson's correlation coefficient. RESULTS: Both the BESTest total score (ICC=0.98) and the BESTest sections (ICC between 0.85 and 0.96) have excellent intrarater reliability. Interrater reliability for the total score was excellent (ICC=0.93) and, for the sections, it ranged between 0.71 and 0.94. The correlation coefficient between the BESTest and the BBS and ABC-Brazil were 0.78 and 0.59, respectively. CONCLUSIONS: The Brazilian version of the BESTest demonstrated adequate reliability when measured by sections and could identify what balance system was affected in patients after stroke. Concurrent validity was excellent with the BBS total score and good to excellent with the sections. The total scores but not the sections present adequate convergent validity with the ABC-Brazil. However, other psychometric properties should be further investigated. PMID:25003281
2013-01-01
Background In recent years response rates on telephone surveys have been declining. Rates for the behavioral risk factor surveillance system (BRFSS) have also declined, prompting the use of new methods of weighting and the inclusion of cell phone sampling frames. A number of scholars and researchers have conducted studies of the reliability and validity of the BRFSS estimates in the context of these changes. As the BRFSS makes changes in its methods of sampling and weighting, a review of reliability and validity studies of the BRFSS is needed. Methods In order to assess the reliability and validity of prevalence estimates taken from the BRFSS, scholarship published from 2004–2011 dealing with tests of reliability and validity of BRFSS measures was compiled and presented by topics of health risk behavior. Assessments of the quality of each publication were undertaken using a categorical rubric. Higher rankings were achieved by authors who conducted reliability tests using repeated test/retest measures, or who conducted tests using multiple samples. A similar rubric was used to rank validity assessments. Validity tests which compared the BRFSS to physical measures were ranked higher than those comparing the BRFSS to other self-reported data. Literature which undertook more sophisticated statistical comparisons was also ranked higher. Results Overall findings indicated that BRFSS prevalence rates were comparable to other national surveys which rely on self-reports, although specific differences are noted for some categories of response. BRFSS prevalence rates were less similar to surveys which utilize physical measures in addition to self-reported data. There is very little research on reliability and validity for some health topics, but a great deal of information supporting the validity of the BRFSS data for others. Conclusions Limitations of the examination of the BRFSS were due to question differences among surveys used as comparisons, as well as mode of data collection differences. As the BRFSS moves to incorporating cell phone data and changing weighting methods, a review of reliability and validity research indicated that past BRFSS landline only data were reliable and valid as measured against other surveys. New analyses and comparisons of BRFSS data which include the new methodologies and cell phone data will be needed to ascertain the impact of these changes on estimates in the future. PMID:23522349
Kobayashi, Sarah; Peduto, Anthony; Simic, Milena; Fransen, Marlene; Refshauge, Kathryn; Mah, Jean; Pappas, Evangelos
2018-04-01
This work aimed to assess inter-rater reliability and agreement of a magnetic resonance imaging (MRI)-based Kellgren and Lawrence (K&L) grading for patellofemoral joint osteoarthritis (OA) and to validate it against the MRI Osteoarthritis Knee Score (MOAKS). MRI scans from people aged 45 to 75 years with chronic knee pain participating in a randomised clinical trial evaluating dietary supplements were utilised. Fifty participants were randomly selected and scored using the MRI-based K&L grading using axial and sagittal MRI scans. Raters conducted inter-rater reliability, blinded to clinical information, radiology reports and other rater results. Intra- and inter-rater reliability and agreement were evaluated using the intra-class correlation coefficient (ICC) and Cohen's weighted kappa. There was a 2-week interval between the first and second readings for intra-rater reliability. Validity was assessed using the MOAKS and evaluated using Spearman's correlation coefficient. Intra-rater reliability of the K&L system was excellent: ICC 0.91 (95% CI 0.82-0.95); weighted kappa (ĸ = 0.69). Inter-rater reliability was high (ICC 0.88; 95% CI 0.79-0.93), while agreement between raters was moderate (ĸ = 0.49-0.57). Validity analysis demonstrated a strong correlation between the total MOAKS features score and the K&L grading system (ρ = 0.62-0.67) but weak correlations when compared with individual MOAKS features (ρ = 0.19-0.61). The high reliability and good agreement show consistency in grading the severity of patellofemoral OA with the MRI-based K&L score. Our validity results suggest that the scale may be useful, particularly in the clinical environment. Future research should validate this method against clinical findings.
Tabard-Fougère, Anne; Bonnefoy-Mazure, Alice; Hanquinet, Sylviane; Lascombes, Pierre; Armand, Stéphane; Dayer, Romain
2017-01-15
Test-retest study. This study aimed to evaluate the validity and reliability of rasterstereography in patients with adolescent idiopathic scoliosis (AIS) with a major curve Cobb angle (CA) between 10° and 40° for frontal, sagittal, and transverse parameters. Previous studies evaluating the validity and reliability of rasterstereography concluded that this technique had good accuracy compared with radiographs and a high intra- and interday reliability in healthy volunteers. To the best of our knowledge, the validity and reliability have not been assessed in AIS patients. Thirty-five adolescents with AIS (male = 13) aged 13.1 ± 2.0 years were included. To evaluate the validity of the scoliosis angle (SA) provided by rasterstereography, a comparison (t test, Pearson correlation) was performed with the CA obtained using 2D EOS® radiography (XR). Three rasterstereographic repeated measurements were independently performed by two operators on the same day (interrater reliability) and again by the first operator 1 week later (intrarater reliability). The variables of interest were the SA, lumbar lordosis, and thoracic kyphosis angle, trunk length, pelvic obliquity, and maximum, root mean square and amplitude of vertebral rotations. The data analyses used intraclass correlation coefficients (ICCs). The CA and SA were strongly correlated (R = 0.70) and were nonsignificantly different (P = 0.60). The intrarater reliability (same day: ICC [1, 1], n = 35; 1 week later: ICC [1, 3], n = 28) and interrater reliability (ICC [3, 3], n = 16) were globally excellent (ICC > 0.75) except for the assessment of pelvic obliquity. This study showed that the rasterstereographic system allows for the evaluation of AIS patients with a good validity compared with XR with an overall excellent intra- and interrater reliability. Based on these results, this automatic, fast, and noninvasive system can be used for monitoring the evolution of AIS in growing patients instead of repetitive radiographs, thereby reducing radiation exposure and decreasing costs. 4.
Validity of a novel computerized screening test system for mild cognitive impairment.
Park, Jin-Hyuck; Jung, Minye; Kim, Jongbae; Park, Hae Yean; Kim, Jung-Ran; Park, Ji-Hyuk
2018-06-20
ABSTRACTBackground:The mobile screening test system for screening mild cognitive impairment (mSTS-MCI) was developed for clinical use. However, the clinical usefulness of mSTS-MCI to detect elderly with MCI from those who are cognitively healthy has yet to be validated. Moreover, the comparability between this system and traditional screening tests for MCI has not been evaluated. The purpose of this study was to examine the validity and reliability of the mSTS-MCI and confirm the cut-off scores to detect MCI. The data were collected from 107 healthy elderly people and 74 elderly people with MCI. Concurrent validity was examined using the Korean version of Montreal Cognitive Assessment (MoCA-K) as a gold standard test, and test-retest reliability was investigated using 30 of the study participants at four-week intervals. The sensitivity, specificity, positive predictive value, and negative predictive value (NPV) were confirmed through Receiver Operating Characteristic (ROC) analysis, and the cut-off scores for elderly people with MCI were identified. Concurrent validity showed statistically significant correlations between the mSTS-MCI and MoCA-K and test-rests reliability indicated high correlation. As a result of screening predictability, the mSTS-MCI had a higher NPV than the MoCA-K. The mSTS-MCI was identified as a system with a high degree of validity and reliability. In addition, the mSTS-MCI showed high screening predictability, indicating it can be used in the clinical field as a screening test system for mild cognitive impairment.
Validation and Improvement of Reliability Methods for Air Force Building Systems
focusing primarily on HVAC systems . This research used contingency analysis to assess the performance of each model for HVAC systems at six Air Force...probabilistic model produced inflated reliability calculations for HVAC systems . In light of these findings, this research employed a stochastic method, a...Nonhomogeneous Poisson Process (NHPP), in an attempt to produce accurate HVAC system reliability calculations. This effort ultimately concluded that
Validity and Reliability of a New Device (WIMU®) for Measuring Hamstring Muscle Extensibility.
Muyor, José M
2017-09-01
The aims of the current study were 1) to evaluate the validity of the WIMU ® system for measuring hamstring muscle extensibility in the passive straight leg raise (PSLR) test using an inclinometer for the criterion and 2) to determine the test-retest reliability of the WIMU ® system to measure hamstring muscle extensibility during the PSLR test. 55 subjects were evaluated on 2 separate occasions. Data from a Unilever inclinometer and WIMU ® system were collected simultaneously. Intraclass correlation coefficients (ICCs) for the validity were very high (0.983-1); a very low systematic bias (-0.21°--0.42°), random error (0.05°-0.04°) and standard error of the estimate (0.43°-0.34°) were observed (left-right leg, respectively) between the 2 devices (inclinometer and the WIMU ® system). The R 2 between the devices was 0.999 (p<0.001) in both the left and right legs. The test-retest reliability of the WIMU ® system was excellent, with ICCs ranging from 0.972-0.995, low coefficients of variation (0.01%), and a low standard error of the estimate (0.19-0.31°). The WIMU ® system showed strong concurrent validity and excellent test-retest reliability for the evaluation of hamstring muscle extensibility in the PSLR test. © Georg Thieme Verlag KG Stuttgart · New York.
Design and validation of an automated hydrostatic weighing system.
McClenaghan, B A; Rocchio, L
1986-08-01
The purpose of this study was to design and evaluate the validity of an automated technique to assess body density using a computerized hydrostatic weighing system. An existing hydrostatic tank was modified and interfaced with a microcomputer equipped with an analog-to-digital converter. Software was designed to input variables, control the collection of data, calculate selected measurements, and provide a summary of the results of each session. Validity of the data obtained utilizing the automated hydrostatic weighing system was estimated by: evaluating the reliability of the transducer/computer interface to measure objects of known underwater weight; comparing the data against a criterion measure; and determining inter-session subject reliability. Values obtained from the automated system were found to be highly correlated with known underwater weights (r = 0.99, SEE = 0.0060 kg). Data concurrently obtained utilizing the automated system and a manual chart recorder were also found to be highly correlated (r = 0.99, SEE = 0.0606 kg). Inter-session subject reliability was determined utilizing data collected on subjects (N = 16) tested on two occasions approximately 24 h apart. Correlations revealed high relationships between measures of underwater weight (r = 0.99, SEE = 0.1399 kg) and body density (r = 0.98, SEE = 0.00244 g X cm-1). Results indicate that a computerized hydrostatic weighing system is a valid and reliable method for determining underwater weight.
Validity and Reliability Testing of an e-learning Questionnaire for Chemistry Instruction
NASA Astrophysics Data System (ADS)
Guspatni, G.; Kurniawati, Y.
2018-04-01
The aim of this paper is to examine validity and reliability of a questionnaire used to evaluate e-learning implementation in chemistry instruction. 48 questionnaires were filled in by students who had studied chemistry through e-learning system. The questionnaire consisted of 20 indicators evaluating students’ perception on using e-learning. Parametric testing was done as data were assumed to follow normal distribution. Item validity of the questionnaire was examined through item-total correlation using Pearson’s formula while its reliability was assessed with Cronbach’s alpha formula. Moreover, convergent validity was assessed to see whether indicators building a factor had theoretically the same underlying construct. The result of validity testing revealed 19 valid indicators while the result of reliability testing revealed Cronbach’s alpha value of .886. The result of factor analysis showed that questionnaire consisted of five factors, and each of them had indicators building the same construct. This article shows the importance of factor analysis to get a construct valid questionnaire before it is used as research instrument.
NASA Aerospace Flight Battery Systems Program Update
NASA Technical Reports Server (NTRS)
Manzo, Michelle; ODonnell, Patricia
1997-01-01
The objectives of NASA's Aerospace Flight Battery Systems Program is to: develop, maintain and provide tools for the validation and assessment of aerospace battery technologies; accelerate the readiness of technology advances and provide infusion paths for emerging technologies; provide NASA projects with the required database and validation guidelines for technology selection of hardware and processes relating to aerospace batteries; disseminate validation and assessment tools, quality assurance, reliability, and availability information to the NASA and aerospace battery communities; and ensure that safe, reliable batteries are available for NASA's future missions.
Suzuki, T; Sato, Y; Sotome, S; Arai, H; Arai, A; Yoshida, H
2017-06-01
This study was designed to investigate the reliability and validity of measurements of finger diameters with a ring gauge. A reliability study enrolled two independent samples (50 participants and seven examiners in Study I; 26 participants and 26 examiners in Study II). The sizes of each participant's little fingers were measured twice with a ring gauge by each examiner. To investigate the validity of the measurements, five hand therapists compared the finger size and hand volume of 30 participants with the ring gauge and with a figure-of-eight technique (Study III). The intra-class correlation coefficient for intra-observer reliability ranged from 0.97 to 0.99 in Study I, and 0.90 to 0.97 in Study II. The intra-class correlation coefficient for inter-observer reliability was 0.95 in Study I and 0.94 in Study II. The validity study showed a Pearson product moment correlation coefficient of 0.75. The ring gauge showed high reliability and validity for measurement of finger size. III, diagnostic.
A Measure of Cognition within the Context of Assertion.
ERIC Educational Resources Information Center
Golden, Morrie
1981-01-01
Described the development and evaluation of a measure of cognitive belief systems and thinking styles. Reliability and validity results were poor for junior college students. For university and nonstudent populations, cognition scores discriminated social anxiety. The Cognition Scale of Assertiveness is a reliable and valid measure of cognitive…
ERIC Educational Resources Information Center
Wray, Kraig; Lai, Cheng-Fei; Sáez, Leilani; Alonzo, Julie; Tindal, Gerald
2013-01-01
We report the results of an alternate form reliability and criterion validity study of kindergarten and grade 1 (N = 84-199) reading measures from the easyCBM© assessment system and Stanford Early School Achievement Test/Stanford Achievement Test, 10th edition (SESAT/SAT-10) across 5 time points. The alternate form reliabilities ranged from…
Meaney, Calvin J; Arabi, Ziad; Venuto, Rocco C; Consiglio, Joseph D; Wilding, Gregory E; Tornatore, Kathleen M
2014-06-12
After renal transplantation, many patients experience adverse effects from maintenance immunosuppressive drugs. When these adverse effects occur, patient adherence with immunosuppression may be reduced and impact allograft survival. If these adverse effects could be prospectively monitored in an objective manner and possibly prevented, adherence to immunosuppressive regimens could be optimized and allograft survival improved. Prospective, standardized clinical approaches to assess immunosuppressive adverse effects by health care providers are limited. Therefore, we developed and evaluated the application, reliability and validity of a novel adverse effects scoring system in renal transplant recipients receiving calcineurin inhibitor (cyclosporine or tacrolimus) and mycophenolic acid based immunosuppressive therapy. The scoring system included 18 non-renal adverse effects organized into gastrointestinal, central nervous system and aesthetic domains developed by a multidisciplinary physician group. Nephrologists employed this standardized adverse effect evaluation in stable renal transplant patients using physical exam, review of systems, recent laboratory results, and medication adherence assessment during a clinic visit. Stable renal transplant recipients in two clinical studies were evaluated and received immunosuppressive regimens comprised of either cyclosporine or tacrolimus with mycophenolic acid. Face, content, and construct validity were assessed to document these adverse effect evaluations. Inter-rater reliability was determined using the Kappa statistic and intra-class correlation. A total of 58 renal transplant recipients were assessed using the adverse effects scoring system confirming face validity. Nephrologists (subject matter experts) rated the 18 adverse effects as: 3.1 ± 0.75 out of 4 (maximum) regarding clinical importance to verify content validity. The adverse effects scoring system distinguished 1.75-fold increased gastrointestinal adverse effects (p=0.008) in renal transplant recipients receiving tacrolimus and mycophenolic acid compared to the cyclosporine regimen. This finding demonstrated construct validity. Intra-class correlation was 0.81 (95% confidence interval: 0.65-0.90) and Kappa statistic of 0.68 ± 0.25 for all 18 adverse effects and verified substantial inter-rater reliability. This immunosuppressive adverse effects scoring system in stable renal transplant recipients was evaluated and substantiated face, content and construct validity with inter-rater reliability. The scoring system may facilitate prospective, standardized clinical monitoring of immunosuppressive adverse drug effects in stable renal transplant recipients and improve medication adherence.
MEMS reliability: The challenge and the promise
DOE Office of Scientific and Technical Information (OSTI.GOV)
Miller, W.M.; Tanner, D.M.; Miller, S.L.
1998-05-01
MicroElectroMechanical Systems (MEMS) that think, sense, act and communicate will open up a broad new array of cost effective solutions only if they prove to be sufficiently reliable. A valid reliability assessment of MEMS has three prerequisites: (1) statistical significance; (2) a technique for accelerating fundamental failure mechanisms, and (3) valid physical models to allow prediction of failures during actual use. These already exist for the microelectronics portion of such integrated systems. The challenge lies in the less well understood micromachine portions and its synergistic effects with microelectronics. This paper presents a methodology addressing these prerequisites and a description ofmore » the underlying physics of reliability for micromachines.« less
Sharma, Shreela; Chuang, Ru-Jye; Skala, Katherine; Atteberry, Heather
2012-01-01
The purpose of this study is describe the initial feasibility, reliability, and validity of an instrument to measure physical activity in preschoolers using direct observation. The System for Observing Fitness Instruction Time for Preschoolers was developed and tested among 3- to 6-year-old children over fall 2008 for feasibility and reliability (Phase I, n=67) and in fall 2009 for concurrent validity (Phase II, n=27). Phase I showed that preschoolers spent >75% of their active time at preschool in light physical activity. The mean inter-observer agreements scores were ≥.75 for physical activity level and type. Correlation coefficients, measuring construct validity between the lesson context and physical activity types with and with the activity levels, were moderately strong. Phase II showed moderately strong correlations ranging from .50 to .54 between the System for Observing Fitness Instruction Time for Preschoolers and Actigraph accelerometers for physical activity levels. The System for Observing Fitness Instruction Time for Preschoolers shows promising initial results as a new method for measuring physical activity among preschoolers. PMID:22485071
Iwata, Shintaro; Uehara, Kosuke; Ogura, Koichi; Akiyama, Toru; Shinoda, Yusuke; Yonemoto, Tsukasa; Kawai, Akira
2016-09-01
The Musculoskeletal Tumor Society (MSTS) scoring system is a widely used functional evaluation tool for patients treated for musculoskeletal tumors. Although the MSTS scoring system has been validated in English and Brazilian Portuguese, a Japanese version of the MSTS scoring system has not yet been validated. We sought to determine whether a Japanese-language translation of the MSTS scoring system for the lower extremity had (1) sufficient reliability and internal consistency, (2) adequate construct validity, and (3) reasonable criterion validity compared with the Toronto Extremity Salvage Score (TESS) and SF-36 using psychometric analysis. The Japanese version of the MSTS scoring system was developed using accepted guidelines, which included translation of the English version of the MSTS into Japanese by five native Japanese bilingual musculoskeletal oncology surgeons and integrated into one document. One hundred patients with a diagnosis of intermediate or malignant bone or soft tissue tumors located in the lower extremity and who had undergone tumor resection with or without reconstruction or amputation participated in this study. Reliability was evaluated by test-retest analysis, and internal consistency was established by Cronbach's alpha coefficient. Construct validity was evaluated using the principal factor analysis and Akaike information criterion network. Criterion validity was evaluated by comparing the MSTS scoring system with the TESS and SF-36. Test-retest analysis showed a high intraclass correlation coefficient (0.92; 95% CI, 0.88-0.95), indicating high reliability of the Japanese version of the MSTS scoring system, although a considerable ceiling effect was observed, with 23 patients (23%) given the maximum score. Cronbach's alpha coefficient was 0.87 (95% CI, 0.82-0.90), suggesting a high level of internal consistency. Factor analysis revealed that all items had high loading values and communalities; we identified a central role for the items "walking" and "gait" according to the Akaike information criterion network. The total MSTS score was correlated with that of the TESS (r = 0.81; 95% CI, 0.73-0.87; p < 0.001) and the physical component summary and physical functioning of the SF-36. The Japanese-language translation of the MSTS scoring system for the lower extremity has sufficient reliability and reasonable validity. Nevertheless, the observation of a ceiling effect suggests poor ability of this system to discriminate from among patients who have a high level of function.
Downer, Jason T.; Booren, Leslie M.; Lima, Olivia K.; Luckner, Amy E.; Pianta, Robert C.
2012-01-01
This paper introduces the Individualized Classroom Assessment Scoring System (inCLASS), an observation tool that targets children’s interactions in preschool classrooms with teachers, peers, and tasks. In particular, initial evidence is reported of the extent to which the inCLASS meets the following psychometric criteria: inter-rater reliability, normal distributions and adequate range, construct validity, and criterion-related validity. These initial findings suggest that the inCLASS has the potential to provide an authentic, contextualized assessment of young children’s classroom behaviors. Future directions for research with the inCLASS are discussed. PMID:23175598
Broderick, Joan E.; Schneider, Stefan; Junghaenel, Doerte U.; Schwartz, Joseph E.; Stone, Arthur A.
2013-01-01
Objective Evaluation of known group validity, ecological validity, and test-retest reliability of four domain instruments from the Patient Reported Outcomes Measurement System (PROMIS) in osteoarthritis (OA) patients. Methods Recruitment of an osteoarthritis sample and a comparison general population (GP) through an Internet survey panel. Pain intensity, pain interference, physical functioning, and fatigue were assessed for 4 consecutive weeks with PROMIS short forms on a daily basis and compared with same-domain Computer Adaptive Test (CAT) instruments that use a 7-day recall. Known group validity (comparison of OA and GP), ecological validity (comparison of aggregated daily measures with CATs), and test-retest reliability were evaluated. Results The recruited samples matched (age, sex, race, ethnicity) the demographic characteristics of the U.S. sample for arthritis and the 2009 Census for the GP. Compliance with repeated measurements was excellent: > 95%. Known group validity for CATs was demonstrated with large effect sizes (pain intensity: 1.42, pain interference: 1.25, and fatigue: .85). Ecological validity was also established through high correlations between aggregated daily measures and weekly CATs (≥ .86). Test-retest validity (7-day) was very good (≥ .80). Conclusion PROMIS CAT instruments demonstrated known group and ecological validity in a comparison of osteoarthritis patients with a general population sample. Adequate test-retest reliability was also observed. These data provide encouraging initial data on the utility of these PROMIS instruments for clinical and research outcomes in osteoarthritis patients. PMID:23592494
Mentiplay, Benjamin F; Perraton, Luke G; Bower, Kelly J; Pua, Yong-Hao; McGaw, Rebekah; Heywood, Sophie; Clark, Ross A
2015-07-16
The revised Xbox One Kinect, also known as the Microsoft Kinect V2 for Windows, includes enhanced hardware which may improve its utility as a gait assessment tool. This study examined the concurrent validity and inter-day reliability of spatiotemporal and kinematic gait parameters estimated using the Kinect V2 automated body tracking system and a criterion reference three-dimensional motion analysis (3DMA) marker-based camera system. Thirty healthy adults performed two testing sessions consisting of comfortable and fast paced walking trials. Spatiotemporal outcome measures related to gait speed, speed variability, step length, width and time, foot swing velocity and medial-lateral and vertical pelvis displacement were examined. Kinematic outcome measures including ankle flexion, knee flexion and adduction and hip flexion were examined. To assess the agreement between Kinect and 3DMA systems, Bland-Altman plots, relative agreement (Pearson's correlation) and overall agreement (concordance correlation coefficients) were determined. Reliability was assessed using intraclass correlation coefficients, Cronbach's alpha and standard error of measurement. The spatiotemporal measurements had consistently excellent (r≥0.75) concurrent validity, with the exception of modest validity for medial-lateral pelvis sway (r=0.45-0.46) and fast paced gait speed variability (r=0.73). In contrast kinematic validity was consistently poor to modest, with all associations between the systems weak (r<0.50). In those measures with acceptable validity, the inter-day reliability was similar between systems. In conclusion, while the Kinect V2 body tracking may not accurately obtain lower body kinematic data, it shows great potential as a tool for measuring spatiotemporal aspects of gait. Copyright © 2015 Elsevier Ltd. All rights reserved.
Baker, Nancy A; Cook, James R; Redfern, Mark S
2009-01-01
This paper describes the inter-rater and intra-rater reliability, and the concurrent validity of an observational instrument, the Keyboard Personal Computer Style instrument (K-PeCS), which assesses stereotypical postures and movements associated with computer keyboard use. Three trained raters independently rated the video clips of 45 computer keyboard users to ascertain inter-rater reliability, and then re-rated a sub-sample of 15 video clips to ascertain intra-rater reliability. Concurrent validity was assessed by comparing the ratings obtained using the K-PeCS to scores developed from a 3D motion analysis system. The overall K-PeCS had excellent reliability [inter-rater: intra-class correlation coefficients (ICC)=.90; intra-rater: ICC=.92]. Most individual items on the K-PeCS had from good to excellent reliability, although six items fell below ICC=.75. Those K-PeCS items that were assessed for concurrent validity compared favorably to the motion analysis data for all but two items. These results suggest that most items on the K-PeCS can be used to reliably document computer keyboarding style.
Gunnarsson, U; Johansson, M; Strigård, K
2011-08-01
The decrease in recurrence rates in ventral hernia surgery have led to a redirection of focus towards other important patient-related endpoints. One such endpoint is abdominal wall function. The aim of the present study was to evaluate the reliability and external validity of abdominal wall strength measurement using the Biodex System-4 with a back abdomen unit. Ten healthy volunteers and ten patients with ventral hernias exceeding 10 cm were recruited. Test-retest reliability, both with and without girdle, was evaluated by comparison of measurements at two test occasions 1 week apart. Reliability was calculated by the interclass correlation coefficients (ICC) method. Validity was evaluated by correlation with the well-established International Physical Activity Questionnaire (IPAQ) and a self-assessment of abdominal wall strength. One person in the healthy group was excluded after the first test due to neck problems following minor trauma. The reliability was excellent (>0.75), with ICC values between 0.92 and 0.97 for the different modalities tested. No differences were seen between testing with and without a girdle. Validity was also excellent both when calculated as correlation to self-assessment of abdominal wall strength, and to IPAQ, giving Kendall tau values of 0.51 and 0.47, respectively, and corresponding P values of 0.002 and 0.004. Measurement of abdominal muscle function using the Biodex System-4 is a reliable and valid method to assess this important patient-related endpoint. Further investigations will be made to explore the potential of this technique in the evaluation of the results of ventral hernia surgery, and to compare muscle function after different abdominal wall reconstruction techniques.
Monteiro-Soares, M; Martins-Mendes, D; Vaz-Carneiro, A; Sampaio, S; Dinis-Ribeiro, M
2014-10-01
We systematically review the available systems used to classify diabetic foot ulcers in order to synthesize their methodological qualitative issues and accuracy to predict lower extremity amputation, as this may represent a critical point in these patients' care. Two investigators searched, in EBSCO, ISI, PubMed and SCOPUS databases, and independently selected studies published until May 2013 and reporting prognostic accuracy and/or reliability of specific systems for patients with diabetic foot ulcer in order to predict lower extremity amputation. We included 25 studies reporting a prevalence of lower extremity amputation between 6% and 78%. Eight different diabetic foot ulcer descriptions and seven prognostic stratification classification systems were addressed with a variable (1-9) number of factors included, specially peripheral arterial disease (n = 12) or infection at the ulcer site (n = 10) or ulcer depth (n = 10). The Meggitt-Wagner, S(AD)SAD and Texas University Classification systems were the most extensively validated, whereas ten classifications were derived or validated only once. Reliability was reported in a single study, and accuracy measures were reported in five studies with another eight allowing their calculation. Pooled accuracy ranged from 0.65 (for gangrene) to 0.74 (for infection). There are numerous classification systems for diabetic foot ulcer outcome prediction, but only few studies evaluated their reliability or external validity. Studies rarely validated several systems simultaneously and only a few reported accuracy measures. Further studies assessing reliability and accuracy of the available systems and their composing variables are needed. Copyright © 2014 John Wiley & Sons, Ltd.
A Robust Compositional Architecture for Autonomous Systems
NASA Technical Reports Server (NTRS)
Brat, Guillaume; Deney, Ewen; Farrell, Kimberley; Giannakopoulos, Dimitra; Jonsson, Ari; Frank, Jeremy; Bobby, Mark; Carpenter, Todd; Estlin, Tara
2006-01-01
Space exploration applications can benefit greatly from autonomous systems. Great distances, limited communications and high costs make direct operations impossible while mandating operations reliability and efficiency beyond what traditional commanding can provide. Autonomous systems can improve reliability and enhance spacecraft capability significantly. However, there is reluctance to utilizing autonomous systems. In part this is due to general hesitation about new technologies, but a more tangible concern is that of reliability of predictability of autonomous software. In this paper, we describe ongoing work aimed at increasing robustness and predictability of autonomous software, with the ultimate goal of building trust in such systems. The work combines state-of-the-art technologies and capabilities in autonomous systems with advanced validation and synthesis techniques. The focus of this paper is on the autonomous system architecture that has been defined, and on how it enables the application of validation techniques for resulting autonomous systems.
14 CFR 417.307 - Support systems.
Code of Federal Regulations, 2010 CFR
2010-01-01
... subsystem, component, and part that can affect the reliability of the support system must have written..., evaluate the data for validity, and provide valid data for display and recording; (3) Perform any... input and processed data at a rate that maintains the validity of the data and at no less than 0.1...
Hardware and software reliability estimation using simulations
NASA Technical Reports Server (NTRS)
Swern, Frederic L.
1994-01-01
The simulation technique is used to explore the validation of both hardware and software. It was concluded that simulation is a viable means for validating both hardware and software and associating a reliability number with each. This is useful in determining the overall probability of system failure of an embedded processor unit, and improving both the code and the hardware where necessary to meet reliability requirements. The methodologies were proved using some simple programs, and simple hardware models.
An Innovative Excel Application to Improve Exam Reliability in Marketing Courses
ERIC Educational Resources Information Center
Keller, Christopher M.; Kros, John F.
2011-01-01
Measures of survey reliability are commonly addressed in marketing courses. One statistic of reliability is "Cronbach's alpha." This paper presents an application of survey reliability as a reflexive application of multiple-choice exam validation. The application provides an interactive decision support system that incorporates survey item…
Measurement in Sensory Modulation: The Sensory Processing Scale Assessment
Miller, Lucy J.; Sullivan, Jillian C.
2014-01-01
OBJECTIVE. Sensory modulation issues have a significant impact on participation in daily life. Moreover, understanding phenotypic variation in sensory modulation dysfunction is crucial for research related to defining homogeneous groups and for clinical work in guiding treatment planning. We thus evaluated the new Sensory Processing Scale (SPS) Assessment. METHOD. Research included item development, behavioral scoring system development, test administration, and item analyses to evaluate reliability and validity across sensory domains. RESULTS. Items with adequate reliability (internal reliability >.4) and discriminant validity (p < .01) were retained. Feedback from the expert panel also contributed to decisions about retaining items in the scale. CONCLUSION. The SPS Assessment appears to be a reliable and valid measure of sensory modulation (scale reliability >.90; discrimination between group effect sizes >1.00). This scale has the potential to aid in differential diagnosis of sensory modulation issues. PMID:25184464
Evaluating the Level of Degree Programmes in Higher Education: The Case of Nursing
ERIC Educational Resources Information Center
Rexwinkel, Trudy; Haenen, Jacques; Pilot, Albert
2013-01-01
The European Quality Assurance system demands that the degree programme level is represented in terms of quantitative outcomes to be valid and reliable. To meet this need the Educational Level Evaluator (ELE) was devised. This conceptually designed procedure with instrumentation aiming to evaluate the level of a degree validly and reliably still…
ERIC Educational Resources Information Center
Chang, Chi-Cheng; Liang, Chaoyun; Chen, Yi-Hui
2013-01-01
This study explored the reliability and validity of Web-based portfolio self-assessment. Participants were 72 senior high school students enrolled in a computer application course. The students created learning portfolios, viewed peers' work, and performed self-assessment on the Web-based portfolio assessment system. The results indicated: 1)…
Validity and reliability of the Paprosky acetabular defect classification.
Yu, Raymond; Hofstaetter, Jochen G; Sullivan, Thomas; Costi, Kerry; Howie, Donald W; Solomon, Lucian B
2013-07-01
The Paprosky acetabular defect classification is widely used but has not been appropriately validated. Reliability of the Paprosky system has not been evaluated in combination with standardized techniques of measurement and scoring. This study evaluated the reliability, teachability, and validity of the Paprosky acetabular defect classification. Preoperative radiographs from a random sample of 83 patients undergoing 85 acetabular revisions were classified by four observers, and their classifications were compared with quantitative intraoperative measurements. Teachability of the classification scheme was tested by dividing the four observers into two groups. The observers in Group 1 underwent three teaching sessions; those in Group 2 underwent one session and the influence of teaching on the accuracy of their classifications was ascertained. Radiographic evaluation showed statistically significant relationships with intraoperative measurements of anterior, medial, and superior acetabular defect sizes. Interobserver reliability improved substantially after teaching and did not improve without it. The weighted kappa coefficient went from 0.56 at Occasion 1 to 0.79 after three teaching sessions in Group 1 observers, and from 0.49 to 0.65 after one teaching session in Group 2 observers. The Paprosky system is valid and shows good reliability when combined with standardized definitions of radiographic landmarks and a structured analysis. Level II, diagnostic study. See the Guidelines for Authors for a complete description of levels of evidence.
COPES Report: System Reliability Study.
ERIC Educational Resources Information Center
Foothill-De Anza Community Coll. District, Los Altos Hills, CA.
The study examines the reliability of the Community College Occupational Programs Evaluation System (COPES). The COPES process is a system for evaluating program strengths and needs. A two-way test, college self-appraisal with third party validation of the self-appraisal, is utilized to assist community colleges in future institutional planning…
Reliability Modeling of Microelectromechanical Systems Using Neural Networks
NASA Technical Reports Server (NTRS)
Perera. J. Sebastian
2000-01-01
Microelectromechanical systems (MEMS) are a broad and rapidly expanding field that is currently receiving a great deal of attention because of the potential to significantly improve the ability to sense, analyze, and control a variety of processes, such as heating and ventilation systems, automobiles, medicine, aeronautical flight, military surveillance, weather forecasting, and space exploration. MEMS are very small and are a blend of electrical and mechanical components, with electrical and mechanical systems on one chip. This research establishes reliability estimation and prediction for MEMS devices at the conceptual design phase using neural networks. At the conceptual design phase, before devices are built and tested, traditional methods of quantifying reliability are inadequate because the device is not in existence and cannot be tested to establish the reliability distributions. A novel approach using neural networks is created to predict the overall reliability of a MEMS device based on its components and each component's attributes. The methodology begins with collecting attribute data (fabrication process, physical specifications, operating environment, property characteristics, packaging, etc.) and reliability data for many types of microengines. The data are partitioned into training data (the majority) and validation data (the remainder). A neural network is applied to the training data (both attribute and reliability); the attributes become the system inputs and reliability data (cycles to failure), the system output. After the neural network is trained with sufficient data. the validation data are used to verify the neural networks provided accurate reliability estimates. Now, the reliability of a new proposed MEMS device can be estimated by using the appropriate trained neural networks developed in this work.
Goode, N; Salmon, P M; Taylor, N Z; Lenné, M G; Finch, C F
2017-10-01
One factor potentially limiting the uptake of Rasmussen's (1997) Accimap method by practitioners is the lack of a contributing factor classification scheme to guide accident analyses. This article evaluates the intra- and inter-rater reliability and criterion-referenced validity of a classification scheme developed to support the use of Accimap by led outdoor activity (LOA) practitioners. The classification scheme has two levels: the system level describes the actors, artefacts and activity context in terms of 14 codes; the descriptor level breaks the system level codes down into 107 specific contributing factors. The study involved 11 LOA practitioners using the scheme on two separate occasions to code a pre-determined list of contributing factors identified from four incident reports. Criterion-referenced validity was assessed by comparing the codes selected by LOA practitioners to those selected by the method creators. Mean intra-rater reliability scores at the system (M = 83.6%) and descriptor (M = 74%) levels were acceptable. Mean inter-rater reliability scores were not consistently acceptable for both coding attempts at the system level (M T1 = 68.8%; M T2 = 73.9%), and were poor at the descriptor level (M T1 = 58.5%; M T2 = 64.1%). Mean criterion referenced validity scores at the system level were acceptable (M T1 = 73.9%; M T2 = 75.3%). However, they were not consistently acceptable at the descriptor level (M T1 = 67.6%; M T2 = 70.8%). Overall, the results indicate that the classification scheme does not currently satisfy reliability and validity requirements, and that further work is required. The implications for the design and development of contributing factors classification schemes are discussed. Copyright © 2017 Elsevier Ltd. All rights reserved.
Reliability and validity of the Microsoft Kinect for evaluating static foot posture
2013-01-01
Background The evaluation of foot posture in a clinical setting is useful to screen for potential injury, however disagreement remains as to which method has the greatest clinical utility. An inexpensive and widely available imaging system, the Microsoft Kinect™, may possess the characteristics to objectively evaluate static foot posture in a clinical setting with high accuracy. The aim of this study was to assess the intra-rater reliability and validity of this system for assessing static foot posture. Methods Three measures were used to assess static foot posture; traditional visual observation using the Foot Posture Index (FPI), a 3D motion analysis (3DMA) system and software designed to collect and analyse image and depth data from the Kinect. Spearman’s rho was used to assess intra-rater reliability and concurrent validity of the Kinect to evaluate foot posture, and a linear regression was used to examine the ability of the Kinect to predict total visual FPI score. Results The Kinect demonstrated moderate to good intra-rater reliability for four FPI items of foot posture (ρ = 0.62 to 0.78) and moderate to good correlations with the 3DMA system for four items of foot posture (ρ = 0.51 to 0.85). In contrast, intra-rater reliability of visual FPI items was poor to moderate (ρ = 0.17 to 0.63), and correlations with the Kinect and 3DMA systems were poor (absolute ρ = 0.01 to 0.44). Kinect FPI items with moderate to good reliability predicted 61% of the variance in total visual FPI score. Conclusions The majority of the foot posture items derived using the Kinect were more reliable than the traditional visual assessment of FPI, and were valid when compared to a 3DMA system. Individual foot posture items recorded using the Kinect were also shown to predict a moderate degree of variance in the total visual FPI score. Combined, these results support the future potential of the Kinect to accurately evaluate static foot posture in a clinical setting. PMID:23566934
Validation of the VISA-A questionnaire for Turkish language: the VISA-A-Tr study.
Dogramaci, Yunus; Kalaci, Aydiner; Kücükkübas, Nigar; Inandi, Taceddin; Esen, Erdinc; Yanat, A Nedim
2011-04-01
To evaluate the validity and reliability of the Turkish version of the Victorian Institute of Sports Assessment-Achilles (VISA-A) questionnaire for patients with Achilles tendinopathy. Fifty-five patients with a diagnosis of Achilles tendinopathy and 55 healthy subjects were included in the study. VISA-A questionnaires were translated and culturally adapted into Turkish. The final Turkish version (VISA-A-Tr) was tested for reliability on healthy individuals and patients. Tests for internal consistency, validity and structure were performed on 55 patients. The VISA-A-Tr showed good test-retest reliability (Pearson's r=0.99, p<0.001). The patients with Achilles tendinopathy had a significantly lower score (p<0.001) than the healthy individuals. The VISA-A-Tr score correlated significantly with the Stanish tendon grading system (Spearman's r=-0.86; p<0.001). The VISA-A-Tr is a valid and reliable tool for evaluating the severity of Achilles tendinopathy.
Akpinar, Pinar; Tezel, Canan G; Eliasson, Ann-Christin; Icagasioglu, Afitap
2010-01-01
To determine the reliability and cross-cultural validation of the Turkish translation of the Manual Ability Classification System (MACS) for children with cerebral palsy (CP) and to investigate the relation to gross motor function and other comorbidities. After the forward and backward translation procedures, inter-rater and test-retest reliability was assessed between parents, physiotherapists and physicians using the intra-class correlation coefficient (ICC). Children (N = 118, 4 to 18 years, mean age 9 years 4 months; 68 boys, 50 girls) with various types of CP were classified. Additional data on the Gross Motor Function Classification System (GMFCS), intellectual delay, visual acuity, and epilepsy were collected. The inter-rater reliability was high; the ICC ranged from 0.89 to 0.96 among different professionals and parents. Between two persons of the same profession it ranged from 0.97 to 0.98. For the test-retest reliability it ranged from 0.91 to 0.98. Total agreement between the GMFCS and the MACS occurred in only 45% of the children. The level of the MACS was found to correlate with the accompanying comorbidities, namely intellectual delay and epilepsy. The Turkish version of the MACS is found to be valid and reliable, and is suggested to be appropriate for the assessment of manual ability within the Turkish population.
Reliability and validity of the Microsoft Kinect for assessment of manual wheelchair propulsion.
Milgrom, Rachel; Foreman, Matthew; Standeven, John; Engsberg, Jack R; Morgan, Kerri A
2016-01-01
Concurrent validity and test-retest reliability of the Microsoft Kinect in quantification of manual wheelchair propulsion were examined. Data were collected from five manual wheelchair users on a roller system. Three Kinect sensors were used to assess test-retest reliability with a still pose. Three systems were used to assess concurrent validity of the Kinect to measure propulsion kinematics (joint angles, push loop characteristics): Kinect, Motion Analysis, and Dartfish ProSuite (Dartfish joint angles were limited to shoulder and elbow flexion). Intraclass correlation coefficients revealed good reliability (0.87-0.99) between five of the six joint angles (neck flexion, shoulder flexion, shoulder abduction, elbow flexion, wrist flexion). ICCs suggested good concurrent validity for elbow flexion between the Kinect and Dartfish and between the Kinect and Motion Analysis. Good concurrent validity was revealed for maximum height, hand-axle relationship, and maximum area (0.92-0.95) between the Kinect and Dartfish and maximum height and hand-axle relationship (0.89-0.96) between the Kinect and Motion Analysis. Analysis of variance revealed significant differences (p < 0.05) in maximum length between Dartfish (mean 58.76 cm) and the Kinect (40.16 cm). Results pose promising research and clinical implications for propulsion assessment and overuse injury prevention with the application of current findings to future technology.
Boerebach, Benjamin C M; Lombarts, Kiki M J M H; Arah, Onyebuchi A
2016-03-01
The System for Evaluation of Teaching Qualities (SETQ) was developed as a formative system for the continuous evaluation and development of physicians' teaching performance in graduate medical training. It has been seven years since the introduction and initial exploratory psychometric analysis of the SETQ questionnaires. This study investigates the validity and reliability of the SETQ questionnaires across hospitals and medical specialties using confirmatory factor analyses (CFAs), reliability analysis, and generalizability analysis. The SETQ questionnaires were tested in a sample of 3,025 physicians and 2,848 trainees in 46 hospitals. The CFA revealed acceptable fit of the data to the previously identified five-factor model. The high internal consistency estimates suggest satisfactory reliability of the subscales. These results provide robust evidence for the validity and reliability of the SETQ questionnaires for evaluating physicians' teaching performance. © The Author(s) 2014.
Vuorenmaa, M; Halme, N; Åstedt-Kurki, P; Kaunonen, M; Perälä, M-L
2014-07-01
The Family Empowerment Scale (FES) is a widely used instrument which measures the parents' own sense of their empowerment at the level of the family, service system and community. It was originally developed for parents of children with emotional disabilities. The aims of this study were to evaluate the validity and reliability of the Finnish FES and to examine its responsiveness in measuring the empowerment of parents with small children. The English FES was translated into Finnish using back translation and modified so as to be generic and convenient for all families. The construct, convergent, discriminant and concurrent validities, reliability and responsiveness of the Finnish FES were examined. Participants (n = 955) were the parents of children aged 0-9 years who had been selected using stratified random sampling. Confirmatory factor analysis proved that the Finnish FES had three subscales based on the original FES. Convergent and discriminant validities confirmed and supported the same construct. The relationship between parents' participation and empowerment was tested for concurrent validity. As in previous FES studies, the participating parents were more empowered, which supported the concurrent validity. The reliability of the Finnish FES proved acceptable for both parents. The Finnish FES could also discriminate the responses of the parents. Participation in the activities organized by the family service system influenced parents' perceptions of empowerment more than did their background characteristics. The Finnish FES is a valid and reliable instrument and it is suitable for measuring the empowerment of parents. However, it is necessary to consider how the FES would identify in the best way the parents who perhaps need some help. © 2013 John Wiley & Sons Ltd.
Uehara, Kosuke; Ogura, Koichi; Akiyama, Toru; Shinoda, Yusuke; Iwata, Shintaro; Kobayashi, Eisuke; Tanzawa, Yoshikazu; Yonemoto, Tsukasa; Kawano, Hirotaka; Kawai, Akira
2017-09-01
The Musculoskeletal Tumor Society (MSTS) scoring system developed in 1993 is a widely used disease-specific evaluation tool for assessment of physical function in patients with musculoskeletal tumors; however, only a few studies have confirmed its reliability and validity. The aim of this study was to validate the MSTS scoring system for the upper extremity (MSTS-UE) in Japanese patients with musculoskeletal tumors for use by others in research. Does the MSTS-UE have: (1) sufficient reliability and internal consistency; (2) adequate construct validity; and (3) reasonable criterion validity in comparison to the Toronto Extremity Salvage Score (TESS) or SF-36? Reliability was performed using test-retest analysis, and internal consistency was evaluated with Cronbach's alpha coefficient. Construct validity was evaluated using a scree plot to confirm the construct number and the Akaike information criterion network. Criterion validity was evaluated by comparing the MSTS-UE with the TESS and SF-36. The test-retest reliability with intraclass correlation coefficient (0.95; 95% CI, 0.91-0.97) was excellent, and internal consistency with Cronbach's α (0.7; 95% CI, 0.53-0.81) was acceptable. There were no ceiling and floor effects. The Akaike Information Criterion network showed that lifting ability, pain, and dexterity played central roles among the components. The MSTS-UE showed substantial correlation with the TESS scoring scale (r = 0.75; p < 0.001) and fair correlation with the SF-36 physical component summary (r = 0.37; p = 0.007). Although the MSTS-UE showed slight correlation with the SF-36 mental component summary, the emotional acceptance component of the MSTS-UE showed fair correlation (r = 0.29; p = 0.039). We can conclude that the MSTS is not an adequate measure of general health-related quality of life; however, this system was designed mainly to be a simple measure of function in a single extremity. To evaluate the mental state of patients with musculoskeletal tumors in the upper extremity, further study is needed.
The Facial Expression Coding System (FACES): Development, Validation, and Utility
ERIC Educational Resources Information Center
Kring, Ann M.; Sloan, Denise M.
2007-01-01
This article presents information on the development and validation of the Facial Expression Coding System (FACES; A. M. Kring & D. Sloan, 1991). Grounded in a dimensional model of emotion, FACES provides information on the valence (positive, negative) of facial expressive behavior. In 5 studies, reliability and validity data from 13 diverse…
2011-01-01
Background The aim of this study was to develop a child-specific classification system for long bone fractures and to examine its reliability and validity on the basis of a prospective multicentre study. Methods Using the sequentially developed classification system, three samples of between 30 and 185 paediatric limb fractures from a pool of 2308 fractures documented in two multicenter studies were analysed in a blinded fashion by eight orthopaedic surgeons, on a total of 5 occasions. Intra- and interobserver reliability and accuracy were calculated. Results The reliability improved with successive simplification of the classification. The final version resulted in an overall interobserver agreement of κ = 0.71 with no significant difference between experienced and less experienced raters. Conclusions In conclusion, the evaluation of the newly proposed classification system resulted in a reliable and routinely applicable system, for which training in its proper use may further improve the reliability. It can be recommended as a useful tool for clinical practice and offers the option for developing treatment recommendations and outcome predictions in the future. PMID:21548939
The, Bertram; Reininga, Inge H F; El Moumni, Mostafa; Eygendaal, Denise
2013-10-01
The modern standard of evaluating treatment results includes the use of rating systems. Elbow-specific rating systems are frequently used in studies aiming at elbow-specific pathology. However, proper validation studies seem to be relatively sparse. In addition, these scoring systems might not always be used for appropriate populations of interest. Both of these issues might give rise to invalid conclusions being reported in the literature. Our aim was to investigate the extent to which the available elbow-specific outcome measurement tools have been validated and the quality of the validation itself. We also aimed to provide characteristics of the populations used for validation of these scales to enable clinicians to use them appropriately. A literature search identified 17 studies of 12 different elbow-specific scoring systems. These were assessed for validity, reliability, and responsiveness characteristics. The quality of these assessments was rated according to the Consensus Based Standards for the Selection of Health Measurement Instruments (COSMIN) checklist criteria, a standardized and validated tool developed specifically for this purpose. Currently, the only elbow-specific rating system that is validated using high-quality methodology is the Oxford Elbow Score, a patient-administered outcome measure tool that has been validated on heterogeneous study populations. Other rating systems still have to be proven in the future to be as good as the Oxford Elbow Score for clinical or research purposes. Additional validation studies are needed. Copyright © 2013 Journal of Shoulder and Elbow Surgery Board of Trustees. Published by Mosby, Inc. All rights reserved.
Rosa-Rizzotto, M; Visonà Dalla Pozza, L; Corlatti, A; Luparia, A; Marchi, A; Molteni, F; Facchin, P; Pagliano, E; Fedrizzi, E
2014-10-01
In hemiplegic children, the recognition of the activity limitation pattern and the possibility of grading its severity are relevant for clinicians while planning interventions, monitoring results, predicting outcomes. Aim of the study is to examine the reliability and validity of Besta Scale, an instrument used to measure in hemiplegic children from 18 months to 12 years of age both grasp on request (capacity) and spontaneous use of upper limb (performance) in bimanual play activities and in ADL. Psychometric analysis of reliability and of validity of the Besta scale was performed. Outpatient study sample Reliability study: A sample of 39 patients was enrolled. The administration of Besta scale was video-recorded in a standardized manner. All videos were scored by 20 independent raters on subsequent viewing. 3 raters randomly selected from the 20-raters group rescored the same video two years later for intra-rater reliability. Intra and inter-rater reliability were calculated using Intraclass Correlation Coefficient (ICC) and Kendall's coefficient (K), respectively. Internal consistency reliability was assessed using Alpha's Chronbach coefficient. Validity study: a sample of 105 children was assessed 5 times (at t0 and 2, 3, 6 and 12 months later) by 20 independent raters. Each patient underwent at the same time to QUEST and Besta scale administration and assessment. Criterion validity was calculated using rho-Pearson coefficient. Reliability study: The inter-rater reliability calculated with Kendall's coefficient resulted moderate K=0.47. The intra-rater (or test-retest) reliability for 3 raters was excellent (ICC=0.927). The Cronbach's alpha for internal consistency was 0.972. Validity study: Besta scale showed a good criterion validity compared to QUEST increasing by age and severity of impairment. Rho Pearson's correlation coefficient r was 0.81 (P<0.0001). Limitations. Besta scales in infants finds hard to distinguish between mild to moderately impaired hand function. Besta scale scoring system is a valid and reliable tool, utilizable in a clinical setting to monitor evolution of unimanual and bimanual manipulation and to distinguish hand's capacity from performance.
Glenn, Jordan M; Galey, Madeline; Edwards, Abigail; Rickert, Bradley; Washington, Tyrone A
2015-07-01
Ability to generate force from the core musculature is a critical factor for sports and general activities with insufficiencies predisposing individuals to injury. This study evaluated isometric force production as a valid and reliable method of assessing abdominal force using the abdominal test and evaluation systems tool (ABTEST). Secondary analysis estimated 1-repetition maximum on commercially available abdominal machine compared to maximum force and average power on ABTEST system. This study utilized test-retest reliability and comparative analysis for validity. Reliability was measured using test-retest design on ABTEST. Validity was measured via comparison to estimated 1-repetition maximum on a commercially available abdominal device. Participants applied isometric, abdominal force against a transducer and muscular activation was evaluated measuring normalized electromyographic activity at the rectus-abdominus, rectus-femoris, and erector-spinae. Test, re-test force production on ABTEST was significantly correlated (r=0.84; p<0.001). Mean electromyographic activity for the rectus-abdominus (72.93% and 75.66%), rectus-femoris (6.59% and 6.51%), and erector-spinae (6.82% and 5.48%) were observed for trial-1 and trial-2, respectively. Significant correlations for the estimated 1-repetition maximum were found for average power (r=0.70, p=0.002) and maximum force (r=0.72, p<0.001). Data indicate the ABTEST can accurately measure rectus-abdominus force isolated from hip-flexor involvement. Negligible activation of erector-spinae substantiates little subjective effort among participants in the lower back. Results suggest ABTEST is a valid and reliable method of evaluating abdominal force. Copyright © 2014 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.
2014-01-01
Background A balance test provides important information such as the standard to judge an individual’s functional recovery or make the prediction of falls. The development of a tool for a balance test that is inexpensive and widely available is needed, especially in clinical settings. The Wii Balance Board (WBB) is designed to test balance, but there is little software used in balance tests, and there are few studies on reliability and validity. Thus, we developed a balance assessment software using the Nintendo Wii Balance Board, investigated its reliability and validity, and compared it with a laboratory-grade force platform. Methods Twenty healthy adults participated in our study. The participants participated in the test for inter-rater reliability, intra-rater reliability, and concurrent validity. The tests were performed with balance assessment software using the Nintendo Wii balance board and a laboratory-grade force platform. Data such as Center of Pressure (COP) path length and COP velocity were acquired from the assessment systems. The inter-rater reliability, the intra-rater reliability, and concurrent validity were analyzed by an intraclass correlation coefficient (ICC) value and a standard error of measurement (SEM). Results The inter-rater reliability (ICC: 0.89-0.79, SEM in path length: 7.14-1.90, SEM in velocity: 0.74-0.07), intra-rater reliability (ICC: 0.92-0.70, SEM in path length: 7.59-2.04, SEM in velocity: 0.80-0.07), and concurrent validity (ICC: 0.87-0.73, SEM in path length: 5.94-0.32, SEM in velocity: 0.62-0.08) were high in terms of COP path length and COP velocity. Conclusion The balance assessment software incorporating the Nintendo Wii balance board was used in our study and was found to be a reliable assessment device. In clinical settings, the device can be remarkably inexpensive, portable, and convenient for the balance assessment. PMID:24912769
Park, Dae-Sung; Lee, GyuChang
2014-06-10
A balance test provides important information such as the standard to judge an individual's functional recovery or make the prediction of falls. The development of a tool for a balance test that is inexpensive and widely available is needed, especially in clinical settings. The Wii Balance Board (WBB) is designed to test balance, but there is little software used in balance tests, and there are few studies on reliability and validity. Thus, we developed a balance assessment software using the Nintendo Wii Balance Board, investigated its reliability and validity, and compared it with a laboratory-grade force platform. Twenty healthy adults participated in our study. The participants participated in the test for inter-rater reliability, intra-rater reliability, and concurrent validity. The tests were performed with balance assessment software using the Nintendo Wii balance board and a laboratory-grade force platform. Data such as Center of Pressure (COP) path length and COP velocity were acquired from the assessment systems. The inter-rater reliability, the intra-rater reliability, and concurrent validity were analyzed by an intraclass correlation coefficient (ICC) value and a standard error of measurement (SEM). The inter-rater reliability (ICC: 0.89-0.79, SEM in path length: 7.14-1.90, SEM in velocity: 0.74-0.07), intra-rater reliability (ICC: 0.92-0.70, SEM in path length: 7.59-2.04, SEM in velocity: 0.80-0.07), and concurrent validity (ICC: 0.87-0.73, SEM in path length: 5.94-0.32, SEM in velocity: 0.62-0.08) were high in terms of COP path length and COP velocity. The balance assessment software incorporating the Nintendo Wii balance board was used in our study and was found to be a reliable assessment device. In clinical settings, the device can be remarkably inexpensive, portable, and convenient for the balance assessment.
Jung, Sung-Hoon; Kwon, Oh-Yun; Jeon, In-Cheol; Hwang, Ui-Jae; Weon, Jong-Hyuck
2018-01-01
The purposes of this study were to determine the intra-rater test-retest reliability of a smart phone-based measurement tool (SBMT) and a three-dimensional (3D) motion analysis system for measuring the transverse rotation angle of the pelvis during single-leg lifting (SLL) and the criterion validity of the transverse rotation angle of the pelvis measurement using SBMT compared with a 3D motion analysis system (3DMAS). Seventeen healthy volunteers performed SLL with their dominant leg without bending the knee until they reached a target placed 20 cm above the table. This study used a 3DMAS, considered the gold standard, to measure the transverse rotation angle of the pelvis to assess the criterion validity of the SBMT measurement. Intra-rater test-retest reliability was determined using the SBMT and 3DMAS using intra-class correlation coefficient (ICC) [3,1] values. The criterion validity of the SBMT was assessed with ICC [3,1] values. Both the 3DMAS (ICC = 0.77) and SBMT (ICC = 0.83) showed excellent intra-rater test-retest reliability in the measurement of the transverse rotation angle of the pelvis during SLL in a supine position. Moreover, the SBMT showed an excellent correlation with the 3DMAS (ICC = 0.99). Measurement of the transverse rotation angle of the pelvis using the SBMT showed excellent reliability and criterion validity compared with the 3DMAS.
ERIC Educational Resources Information Center
Yao, Shuqiao; Zou, Tao; Zhu, Xiongzhao; Abela, John R. Z.; Auerbach, Randy P.; Tong, Xi
2007-01-01
The objective of the current study was to develop a Chinese translation of the Multidimensional Anxiety Scale for Children (MASC) [March (1997) Multidimensional anxiety scale for children: Technical manual, Multi health systems, Toronto, ON], and to evaluate its reliability and validity. The original version of the MASC was translated into Chinese…
Mohamad Marzuki, Muhamad Fadhil; Yaacob, Nor Azwany; Yaacob, Najib Majdi
2018-05-14
A mobile app is a programmed system designed to be used by a target user on a mobile device. The usability of such a system refers not only to the extent to which product can be used to achieve the task that it was designed for, but also its effectiveness and efficiency, as well as user satisfaction. The System Usability Scale is one of the most commonly used questionnaires used to assess the usability of a system. The original 10-item version of System Usability Scale was developed in English and thus needs to be adapted into local languages to assess the usability of a mobile apps developed in other languages. The aim of this study is to translate and validate (with cross-cultural adaptation) the English System Usability Scale questionnaire into Malay, the main language spoken in Malaysia. The development of a translated version will allow the usability of mobile apps to be assessed in Malay. Forward and backward translation of the questionnaire was conducted by groups of Malay native speakers who spoke English as their second language. The final version was obtained after reconciliation and cross-cultural adaptation. The content of the Malay System Usability Scale questionnaire for mobile apps was validated by 10 experts in mobile app development. The efficacy of the questionnaire was further probed by testing the face validity on 10 mobile phone users, followed by reliability testing involving 54 mobile phone users. The content validity index was determined to be 0.91, indicating good relevancy of the 10 items used to assess the usability of a mobile app. Calculation of the face validity index resulted in a value of 0.94, therefore indicating that the questionnaire was easily understood by the users. Reliability testing showed a Cronbach alpha value of .85 (95% CI 0.79-0.91) indicating that the translated System Usability Scale questionnaire is a reliable tool for the assessment of usability of a mobile app. The Malay System Usability Scale questionnaire is a valid and reliable tool to assess the usability of mobile app in Malaysia. ©Muhamad Fadhil Mohamad Marzuki, Nor Azwany Yaacob, Najib Majdi Yaacob. Originally published in JMIR Human Factors (http://humanfactors.jmir.org), 14.05.2018.
Barmou, Maher M; Hussain, Saba F; Abu Hassan, Mohamed I
2018-06-01
The aim of the study was to assess the reliability and validity of cephalometric variables from MicroScribe-3DXL. Seven cephalometric variables (facial angle, ANB, maxillary depth, U1/FH, FMA, IMPA, FMIA) were measured by a dentist in 60 Malay subjects (30 males and 30 females) with class I occlusion and balanced face. Two standard images were taken for each subject with conventional cephalometric radiography and MicroScribe-3DXL. All the images were traced and analysed. SPSS version 2.0 was used for statistical analysis with P-value was set at P<0.05. The results revealed a significant statistic difference in four measurements (U1/FH, FMA, IMPA, FMIA) with P-value range (0.00 to 0.03). The difference in the measurements was considered clinically acceptable. The overall reliability of MicroScribe-3DXL was 92.7% and its validity was 91.8%. The MicroScribe-3DXL is reliable and valid to most of the cephalometric variables with the advantages of saving time and cost. This is a promising device to assist in diverse areas in dental practice and research. Copyright © 2018. Published by Elsevier Masson SAS.
Gutiérrez-Vilahú, Lourdes; Massó-Ortigosa, Núria; Rey-Abella, Ferran; Costa-Tutusaus, Lluís; Guerra-Balic, Myriam
2016-05-01
People with Down syndrome present skeletal abnormalities in their feet that can be analyzed by commonly used gold standard indices (the Hernández-Corvo index, the Chippaux-Smirak index, the Staheli arch index, and the Clarke angle) based on footprint measurements. The use of Photoshop CS5 software (Adobe Systems Software Ireland Ltd, Dublin, Ireland) to measure footprints has been validated in the general population. The present study aimed to assess the reliability and validity of this footprint assessment technique in the population with Down syndrome. Using optical podography and photography, 44 footprints from 22 patients with Down syndrome (11 men [mean ± SD age, 23.82 ± 3.12 years] and 11 women [mean ± SD age, 24.82 ± 6.81 years]) were recorded in a static bipedal standing position. A blinded observer performed the measurements using a validated manual method three times during the 4-month study, with 2 months between measurements. Test-retest was used to check the reliability of the Photoshop CS5 software measurements. Validity and reliability were obtained by intraclass correlation coefficient (ICC). The reliability test for all of the indices showed very good values for the Photoshop CS5 method (ICC, 0.982-0.995). Validity testing also found no differences between the techniques (ICC, 0.988-0.999). The Photoshop CS5 software method is reliable and valid for the study of footprints in young people with Down syndrome.
Shenker, Bennett S
2014-02-01
To validate a scoring system that evaluates the ability of Internet search engines to correctly predict diagnoses when symptoms are used as search terms. We developed a five point scoring system to evaluate the diagnostic accuracy of Internet search engines. We identified twenty diagnoses common to a primary care setting to validate the scoring system. One investigator entered the symptoms for each diagnosis into three Internet search engines (Google, Bing, and Ask) and saved the first five webpages from each search. Other investigators reviewed the webpages and assigned a diagnostic accuracy score. They rescored a random sample of webpages two weeks later. To validate the five point scoring system, we calculated convergent validity and test-retest reliability using Kendall's W and Spearman's rho, respectively. We used the Kruskal-Wallis test to look for differences in accuracy scores for the three Internet search engines. A total of 600 webpages were reviewed. Kendall's W for the raters was 0.71 (p<0.0001). Spearman's rho for test-retest reliability was 0.72 (p<0.0001). There was no difference in scores based on Internet search engine. We found a significant difference in scores based on the webpage's order on the Internet search engine webpage (p=0.007). Pairwise comparisons revealed higher scores in the first webpages vs. the fourth (corr p=0.009) and fifth (corr p=0.017). However, this significance was lost when creating composite scores. The five point scoring system to assess diagnostic accuracy of Internet search engines is a valid and reliable instrument. The scoring system may be used in future Internet research. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Reliability and Validity Assessment of a Linear Position Transducer
Garnacho-Castaño, Manuel V.; López-Lastra, Silvia; Maté-Muñoz, José L.
2015-01-01
The objectives of the study were to determine the validity and reliability of peak velocity (PV), average velocity (AV), peak power (PP) and average power (AP) measurements were made using a linear position transducer. Validity was assessed by comparing measurements simultaneously obtained using the Tendo Weightlifting Analyzer Systemi and T-Force Dynamic Measurement Systemr (Ergotech, Murcia, Spain) during two resistance exercises, bench press (BP) and full back squat (BS), performed by 71 trained male subjects. For the reliability study, a further 32 men completed both lifts using the Tendo Weightlifting Analyzer Systemz in two identical testing sessions one week apart (session 1 vs. session 2). Intraclass correlation coefficients (ICCs) indicating the validity of the Tendo Weightlifting Analyzer Systemi were high, with values ranging from 0.853 to 0.989. Systematic biases and random errors were low to moderate for almost all variables, being higher in the case of PP (bias ±157.56 W; error ±131.84 W). Proportional biases were identified for almost all variables. Test-retest reliability was strong with ICCs ranging from 0.922 to 0.988. Reliability results also showed minimal systematic biases and random errors, which were only significant for PP (bias -19.19 W; error ±67.57 W). Only PV recorded in the BS showed no significant proportional bias. The Tendo Weightlifting Analyzer Systemi emerged as a reliable system for measuring movement velocity and estimating power in resistance exercises. The low biases and random errors observed here (mainly AV, AP) make this device a useful tool for monitoring resistance training. Key points This study determined the validity and reliability of peak velocity, average velocity, peak power and average power measurements made using a linear position transducer The Tendo Weight-lifting Analyzer Systemi emerged as a reliable system for measuring movement velocity and power. PMID:25729300
The reliability and validity of the Saliba Postural Classification System
Collins, Cristiana Kahl; Johnson, Vicky Saliba; Godwin, Ellen M.; Pappas, Evangelos
2016-01-01
Objectives To determine the reliability and validity of the Saliba Postural Classification System (SPCS). Methods Two physical therapists classified pictures of 100 volunteer participants standing in their habitual posture for inter and intra-tester reliability. For validity, 54 participants stood on a force plate in a habitual and a corrected posture, while a vertical force was applied through the shoulders until the clinician felt a postural give. Data were extracted at the time the give was felt and at a time in the corrected posture that matched the peak vertical ground reaction force (VGRF) in the habitual posture. Results Inter-tester reliability demonstrated 75% agreement with a Kappa = 0.64 (95% CI = 0.524–0.756, SE = 0.059). Intra-tester reliability demonstrated 87% agreement with a Kappa = 0.8, (95% CI = 0.702–0.898, SE = 0.05) and 80% agreement with a Kappa = 0.706, (95% CI = 0.594–0818, SE = 0.057). The examiner applied a significantly higher (p < 0.001) peak vertical force in the corrected posture prior to a postural give when compared to the habitual posture. Within the corrected posture, the %VGRF was higher when the test was ongoing vs. when a postural give was felt (p < 0.001). The %VGRF was not different between the two postures when comparing the peaks (p = 0.214). Discussion The SPCS has substantial agreement for inter- and intra-tester reliability and is largely a valid postural classification system as determined by the larger vertical forces in the corrected postures. Further studies on the correlation between the SPCS and diagnostic classifications are indicated. PMID:27559288
The reliability and validity of the Saliba Postural Classification System.
Collins, Cristiana Kahl; Johnson, Vicky Saliba; Godwin, Ellen M; Pappas, Evangelos
2016-07-01
To determine the reliability and validity of the Saliba Postural Classification System (SPCS). Two physical therapists classified pictures of 100 volunteer participants standing in their habitual posture for inter and intra-tester reliability. For validity, 54 participants stood on a force plate in a habitual and a corrected posture, while a vertical force was applied through the shoulders until the clinician felt a postural give. Data were extracted at the time the give was felt and at a time in the corrected posture that matched the peak vertical ground reaction force (VGRF) in the habitual posture. Inter-tester reliability demonstrated 75% agreement with a Kappa = 0.64 (95% CI = 0.524-0.756, SE = 0.059). Intra-tester reliability demonstrated 87% agreement with a Kappa = 0.8, (95% CI = 0.702-0.898, SE = 0.05) and 80% agreement with a Kappa = 0.706, (95% CI = 0.594-0818, SE = 0.057). The examiner applied a significantly higher (p < 0.001) peak vertical force in the corrected posture prior to a postural give when compared to the habitual posture. Within the corrected posture, the %VGRF was higher when the test was ongoing vs. when a postural give was felt (p < 0.001). The %VGRF was not different between the two postures when comparing the peaks (p = 0.214). The SPCS has substantial agreement for inter- and intra-tester reliability and is largely a valid postural classification system as determined by the larger vertical forces in the corrected postures. Further studies on the correlation between the SPCS and diagnostic classifications are indicated.
Leddy, Abigail L; Crowner, Beth E; Earhart, Gammon M
2011-01-01
Gait impairments, balance impairments, and falls are prevalent in individuals with Parkinson disease (PD). Although the Berg Balance Scale (BBS) can be considered the reference standard for the determination of fall risk, it has a noted ceiling effect. Development of ceiling-free measures that can assess balance and are good at discriminating "fallers" from "nonfallers" is needed. The purpose of this study was to compare the Functional Gait Assessment (FGA) and the Balance Evaluation Systems Test (BESTest) with the BBS among individuals with PD and evaluate the tests' reliability, validity, and discriminatory sensitivity and specificity for fallers versus nonfallers. This was an observational study of community-dwelling individuals with idiopathic PD. The BBS, FGA, and BESTest were administered to 80 individuals with PD. Interrater reliability (n=15) was assessed by 3 raters. Test-retest reliability was based on 2 tests of participants (n=24), 2 weeks apart. Intraclass correlation coefficients (2,1) were used to calculate reliability, and Spearman correlation coefficients were used to assess validity. Cutoff points, sensitivity, and specificity were based on receiver operating characteristic plots. Test-retest reliability was .80 for the BBS, .91 for the FGA, and .88 for the BESTest. Interrater reliability was greater than .93 for all 3 tests. The FGA and BESTest were correlated with the BBS (r=.78 and r=.87, respectively). Cutoff scores to identify fallers were 47/56 for the BBS, 15/30 for the FGA, and 69% for the BESTest. The overall accuracy (area under the curve) for the BBS, FGA, and BESTest was .79, .80, and .85, respectively. Fall reports were retrospective. Both the FGA and the BESTest have reliability and validity for assessing balance in individuals with PD. The BESTest is most sensitive for identifying fallers.
Validation of the one pass measure for motivational interviewing competence.
McMaster, Fiona; Resnicow, Ken
2015-04-01
This paper examines the psychometric properties of the OnePass coding system: a new, user-friendly tool for evaluating practitioner competence in motivational interviewing (MI). We provide data on reliability and validity with the current gold-standard: Motivational Interviewing Treatment Integrity tool (MITI). We compared scores from 27 videotaped MI sessions performed by student counselors trained in MI and simulated patients using both OnePass and MITI, with three different raters for each tool. Reliability was estimated using intra-class coefficients (ICCs), and validity was assessed using Pearson's r. OnePass had high levels of inter-rater reliability with 19/23 items found from substantial to almost perfect agreement. Taking the pair of scores with the highest inter-rater reliability on the MITI, the concurrent validity between the two measures ranged from moderate to high. Validity was highest for evocation, autonomy, direction and empathy. OnePass appears to have good inter-rater reliability while capturing similar dimensions of MI as the MITI. Despite the moderate concurrent validity with the MITI, the OnePass shows promise in evaluating both traditional and novel interpretations of MI. OnePass may be a useful tool for developing and improving practitioner competence in MI where access to MITI coders is limited. Copyright © 2015. Published by Elsevier Ireland Ltd.
Gudbergsen, Henrik; Kjærgaard, Morten; Lykkegaard, Kasper Lundberg
2018-01-01
Physical inactivity is important to address, and an objective way of measuring inactivity is by accelerometry. The objective of this study was to determine the reliability and construct validity of the SENS motion system to record physical activity and inactivity in patients with knee osteoarthritis. Participants with an age > 40 years and an average weekly pain above 0 on a numeric rating scale (0 = no pain, 10 = worst pain) were included. Participants had a total of two study visits and at each visit participants completed a standardized activity. Data from 24 participants were analysed. A mean agreement of 99% (SD 3%) for sedentary behaviour and a mean agreement of 97% (SD 9%) for active behaviour were found. The agreement for “walking” was 28% (SD 18%). Mean agreement between recordings on the two visits was 96% (SD 8%) for sedentary behaviour and 99% (SD 1%) for active behaviour. The SENS motion activity measurement system can be regarded as a reliable and valid device for measuring sedentary behaviour in patients with knee OA, whereas detection of walking is not reliable and would require further work. PMID:29686901
Reliability techniques for computer executive programs
NASA Technical Reports Server (NTRS)
1972-01-01
Computer techniques for increasing the stability and reliability of executive and supervisory systems were studied. Program segmentation characteristics are discussed along with a validation system which is designed to retain the natural top down outlook in coding. An analysis of redundancy techniques and roll back procedures is included.
Hoppe, Matthias W; Baumgart, Christian; Polglaze, Ted; Freiwald, Jürgen
2018-01-01
This study aimed to investigate the validity and reliability of global (GPS) and local (LPS) positioning systems for measuring distances covered and sprint mechanical properties in team sports. Here, we evaluated two recently released 18 Hz GPS and 20 Hz LPS technologies together with one established 10 Hz GPS technology. Six male athletes (age: 27±2 years; VO2max: 48.8±4.7 ml/min/kg) performed outdoors on 10 trials of a team sport-specific circuit that was equipped with double-light timing gates. The circuit included various walking, jogging, and sprinting sections that were performed either in straight-lines or with changes of direction. During the circuit, athletes wore two devices of each positioning system. From the reported and filtered velocity data, the distances covered and sprint mechanical properties (i.e., the theoretical maximal horizontal velocity, force, and power output) were computed. The sprint mechanical properties were modeled via an inverse dynamic approach applied to the center of mass. The validity was determined by comparing the measured and criterion data via the typical error of estimate (TEE), whereas the reliability was examined by comparing the two devices of each technology (i.e., the between-device reliability) via the coefficient of variation (CV). Outliers due to measurement errors were statistically identified and excluded from validity and reliability analyses. The 18 Hz GPS showed better validity and reliability for determining the distances covered (TEE: 1.6-8.0%; CV: 1.1-5.1%) and sprint mechanical properties (TEE: 4.5-14.3%; CV: 3.1-7.5%) than the 10 Hz GPS (TEE: 3.0-12.9%; CV: 2.5-13.0% and TEE: 4.1-23.1%; CV: 3.3-20.0%). However, the 20 Hz LPS demonstrated superior validity and reliability overall (TEE: 1.0-6.0%; CV: 0.7-5.0% and TEE: 2.1-9.2%; CV: 1.6-7.3%). For the 10 Hz GPS, 18 Hz GPS, and 20 Hz LPS, the relative loss of data sets due to measurement errors was 10.0%, 20.0%, and 15.8%, respectively. This study shows that 18 Hz GPS has enhanced validity and reliability for determining movement patterns in team sports compared to 10 Hz GPS, whereas 20 Hz LPS had superior validity and reliability overall. However, compared to 10 Hz GPS, 18 Hz GPS and 20 Hz LPS technologies had more outliers due to measurement errors, which limits their practical applications at this time.
NASA Astrophysics Data System (ADS)
Launch vehicle propulsion system reliability considerations during the design and verification processes are discussed. The tools available for predicting and minimizing anomalies or failure modes are described and objectives for validating advanced launch system propulsion reliability are listed. Methods for ensuring vehicle/propulsion system interface reliability are examined and improvements in the propulsion system development process are suggested to improve reliability in launch operations. Also, possible approaches to streamline the specification and procurement process are given. It is suggested that government and industry should define reliability program requirements and manage production and operations activities in a manner that provides control over reliability drivers. Also, it is recommended that sufficient funds should be invested in design, development, test, and evaluation processes to ensure that reliability is not inappropriately subordinated to other management considerations.
ERIC Educational Resources Information Center
Clifford, Matthew; Menon, Roshni; Gangi, Tracy; Condon, Christopher; Hornung, Katie
2012-01-01
This policy brief provides principal evaluation system designers information about the technical soundness and cost (i.e., time requirements) of publicly available school climate surveys. The authors focus on the technical soundness of school climate surveys because they believe that using validated and reliable surveys as an outcomes measure can…
A Note on Some Characteristics and Correlates of the Meier Art Test of Aesthetic Perception.
ERIC Educational Resources Information Center
Stallings, William M.; Anderson, Frances E.
The reliability and the predictive and concurrent validity of the MATAP were investigated with the implicit goal of improving the prediction of course grades in the College of Fine and Applied Arts. It was found that reliability and validity coefficients were low, and it was suggested that the scoring system was a source of error variance. (MS)
Classification of mood disorders in DSM-V and DSM-VI.
Joyce, Peter R
2008-10-01
For any diagnostic system to be clinically useful, and go beyond description, it must provide an understanding that informs about aetiology and/or outcome. DSM-III and DSM-IV have provided reliability; the challenge for DSM-V and DSM-VI will be to provide validity. For DSM-V this will not be achieved. Believers in DSM-III and DSM-IV have impeded progress towards a valid classification system, so DSM-V needs to retain continuity with its predecessors to retain reliability and enhance research, but position itself to inform a valid diagnostic system by DSM-VI. This review examines the features of a diagnostic system and summarizes what is really known about mood disorders. The review also questions whether what are called mood disorders are primarily disorders of mood. Finally, it provides suggestions for DSM-VI.
Mouthon, L; Rannou, F; Bérezné, A; Pagnoux, C; Arène, J‐P; Foïs, E; Cabane, J; Guillevin, L; Revel, M; Fermanian, J; Poiraudeau, S
2007-01-01
Objective To develop and assess the reliability and construct validity of a scale assessing disability involving the mouth in systemic sclerosis (SSc). Methods We generated a 34‐item provisional scale from mailed responses of patients (n = 74), expert consensus (n = 10) and literature analysis. A total of 71 other SSc patients were recruited. The test–retest reliability was assessed using the intraclass coefficient correlation and divergent validity using the Spearman correlation coefficient. Factor analysis followed by varimax rotation was performed to assess the factorial structure of the scale. Results The item reduction process retained 12 items with 5 levels of answers (total score range 0–48). The mean total score of the scale was 20.3 (SD 9.7). The test–retest reliability was 0.96. Divergent validity was confirmed for global disability (Health Assessment Questionnaire (HAQ), r = 0.33), hand function (Cochin Hand Function Scale, r = 0.37), inter‐incisor distance (r = −0.34), handicap (McMaster‐Toronto Arthritis questionnaire (MACTAR), r = 0.24), depression (Hospital Anxiety and Depression (HAD); HADd, r = 0.26) and anxiety (HADa, r = 0.17). Factor analysis extracted 3 factors with eigenvalues of 4.26, 1.76 and 1.47, explaining 63% of the variance. These 3 factors could be clinically characterised. The first factor (5 items) represents handicap induced by the reduction in mouth opening, the second (5 items) handicap induced by sicca syndrome and the third (2 items) aesthetic concerns. Conclusion We propose a new scale, the Mouth Handicap in Systemic Sclerosis (MHISS) scale, which has excellent reliability and good construct validity, and assesses specifically disability involving the mouth in patients with SSc. PMID:17502364
Tan, Edwin T.; Martin, Sarah R.; Fortier, Michelle A.; Kain, Zeev N.
2012-01-01
Objective To develop and validate a behavioral coding measure, the Children's Behavior Coding System-PACU (CBCS-P), for children's distress and nondistress behaviors while in the postanesthesia recovery unit. Methods A multidisciplinary team examined videotapes of children in the PACU and developed a coding scheme that subsequently underwent a refinement process (CBCS-P). To examine the reliability and validity of the coding system, 121 children and their parents were videotaped during their stay in the PACU. Participants were healthy children undergoing elective, outpatient surgery and general anesthesia. The CBCS-P was utilized and objective data from medical charts (analgesic consumption and pain scores) were extracted to establish validity. Results Kappa values indicated good-to-excellent (κ's > .65) interrater reliability of the individual codes. The CBCS-P had good criterion validity when compared to children's analgesic consumption and pain scores. Conclusions The CBCS-P is a reliable, observational coding method that captures children's distress and nondistress postoperative behaviors. These findings highlight the importance of considering context in both the development and application of observational coding schemes. PMID:22167123
Patterson, P Daniel; Weaver, Matthew D; Fabio, Anthony; Teasley, Ellen M; Renn, Megan L; Curtis, Brett R; Matthews, Margaret E; Kroemer, Andrew J; Xun, Xiaoshuang; Bizhanova, Zhadyra; Weiss, Patricia M; Sequeira, Denisse J; Coppler, Patrick J; Lang, Eddy S; Higgins, J Stephen
2018-02-15
This study sought to systematically search the literature to identify reliable and valid survey instruments for fatigue measurement in the Emergency Medical Services (EMS) occupational setting. A systematic review study design was used and searched six databases, including one website. The research question guiding the search was developed a priori and registered with the PROSPERO database of systematic reviews: "Are there reliable and valid instruments for measuring fatigue among EMS personnel?" (2016:CRD42016040097). The primary outcome of interest was criterion-related validity. Important outcomes of interest included reliability (e.g., internal consistency), and indicators of sensitivity and specificity. Members of the research team independently screened records from the databases. Full-text articles were evaluated by adapting the Bolster and Rourke system for categorizing findings of systematic reviews, and the rated data abstracted from the body of literature as favorable, unfavorable, mixed/inconclusive, or no impact. The Grading of Recommendations, Assessment, Development and Evaluation (GRADE) methodology was used to evaluate the quality of evidence. The search strategy yielded 1,257 unique records. Thirty-four unique experimental and non-experimental studies were determined relevant following full-text review. Nineteen studies reported on the reliability and/or validity of ten different fatigue survey instruments. Eighteen different studies evaluated the reliability and/or validity of four different sleepiness survey instruments. None of the retained studies reported sensitivity or specificity. Evidence quality was rated as very low across all outcomes. In this systematic review, limited evidence of the reliability and validity of 14 different survey instruments to assess the fatigue and/or sleepiness status of EMS personnel and related shift worker groups was identified.
Validation of different pediatric triage systems in the emergency department
Aeimchanbanjong, Kanokwan; Pandee, Uthen
2017-01-01
BACKGROUND: Triage system in children seems to be more challenging compared to adults because of their different response to physiological and psychosocial stressors. This study aimed to determine the best triage system in the pediatric emergency department. METHODS: This was a prospective observational study. This study was divided into two phases. The first phase determined the inter-rater reliability of five triage systems: Manchester Triage System (MTS), Emergency Severity Index (ESI) version 4, Pediatric Canadian Triage and Acuity Scale (CTAS), Australasian Triage Scale (ATS), and Ramathibodi Triage System (RTS) by triage nurses and pediatric residents. In the second phase, to analyze the validity of each triage system, patients were categorized as two groups, i.e., high acuity patients (triage level 1, 2) and low acuity patients (triage level 3, 4, and 5). Then we compared the triage acuity with actual admission. RESULTS: In phase I, RTS illustrated almost perfect inter-rater reliability with kappa of 1.0 (P<0.01). ESI and CTAS illustrated good inter-rater reliability with kappa of 0.8–0.9 (P<0.01). Meanwhile, ATS and MTS illustrated moderate to good inter-rater reliability with kappa of 0.5–0.7 (P<0.01). In phase II, we included 1 041 participants with average age of 4.7±4.2 years, of which 55% were male and 45% were female. In addition 32% of the participants had underlying diseases, and 123 (11.8%) patients were admitted. We found that ESI illustrated the most appropriate predicting ability for admission with sensitivity of 52%, specificity of 81%, and AUC 0.78 (95%CI 0.74–0.81). CONCLUSION: RTS illustrated almost perfect inter-rater reliability. Meanwhile, ESI and CTAS illustrated good inter-rater reliability. Finally, ESI illustrated the appropriate validity for triage system. PMID:28680520
Lynn, Scott K.; Watkins, Casey M.; Wong, Megan A.; Balfany, Katherine; Feeney, Daniel F.
2018-01-01
The Athos ® wearable system integrates surface electromyography (sEMG ) electrodes into the construction of compression athletic apparel. The Athos system reduces the complexity and increases the portability of collecting EMG data and provides processed data to the end user. The objective of the study was to determine the reliability and validity of Athos as compared with a research grade sEMG system. Twelve healthy subjects performed 7 trials on separate days (1 baseline trial and 6 repeated trials). In each trial subjects wore the wearable sEMG system and had a research grade sEMG system’s electrodes placed just distal on the same muscle, as close as possible to the wearable system’s electrodes. The muscles tested were the vastus lateralis (VL), vastus medialis (VM), and biceps femoris (BF). All testing was done on an isokinetic dynamometer. Baseline testing involved performing isometric 1 repetition maximum tests for the knee extensors and flexors and three repetitions of concentric-concentric knee flexion and extension at MVC for each testing speed: 60, 180, and 300 deg/sec. Repeated trials 2-7 each comprised 9 sets where each set included three repetitions of concentric-concentric knee flexion-extension. Each repeated trial (2-7) comprised one set at each speed and percent MVC (50%, 75%, 100%) combination. The wearable system and research grade sEMG data were processed using the same methods and aligned in time. The amplitude metrics calculated from the sEMG for each repetition were the peak amplitude, sum of the linear envelope, and 95th percentile. Validity results comprise two main findings. First, there is not a significant effect of system (Athos or research grade system) on the repetition amplitude metrics (95%, peak, or sum). Second, the relationship between torque and sEMG is not significantly different between Athos and the research grade system. For reliability testing, the variation across trials and averaged across speeds was 0.8%, 7.3%, and 0.2% higher for Athos from BF, VL and VM, respectively. Also, using the standard deviation of the MVC normalized repetition amplitude, the research grade system showed 10.7% variability while Athos showed 12%. The wearable technology (Athos) provides sEMG measures that are consistent with controlled, research grade technologies and data collection procedures. Key points Surface EMG embedded into athletic garments (Athos) had similar validity and reliability when compared with a research grade system There was no difference in the torque-EMG relationship between the two systems No statistically significant difference in reliability across 6 trials between the two systems The validity and reliability of Athos demonstrates the potential for sEMG to be applied in dynamic rehabilitation and sports settings PMID:29769821
Rating scales for dystonia in cerebral palsy: reliability and validity.
Monbaliu, E; Ortibus, E; Roelens, F; Desloovere, K; Deklerck, J; Prinzie, P; de Cock, P; Feys, H
2010-06-01
This study investigated the reliability and validity of the Barry-Albright Dystonia Scale (BADS), the Burke-Fahn-Marsden Movement Scale (BFMMS), and the Unified Dystonia Rating Scale (UDRS) in patients with bilateral dystonic cerebral palsy (CP). Three raters independently scored videotapes of 10 patients (five males, five females; mean age 13 y 3 mo, SD 5 y 2 mo, range 5-22 y). One patient each was classified at levels I-IV in the Gross Motor Function Classification System and six patients were classified at level V. Reliability was measured by (1) intraclass correlation coefficient (ICC) for interrater reliability, (2) standard error of measurement (SEM) and smallest detectable difference (SDD), and (3) Cronbach's alpha for internal consistency. Validity was assessed by Pearson's correlations among the three scales used and by content analysis. Moderate to good interrater reliability was found for total scores of the three scales (ICC: BADS=0.87; BFMMS=0.86; UDRS=0.79). However, many subitems showed low reliability, in particular for the UDRS. SEM and SDD were respectively 6.36% and 17.72% for the BADS, 9.88% and 27.39% for the BFMMS, and 8.89% and 24.63% for the UDRS. High internal consistency was found. Pearson's correlations were high. Content validity showed insufficient accordance with the new CP definition and classification. Our results support the internal consistency and concurrent validity of the scales; however, taking into consideration the limitations in reliability, including the large SDD values and the content validity, further research on methods of assessment of dystonia is warranted.
The specification-based validation of reliable multicast protocol: Problem Report. M.S. Thesis
NASA Technical Reports Server (NTRS)
Wu, Yunqing
1995-01-01
Reliable Multicast Protocol (RMP) is a communication protocol that provides an atomic, totally ordered, reliable multicast service on top of unreliable IP multicasting. In this report, we develop formal models for RMP using existing automated verification systems, and perform validation on the formal RMP specifications. The validation analysis help identifies some minor specification and design problems. We also use the formal models of RMP to generate a test suite for conformance testing of the implementation. Throughout the process of RMP development, we follow an iterative, interactive approach that emphasizes concurrent and parallel progress of implementation and verification processes. Through this approach, we incorporate formal techniques into our development process, promote a common understanding for the protocol, increase the reliability of our software, and maintain high fidelity between the specifications of RMP and its implementation.
Mills, Sarah D; Kwakkenbos, Linda; Carrier, Marie-Eve; Gholizadeh, Shadi; Fox, Rina S; Jewett, Lisa R; Gottesman, Karen; Roesch, Scott C; Thombs, Brett D; Malcarne, Vanessa L
2018-01-17
Systemic sclerosis (SSc) is an autoimmune disease that can cause disfiguring changes in appearance. This study examined the structural validity, internal consistency reliability, convergent validity, and measurement equivalence of the Social Appearance Anxiety Scale (SAAS) across SSc disease subtypes. Patients enrolled in the Scleroderma Patient-centered Intervention Network Cohort completed the SAAS and measures of appearance-related concerns and psychological distress. Confirmatory factor analysis (CFA) was used to examine the structural validity of the SAAS. Multiple-group CFA was used to determine if SAAS scores can be compared across patients with limited and diffuse disease subtypes. Cronbach's alpha was used to examine internal consistency reliability. Correlations of SAAS scores with measures of body image dissatisfaction, fear of negative evaluation, social anxiety, and depression were used to examine convergent validity. SAAS scores were hypothesized to be positively associated with all convergent validity measures, with correlations significant and moderate to large in size. A total of 938 patients with SSc were included. CFA supported a one-factor structure (CFI: .92; SRMR: .04; RMSEA: .08), and multiple-group CFA indicated that the scalar invariance model best fit the data. Internal consistency reliability was good in the total sample (α = .96) and in disease subgroups. Overall, evidence of convergent validity was found with measures of body image dissatisfaction, fear of negative evaluation, social anxiety, and depression. The SAAS can be reliably and validly used to assess fear of appearance evaluation in patients with SSc, and SAAS scores can be meaningfully compared across disease subtypes. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Boerebach, Benjamin C M; Arah, Onyebuchi A; Busch, Olivier R C; Lombarts, Kiki M J M H
2012-01-01
In surgical education, there is a need for educational performance evaluation tools that yield reliable and valid data. This paper describes the development and validation of robust evaluation tools that provide surgeons with insight into their clinical teaching performance. We investigated (1) the reliability and validity of 2 tools for evaluating the teaching performance of attending surgeons in residency training programs, and (2) whether surgeons' self evaluation correlated with the residents' evaluation of those surgeons. We surveyed 343 surgeons and 320 residents as part of a multicenter prospective cohort study of faculty teaching performance in residency training programs. The reliability and validity of the SETQ (System for Evaluation Teaching Qualities) tools were studied using standard psychometric techniques. We then estimated the correlations between residents' and surgeons' evaluations. The response rate was 87% among surgeons and 84% among residents, yielding 2625 residents' evaluations and 302 self evaluations. The SETQ tools yielded reliable and valid data on 5 domains of surgical teaching performance, namely, learning climate, professional attitude towards residents, communication of goals, evaluation of residents, and feedback. The correlations between surgeons' self and residents' evaluations were low, with coefficients ranging from 0.03 for evaluation of residents to 0.18 for communication of goals. The SETQ tools for the evaluation of surgeons' teaching performance appear to yield reliable and valid data. The lack of strong correlations between surgeons' self and residents' evaluations suggest the need for using external feedback sources in informed self evaluation of surgeons. Copyright © 2012 Association of Program Directors in Surgery. Published by Elsevier Inc. All rights reserved.
Validity and Realibility of Chemistry Systemic Multiple Choices Questions (CSMCQs)
ERIC Educational Resources Information Center
Priyambodo, Erfan; Marfuatun
2016-01-01
Nowadays, Rasch model analysis is used widely in social research, moreover in educational research. In this research, Rasch model is used to determine the validation and the reliability of systemic multiple choices question in chemistry teaching and learning. There were 30 multiple choices question with systemic approach for high school student…
Chang, Wen-Dien; Chang, Wan-Yi; Lee, Chia-Lun; Feng, Chi-Yen
2013-10-01
[Purpose] Balance is an integral part of human ability. The smart balance master system (SBM) is a balance test instrument with good reliability and validity, but it is expensive. Therefore, we modified a Wii Fit balance board, which is a convenient balance assessment tool, and analyzed its reliability and validity. [Subjects and Methods] We recruited 20 healthy young adults and 20 elderly people, and administered 3 balance tests. The correlation coefficient and intraclass correlation of both instruments were analyzed. [Results] There were no statistically significant differences in the 3 tests between the Wii Fit balance board and the SBM. The Wii Fit balance board had a good intraclass correlation (0.86-0.99) for the elderly people and positive correlations (r = 0.58-0.86) with the SBM. [Conclusions] The Wii Fit balance board is a balance assessment tool with good reliability and high validity for elderly people, and we recommend it as an alternative tool for assessing balance ability.
Memtsa, Pinelopi Theopisti; Tolia, Maria; Tzitzikas, Ioannis; Bizakis, Ioannis; Pistevou-Gombaki, Kyriaki; Charalambidou, Martha; Iliopoulou, Chrysoula; Kyrgias, George
2017-03-01
Xerostomia after radiation therapy for head and neck (H&N) cancer has serious effects on patients' quality of life. The purpose of this study was to validate the Greek version of the self-reported eight-item xerostomia questionnaire (XQ) in patients treated with radiotherapy for H&N cancer. The XQ was translated into Greek and administered to 100 XQ patients. An exploratory factor analysis was performed. Reliability measures were calculated. Several types of validity were evaluated. The observer-rated scoring system was also used. The mean XQ value was 41.92 (SD 22.71). Factor analysis revealed the unidimensional nature of the questionnaire. High reliability measures (ICC, Cronbach's α, Pearson coefficients) were obtained. Patients differed statistically significantly in terms of XQ score, depending on the RTOG/EORTC classification. The Greek version of XQ is valid and reliable. Its score is well related to observer's findings and it can be used to evaluate the impact of radiation therapy on the subjective feeling of xerostomia.
2011-09-01
a quality evaluation with limited data, a model -based assessment must be...that affect system performance, a multistage approach to system validation, a modeling and experimental methodology for efficiently addressing a ...affect system performance, a multistage approach to system validation, a modeling and experimental methodology for efficiently addressing a wide range
Digital avionics systems - Principles and practices (2nd revised and enlarged edition)
NASA Technical Reports Server (NTRS)
Spitzer, Cary R.
1993-01-01
The state of the art in digital avionics systems is surveyed. The general topics addressed include: establishing avionics system requirements; avionics systems essentials in data bases, crew interfaces, and power; fault tolerance, maintainability, and reliability; architectures; packaging and fitting the system into the aircraft; hardware assessment and validation; software design, assessment, and validation; determining the costs of avionics.
Ahluwalia, Indu B; Helms, Kristen; Morrow, Brian
2013-01-01
We investigated the reliability and validity of three self-reported indicators from the Pregnancy Risk Assessment Monitoring System (PRAMS) survey. We used 2008 PRAMS (n=15,646) data from 12 states that had implemented the 2003 revised U.S. Certificate of Live Birth. We estimated reliability by kappa coefficient and validity by sensitivity and specificity using the birth certificate data as the reference for the following: prenatal participation in the Special Supplemental Nutrition Program for Women, Infants, and Children (WIC); Medicaid payment for delivery; and breastfeeding initiation. These indicators were examined across several demographic subgroups. The reliability was high for all three measures: 0.81 for WIC participation, 0.67 for Medicaid payment of delivery, and 0.72 for breastfeeding initiation. The validity of PRAMS indicators was also high: WIC participation (sensitivity = 90.8%, specificity = 90.6%), Medicaid payment for delivery (sensitivity = 82.4%, specificity = 85.6%), and breastfeeding initiation (sensitivity = 94.3%, specificity = 76.0%). The prevalence estimates were higher on PRAMS than the birth certificate for each of the indicators except Medicaid-paid delivery among non-Hispanic black women. Kappa values within most subgroups remained in the moderate range (0.40-0.80). Sensitivity and specificity values were lower for Hispanic women who responded to the PRAMS survey in Spanish and for breastfeeding initiation among women who delivered very low birthweight and very preterm infants. The validity and reliability of the PRAMS data for measures assessed were high. Our findings support the use of PRAMS data for epidemiological surveillance, research, and planning.
A Compact Forearm Crutch Based on Force Sensors for Aided Gait: Reliability and Validity.
Chamorro-Moriana, Gema; Sevillano, José Luis; Ridao-Fernández, Carmen
2016-06-21
Frequently, patients who suffer injuries in some lower member require forearm crutches in order to partially unload weight-bearing. These lesions cause pain in lower limb unloading and their progression should be controlled objectively to avoid significant errors in accuracy and, consequently, complications and after effects in lesions. The design of a new and feasible tool that allows us to control and improve the accuracy of loads exerted on crutches during aided gait is necessary, so as to unburden the lower limbs. In this paper, we describe such a system based on a force sensor, which we have named the GCH System 2.0. Furthermore, we determine the validity and reliability of measurements obtained using this tool via a comparison with the validated AMTI (Advanced Mechanical Technology, Inc., Watertown, MA, USA) OR6-7-2000 Platform. An intra-class correlation coefficient demonstrated excellent agreement between the AMTI Platform and the GCH System. A regression line to determine the predictive ability of the GCH system towards the AMTI Platform was found, which obtained a precision of 99.3%. A detailed statistical analysis is presented for all the measurements and also segregated for several requested loads on the crutches (10%, 25% and 50% of body weight). Our results show that our system, designed for assessing loads exerted by patients on forearm crutches during assisted gait, provides valid and reliable measurements of loads.
A Compact Forearm Crutch Based on Force Sensors for Aided Gait: Reliability and Validity
Chamorro-Moriana, Gema; Sevillano, José Luis; Ridao-Fernández, Carmen
2016-01-01
Frequently, patients who suffer injuries in some lower member require forearm crutches in order to partially unload weight-bearing. These lesions cause pain in lower limb unloading and their progression should be controlled objectively to avoid significant errors in accuracy and, consequently, complications and after effects in lesions. The design of a new and feasible tool that allows us to control and improve the accuracy of loads exerted on crutches during aided gait is necessary, so as to unburden the lower limbs. In this paper, we describe such a system based on a force sensor, which we have named the GCH System 2.0. Furthermore, we determine the validity and reliability of measurements obtained using this tool via a comparison with the validated AMTI (Advanced Mechanical Technology, Inc., Watertown, MA, USA) OR6-7-2000 Platform. An intra-class correlation coefficient demonstrated excellent agreement between the AMTI Platform and the GCH System. A regression line to determine the predictive ability of the GCH system towards the AMTI Platform was found, which obtained a precision of 99.3%. A detailed statistical analysis is presented for all the measurements and also segregated for several requested loads on the crutches (10%, 25% and 50% of body weight). Our results show that our system, designed for assessing loads exerted by patients on forearm crutches during assisted gait, provides valid and reliable measurements of loads. PMID:27338396
Documentation of pharmaceutical care: Validation of an intervention oriented classification system.
Maes, Karen A; Studer, Helene; Berger, Jérôme; Hersberger, Kurt E; Lampert, Markus L
2017-12-01
During the dispensing process, pharmacists may come across technical and clinical issues requiring a pharmaceutical intervention (PI). An intervention-oriented classification system is a helpful tool to document these PIs in a structured manner. Therefore, we developed the PharmDISC classification system (Pharmacists' Documentation of Interventions in Seamless Care). The aim of this study was to evaluate the PharmDISC system in the daily practice environment (in terms of interrater reliability, appropriateness, interpretability, acceptability, feasibility, and validity); to assess its user satisfaction, the descriptive manual, and the online training; and to explore first implementation aspects. Twenty-one pharmacists from different community pharmacies each classified 30 prescriptions requiring a PI with the PharmDISC system on 5 selected days within 5 weeks. Interrater reliability was determined using model PIs and Fleiss's kappa coefficients (κ) were calculated. User satisfaction was assessed by questionnaire with a 4-point Likert scale. The main outcome measures were interrater reliability (κ); appropriateness, interpretability, validity (ratio of completely classified PIs/all PIs); feasibility, and acceptability (user satisfaction and suggestions). The PharmDISC system reached an average substantial agreement (κ = 0.66). Of documented 519 PIs, 430 (82.9%) were completely classified. Most users found the system comprehensive (median user agreement 3 [2/3.25 quartiles]) and practical (3[2.75/3]). The PharmDISC system raised the awareness regarding drug-related problems for most users (n = 16). To facilitate its implementation, an electronic version that automatically connects to the prescription together with a task manager for PIs needing follow-up was suggested. Barriers could be time expenditure and lack of understanding the benefits. Substantial interrater reliability and acceptable user satisfaction indicate that the PharmDISC system is a valid system to document PIs in daily community pharmacy practice. © 2017 John Wiley & Sons, Ltd.
Judah, Gaby; de Witt Huberts, Jessie; Drassal, Allan; Aunger, Robert
2017-01-01
The accurate measurement of behaviour is vitally important to many disciplines and practitioners of various kinds. While different methods have been used (such as observation, diaries, questionnaire), none are able to accurately monitor behaviour over the long term in the natural context of people's own lives. The aim of this work was therefore to develop and test a reliable system for unobtrusively monitoring various behaviours of multiple individuals within the same household over a period of several months. A commercial Real Time Location System was adapted to meet these requirements and subsequently validated in three households by monitoring various bathroom behaviours. The results indicate that the system is robust, can monitor behaviours over the long-term in different households and can reliably distinguish between individuals. Precision rates were high and consistent. Recall rates were less consistent across households and behaviours, although recall rates improved considerably with practice at set-up of the system. The achieved precision and recall rates were comparable to the rates observed in more controlled environments using more valid methods of ground truthing. These initial findings indicate that the system is a valuable, flexible and robust system for monitoring behaviour in its natural environment that would allow new research questions to be addressed.
Validating Neuro-QoL short forms and targeted scales with people who have multiple sclerosis.
Miller, Deborah M; Bethoux, Francois; Victorson, David; Nowinski, Cindy J; Buono, Sarah; Lai, Jin-Shei; Wortman, Katy; Burns, James L; Moy, Claudia; Cella, David
2016-05-01
Multiple sclerosis (MS) is a chronic, progressive, and disabling disease of the central nervous system with dramatic variations in the combination and severity of symptoms it can produce. The lack of reliable disease-specific health-related quality of life (HRQL) measures for use in clinical trials prompted the development of the Neurology Quality of Life (Neuro-QOL) instrument, which includes 13 scales that assess physical, emotional, cognitive, and social domains, for use in a variety of neurological illnesses. The objective of this research paper is to conduct an initial assessment of the reliability and validation of the Neuro-QOL short forms (SFs) in MS. We assessed reliability, concurrent validity, known groups validity, and responsiveness between cross-sectional and longitudinal data in 161 recruited MS patients. Internal consistency was high for all measures (α = 0.81-0.95) and ICCs were within the acceptable range (0.76-0.91); concurrent and known groups validity were highest with the Global HRQL question. Longitudinal assessment was limited by the lack of disease progression in the group. The Neuro-QOL SFs demonstrate good internal consistency, test-re-test reliability, and concurrent and known groups validity in this MS population, supporting the validity of Neuro-QOL in adults with MS. © The Author(s), 2015.
Prowse, Ashleigh; Aslaksen, Berit; Kierkegaard, Marie; Furness, James; Gerdhem, Paul; Abbott, Allan
2017-01-18
To investigate the reliability and concurrent validity of the Baseline ® Body Level/Scoliosis meter for adolescent idiopathic scoliosis postural assessment in three anatomical planes. This is an observational reliability and concurrent validity study of adolescent referrals to the Orthopaedic department for scoliosis screening at Karolinska University Hospital, Stockholm, Sweden between March-May 2012. A total of 31 adolescents with idiopathic scoliosis (13.6 ± 0.6 years old) of mild-moderate curvatures (25° ± 12°) were consecutively recruited. Measurement of cervical, thoracic and lumbar curvatures, pelvic and shoulder tilt, and axial thoracic rotation (ATR) were performed by two trained physiotherapists in one day. The intraclass correlation coefficient (ICC) was used to determine the inter-examiner reliability (ICC2,1) and the intra-rater reliability (ICC3,3) of the Baseline ® Body Level/Scoliosis meter. Spearman's correlation analyses were used to estimate concurrent validity between the Baseline ® Body Level/Scoliosis meter and Gold Standard Cobb angles from radiographs and the Orthopaedic Systems Inc. Scoliometer. There was excellent reliability between examiners for thoracic kyphosis (ICC2,1 = 0.94), ATR (ICC2,1 = 0.92) and lumbar lordosis (ICC2,1 = 0.79). There was adequate reliability between examiners for cervical lordosis (ICC2,1 = 0.51), however poor reliability for pelvic and shoulder tilt. Both devices were reproducible in the measurement of ATR when repeated by one examiner (ICC3,3 0.98-1.00). The device had a good correlation with the Scoliometer (rho = 0.78). When compared with Cobb angle from radiographs, there was a moderate correlation for ATR (rho = 0.627). The Baseline ® Body Level/Scoliosis meter provides reliable transverse and sagittal cervical, thoracic and lumbar measurements and valid transverse plan measurements of mild-moderate scoliosis deformity.
Environmental education curriculum evaluation questionnaire: A reliability and validity study
NASA Astrophysics Data System (ADS)
Minner, Daphne Diane
The intention of this research project was to bridge the gap between social science research and application to the environmental domain through the development of a theoretically derived instrument designed to give educators a template by which to evaluate environmental education curricula. The theoretical base for instrument development was provided by several developmental theories such as Piaget's theory of cognitive development, Developmental Systems Theory, Life-span Perspective, as well as curriculum research within the area of environmental education. This theoretical base fueled the generation of a list of components which were then translated into a questionnaire with specific questions relevant to the environmental education domain. The specific research question for this project is: Can a valid assessment instrument based largely on human development and education theory be developed that reliably discriminates high, moderate, and low quality in environmental education curricula? The types of analyses conducted to answer this question were interrater reliability (percent agreement, Cohen's Kappa coefficient, Pearson's Product-Moment correlation coefficient), test-retest reliability (percent agreement, correlation), and criterion-related validity (correlation). Face validity and content validity were also assessed through thorough reviews. Overall results indicate that 29% of the questions on the questionnaire demonstrated a high level of interrater reliability and 43% of the questions demonstrated a moderate level of interrater reliability. Seventy-one percent of the questions demonstrated a high test-retest reliability and 5% a moderate level. Fifty-five percent of the questions on the questionnaire were reliable (high or moderate) both across time and raters. Only eight questions (8%) did not show either interrater or test-retest reliability. The global overall rating of high, medium, or low quality was reliable across both coders and time, indicating that the questionnaire can discriminate differences in quality of environmental education curricula. Of the 35 curricula evaluated, 6 were high quality, 14 were medium quality and 15 were low quality. The criterion-related validity of the instrument is at current time unable to be established due to the lack of comparable measures or a concretely usable set of multidisciplinary standards. Face and content validity were sufficiently demonstrated.
Reliability and validity of an accele-rometric system for assessing vertical jumping performance.
Choukou, M-A; Laffaye, G; Taiar, R
2014-03-01
The validity of an accelerometric system (Myotest©) for assessing vertical jump height, vertical force and power, leg stiffness and reactivity index was examined. 20 healthy males performed 3×"5 hops in place", 3×"1 squat jump" and 3× "1 countermovement jump" during 2 test-retest sessions. The variables were simultaneously assessed using an accelerometer and a force platform at a frequency of 0.5 and 1 kHz, respectively. Both reliability and validity of the accelerometric system were studied. No significant differences between test and retest data were found (p < 0.05), showing a high level of reliability. Besides, moderate to high intraclass correlation coefficients (ICCs) (from 0.74 to 0.96) were obtained for all variables whereas weak to moderate ICCs (from 0.29 to 0.79) were obtained for force and power during the countermovement jump. With regards to validity, the difference between the two devices was not significant for 5 hops in place height (1.8 cm), force during squat (-1.4 N · kg(-1)) and countermovement (0.1 N · kg(-1)) jumps, leg stiffness (7.8 kN · m(-1)) and reactivity index (0.4). So, the measurements of these variables with this accelerometer are valid, which is not the case for the other variables. The main causes of non-validity for velocity, power and contact time assessment are temporal biases of the takeoff and touchdown moments detection.
RELIABILITY AND VALIDITY OF AN ACCELEROMETRIC SYSTEM FOR ASSESSING VERTICAL JUMPING PERFORMANCE
Laffaye, G.; Taiar, R.
2014-01-01
The validity of an accelerometric system (Myotest©) for assessing vertical jump height, vertical force and power, leg stiffness and reactivity index was examined. 20 healthy males performed 3ד5 hops in place”, 3ד1 squat jump” and 3× “1 countermovement jump” during 2 test-retest sessions. The variables were simultaneously assessed using an accelerometer and a force platform at a frequency of 0.5 and 1 kHz, respectively. Both reliability and validity of the accelerometric system were studied. No significant differences between test and retest data were found (p < 0.05), showing a high level of reliability. Besides, moderate to high intraclass correlation coefficients (ICCs) (from 0.74 to 0.96) were obtained for all variables whereas weak to moderate ICCs (from 0.29 to 0.79) were obtained for force and power during the countermovement jump. With regards to validity, the difference between the two devices was not significant for 5 hops in place height (1.8 cm), force during squat (-1.4 N · kg−1) and countermovement (0.1 N · kg−1) jumps, leg stiffness (7.8 kN · m−1) and reactivity index (0.4). So, the measurements of these variables with this accelerometer are valid, which is not the case for the other variables. The main causes of non-validity for velocity, power and contact time assessment are temporal biases of the takeoff and touchdown moments detection. PMID:24917690
Health Service Quality Scale: Brazilian Portuguese translation, reliability and validity.
Rocha, Luiz Roberto Martins; Veiga, Daniela Francescato; e Oliveira, Paulo Rocha; Song, Elaine Horibe; Ferreira, Lydia Masako
2013-01-17
The Health Service Quality Scale is a multidimensional hierarchical scale that is based on interdisciplinary approach. This instrument was specifically created for measuring health service quality based on marketing and health care concepts. The aim of this study was to translate and culturally adapt the Health Service Quality Scale into Brazilian Portuguese and to assess the validity and reliability of the Brazilian Portuguese version of the instrument. We conducted a cross-sectional, observational study, with public health system patients in a Brazilian university hospital. Validity was assessed using Pearson's correlation coefficient to measure the strength of the association between the Brazilian Portuguese version of the instrument and the SERVQUAL scale. Internal consistency was evaluated using Cronbach's alpha coefficient; the intraclass (ICC) and Pearson's correlation coefficients were used for test-retest reliability. One hundred and sixteen consecutive postoperative patients completed the questionnaire. Pearson's correlation coefficient for validity was 0.20. Cronbach's alpha for the first and second administrations of the final version of the instrument were 0.982 and 0.986, respectively. For test-retest reliability, Pearson's correlation coefficient was 0.89 and ICC was 0.90. The culturally adapted, Brazilian Portuguese version of the Health Service Quality Scale is a valid and reliable instrument to measure health service quality.
Harrop, James S; Vaccaro, Alexander R; Hurlbert, R John; Wilsey, Jared T; Baron, Eli M; Shaffrey, Christopher I; Fisher, Charles G; Dvorak, Marcel F; Oner, F C; Wood, Kirkham B; Anand, Neel; Anderson, D Greg; Lim, Moe R; Lee, Joon Y; Bono, Christopher M; Arnold, Paul M; Rampersaud, Y Raja; Fehlings, Michael G
2006-02-01
A new classification and treatment algorithm for thoracolumbar injuries was recently introduced by Vaccaro and colleagues in 2005. A thoracolumbar injury severity scale (TLISS) was proposed for grading and guiding treatment for these injuries. The scale is based on the following: 1) the mechanism of injury; 2) the integrity of the posterior ligamentous complex (PLC); and 3) the patient's neurological status. The reliability and validity of assessing injury mechanism and the integrity of the PLC was assessed. Forty-eight spine surgeons, consisting of neurosurgeons and orthopedic surgeons, reviewed 56 clinical thoracolumbar injury case histories. Each was classified and scored to determine treatment recommendations according to a novel classification system. After 3 months the case histories were reordered and the physicians repeated the exercise. Validity of this classification was good among reviewers; the vast majority (> 90%) agreed with the system's treatment recommendations. Surgeons were unclear as to a cogent description of PLC disruption and fracture mechanism. The TLISS demonstrated acceptable reliability in terms of intra- and interobserver agreement on the algorithm's treatment recommendations. Replacing injury mechanism with a description of injury morphology and better definition of PLC injury will improve inter- and intraobserver reliability of this injury classification system.
Kageyama, M; Nakamura, Y; Kobayashi, S; Yokoyama, K
2016-10-01
WHAT IS KNOWN ON THE SUBJECT?: Empowerment of family caregivers of adults with mental health issues has received increasing attention among mental health nurses in Japan and has been recognized as a new goal of family interventions. The Family Empowerment Scale (FES) was originally developed to measure the empowerment status of parents of children with emotional disorders. However, it was later applied to broader health issues. WHAT THIS PAPER ADDS TO EXISTING KNOWLEDGE?: We developed a Japanese version of the FES for family caregivers of adults with mental health issues (FES-AMJ) and examined the validity and reliability among parents. Results showed that the FES-AMJ had acceptable concurrent validity and reliability; however, insufficient construct validity was found, especially for the subscale regarding the service system. WHAT ARE THE IMPLICATIONS FOR PRACTICE?: Further studies need to modify the scale. Clarification of ideal family empowerment status in the service system through discussion with mental health nurses and family caregivers may be important. Introduction The Family Empowerment Scale (FES) was originally developed for parents of children with emotional disorders. In Japan, family empowerment is gaining increasing attention and may be one goal of nursing interventions. Aim To develop a Japanese version of the FES for family caregivers of adults with mental health issues and to study the validity and reliability of this scale among parents. Method We translated the FES into Japanese and administered this self-report questionnaire to 275 parents. Results The multitrait scaling analysis revealed acceptable convergent validity and insufficient discriminant validity among all subscales. In particular, all items of the Service system subscale had insufficient discriminant and/or convergent validity. Each subscale significantly correlated with the indicator of empowerment. The intraclass correlation coefficients of each subscale were .855-.917. Cronbach's alpha of each factor ranged from .867 to .895. Discussion The Service system subscale may not linearly reflect family empowerment, and instead may depend on unclear roles of family caregivers of adults, disorder severity or insufficient services. Implications for practice Further studies need to modify the scale. Clarification of ideal family empowerment status in the service system through discussion with mental health nurses and family caregivers may be important. © 2016 John Wiley & Sons Ltd.
Baumgart, Christian; Polglaze, Ted; Freiwald, Jürgen
2018-01-01
This study aimed to investigate the validity and reliability of global (GPS) and local (LPS) positioning systems for measuring distances covered and sprint mechanical properties in team sports. Here, we evaluated two recently released 18 Hz GPS and 20 Hz LPS technologies together with one established 10 Hz GPS technology. Six male athletes (age: 27±2 years; VO2max: 48.8±4.7 ml/min/kg) performed outdoors on 10 trials of a team sport-specific circuit that was equipped with double-light timing gates. The circuit included various walking, jogging, and sprinting sections that were performed either in straight-lines or with changes of direction. During the circuit, athletes wore two devices of each positioning system. From the reported and filtered velocity data, the distances covered and sprint mechanical properties (i.e., the theoretical maximal horizontal velocity, force, and power output) were computed. The sprint mechanical properties were modeled via an inverse dynamic approach applied to the center of mass. The validity was determined by comparing the measured and criterion data via the typical error of estimate (TEE), whereas the reliability was examined by comparing the two devices of each technology (i.e., the between-device reliability) via the coefficient of variation (CV). Outliers due to measurement errors were statistically identified and excluded from validity and reliability analyses. The 18 Hz GPS showed better validity and reliability for determining the distances covered (TEE: 1.6–8.0%; CV: 1.1–5.1%) and sprint mechanical properties (TEE: 4.5–14.3%; CV: 3.1–7.5%) than the 10 Hz GPS (TEE: 3.0–12.9%; CV: 2.5–13.0% and TEE: 4.1–23.1%; CV: 3.3–20.0%). However, the 20 Hz LPS demonstrated superior validity and reliability overall (TEE: 1.0–6.0%; CV: 0.7–5.0% and TEE: 2.1–9.2%; CV: 1.6–7.3%). For the 10 Hz GPS, 18 Hz GPS, and 20 Hz LPS, the relative loss of data sets due to measurement errors was 10.0%, 20.0%, and 15.8%, respectively. This study shows that 18 Hz GPS has enhanced validity and reliability for determining movement patterns in team sports compared to 10 Hz GPS, whereas 20 Hz LPS had superior validity and reliability overall. However, compared to 10 Hz GPS, 18 Hz GPS and 20 Hz LPS technologies had more outliers due to measurement errors, which limits their practical applications at this time. PMID:29420620
NASA Astrophysics Data System (ADS)
Yan, Yajing; Barth, Alexander; Beckers, Jean-Marie; Candille, Guillem; Brankart, Jean-Michel; Brasseur, Pierre
2015-04-01
Sea surface height, sea surface temperature and temperature profiles at depth collected between January and December 2005 are assimilated into a realistic eddy permitting primitive equation model of the North Atlantic Ocean using the Ensemble Kalman Filter. 60 ensemble members are generated by adding realistic noise to the forcing parameters related to the temperature. The ensemble is diagnosed and validated by comparison between the ensemble spread and the model/observation difference, as well as by rank histogram before the assimilation experiments. Incremental analysis update scheme is applied in order to reduce spurious oscillations due to the model state correction. The results of the assimilation are assessed according to both deterministic and probabilistic metrics with observations used in the assimilation experiments and independent observations, which goes further than most previous studies and constitutes one of the original points of this paper. Regarding the deterministic validation, the ensemble means, together with the ensemble spreads are compared to the observations in order to diagnose the ensemble distribution properties in a deterministic way. Regarding the probabilistic validation, the continuous ranked probability score (CRPS) is used to evaluate the ensemble forecast system according to reliability and resolution. The reliability is further decomposed into bias and dispersion by the reduced centred random variable (RCRV) score in order to investigate the reliability properties of the ensemble forecast system. The improvement of the assimilation is demonstrated using these validation metrics. Finally, the deterministic validation and the probabilistic validation are analysed jointly. The consistency and complementarity between both validations are highlighted. High reliable situations, in which the RMS error and the CRPS give the same information, are identified for the first time in this paper.
Schwertner, Debora Soccal; Oliveira, Raul; Mazo, Giovana Zarpellon; Gioda, Fabiane Rosa; Kelber, Christian Roberto; Swarowsky, Alessandra
2016-05-04
Several posture evaluation devices have been used to detect deviations of the vertebral column. However it has been observed that the instruments present measurement errors related to the equipment, environment or measurement protocol. This study aimed to build, validate, analyze the reliability and describe a measurement protocol for the use of the Posture Evaluation Rotating Platform System (SPGAP, Brazilian abbreviation). The posture evaluation system comprises a Posture Evaluation Rotating Platform, video camera, calibration support and measurement software. Two pilot studies were carried out with 102 elderly individuals (average age 69 years old, SD = ±7.3) to establish a protocol for SPGAP, controlling the measurement errors related to the environment, equipment and the person under evaluation. Content validation was completed with input from judges with expertise in posture measurement. The variation coefficient method was used to validate the measurement by the instrument of an object with known dimensions. Finally, reliability was established using repeated measurements of the known object. Expert content judges gave the system excellent ratings for content validity (mean 9.4 out of 10; SD 1.13). The measurement of an object with known dimensions indicated excellent validity (all measurement errors <1 %) and test-retest reliability. A total of 26 images were needed to stabilize the system. Participants in the pilot studies indicated that they felt comfortable throughout the assessment. The use of only one image can offer measurements that underestimate or overestimate the reality. To verify the images of objects with known dimensions the values for the width and height were, respectively, CV 0.88 (width) and 2.33 (height), SD 0.22 (width) and 0.35 (height), minimum and maximum values 24.83-25.2 (width) and 14.56 - 15.75 (height). In the analysis of different images (similar) of an individual, greater discrepancies were observed in the values found. The cervical index, for example, presented minimum and maximum values of 15.38 and 37.5, a coefficient of variation of 0.29 and a standard deviation of 6.78. The SPGAP was shown to be a valid and reliable instrument for the quantitative analysis of body posture with applicability and clinical use, since it managed to reduce several measurement errors, amongst which parallax distortion.
Evaluation of Urinary Tract Dilation Classification System for Grading Postnatal Hydronephrosis.
Hodhod, Amr; Capolicchio, John-Paul; Jednak, Roman; El-Sherif, Eid; El-Doray, Abd El-Alim; El-Sherbiny, Mohamed
2016-03-01
We assessed the reliability and validity of the Urinary Tract Dilation classification system as a new grading system for postnatal hydronephrosis. We retrospectively reviewed charts of patients who presented with hydronephrosis from 2008 to 2013. We included patients diagnosed prenatally and those with hydronephrosis discovered incidentally during the first year of life. We excluded cases involving urinary tract infection, neurogenic bladder and chromosomal anomalies, those associated with extraurinary congenital malformations and those with followup of less than 24 months without resolution. Hydronephrosis was graded postnatally using the Society for Fetal Urology system, and then the management protocol was chosen. All units were regraded using the Urinary Tract Dilation classification system and compared to the Society for Fetal Urology system to assess reliability. Univariate and multivariate analyses were performed to assess the validity of the Urinary Tract Dilation classification system in predicting hydronephrosis resolution and surgical intervention. A total of 490 patients (730 renal units) were eligible to participate. The Urinary Tract Dilation classification system was reliable in the assessment of hydronephrosis (parallel forms 0.92). Hydronephrosis resolved in 357 units (49%), and 86 units (12%) were managed by surgical intervention. The remainder of renal units demonstrated stable or improved hydronephrosis. Multivariate analysis revealed that the likelihood of surgical intervention was predicted independently by Urinary Tract Dilation classification system risk group, while Society for Fetal Urology grades were predictive of likelihood of resolution. The Urinary Tract Dilation classification system is reliable for evaluation of postnatal hydronephrosis and is valid in predicting surgical intervention. Copyright © 2016 American Urological Association Education and Research, Inc. Published by Elsevier Inc. All rights reserved.
Lu, Liang-Hsuan; Chiang, Shang-Lin; Wei, Shun-Hwa; Lin, Chueh-Ho; Sung, Wen-Hsu
2017-08-01
Being bedridden long-term can cause deterioration in patients' physiological function and performance, limiting daily activities and increasing the incidence of falls and other accidental injuries. Little research has been carried out in designing effective detecting systems to monitor the posture and status of bedridden patients and to provide accurate real-time feedback on posture. The purposes of this research were to develop a computer-aided system for real-time detection of physical activities in bed and to validate the system's validity and test-retest reliability in determining eight postures: motion leftward/rightward, turning over leftward/rightward, getting up leftward/rightward, and getting off the bed leftward/rightward. The in-bed physical activity detecting system consists mainly of a clinical sickbed, signal amplifier, a data acquisition (DAQ) system, and operating software for computing and determining postural changes associated with four load cell sensing components. Thirty healthy subjects (15 males and 15 females, mean age = 27.8 ± 5.3 years) participated in the study. All subjects were asked to execute eight in-bed activities in a random order and to participate in an evaluation of the test-retest reliability of the results 14 days later. Spearman's rank correlation coefficient was used to compare the system's determinations of postural states with researchers' recordings of postural changes. The test-retest reliability of the system's ability to determine postures was analyzed using the interclass correlation coefficient ICC(3,1). The system was found to exhibit high validity and accuracy (r = 0.928, p < 0.001; accuracy rate: 87.9%) in determining in-bed displacement, turning over, sitting up, and getting off the bed. The system was particularly accurate in detecting motion rightward (90%), turning over leftward (83%), sitting up leftward or rightward (87-93%), and getting off the bed (100%). The test-retest reliability ICC(3,1) value was 0.968 (p < 0.001). The system developed in this study exhibits satisfactory validity and reliability in detecting changes in-bed body postures and can be beneficial in assisting caregivers and clinical nursing staff in detecting the in-bed physical activities of bedridden patients and in developing fall prevention warning systems. Copyright © 2017 Elsevier B.V. All rights reserved.
O'Neil, Margaret E; Fragala-Pinkham, Maria; Lennon, Nancy; George, Ameeka; Forman, Jeffrey; Trost, Stewart G
2016-01-01
Physical therapy for youth with cerebral palsy (CP) who are ambulatory includes interventions to increase functional mobility and participation in physical activity (PA). Thus, reliable and valid measures are needed to document PA in youth with CP. The purpose of this study was to evaluate the inter-instrument reliability and concurrent validity of 3 accelerometer-based motion sensors with indirect calorimetry as the criterion for measuring PA intensity in youth with CP. Fifty-seven youth with CP (mean age=12.5 years, SD=3.3; 51% female; 49.1% with spastic hemiplegia) participated. Inclusion criteria were: aged 6 to 20 years, ambulatory, Gross Motor Function Classification System (GMFCS) levels I through III, able to follow directions, and able to complete the full PA protocol. Protocol activities included standardized activity trials with increasing PA intensity (resting, writing, household chores, active video games, and walking at 3 self-selected speeds), as measured by weight-relative oxygen uptake (in mL/kg/min). During each trial, participants wore bilateral accelerometers on the upper arms, waist/hip, and ankle and a portable indirect calorimeter. Intraclass coefficient correlations (ICCs) were calculated to evaluate inter-instrument reliability (left-to-right accelerometer placement). Spearman correlations were used to examine concurrent validity between accelerometer output (activity and step counts) and indirect calorimetry. Friedman analyses of variance with post hoc pair-wise analyses were conducted to examine the validity of accelerometers to discriminate PA intensity across activity trials. All accelerometers exhibited excellent inter-instrument reliability (ICC=.94-.99) and good concurrent validity (rho=.70-.85). All accelerometers discriminated PA intensity across most activity trials. This PA protocol consisted of controlled activity trials. Accelerometers provide valid and reliable measures of PA intensity among youth with CP. © 2016 American Physical Therapy Association.
Validity and reliability of acoustic analysis of respiratory sounds in infants
Elphick, H; Lancaster, G; Solis, A; Majumdar, A; Gupta, R; Smyth, R
2004-01-01
Objective: To investigate the validity and reliability of computerised acoustic analysis in the detection of abnormal respiratory noises in infants. Methods: Blinded, prospective comparison of acoustic analysis with stethoscope examination. Validity and reliability of acoustic analysis were assessed by calculating the degree of observer agreement using the κ statistic with 95% confidence intervals (CI). Results: 102 infants under 18 months were recruited. Convergent validity for agreement between stethoscope examination and acoustic analysis was poor for wheeze (κ = 0.07 (95% CI, –0.13 to 0.26)) and rattles (κ = 0.11 (–0.05 to 0.27)) and fair for crackles (κ = 0.36 (0.18 to 0.54)). Both the stethoscope and acoustic analysis distinguished well between sounds (discriminant validity). Agreement between observers for the presence of wheeze was poor for both stethoscope examination and acoustic analysis. Agreement for rattles was moderate for the stethoscope but poor for acoustic analysis. Agreement for crackles was moderate using both techniques. Within-observer reliability for all sounds using acoustic analysis was moderate to good. Conclusions: The stethoscope is unreliable for assessing respiratory sounds in infants. This has important implications for its use as a diagnostic tool for lung disorders in infants, and confirms that it cannot be used as a gold standard. Because of the unreliability of the stethoscope, the validity of acoustic analysis could not be demonstrated, although it could discriminate between sounds well and showed good within-observer reliability. For acoustic analysis, targeted training and the development of computerised pattern recognition systems may improve reliability so that it can be used in clinical practice. PMID:15499065
Sleeper, Mark D; Kenyon, Lisa K; Elliott, James M; Cheng, M Samuel
2016-12-01
Despite the availability of various field-tests for many competitive sports, a reliable and valid test specifically developed for use in men's gymnastics has not yet been developed. The Men's Gymnastics Functional Measurement Tool (MGFMT) was designed to assess sport-specific physical abilities in male competitive gymnasts. The purpose of this study was to develop the MGFMT by establishing a scoring system for individual test items and to initiate the process of establishing test-retest reliability and construct validity. A total of 83 competitive male gymnasts ages 7-18 underwent testing using the MGFMT. Thirty of these subjects underwent re-testing one week later in order to assess test-retest reliability. Construct validity was assessed using a simple regression analysis between total MGFMT scores and the gymnasts' USA-Gymnastics competitive level to calculate the coefficient of determination (r 2 ). Test-retest reliability was analyzed using Model 1 Intraclass correlation coefficients (ICC). Statistical significance was set at the p<0.05 level. The relationship between total MGFMT scores and subjects' current USA-Gymnastics competitive level was found to be good (r 2 = 0.63). Reliability testing of the MGFMT composite test score showed excellent test-retest reliability over a one-week period (ICC = 0.97). Test-retest reliability of the individual component tests ranged from good to excellent (ICC = 0.75-0.97). The results of this study provide initial support for the construct validity and test-retest reliability of the MGFMT. Level 3.
Validation of the Behavioral Risk Factor Surveillance System Sleep Questions
Jungquist, Carla R.; Mund, Jaime; Aquilina, Alan T.; Klingman, Karen; Pender, John; Ochs-Balcom, Heather; van Wijngaarden, Edwin; Dickerson, Suzanne S.
2016-01-01
Study Objective: Sleep problems may constitute a risk for health problems, including cardiovascular disease, depression, diabetes, poor work performance, and motor vehicle accidents. The primary purpose of this study was to assess the validity of the current Behavioral Risk Factor Surveillance System (BRFSS) sleep questions by establishing the sensitivity and specificity for detection of sleep/ wake disturbance. Methods: Repeated cross-sectional assessment of 300 community dwelling adults over the age of 18 who did not wear CPAP or oxygen during sleep. Reliability and validity testing of the BRFSS sleep questions was performed comparing to BFRSS responses to data from home sleep study, actigraphy for 14 days, Insomnia Severity Index, Epworth Sleepiness Scale, and PROMIS-57. Results: Only two of the five BRFSS sleep questions were found valid and reliable in determining total sleep time and excessive daytime sleepiness. Conclusions: Refinement of the BRFSS questions is recommended. Citation: Jungquist CR, Mund J, Aquilina AT, Klingman K, Pender J, Ochs-Balcom H, van Wijngaarden E, Dickerson SS. Validation of the behavioral risk factor surveillance system sleep questions. J Clin Sleep Med 2016;12(3):301–310. PMID:26446246
Clark, Ross A; Pua, Yong-Hao; Oliveira, Cristino C; Bower, Kelly J; Thilarajah, Shamala; McGaw, Rebekah; Hasanki, Ksaniel; Mentiplay, Benjamin F
2015-07-01
The Microsoft Kinect V2 for Windows, also known as the Xbox One Kinect, includes new and potentially far improved depth and image sensors which may increase its accuracy for assessing postural control and balance. The aim of this study was to assess the concurrent validity and reliability of kinematic data recorded using a marker-based three dimensional motion analysis (3DMA) system and the Kinect V2 during a variety of static and dynamic balance assessments. Thirty healthy adults performed two sessions, separated by one week, consisting of static standing balance tests under different visual (eyes open vs. closed) and supportive (single limb vs. double limb) conditions, and dynamic balance tests consisting of forward and lateral reach and an assessment of limits of stability. Marker coordinate and joint angle data were concurrently recorded using the Kinect V2 skeletal tracking algorithm and the 3DMA system. Task-specific outcome measures from each system on Day 1 and 2 were compared. Concurrent validity of trunk angle data during the dynamic tasks and anterior-posterior range and path length in the static balance tasks was excellent (Pearson's r>0.75). In contrast, concurrent validity for medial-lateral range and path length was poor to modest for all trials except single leg eyes closed balance. Within device test-retest reliability was variable; however, the results were generally comparable between devices. In conclusion, the Kinect V2 has the potential to be used as a reliable and valid tool for the assessment of some aspects of balance performance. Copyright © 2015 Elsevier B.V. All rights reserved.
Quek, June; Brauer, Sandra G; Treleaven, Julia; Pua, Yong-Hao; Mentiplay, Benjamin; Clark, Ross Allan
2014-04-17
Concurrent validity and intra-rater reliability using a customized Android phone application to measure cervical-spine range-of-motion (ROM) has not been previously validated against a gold-standard three-dimensional motion analysis (3DMA) system. Twenty-one healthy individuals (age:31 ± 9.1 years, male:11) participated, with 16 re-examined for intra-rater reliability 1-7 days later. An Android phone was fixed on a helmet, which was then securely fastened on the participant's head. Cervical-spine ROM in flexion, extension, lateral flexion and rotation were performed in sitting with concurrent measurements obtained from both a 3DMA system and the phone.The phone demonstrated moderate to excellent (ICC = 0.53-0.98, Spearman ρ = 0.52-0.98) concurrent validity for ROM measurements in cervical flexion, extension, lateral-flexion and rotation. However, cervical rotation demonstrated both proportional and fixed bias. Excellent intra-rater reliability was demonstrated for cervical flexion, extension and lateral flexion (ICC = 0.82-0.90), but poor for right- and left-rotation (ICC = 0.05-0.33) using the phone. Possible reasons for the outcome are that flexion, extension and lateral-flexion measurements are detected by gravity-dependent accelerometers while rotation measurements are detected by the magnetometer which can be adversely affected by surrounding magnetic fields. The results of this study demonstrate that the tested Android phone application is valid and reliable to measure ROM of the cervical-spine in flexion, extension and lateral-flexion but not in rotation likely due to magnetic interference. The clinical implication of this study is that therapists should be mindful of the plane of measurement when using the Android phone to measure ROM of the cervical-spine.
2014-01-01
Background Concurrent validity and intra-rater reliability using a customized Android phone application to measure cervical-spine range-of-motion (ROM) has not been previously validated against a gold-standard three-dimensional motion analysis (3DMA) system. Findings Twenty-one healthy individuals (age:31 ± 9.1 years, male:11) participated, with 16 re-examined for intra-rater reliability 1–7 days later. An Android phone was fixed on a helmet, which was then securely fastened on the participant’s head. Cervical-spine ROM in flexion, extension, lateral flexion and rotation were performed in sitting with concurrent measurements obtained from both a 3DMA system and the phone. The phone demonstrated moderate to excellent (ICC = 0.53-0.98, Spearman ρ = 0.52-0.98) concurrent validity for ROM measurements in cervical flexion, extension, lateral-flexion and rotation. However, cervical rotation demonstrated both proportional and fixed bias. Excellent intra-rater reliability was demonstrated for cervical flexion, extension and lateral flexion (ICC = 0.82-0.90), but poor for right- and left-rotation (ICC = 0.05-0.33) using the phone. Possible reasons for the outcome are that flexion, extension and lateral-flexion measurements are detected by gravity-dependent accelerometers while rotation measurements are detected by the magnetometer which can be adversely affected by surrounding magnetic fields. Conclusion The results of this study demonstrate that the tested Android phone application is valid and reliable to measure ROM of the cervical-spine in flexion, extension and lateral-flexion but not in rotation likely due to magnetic interference. The clinical implication of this study is that therapists should be mindful of the plane of measurement when using the Android phone to measure ROM of the cervical-spine. PMID:24742001
Scherr, Karen A.; Fagerlin, Angela; Williamson, Lillie D.; Davis, J. Kelly; Fridman, Ilona; Atyeo, Natalie; Ubel, Peter A.
2016-01-01
Background Physicians’ recommendations affect patients’ treatment choices. However, most research relies on physicians’ or patients’ retrospective reports of recommendations, which offer a limited perspective and have limitations such as recall bias. Objective To develop a reliable and valid method to measure the strength of physician recommendations using direct observation of clinical encounters. Methods Clinical encounters (n = 257) were recorded as part of a larger study of prostate cancer decision making. We used an iterative process to create the 5-point Physician Recommendation Coding System (PhyReCS). To determine reliability, research assistants double-coded 50 transcripts. To establish content validity, we used one-way ANOVAs to determine whether relative treatment recommendation scores differed as a function of which treatment patients received. To establish concurrent validity, we examined whether patients’ perceived treatment recommendations matched our coded recommendations. Results The PhyReCS was highly reliable (Krippendorf’s alpha =. 89, 95% CI [.86, .91]). The average relative treatment recommendation score for each treatment was higher for individuals who received that particular treatment. For example, the average relative surgery recommendation score was higher for individuals who received surgery versus radiation (mean difference = .98, SE = .18, p < .001) or active surveillance (mean difference = 1.10, SE = .14, p < .001). Patients’ perceived recommendations matched coded recommendations 81% of the time. Conclusion The PhyReCS is a reliable and valid way to capture the strength of physician recommendations. We believe that the PhyReCS would be helpful for other researchers who wish to study physician recommendations, an important part of patient decision making. PMID:27343015
Scherr, Karen A; Fagerlin, Angela; Williamson, Lillie D; Davis, J Kelly; Fridman, Ilona; Atyeo, Natalie; Ubel, Peter A
2017-01-01
Physicians' recommendations affect patients' treatment choices. However, most research relies on physicians' or patients' retrospective reports of recommendations, which offer a limited perspective and have limitations such as recall bias. To develop a reliable and valid method to measure the strength of physician recommendations using direct observation of clinical encounters. Clinical encounters (n = 257) were recorded as part of a larger study of prostate cancer decision making. We used an iterative process to create the 5-point Physician Recommendation Coding System (PhyReCS). To determine reliability, research assistants double-coded 50 transcripts. To establish content validity, we used 1-way analyses of variance to determine whether relative treatment recommendation scores differed as a function of which treatment patients received. To establish concurrent validity, we examined whether patients' perceived treatment recommendations matched our coded recommendations. The PhyReCS was highly reliable (Krippendorf's alpha = 0.89, 95% CI [0.86, 0.91]). The average relative treatment recommendation score for each treatment was higher for individuals who received that particular treatment. For example, the average relative surgery recommendation score was higher for individuals who received surgery versus radiation (mean difference = 0.98, SE = 0.18, P < 0.001) or active surveillance (mean difference = 1.10, SE = 0.14, P < 0.001). Patients' perceived recommendations matched coded recommendations 81% of the time. The PhyReCS is a reliable and valid way to capture the strength of physician recommendations. We believe that the PhyReCS would be helpful for other researchers who wish to study physician recommendations, an important part of patient decision making. © The Author(s) 2016.
Development of the Systems Thinking Scale for Adolescent Behavior Change.
Moore, Shirley M; Komton, Vilailert; Adegbite-Adeniyi, Clara; Dolansky, Mary A; Hardin, Heather K; Borawski, Elaine A
2018-03-01
This report describes the development and psychometric testing of the Systems Thinking Scale for Adolescent Behavior Change (STS-AB). Following item development, initial assessments of understandability and stability of the STS-AB were conducted in a sample of nine adolescents enrolled in a weight management program. Exploratory factor analysis of the 16-item STS-AB and internal consistency assessments were then done with 359 adolescents enrolled in a weight management program. Test-retest reliability of the STS-AB was .71, p = .03; internal consistency reliability was .87. Factor analysis of the 16-item STS-AB indicated a one-factor solution with good factor loadings, ranging from .40 to .67. Evidence of construct validity was supported by significant correlations with established measures of variables associated with health behavior change. We provide beginning evidence of the reliability and validity of the STS-AB to measure systems thinking for health behavior change in young adolescents.
Development of the Systems Thinking Scale for Adolescent Behavior Change
Moore, Shirley M.; Komton, Vilailert; Adegbite-Adeniyi, Clara; Dolansky, Mary A.; Hardin, Heather K.; Borawski, Elaine A.
2017-01-01
This report describes the development and psychometric testing of the Systems Thinking Scale for Adolescent Behavior Change (STS-AB). Following item development, initial assessments of understandability and stability of the STS-AB were conducted in a sample of nine adolescents enrolled in a weight management program. Exploratory factor analysis of the 16-item STS-AB and internal consistency assessments were then done with 359 adolescents enrolled in a weight management program. Test–retest reliability of the STS-AB was .71, p = .03; internal consistency reliability was .87. Factor analysis of the 16-item STS-AB indicated a one-factor solution with good factor loadings, ranging from .40 to .67. Evidence of construct validity was supported by significant correlations with established measures of variables associated with health behavior change. We provide beginning evidence of the reliability and validity of the STS-AB to measure systems thinking for health behavior change in young adolescents. PMID:28303755
2011-11-01
assessment to quality of localization/characterization estimates. This protocol includes four critical components: (1) a procedure to identify the...critical factors impacting SHM system performance; (2) a multistage or hierarchical approach to SHM system validation; (3) a model -assisted evaluation...Lindgren, E. A ., Buynak, C. F., Steffes, G., Derriso, M., “ Model -assisted Probabilistic Reliability Assessment for Structural Health Monitoring
Validation of Clinical Observations of Mastication in Persons with ALS.
Simione, Meg; Wilson, Erin M; Yunusova, Yana; Green, Jordan R
2016-06-01
Amyotrophic lateral sclerosis (ALS) is a progressive neurological disease that can result in difficulties with mastication leading to malnutrition, choking or aspiration, and reduced quality of life. When evaluating mastication, clinicians primarily observe spatial and temporal aspects of jaw motion. The reliability and validity of clinical observations for detecting jaw movement abnormalities is unknown. The purpose of this study is to determine the reliability and validity of clinician-based ratings of chewing performance in neuro-typical controls and persons with varying degrees of chewing impairments due to ALS. Adults chewed a solid food consistency while full-face video were recorded along with jaw kinematic data using a 3D optical motion capture system. Five experienced speech-language pathologists watched the videos and rated the spatial and temporal aspects of chewing performance. The jaw kinematic data served as the gold-standard for validating the clinicians' ratings. Results showed that the clinician-based rating of temporal aspects of chewing performance had strong inter-rater reliability and correlated well with comparable kinematic measures. In contrast, the reliability of rating the spatial and spatiotemporal aspects of chewing (i.e., range of motion of the jaw, consistency of the chewing pattern) was mixed. Specifically, ratings of range of motion were at best only moderately reliable. Ratings of chewing movement consistency were reliable but only weakly correlated with comparable measures of jaw kinematics. These findings suggest that clinician ratings of temporal aspects of chewing are appropriate for clinical use, whereas ratings of the spatial and spatiotemporal aspects of chewing may not be reliable or valid.
A Comparison of Laser and Video Techniques for Determining Displacement and Velocity during Running
ERIC Educational Resources Information Center
Harrison, Andrew J.; Jensen, Randall L.; Donoghue, Orna
2005-01-01
The reliability of a laser system was compared with the reliability of a video-based kinematic analysis in measuring displacement and velocity during running. Validity and reliability of the laser on static measures was also assessed at distances between 10 m and 70 m by evaluating the coefficient of variation and intraclass correlation…
The Adult Attachment Projective Picture System: integrating attachment into clinical assessment.
George, Carol; West, Malcolm
2011-01-01
This article summarizes the development and validation of the Adult Attachment Projective System (AAP), a measure we developed from the Bowlby-Ainsworth developmental tradition to assess adult attachment status. The AAP has demonstrated excellent concurrent validity with the Adult Attachment Interview (George, Kaplan, & Main, 1984/1985/1996; Main & Goldwyn, 1985-1994; Main, Goldwyn, & Hesse, 2003), interjudge reliability, and test-retest reliability, with no effects of verbal intelligence or social desirability. The AAP coding and classification system and application in clinical and community samples are summarized. Finally, we introduce the 3 other articles that are part of this Special Section and discuss the use of the AAP in therapeutic assessment and treatment.
ERIC Educational Resources Information Center
Tella, Adeyinka
2011-01-01
The suitability of 52 items for measuring Blackboard course management system success was investigated with the aim of validating the Blackboard CMS success scale in an educational context. Through a survey, the Blackboard course management system (BCMS) success scale was administered to 503 students at the University of Botswana. Data collected…
Barty, Elizabeth; Caynes, Katy; Johnston, Leanne M
2016-10-01
This paper describes the development, validation, and reliability of the Functional Communication Classification System (FCCS), designed to classify expressive communication skills of children with cerebral palsy (CP) aged 4 years and 5 years (between their fourth and sixth birthdays). The Functional Communication Classification System (FCCS) was developed in 2006 using a literature review, client file audit, and expert consultative committee process in order to devise scale content, structure, and check clinical validity and utility. Interrater reliability was examined between speech-language pathologists (SLPs), other allied health professionals (AHPs), and parents of 48 children with CP. The scale was revised and a clinical reasoning prompt sheet added, then trialled again for 42 children. The result was a five-level system with descriptors and decision-making guides for classification of functional expressive communication for children with CP. Overall interrater reliability was excellent for the final FCCS, intraclass correlation coefficient=0.97 (95% confidence interval 0.95 to 0.98). Kappa values were 0.94 between SLPs and AHPs, 0.59 between SLPs and parents, and 0.60 between AHPs and parents. The FCCS is a reliable tool for describing functional communication in young children with CP, appropriate for use by SLPs, other AHPs, and parents of children with CP. © 2016 Mac Keith Press.
Chang, Wen-Dien; Chang, Wan-Yi; Lee, Chia-Lun; Feng, Chi-Yen
2013-01-01
[Purpose] Balance is an integral part of human ability. The smart balance master system (SBM) is a balance test instrument with good reliability and validity, but it is expensive. Therefore, we modified a Wii Fit balance board, which is a convenient balance assessment tool, and analyzed its reliability and validity. [Subjects and Methods] We recruited 20 healthy young adults and 20 elderly people, and administered 3 balance tests. The correlation coefficient and intraclass correlation of both instruments were analyzed. [Results] There were no statistically significant differences in the 3 tests between the Wii Fit balance board and the SBM. The Wii Fit balance board had a good intraclass correlation (0.86–0.99) for the elderly people and positive correlations (r = 0.58–0.86) with the SBM. [Conclusions] The Wii Fit balance board is a balance assessment tool with good reliability and high validity for elderly people, and we recommend it as an alternative tool for assessing balance ability. PMID:24259769
Alsalaheen, Bara; Haines, Jamie; Yorke, Amy; Broglio, Steven P
2015-12-01
To examine the reliability, convergent, and discriminant validity of the limits of stability (LOS) test to assess dynamic postural stability in adolescents using a portable forceplate system. Cross-sectional reliability observational study. School setting. Adolescents (N=36) completed all measures during the first session. To examine the reliability of the LOS test, a subset of 15 participants repeated the LOS test after 1 week. Not applicable. Outcome measurements included the LOS test, Balance Error Scoring System, Instrumented Balance Error Scoring System, and Modified Clinical Test for Sensory Interaction on Balance. A significant relation was observed among LOS composite scores (r=.36-.87, P<.05). However, no relation was observed between LOS and static balance outcome measurements. The reliability of the LOS composite scores ranged from moderate to good (intraclass correlation coefficient model 2,1=.73-.96). The results suggest that the LOS composite scores provide unique information about dynamic postural stability, and the LOS test completed at 100% of the theoretical limit appeared to be a reliable test of dynamic postural stability in adolescents. Clinicians should use dynamic balance measurement as part of their balance assessment and should not use static balance testing (eg, Balance Error Scoring System) to make inferences about dynamic balance, especially when balance assessment is used to determine rehabilitation outcomes, or when making return to play decisions after injury. Copyright © 2015 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Yanai, Toshimasa; Matsuo, Akifumi; Maeda, Akira; Nakamoto, Hiroki; Mizutani, Mirai; Kanehisa, Hiroaki; Fukunaga, Tetsuo
2017-08-01
We developed a force measurement system in a soil-filled mound for measuring ground reaction forces (GRFs) acting on baseball pitchers and examined the reliability and validity of kinetic and kinematic parameters determined from the GRFs. Three soil-filled trays of dimensions that satisfied the official baseball rules were fixed onto 3 force platforms. Eight collegiate pitchers wearing baseball shoes with metal cleats were asked to throw 5 fastballs with maximum effort from the mound toward a catcher. The reliability of each parameter was determined for each subject as the coefficient of variation across the 5 pitches. The validity of the measurements was tested by comparing the outcomes either with the true values or the corresponding values computed from a motion capture system. The coefficients of variation in the repeated measurements of the peak forces ranged from 0.00 to 0.17, and were smaller for the pivot foot than the stride foot. The mean absolute errors in the impulses determined over the entire duration of pitching motion were 5.3 N˙s, 1.9 N˙s, and 8.2 N˙s for the X-, Y-, and Z-directions, respectively. These results suggest that the present method is reliable and valid for determining selected kinetic and kinematic parameters for analyzing pitching performance.
NASA Astrophysics Data System (ADS)
Yan, Y.; Barth, A.; Beckers, J. M.; Candille, G.; Brankart, J. M.; Brasseur, P.
2015-07-01
Sea surface height, sea surface temperature, and temperature profiles at depth collected between January and December 2005 are assimilated into a realistic eddy permitting primitive equation model of the North Atlantic Ocean using the Ensemble Kalman Filter. Sixty ensemble members are generated by adding realistic noise to the forcing parameters related to the temperature. The ensemble is diagnosed and validated by comparison between the ensemble spread and the model/observation difference, as well as by rank histogram before the assimilation experiments. An incremental analysis update scheme is applied in order to reduce spurious oscillations due to the model state correction. The results of the assimilation are assessed according to both deterministic and probabilistic metrics with independent/semiindependent observations. For deterministic validation, the ensemble means, together with the ensemble spreads are compared to the observations, in order to diagnose the ensemble distribution properties in a deterministic way. For probabilistic validation, the continuous ranked probability score (CRPS) is used to evaluate the ensemble forecast system according to reliability and resolution. The reliability is further decomposed into bias and dispersion by the reduced centered random variable (RCRV) score in order to investigate the reliability properties of the ensemble forecast system. The improvement of the assimilation is demonstrated using these validation metrics. Finally, the deterministic validation and the probabilistic validation are analyzed jointly. The consistency and complementarity between both validations are highlighted.
Integrating Reliability Analysis with a Performance Tool
NASA Technical Reports Server (NTRS)
Nicol, David M.; Palumbo, Daniel L.; Ulrey, Michael
1995-01-01
A large number of commercial simulation tools support performance oriented studies of complex computer and communication systems. Reliability of these systems, when desired, must be obtained by remodeling the system in a different tool. This has obvious drawbacks: (1) substantial extra effort is required to create the reliability model; (2) through modeling error the reliability model may not reflect precisely the same system as the performance model; (3) as the performance model evolves one must continuously reevaluate the validity of assumptions made in that model. In this paper we describe an approach, and a tool that implements this approach, for integrating a reliability analysis engine into a production quality simulation based performance modeling tool, and for modeling within such an integrated tool. The integrated tool allows one to use the same modeling formalisms to conduct both performance and reliability studies. We describe how the reliability analysis engine is integrated into the performance tool, describe the extensions made to the performance tool to support the reliability analysis, and consider the tool's performance.
Hung, Man; Baumhauer, Judith F; Latt, L Daniel; Saltzman, Charles L; SooHoo, Nelson F; Hunt, Kenneth J
2013-11-01
In 2012, the American Orthopaedic Foot & Ankle Society(®) established a national network for collecting and sharing data on treatment outcomes and improving patient care. One of the network's initiatives is to explore the use of computerized adaptive tests (CATs) for patient-level outcome reporting. We determined whether the CAT from the NIH Patient Reported Outcome Measurement Information System(®) (PROMIS(®)) Physical Function (PF) item bank provides efficient, reliable, valid, precise, and adequately covered point estimates of patients' physical function. After informed consent, 288 patients with a mean age of 51 years (range, 18-81 years) undergoing surgery for common foot and ankle problems completed a web-based questionnaire. Efficiency was determined by time for test administration. Reliability was assessed with person and item reliability estimates. Validity evaluation included content validity from expert review and construct validity measured against the PROMIS(®) Pain CAT and patient responses based on tradeoff perceptions. Precision was assessed by standard error of measurement (SEM) across patients' physical function levels. Instrument coverage was based on a person-item map. Average time of test administration was 47 seconds. Reliability was 0.96 for person and 0.99 for item. Construct validity against the Pain CAT had an r value of -0.657 (p < 0.001). Precision had an SEM of less than 3.3 (equivalent to a Cronbach's alpha of ≥ 0.90) across a broad range of function. Concerning coverage, the ceiling effect was 0.32% and there was no floor effect. The PROMIS(®) PF CAT appears to be an excellent method for measuring outcomes for patients with foot and ankle surgery. Further validation of the PROMIS(®) item banks may ultimately provide a valid and reliable tool for measuring patient-reported outcomes after injuries and treatment.
Green, Dido; Meroz, Anat; Margalit, Adi Edit; Ratzon, Navah Z
2012-11-01
This study examines a potential instrument for measurement of typing postures of children. This paper describes inter-rater, test-retest reliability and concurrent validity of the Keyboard Personal Computer Style instrument (K-PeCS), an observational measurement of postures and movements during keyboarding, for use with children. Two trained raters independently rated videos of 24 children (aged 7-10 years). Six children returned one week later for identifying test-retest reliability. Concurrent validity was assessed by comparing ratings obtained using the K-PECS to scores from a 3D motion analysis system. Inter-rater reliability was moderate to high for 12 out of 16 items (Kappa: 0.46 to 1.00; correlation coefficients: 0.77-0.95) and test-retest reliability varied across items (Kappa: 0.25 to 0.67; correlation coefficients: r = 0.20 to r = 0.95). Concurrent validity compared favourably across arm pathlength, wrist extension and ulnar deviation. In light of the limitations of other tools the K-PeCS offers a fairly affordable, reliable and valid instrument to address the gap for measurement of typing styles of children, despite the shortcomings of some items. However further research is required to refine the instrument for use in evaluating typing among children. Copyright © 2012 Elsevier Ltd and The Ergonomics Society. All rights reserved.
Mousavian, Alireza; Ebrahimzadeh, Mohammad H; Birjandinejad, Ali; Omidi-Kashani, Farzad; Kachooei, Amir Reza
2015-12-01
In this study, we aimed to translate and test the validity and reliablity of the Persian version of the Manchester-Oxford Foot Questionnaire in foot and ankle patients. We translated the Manchester-Oxford Foot Questionnaire to Persian language according to the accepted guidelines, then assessed the psychometric properties including the validity and reliability on 308 patients with long-standing foot and ankle problems. To test the reliability, we calculated the intra-class correlation coefficient (ICC) for test-retest reliability and measured Cronbach's alpha to test the internal consistency. To test the construct validity of the Manchester-Oxford Foot Questionnaire we also administered the Short-Form 36 to patients. Construct validity was supported by significant correlation with SF36 subscales except for pain subscale of the persian MOXFQ with mental health of the SF36 (r=0.207). Intraclass correlation coefficient was 0.79 for the total MOXFQ and ranged from 0.83 to 0.89 for the three subscales. Cronbach's alpha for pain, walking/standing, and social interaction was 0.86, 0.88, and 0.89, respectively, and was 0.79 for the total MOXFQ showing good internal consistency in each domain. The Persian Manchester-Oxford Foot Questionnaire health scoring system is a valid and reliable patient-reported instrument for foot and ankle problems. Copyright © 2015. Published by Elsevier Ltd.
Wan, Chonghua; Li, Hezhan; Fan, Xuejin; Yang, Ruixue; Pan, Jiahua; Chen, Wenru; Zhao, Rong
2014-06-04
Quality of life (QOL) for patients with coronary heart disease (CHD) is now concerned worldwide with the specific instruments being seldom and no one developed by the modular approach. This paper is aimed to develop the CHD scale of the system of Quality of Life Instruments for Chronic Diseases (QLICD-CHD) by the modular approach and validate it by both classical test theory and Generalizability Theory. The QLICD-CHD was developed based on programmed decision procedures with multiple nominal and focus group discussions, in-depth interview, pre-testing and quantitative statistical procedures. 146 inpatients with CHD were used to provide the data measuring QOL three times before and after treatments. The psychometric properties of the scale were evaluated with respect to validity, reliability and responsiveness employing correlation analysis, factor analyses, multi-trait scaling analysis, t-tests and also G studies and D studies of Genralizability Theory analysis. Multi-trait scaling analysis, correlation and factor analyses confirmed good construct validity and criterion-related validity when using SF-36 as a criterion. The internal consistency α and test-retest reliability coefficients (Pearson r and Intra-class correlations ICC) for the overall instrument and all domains were higher than 0.70 and 0.80 respectively; The overall and all domains except for social domain had statistically significant changes after treatments with moderate effect size SRM (standardized response mea) ranging from 0.32 to 0.67. G-coefficients and index of dependability (Ф coefficients) confirmed the reliability of the scale further with more exact variance components. The QLICD-CHD has good validity, reliability, and moderate responsiveness and some highlights, and can be used as the quality of life instrument for patients with CHD. However, in order to obtain better reliability, the numbers of items for social domain should be increased or the items' quality, not quantity, should be improved.
Inferior turbinate classification system, grades 1 to 4: development and validation study.
Camacho, Macario; Zaghi, Soroush; Certal, Victor; Abdullatif, Jose; Means, Casey; Acevedo, Jason; Liu, Stanley; Brietzke, Scott E; Kushida, Clete A; Capasso, Robson
2015-02-01
To develop a validated inferior turbinate grading scale. Development and validation study. Phase 1 development (alpha test) consisted of a proposal of 10 different inferior turbinate grading scales (>1,000 clinic patients). Phase 2 validation (beta test) utilized 10 providers grading 27 standardized endoscopic photos of inferior turbinates using two different classification systems. Phase 3 validation (pilot study) consisted of 100 live consecutive clinic patients (n = 200 inferior turbinates) who were each prospectively graded by 18 different combinations of two independent raters, and grading was repeated by each of the same two raters, two separate times for each patient. In the development phase, 25% (grades 1-4) and 33% (grades 1-4) were the most useful systems. In the validation phase, the 25% classification system was found to be the best balance between potential clinical utility and ability to grade; the photo grading demonstrated a Cohen's kappa (κ) = 0.4671 ± 0.0082 (moderate inter-rater agreement). Live-patient grading with the 25% classification system demonstrated an overall inter-rater reliability of 71.5% (95% confidence interval [CI]: 64.8-77.3), with overall substantial agreement (κ = 0.704 ± 0.028). Intrarater reliability was 91.5% (95% CI: 88.7-94.3). Distribution for the 200 inferior turbinates was as follows: 25% quartile = grade 1, 50% quartile (median) = grade 2, 75% quartile = grade 3, and 90% quartile = grade 4. Mean turbinate size was 2.22 (95% CI: 2.07-2.34; standard deviation 1.02). Categorical κ was as follows: grade 1, 0.8541 ± 0.0289; grade 2, 0.7310 ± 0.0289; grade 3, 0.6997 ± 0.0289, and grade 4, 0.7760 ± 0.0289. The 25% (grades 1-4) inferior turbinate classification system is a validated grading scale with high intrarater and inter-rater reliability. This system can facilitate future research by tracking the effect of interventions on inferior turbinates. 2c. © 2014 The American Laryngological, Rhinological and Otological Society, Inc.
Li-Tsang, Cecilia W P; Wong, Agnes S K; Leung, Howard W H; Cheng, Joyce S; Chiu, Billy H W; Tse, Linda F L; Chung, Raymond C K
2013-09-01
There are more children diagnosed with specific learning difficulties in recent years as people are more aware of these conditions. Diagnostic tool has been validated to screen out this condition from the population (SpLD test for Hong Kong children). However, for specific assessment on handwriting problem, there seems a lack of standardized and objective evaluation tool to look into the problems. The objective of this study was to validate the Chinese Handwriting Analysis System (CHAS), which is designed to measure both the process and production of handwriting. The construct validity, convergent validity, internal consistency and test-retest reliability of CHAS was analyzed using the data from 734 grade 1-6 students from 6 primary schools in Hong Kong. Principal Component Analysis revealed that measurements of CHAS loaded into 4 components which accounted for 77.73% of the variance. The correlation between the handwriting accuracy obtained from HAS and eyeballing was r=.73. Cronbach's alpha of all measurement items was .65. Except SD of writing time per character, all the measurement items regarding handwriting speed, handwriting accuracy and pen pressure showed good to excellent test-retest reliability (r=.72-.96), while measurement on the numbers of characters which exceeded grid showed moderate reliability (r=.48). Although there are still ergonomic, biomechanical or unspecified aspects which may not be determined by the system, the CHAS can definitely assist therapists in identifying primary school students with handwriting problems and implement interventions accordingly. Copyright © 2013 Elsevier Ltd. All rights reserved.
Torok, Kathryn S; Baker, Nancy A; Lucas, Mary; Domsic, Robyn T; Boudreau, Robert; Medsger, Thomas A
2010-01-01
To determine the reliability and validity of a new measure of finger motion in patients with systemic sclerosis (SSc), the 'delta finger-topalm' (delta FTP) and compare its psychometric properties to the traditional measure of finger motion, the finger-topalm (FTP). Phase 1: The reliability of the delta FTP and FTP were examined in 39 patients with SSc. Phase 2: Criterion and convergent construct validity of both measures were examined in 17 patients with SSc by comparing them to other clinical measures: Total Active Range of Motion (TAROM), Hand Mobility in Scleroderma (HAMIS), the Duruoz Hand Index (DHI), Health Assessment Questionnaire (HAQ), and modified Rodnan skin score (mRSS). Phase 3: Sensitivity to change of the delta FTP was investigated in 24 patients with early diffuse cutaneous SSc. Both measures had excellent intra-rater and inter-rater reliability (ICC 0.92 to 0.99). Fair to strong correlations (rs=0.49-0.94) were observed between the delta FTP and TAROM, HAMIS, and DHI. Fair to moderate correlations were observed between delta FTP and HAQ components related to hand function and upper extremity mRSS. Correlations of the traditional FTP with these measures were fair to strong, but most often the delta FTP outperformed the FTP. The effect size and standardised response mean for the mean delta FTP were 0.50 and 1.10 respectively, over a 2-8 month period. The delta FTP is a valid and reliable measure of finger motion in patients with SSc which outperforms the FTP.
Lei, Pingguang; Lei, Guanghe; Tian, Jianjun; Zhou, Zengfen; Zhao, Miao; Wan, Chonghua
2014-10-01
This paper is aimed to develop the irritable bowel syndrome (IBS) scale of the system of Quality of Life Instruments for Chronic Diseases (QLICD-IBS) by the modular approach and validate it by both classical test theory and generalizability theory. The QLICD-IBS was developed based on programmed decision procedures with multiple nominal and focus group discussions, in-depth interview, and quantitative statistical procedures. One hundred twelve inpatients with IBS were used to provide the data measuring QOL three times before and after treatments. The psychometric properties of the scale were evaluated with respect to validity, reliability, and responsiveness employing correlation analysis, factor analyses, multi-trait scaling analysis, t tests and also G studies and D studies of generalizability theory analysis. Multi-trait scaling analysis, correlation, and factor analyses confirmed good construct validity and criterion-related validity when using SF-36 as a criterion. Test-retest reliability coefficients (Pearson r and intra-class correlation (ICC)) for the overall score and all domains were higher than 0.80; the internal consistency α for all domains at two measurements were higher than 0.70 except for the social domain (0.55 and 0.67, respectively). The overall score and scores for all domains/facets had statistically significant changes after treatments with moderate or higher effect size standardized response mean (SRM) ranging from 0.72 to 1.02 at domain levels. G coefficients and index of dependability (Ф coefficients) confirmed the reliability of the scale further with more exact variance components. The QLICD-IBS has good validity, reliability, responsiveness, and some highlights and can be used as the quality of life instrument for patients with IBS.
Bertucci, W; Duc, S; Villerius, V; Pernin, J N; Grappe, F
2005-12-01
The SRM power measuring crank system is nowadays a popular device for cycling power output (PO) measurements in the field and in laboratories. The PowerTap (CycleOps, Madison, USA) is a more recent and less well-known device that allows mobile PO measurements of cycling via the rear wheel hub. The aim of this study is to test the validity and reliability of the PowerTap by comparing it with the most accurate (i.e. the scientific model) of the SRM system. The validity of the PowerTap is tested during i) sub-maximal incremental intensities (ranging from 100 to 420 W) on a treadmill with different pedalling cadences (45 to 120 rpm) and cycling positions (standing and seated) on different grades, ii) a continuous sub-maximal intensity lasting 30 min, iii) a maximal intensity (8-s sprint), and iiii) real road cycling. The reliability is assessed by repeating ten times the sub-maximal incremental and continuous tests. The results show a good validity of the PowerTap during sub-maximal intensities between 100 and 450 W (mean PO difference -1.2 +/- 1.3 %) when it is compared to the scientific SRM model, but less validity for the maximal PO during sprint exercise, where the validity appears to depend on the gear ratio. The reliability of the PowerTap during the sub-maximal intensities is similar to the scientific SRM model (the coefficient of variation is respectively 0.9 to 2.9 % and 0.7 to 2.1 % for PowerTap and SRM). The PowerTap must be considered as a suitable device for PO measurements during sub-maximal real road cycling and in sub-maximal laboratory tests.
Establishing inter-rater reliability scoring in a state trauma system.
Read-Allsopp, Christine
2004-01-01
Trauma systems rely on accurate Injury Severity Scoring (ISS) to describe trauma patient populations. Twenty-seven (27) Trauma Nurse Coordinators and Data Managers across the state of New South Wales, Australia trauma network were instructed in the uses and techniques of the Abbreviated Injury Scale (AIS) from the Association for the Advancement of Automotive Medicine. The aim is to provide accurate, reliable and valid data for the state trauma network. Four (4) months after the course a coding exercise was conducted to assess inter-rater reliability. The results show that inter-rater reliability is with accepted international standards.
Intratester Reliability and Construct Validity of a Hip Abductor Eccentric Strength Test.
Brindle, Richard A; Ebaugh, David; Milner, Clare E
2018-06-06
Side-lying hip abductor strength tests are commonly used to evaluate muscle strength. In a "break" test, the tester applies sufficient force to lower the limb to the table while the patient resists. The peak force is postulated to occur while the leg is lowering, thus representing the participant's eccentric muscle strength. However, it is unclear whether peak force occurs before or after the leg begins to lower. To determine intrarater reliability and construct validity of a hip abductor eccentric strength test. Intrarater reliability and construct validity study. Twenty healthy adults (26 [6] y; 1.66 [0.06] m; 62.2 [8.0] kg) made 2 visits to the laboratory at least 1 week apart. During the hip abductor eccentric strength test, a handheld dynamometer recorded peak force and time to peak force, and limb position was recorded via a motion capture system. Intrarater reliability was determined using intraclass correlation, SEM, and minimal detectable difference. Construct validity was assessed by determining if peak force occurred after the start of the lowering phase using a 1-sample t test. The hip abductor eccentric strength test had substantial intrarater reliability (intraclass correlation (3,3) = .88; 95% confidence interval, .65-.95), SEM of 0.9 %BWh, and a minimal detectable difference of 2.5 %BWh. Construct validity was established as peak force occurred 2.1 (0.6) seconds (range: 0.7-3.7 s) after the start of the lowering phase of the test (P ≤ .001). The hip abductor eccentric strength test is a valid and reliable measure of eccentric muscle strength. This test may be used clinically to assess changes in eccentric muscle strength over time.
NASA Technical Reports Server (NTRS)
Chang, C. L.; Stachowitz, R. A.
1988-01-01
Software quality is of primary concern in all large-scale expert system development efforts. Building appropriate validation and test tools for ensuring software reliability of expert systems is therefore required. The Expert Systems Validation Associate (EVA) is a validation system under development at the Lockheed Artificial Intelligence Center. EVA provides a wide range of validation and test tools to check correctness, consistency, and completeness of an expert system. Testing a major function of EVA. It means executing an expert system with test cases with the intent of finding errors. In this paper, we describe many different types of testing such as function-based testing, structure-based testing, and data-based testing. We describe how appropriate test cases may be selected in order to perform good and thorough testing of an expert system.
ERIC Educational Resources Information Center
Hawkinson, Laura E.; Quick, Heather E.; Muenchow, Susan; Anthony, Jennifer; Weinberg, Emily; Holod, Aleksandra; Parrish, Deborah; Meakin, John; Lee, Dong Hoon; Tarrant, Kate; Cannon, Jill S.; Zellman, Gail L.; Karoly, Lynn A.
2015-01-01
The first step in the Validity and Reliability Study summarizes the history and purpose of California's quality rating and improvement system (QRIS), reviews findings from other QRIS evaluation studies, and describes the approach to validating the system in California. The majority of this report focuses on providing context for the California…
Kenyon, Lisa K.; Elliott, James M; Cheng, M. Samuel
2016-01-01
Purpose/Background Despite the availability of various field-tests for many competitive sports, a reliable and valid test specifically developed for use in men's gymnastics has not yet been developed. The Men's Gymnastics Functional Measurement Tool (MGFMT) was designed to assess sport-specific physical abilities in male competitive gymnasts. The purpose of this study was to develop the MGFMT by establishing a scoring system for individual test items and to initiate the process of establishing test-retest reliability and construct validity. Methods A total of 83 competitive male gymnasts ages 7-18 underwent testing using the MGFMT. Thirty of these subjects underwent re-testing one week later in order to assess test-retest reliability. Construct validity was assessed using a simple regression analysis between total MGFMT scores and the gymnasts’ USA-Gymnastics competitive level to calculate the coefficient of determination (r2). Test-retest reliability was analyzed using Model 1 Intraclass correlation coefficients (ICC). Statistical significance was set at the p<0.05 level. Results The relationship between total MGFMT scores and subjects’ current USA-Gymnastics competitive level was found to be good (r2 = 0.63). Reliability testing of the MGFMT composite test score showed excellent test-retest reliability over a one-week period (ICC = 0.97). Test-retest reliability of the individual component tests ranged from good to excellent (ICC = 0.75-0.97). Conclusions The results of this study provide initial support for the construct validity and test-retest reliability of the MGFMT. Level of Evidence Level 3 PMID:27999723
Maddali Bongi, S; Del Rosso, A; Miniati, I; Galluccio, F; Landi, G; Tai, G; Matucci-Cerinic, M
2012-09-01
In systemic sclerosis (SSc), mouth and face involvement leads to problems in oral health-related quality of life (OHRQoL). Mouth Handicap in Systemic Sclerosis scale (MHISS) is a 12-item questionnaire specifically quantifying mouth disability in SSc, organized in 3 subscales. Our aim was to validate Italian version of MHISS, by assessing its test-retest reliability and internal and external consistency in Italian SSc patients. Forty SSc patients (7 dSSc, 33 lSSc; age and disease duration: 57.27 ± 11.41, 9.4 ± 4.4 years; 22 with sicca syndrome) were evaluated with MHISS. MHISS was translated following a forward-backward translation procedure, with independent translations and counter-translation. Test-retest reliability was evaluated, comparing the results of two administrations, with intraclass correlation coefficient (ICC). Internal consistency was assessed by Cronbach's α and external consistency by comparison with mouth opening. MHISS has a good test-retest reliability (ICC: 0.93) and internal consistency (Cronbach's α:0.99). A good external consistency was confirmed by correlation with mouth opening (rho: -0,3869, p: 0.0137). Total MHISS score was 17.65 ± 5.20, with scores of subscale 1 (reduced mouth opening) of 6.60 ± 2.85 and scores of subscales 2 (sicca syndrome) and 3 (aesthetic concerns) of 7.82 ± 2.59 and 3.22 ± 1.14. Total and subscale 2 scores are higher in dSSc than in lSSc. This result may be due to the higher presence of sicca syndrome in dSSc than in lSSc (p = 0.0109). Our results support validity and reliability in Italian SSc patients of MHISS, specifically measuring SSc OHRQoL.
Halim, Isa; Arep, Hambali; Kamat, Seri Rahayu; Abdullah, Rohana; Omar, Abdul Rahman; Ismail, Ahmad Rasdan
2014-06-01
Prolonged standing has been hypothesized as a vital contributor to discomfort and muscle fatigue in the workplace. The objective of this study was to develop a decision support system that could provide systematic analysis and solutions to minimize the discomfort and muscle fatigue associated with prolonged standing. The integration of object-oriented programming and a Model Oriented Simultaneous Engineering System were used to design the architecture of the decision support system. Validation of the decision support system was carried out in two manufacturing companies. The validation process showed that the decision support system produced reliable results. The decision support system is a reliable advisory tool for providing analysis and solutions to problems related to the discomfort and muscle fatigue associated with prolonged standing. Further testing of the decision support system is suggested before it is used commercially.
Halim, Isa; Arep, Hambali; Kamat, Seri Rahayu; Abdullah, Rohana; Omar, Abdul Rahman; Ismail, Ahmad Rasdan
2014-01-01
Background Prolonged standing has been hypothesized as a vital contributor to discomfort and muscle fatigue in the workplace. The objective of this study was to develop a decision support system that could provide systematic analysis and solutions to minimize the discomfort and muscle fatigue associated with prolonged standing. Methods The integration of object-oriented programming and a Model Oriented Simultaneous Engineering System were used to design the architecture of the decision support system. Results Validation of the decision support system was carried out in two manufacturing companies. The validation process showed that the decision support system produced reliable results. Conclusion The decision support system is a reliable advisory tool for providing analysis and solutions to problems related to the discomfort and muscle fatigue associated with prolonged standing. Further testing of the decision support system is suggested before it is used commercially. PMID:25180141
Stoyanova, Rumyana; Dimova, Rositsa; Tarnovska, Miglena; Boeva, Tatyana
2018-05-20
Patient safety (PS) is one of the essential elements of health care quality and a priority of healthcare systems in most countries. Thus the creation of validated instruments and the implementation of systems that measure patient safety are considered to be of great importance worldwide. The present paper aims to illustrate the process of linguistic validation, cross-cultural verification and adaptation of the Bulgarian version of the Hospital Survey on Patient Safety Culture (B-HSOPSC) and its test-retest reliability. The study design is cross-sectional. The HSOPSC questionnaire consists of 42 questions, grouped in 12 different subscales that measure patient safety culture. Internal con-sistency was assessed using Cronbach's alpha. The Wilcoxon signed-rank test and the split-half method were used; the Spear-man-Brown coefficient was calculated. The overall Cronbach's alpha for B-HSOPSC is 0.918. Subscales 7 Staffing and 12 Overall perceptions of safety had the lowest coefficients. The high reliability of the instrument was confirmed by the Split-half method (0.97) and ICC-coefficient (0.95). The lowest values of Spearmen-Broun coefficients were found in items A13 and A14. The study offers an analysis of the results of the linguistic validation of the B-HSOPSC and its test-retest reliability. The psychometric characteristics of the questions revealed good validity and reliability, except two questions. In the future, the instrument will be administered to the target population in the main study so that the psychometric properties of the instrument can be verified.
A Validation of the Classroom Assessment Scoring System in Finnish Kindergartens
ERIC Educational Resources Information Center
Pakarinen, Eija; Lerkkanen, Marja-Kristiina; Poikkeus, Anna-Maija; Kiuru, Noona; Siekkinen, Martti; Rasku-Puttonen, Helena; Nurmi, Jari-Erik
2010-01-01
Research Findings: This study examined the validity and reliability of the Classroom Assessment Scoring System (CLASS; R. C. Pianta, K. M. La Paro, & B. K. Hamre, 2008) in Finnish kindergartens. A pair of trained observers used the CLASS to observe 49 kindergarten teachers (47 female, 2 male) on two different days. Questionnaires measuring…
Durability and Reliability of Large Diameter HDPE Pipe for Water Main Applications (Web Report 4485)
Research validates HDPE as a suitable material for use in municipal piping systems, and more research may help users maximize their understanding of its durability and reliability. Overall, corrosion resistance, hydraulic efficiency, flexibility, abrasion resistance, toughness, f...
McElhone, Kathleen; Abbott, Janice; Shelmerdine, Joanna; Bruce, Ian N; Ahmad, Yasmeen; Gordon, Caroline; Peers, Kate; Isenberg, David; Ferenkeh-Koroma, Ada; Griffiths, Bridget; Akil, Mohamed; Maddison, Peter; Teh, Lee-Suan
2007-08-15
To develop and validate a disease-specific health-related quality of life (HRQOL) instrument for adults with systemic lupus erythematosus (SLE). The work consisted of 6 stages. Stage 1 included item generation for questionnaire content from semistructured interviews with SLE patients. In stage 2 item selection for the draft questionnaire was performed by thematic analysis of the patient interview transcripts and expert panel agreement. In stage 3 the content validity of the draft questionnaire was assessed by patients completing the questionnaire and providing critical feedback. In stages 4 and 5 construct validity and internal reliability of the 3 versions of the LupusQoL were evaluated using principal component analysis with varimax rotation and Cronbach's alpha coefficients, respectively. In stage 6 discriminatory validity, concurrent validity, and test-retest reliability were evaluated. Stages 1, 2, and 3 resulted in a preliminary instrument containing 63 items. In stage 4, 8 domains were identified. This factor structure, accounting for 82% of the variance, was confirmed in stage 5. The domains and Cronbach's alpha coefficients were physical health (0.94), emotional health (0.94), body image (0.89), pain (0.92), planning (0.93), fatigue (0.88), intimate relationships (0.96), and burden to others (0.94). Discriminant validity was demonstrated for different levels of disease activity (British Isles Lupus Assessment Group Index) and damage (Systemic Lupus International Collaborating Clinics/American College of Rheumatology Damage Index). High correlations (r = 0.71-0.79) between comparable domains of the Short Form 36 and the LupusQoL assured acceptable concurrent validity. Good test-retest reliability (r = 0.72-0.93) was demonstrated. The LupusQoL is a validated SLE-specific HRQOL instrument with 34 items across 8 domains defined by patients as being important.
Iglesias-Parra, Maria Rosa; García-Guerrero, Alfonso; García-Mayor, Silvia; Kaknani-Uttumchandani, Shakira; León-Campos, Álvaro; Morales-Asencio, José Miguel
2015-07-01
To develop an evaluation system of clinical competencies for the practicum of nursing students based on the Nursing Interventions Classification (NIC). Psychometric validation study: the first two phases addressed definition and content validation, and the third phase consisted of a cross-sectional study for analyzing reliability. The study population was undergraduate nursing students and clinical tutors. Through the Delphi technique, 26 competencies and 91 interventions were isolated. Cronbach's α was 0.96. Factor analysis yielded 18 factors that explained 68.82% of the variance. Overall inter-item correlation was 0.26, and total-item correlation ranged between 0.66 and 0.19. A competency system for the nursing practicum, structured on the NIC, is a reliable method for assessing and evaluating clinical competencies. Further evaluations in other contexts are needed. The availability of standardized language systems in the nursing discipline supposes an ideal framework to develop the nursing curricula. © 2015 Sigma Theta Tau International.
Prowse, Ashleigh; Aslaksen, Berit; Kierkegaard, Marie; Furness, James; Gerdhem, Paul; Abbott, Allan
2017-01-01
AIM To investigate the reliability and concurrent validity of the Baseline® Body Level/Scoliosis meter for adolescent idiopathic scoliosis postural assessment in three anatomical planes. METHODS This is an observational reliability and concurrent validity study of adolescent referrals to the Orthopaedic department for scoliosis screening at Karolinska University Hospital, Stockholm, Sweden between March-May 2012. A total of 31 adolescents with idiopathic scoliosis (13.6 ± 0.6 years old) of mild-moderate curvatures (25° ± 12°) were consecutively recruited. Measurement of cervical, thoracic and lumbar curvatures, pelvic and shoulder tilt, and axial thoracic rotation (ATR) were performed by two trained physiotherapists in one day. The intraclass correlation coefficient (ICC) was used to determine the inter-examiner reliability (ICC2,1) and the intra-rater reliability (ICC3,3) of the Baseline® Body Level/Scoliosis meter. Spearman’s correlation analyses were used to estimate concurrent validity between the Baseline® Body Level/Scoliosis meter and Gold Standard Cobb angles from radiographs and the Orthopaedic Systems Inc. Scoliometer. RESULTS There was excellent reliability between examiners for thoracic kyphosis (ICC2,1 = 0.94), ATR (ICC2,1 = 0.92) and lumbar lordosis (ICC2,1 = 0.79). There was adequate reliability between examiners for cervical lordosis (ICC2,1 = 0.51), however poor reliability for pelvic and shoulder tilt. Both devices were reproducible in the measurement of ATR when repeated by one examiner (ICC3,3 0.98-1.00). The device had a good correlation with the Scoliometer (rho = 0.78). When compared with Cobb angle from radiographs, there was a moderate correlation for ATR (rho = 0.627). CONCLUSION The Baseline® Body Level/Scoliosis meter provides reliable transverse and sagittal cervical, thoracic and lumbar measurements and valid transverse plan measurements of mild-moderate scoliosis deformity. PMID:28144582
Systematic review of methods for quantifying teamwork in the operating theatre
Marshall, D.; Sykes, M.; McCulloch, P.; Shalhoub, J.; Maruthappu, M.
2018-01-01
Background Teamwork in the operating theatre is becoming increasingly recognized as a major factor in clinical outcomes. Many tools have been developed to measure teamwork. Most fall into two categories: self‐assessment by theatre staff and assessment by observers. A critical and comparative analysis of the validity and reliability of these tools is lacking. Methods MEDLINE and Embase databases were searched following PRISMA guidelines. Content validity was assessed using measurements of inter‐rater agreement, predictive validity and multisite reliability, and interobserver reliability using statistical measures of inter‐rater agreement and reliability. Quantitative meta‐analysis was deemed unsuitable. Results Forty‐eight articles were selected for final inclusion; self‐assessment tools were used in 18 and observational tools in 28, and there were two qualitative studies. Self‐assessment of teamwork by profession varied with the profession of the assessor. The most robust self‐assessment tool was the Safety Attitudes Questionnaire (SAQ), although this failed to demonstrate multisite reliability. The most robust observational tool was the Non‐Technical Skills (NOTECHS) system, which demonstrated both test–retest reliability (P > 0·09) and interobserver reliability (Rwg = 0·96). Conclusion Self‐assessment of teamwork by the theatre team was influenced by professional differences. Observational tools, when used by trained observers, circumvented this.
Devoogdt, Nele; De Groef, An; Hendrickx, Ad; Damstra, Robert; Christiaansen, Anke; Geraerts, Inge; Vervloesem, Nele; Vergote, Ignace; Van Kampen, Marijke
2014-05-01
Patients may develop primary (congenital) or secondary (acquired) lymphedema, causing significant physical and psychosocial problems. To plan treatment for lymphedema and monitor a patient's progress, swelling, and problems in functioning associated with lymphedema development should be assessed at baseline and follow-up. The purpose of this study was to investigate the reliability (test-retest, internal consistency, and measurement variability) and validity (content and construct) of data obtained with the Lymphoedema Functioning, Disability and Health Questionnaire for Lower Limb Lymphoedema (Lymph-ICF-LL). This was a multicenter, cross-sectional study. The Lymph-ICF-LL is a descriptive, evaluative tool containing 28 questions about impairments in function, activity limitations, and participation restrictions in patients with lower limb lymphedema. The questionnaire has 5 domains: physical function, mental function, general tasks/household activities, mobility activities, and life domains/social life. The reliability and validity of the Lymph-ICF-LL were examined in 30 participants with objective lower limb lymphedema. Intraclass correlation coefficients for test-retest reliability ranged from .69 to .94, and Cronbach alpha coefficients for internal consistency ranged from .82 to .97. Measurement variability was acceptable (standard error of measurement=5.9-12.6). Content validity was good because all questions were understandable for 93% of participants, the scoring system (visual analog scale) was clear, and the questionnaire was comprehensive for 90% of participants. Construct validity was good. All hypotheses for assessing convergent validity and divergent validity were accepted. The known-groups validity and responsiveness of the Dutch Lymph-ICF-LL and the cross-cultural validity of the English version of the Lymph-ICF-LL were not investigated. The Lymph-ICF-LL is a Dutch questionnaire with evidence of reliability and validity for assessing impairments in function, activity limitations, and participation restrictions in people with primary or secondary lower limb lymphedema.
Park, Hee-Won; Baek, Sora; Kim, Hong Young; Park, Jung-Gyoo; Kang, Eun Kyoung
2017-10-01
To investigate the reliability and validity of a new method for isometric back extensor strength measurement using a portable dynamometer. A chair equipped with a small portable dynamometer was designed (Power Track II Commander Muscle Tester). A total of 15 men (mean age, 34.8±7.5 years) and 15 women (mean age, 33.1±5.5 years) with no current back problems or previous history of back surgery were recruited. Subjects were asked to push the back of the chair while seated, and their isometric back extensor strength was measured by the portable dynamometer. Test-retest reliability was assessed with intraclass correlation coefficient (ICC). For the validity assessment, isometric back extensor strength of all subjects was measured by a widely used physical performance evaluation instrument, BTE PrimusRS system. The limit of agreement (LoA) from the Bland-Altman plot was evaluated between two methods. The test-retest reliability was excellent (ICC=0.82; 95% confidence interval, 0.65-0.91). The Bland-Altman plots demonstrated acceptable agreement between the two methods: the lower 95% LoA was -63.1 N and the upper 95% LoA was 61.1 N. This study shows that isometric back extensor strength measurement using a portable dynamometer has good reliability and validity.
Brackley, Victoria; Ball, Kevin; Tor, Elaine
2018-05-12
The effectiveness of the swimming turn is highly influential to overall performance in competitive swimming. The push-off or wall contact, within the turn phase, is directly involved in determining the speed the swimmer leaves the wall. Therefore, it is paramount to develop reliable methods to measure the wall-contact-time during the turn phase for training and research purposes. The aim of this study was to determine the concurrent validity and reliability of the Pool Pad App to measure wall-contact-time during the freestyle and backstroke tumble turn. The wall-contact-times of nine elite and sub-elite participants were recorded during their regular training sessions. Concurrent validity statistics included the standardised typical error estimate, linear analysis and effect sizes while the intraclass correlating coefficient (ICC) was used for the reliability statistics. The standardised typical error estimate resulted in a moderate Cohen's d effect size with an R 2 value of 0.80 and the ICC between the Pool Pad and 2D video footage was 0.89. Despite these measurement differences, the results from this concurrent validity and reliability analyses demonstrated that the Pool Pad is suitable for measuring wall-contact-time during the freestyle and backstroke tumble turn within a training environment.
Karanikola, Maria N K; Papathanassoglou, Elizabeth D E
2015-02-01
The Index of Work Satisfaction (IWS) is a comprehensive scale assessing nurses' professional satisfaction. The aim of the present study was to explore: a) the applicability, reliability and validity of the Greek version of the IWS and b) contrasts among the factors addressed by IWS against the main themes emerging from a qualitative phenomenological investigation of nurses' professional experiences. A descriptive correlational design was applied using a sample of 246 emergency and critical care nurses. Internal consistency and test-retest reliability were tested. Construct and content validity were assessed by factor analysis, and through qualitative phenomenological analysis with a purposive sample of 12 nurses. Scale factors were contrasted to qualitative themes to assure that IWS embraces all aspects of Greek nurses' professional satisfaction. The internal consistency (α = 0.81) and test-retest (tau = 1, p < 0.0001) reliability were adequate. Following appropriate modifications, factor analysis confirmed the construct validity of the scale and subscales. The qualitative data partially clarified the low reliability of one subscale. The Greek version of the IWS scale is supported for use in acute care. The mixed methods approach constitutes a powerful tool for transferring scales to different cultures and healthcare systems. Copyright © 2014 Elsevier Inc. All rights reserved.
Are Validity and Reliability "Relevant" in Qualitative Evaluation Research?
ERIC Educational Resources Information Center
Goodwin, Laura D.; Goodwin, William L.
1984-01-01
The views of prominant qualitative methodologists on the appropriateness of validity and reliability estimation for the measurement strategies employed in qualitative evaluations are summarized. A case is made for the relevance of validity and reliability estimation. Definitions of validity and reliability for qualitative measurement are presented…
NASA Technical Reports Server (NTRS)
Sizlo, T. R.; Berg, R. A.; Gilles, D. L.
1979-01-01
An augmentation system for a 230 passenger, twin engine aircraft designed with a relaxation of conventional longitudinal static stability was developed. The design criteria are established and candidate augmentation system control laws and hardware architectures are formulated and evaluated with respect to reliability, flying qualities, and flight path tracking performance. The selected systems are shown to satisfy the interpreted regulatory safety and reliability requirements while maintaining the present DC 10 (study baseline) level of maintainability and reliability for the total flight control system. The impact of certification of the relaxed static stability augmentation concept is also estimated with regard to affected federal regulations, system validation plan, and typical development/installation costs.
Reliability and validity in a nutshell.
Bannigan, Katrina; Watson, Roger
2009-12-01
To explore and explain the different concepts of reliability and validity as they are related to measurement instruments in social science and health care. There are different concepts contained in the terms reliability and validity and these are often explained poorly and there is often confusion between them. To develop some clarity about reliability and validity a conceptual framework was built based on the existing literature. The concepts of reliability, validity and utility are explored and explained. Reliability contains the concepts of internal consistency and stability and equivalence. Validity contains the concepts of content, face, criterion, concurrent, predictive, construct, convergent (and divergent), factorial and discriminant. In addition, for clinical practice and research, it is essential to establish the utility of a measurement instrument. To use measurement instruments appropriately in clinical practice, the extent to which they are reliable, valid and usable must be established.
Demonstrating the Alaska Ocean Observing System in Prince William Sound
NASA Astrophysics Data System (ADS)
Schoch, G. Carl; McCammon, Molly
2013-07-01
The Alaska Ocean Observing System and the Oil Spill Recovery Institute developed a demonstration project over a 5 year period in Prince William Sound. The primary goal was to develop a quasi-operational system that delivers weather and ocean information in near real time to diverse user communities. This observing system now consists of atmospheric and oceanic sensors, and a new generation of computer models to numerically simulate and forecast weather, waves, and ocean circulation. A state of the art data management system provides access to these products from one internet portal at http://www.aoos.org. The project culminated in a 2009 field experiment that evaluated the observing system and performance of the model forecasts. Observations from terrestrial weather stations and weather buoys validated atmospheric circulation forecasts. Observations from wave gages on weather buoys validated forecasts of significant wave heights and periods. There was an emphasis on validation of surface currents forecasted by the ocean circulation model for oil spill response and search and rescue applications. During the 18 day field experiment a radar array mapped surface currents and drifting buoys were deployed. Hydrographic profiles at fixed stations, and by autonomous vehicles along transects, were made to acquire measurements through the water column. Terrestrial weather stations were the most reliable and least costly to operate, and in situ ocean sensors were more costly and considerably less reliable. The radar surface current mappers were the least reliable and most costly but provided the assimilation and validation data that most improved ocean circulation forecasts. We describe the setting of Prince William Sound and the various observational platforms and forecast models of the observing system, and discuss recommendations for future development.
Validity, Reliability, and Inertia of Four Different Temperature Capsule Systems.
Bongers, Coen C W G; Daanen, Hein A M; Bogerd, Cornelis P; Hopman, Maria T E; Eijsvogels, Thijs M H
2018-01-01
Telemetric temperature capsule systems are wireless, relatively noninvasive, and easily applicable in field conditions and have therefore great advantages for monitoring core body temperature. However, the accuracy and responsiveness of available capsule systems have not been compared previously. Therefore, the aim of this study was to examine the validity, reliability, and inertia characteristics of four ingestible temperature capsule systems (i.e., CorTemp, e-Celsius, myTemp, and VitalSense). Ten temperature capsules were examined for each system in a temperature-controlled water bath during three trials. The water bath temperature gradually increased from 33°C to 44°C in trials 1 and 2 to assess the validity and reliability, and from 36°C to 42°C in trial 3 to assess the inertia characteristics of the temperature capsules. A systematic difference between capsule and water bath temperature was found for CorTemp (0.077°C ± 0.040°C), e-Celsius (-0.081°C ± 0.055°C), myTemp (-0.003°C ± 0.006°C), and VitalSense (-0.017°C ± 0.023°C; P < 0.010), with the lowest bias for the myTemp system (P < 0.001). A systematic difference was found between trial 1 and trial 2 for CorTemp (0.017°C ± 0.083°C; P = 0.030) and e-Celsius (-0.007°C ± 0.033°C; P = 0.019), whereas temperature values of myTemp (0.001°C ± 0.008°C) and VitalSense (0.002°C ± 0.014°C) did not differ (P > 0.05). Comparable inertia characteristics were found for CorTemp (25 ± 4 s), e-Celsius (21 ± 13 s), and myTemp (19 ± 2 s), whereas the VitalSense system responded more slowly (39 ± 6 s) to changes in water bath temperature (P < 0.001). Although differences in temperature and inertia were observed between capsule systems, an excellent validity, test-retest reliability, and inertia was found for each system between 36°C and 44°C after removal of outliers.
Zbrozek, Arthur; Hebert, Joy; Gogates, Gregory; Thorell, Rod; Dell, Christopher; Molsen, Elizabeth; Craig, Gretchen; Grice, Kenneth; Kern, Scottie; Hines, Sheldon
2013-06-01
Outcomes research literature has many examples of high-quality, reliable patient-reported outcome (PRO) data entered directly by electronic means, ePRO, compared to data entered from original results on paper. Clinical trial managers are increasingly using ePRO data collection for PRO-based end points. Regulatory review dictates the rules to follow with ePRO data collection for medical label claims. A critical component for regulatory compliance is evidence of the validation of these electronic data collection systems. Validation of electronic systems is a process versus a focused activity that finishes at a single point in time. Eight steps need to be described and undertaken to qualify the validation of the data collection software in its target environment: requirements definition, design, coding, testing, tracing, user acceptance testing, installation and configuration, and decommissioning. These elements are consistent with recent regulatory guidance for systems validation. This report was written to explain how the validation process works for sponsors, trial teams, and other users of electronic data collection devices responsible for verifying the quality of the data entered into relational databases from such devices. It is a guide on the requirements and documentation needed from a data collection systems provider to demonstrate systems validation. It is a practical source of information for study teams to ensure that ePRO providers are using system validation and implementation processes that will ensure the systems and services: operate reliably when in practical use; produce accurate and complete data and data files; support management control and comply with any existing regulations. Furthermore, this short report will increase user understanding of the requirements for a technology review leading to more informed and balanced recommendations or decisions on electronic data collection methods. Copyright © 2013 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.
Llorens, Roberto; Latorre, Jorge; Noé, Enrique; Keshner, Emily A
2016-01-01
Posturography systems that incorporate force platforms are considered to assess balance and postural control with greater sensitivity and objectivity than conventional clinical tests. The Wii Balance Board (WBB) system has been shown to have similar performance characteristics as other force platforms, but with lower cost and size. To determine the validity and reliability of a freely available WBB-based posturography system that combined the WBB with several traditional balance assessments, and to assess the performance of a cohort of stroke individuals with respect to healthy individuals. Healthy subjects and individuals with stroke were recruited. Both groups were assessed using the WBB-based posturography system. Individuals with stroke were also assessed using a laboratory grade posturography system and a battery of clinical tests to determine the concurrent validity of the system. A group of subjects were assessed twice with the WBB-based system to determine its reliability. A total of 144 healthy individuals and 53 individuals with stroke participated in the study. Concurrent validity with another posturography system was moderate to high. Correlations with clinical scales were consistent with previous research. The reliability of the system was excellent in almost all measures. In addition, the system successfully characterized individuals with stroke with respect to the healthy population. The WBB-based posturography system exhibited excellent psychometric properties and sensitivity for identifying balance performance of individuals with stroke in comparison with healthy subjects, which supports feasibility of the system as a clinical tool. Copyright © 2015 Elsevier B.V. All rights reserved.
NASA Technical Reports Server (NTRS)
Shin, Jong-Yeob; Belcastro, Christine
2008-01-01
Formal robustness analysis of aircraft control upset prevention and recovery systems could play an important role in their validation and ultimate certification. As a part of the validation process, this paper describes an analysis method for determining a reliable flight regime in the flight envelope within which an integrated resilent control system can achieve the desired performance of tracking command signals and detecting additive faults in the presence of parameter uncertainty and unmodeled dynamics. To calculate a reliable flight regime, a structured singular value analysis method is applied to analyze the closed-loop system over the entire flight envelope. To use the structured singular value analysis method, a linear fractional transform (LFT) model of a transport aircraft longitudinal dynamics is developed over the flight envelope by using a preliminary LFT modeling software tool developed at the NASA Langley Research Center, which utilizes a matrix-based computational approach. The developed LFT model can capture original nonlinear dynamics over the flight envelope with the ! block which contains key varying parameters: angle of attack and velocity, and real parameter uncertainty: aerodynamic coefficient uncertainty and moment of inertia uncertainty. Using the developed LFT model and a formal robustness analysis method, a reliable flight regime is calculated for a transport aircraft closed-loop system.
Losa-Iglesias, Marta Elena; Becerro-de-Bengoa-Vallejo, Ricardo; Becerro-de-Bengoa-Losa, Klark Ricardo
2016-06-01
There are downloadable applications (Apps) for cell phones that can measure heart rate in a simple and painless manner. The aim of this study was to assess the reliability of this type of App for a Smartphone using an Android system, compared to the radial pulse and a portable pulse oximeter. We performed a pilot observational study of diagnostic accuracy, randomized in 46 healthy volunteers. The patients' demographic data and cardiac pulse were collected. Radial pulse was measured by palpation of the radial artery with three fingers at the wrist over the radius; a low-cost portable, liquid crystal display finger pulse oximeter; and a Heart Rate Plus for Samsung Galaxy Note®. This study demonstrated high reliability and consistency between systems with respect to the heart rate parameter of healthy adults using three systems. For all parameters, ICC was > 0.93, indicating excellent reliability. Moreover, CVME values for all parameters were between 1.66-4.06 %. We found significant correlation coefficients and no systematic differences between radial pulse palpation and pulse oximeter and a high precision. Low-cost pulse oximeter and App systems can serve as valid instruments for the assessment of heart rate in healthy adults. © The Author(s) 2014.
Validity and reliability of the Ergomopro powermeter.
Kirkland, A; Coleman, D; Wiles, J D; Hopker, J
2008-11-01
The aim of this investigation was to assess the validity and reliability of the Ergomopro powermeter. Nine participants completed trials on a Monark ergometer fitted with Ergomopro and SRM powermeters simultaneously recording power output. Each participant completed multiple trials at power outputs ranging from 50 to 450 W. The work stages recorded were 60 s in duration and were repeated three times. Participants also completed a single trial on a cycle ergometer designed to assess bilateral contributions to work output (Lode Excaliber Sport PFM). The power output during the trials was significantly different between all three systems, (p < 0.01) 231.2 +/- 114.2 W, 233.0 +/- 112.4 W, 227.8 +/- 108.8 W for the Monark, SRM and Ergomopro system, respectively. When the bilateral contributions were factored into the analysis, there were no significant differences between the powermeters (p = 0.58). The reliability of the Ergomopro system (CV%) was 2.31 % (95 % CI 2.13 - 2.52 %) compared to 1.59 % (95 % CI 1.47 to 1.74 %) for the Monark, and 1.37 % (95 % CI 1.26 - 1.50 %) for the SRM powermeter. These results indicate that the Ergomopro system has acceptable accuracy under these conditions. However, based on the reliability data, the increased variability of the Ergomopro system and bilateral balance issues have to be considered when using this device.
Reliability of the AMA Guides to the Evaluation of Permanent Impairment.
Forst, Linda; Friedman, Lee; Chukwu, Abraham
2010-12-01
AMA's Guides to the Evaluation of Permanent Impairment is used to rate loss of function and determine compensation and ability to work after injury or illness; however, there are few studies that evaluate reliability or construct validity. To evaluate the reliability of the fifth and sixth editions for back injury; to determine best methods for further study. Intra-class correlation coefficients within and between raters were relatively high. There was wider variability for individual cases. Impairment ratings were lower and correlated less well for the sixth edition, though confidence intervals overlapped. The sixth edition may not be an improvement over the fifth. A research agenda should include investigations of reliability and construct validity for different body sites and organ systems along the entire rating scale and among different categories of raters.
Valente, Ana Rita S; Hall, Andreia; Alvelos, Helena; Leahy, Margaret; Jesus, Luis M T
2018-04-12
The appropriate use of language in context depends on the speaker's pragmatic language competencies. A coding system was used to develop a specific and adult-focused self-administered questionnaire to adults who stutter and adults who do not stutter, The Assessment of Language Use in Social Contexts for Adults, with three categories: precursors, basic exchanges, and extended literal/non-literal discourse. This paper presents the content validity, item analysis, reliability coefficients and evidences of construct validity of the instrument. Content validity analysis was based on a two-stage process: first, 11 pragmatic questionnaires were assessed to identify items that probe each pragmatic competency and to create the first version of the instrument; second, items were assessed qualitatively by an expert panel composed by adults who stutter and controls, and quantitatively and qualitatively by an expert panel composed by clinicians. A pilot study was conducted with five adults who stutter and five controls to analyse items and calculate reliability. Construct validity evidences were obtained using the hypothesized relationships method and factor analysis with 28 adults who stutter and 28 controls. Concerning content validity, the questionnaires assessed up to 13 pragmatic competencies. Qualitative and quantitative analysis revealed ambiguities in items construction. Disagreement between experts was solved through item modification. The pilot study showed that the instrument presented internal consistency and temporal stability. Significant differences between adults who stutter and controls and different response profiles revealed the instrument's underlying construct. The instrument is reliable and presented evidences of construct validity.
González-Chordá, Víctor M; Mena-Tudela, Desirée; Salas-Medina, Pablo; Cervera-Gasch, Agueda; Orts-Cortés, Isabel; Maciá-Soler, Loreto
2016-02-01
Writing a bachelor thesis (BT) is the last step to obtain a nursing degree. In order to perform an effective assessment of a nursing BT, certain reliable and valid tools are required. To develop and validate a 3-rubric system (drafting process, dissertation, and viva) to assess final year nursing students' BT. A multi-disciplinary study of content validity and psychometric properties. The study was carried out between December 2014 and July 2015. Nursing Degree at Universitat Jaume I. Spain. Eleven experts (9 nursing professors and 2 education professors from 6 different universities) took part in the development and content validity stages. Fifty-two theses presented during the 2014-2015 academic year were included by consecutive sampling of cases in order to study the psychometric properties. First, a group of experts was created to validate the content of the assessment system based on three rubrics (drafting process, dissertation, and viva). Subsequently, a reliability and validity study of the rubrics was carried out on the 52 theses presented during the 2014-2015 academic year. The BT drafting process rubric has 8 criteria (S-CVI=0.93; α=0.837; ICC=0.614), the dissertation rubric has 7 criteria (S-CVI=0.9; α=0.893; ICC=0.74), and the viva rubric has 4 criteria (S-CVI=0.86; α=8.16; ICC=0.895). A nursing BT assessment system based on three rubrics (drafting process, dissertation, and viva) has been validated. This system may be transferred to other nursing degrees or degrees from other academic areas. It is necessary to continue with the validation process taking into account factors that may affect the results obtained. Copyright © 2015 Elsevier Ltd. All rights reserved.
Dong, Ren G; Welcome, Daniel E; McDowell, Thomas W; Wu, John Z
2013-11-25
The relationship between the vibration transmissibility and driving-point response functions (DPRFs) of the human body is important for understanding vibration exposures of the system and for developing valid models. This study identified their theoretical relationship and demonstrated that the sum of the DPRFs can be expressed as a linear combination of the transmissibility functions of the individual mass elements distributed throughout the system. The relationship is verified using several human vibration models. This study also clarified the requirements for reliably quantifying transmissibility values used as references for calibrating the system models. As an example application, this study used the developed theory to perform a preliminary analysis of the method for calibrating models using both vibration transmissibility and DPRFs. The results of the analysis show that the combined method can theoretically result in a unique and valid solution of the model parameters, at least for linear systems. However, the validation of the method itself does not guarantee the validation of the calibrated model, because the validation of the calibration also depends on the model structure and the reliability and appropriate representation of the reference functions. The basic theory developed in this study is also applicable to the vibration analyses of other structures.
Reliability and Validity Evidence of Multiple Balance Assessments in Athletes With a Concussion
Murray, Nicholas; Salvatore, Anthony; Powell, Douglas; Reed-Jones, Rebecca
2014-01-01
Context: An estimated 300 000 sport-related concussion injuries occur in the United States annually. Approximately 30% of individuals with concussions experience balance disturbances. Common methods of balance assessment include the Clinical Test of Sensory Organization and Balance (CTSIB), the Sensory Organization Test (SOT), the Balance Error Scoring System (BESS), and the Romberg test; however, the National Collegiate Athletic Association recommended the Wii Fit as an alternative measure of balance in athletes with a concussion. A central concern regarding the implementation of the Wii Fit is whether it is reliable and valid for measuring balance disturbance in athletes with concussion. Objective: To examine the reliability and validity evidence for the CTSIB, SOT, BESS, Romberg test, and Wii Fit for detecting balance disturbance in athletes with a concussion. Data Sources: Literature considered for review included publications with reliability and validity data for the assessments of balance (CTSIB, SOT, BESS, Romberg test, and Wii Fit) from PubMed, PsycINFO, and CINAHL. Data Extraction: We identified 63 relevant articles for consideration in the review. Of the 63 articles, 28 were considered appropriate for inclusion and 35 were excluded. Data Synthesis: No current reliability or validity information supports the use of the CTSIB, SOT, Romberg test, or Wii Fit for balance assessment in athletes with a concussion. The BESS demonstrated moderate to high reliability (interclass correlation coefficient = 0.87) and low to moderate validity (sensitivity = 34%, specificity = 87%). However, the Romberg test and Wii Fit have been shown to be reliable tools in the assessment of balance in Parkinson patients. Conclusions: The BESS can evaluate balance problems after a concussion. However, it lacks the ability to detect balance problems after the third day of recovery. Further investigation is needed to establish the use of the CTSIB, SOT, Romberg test, and Wii Fit for assessing balance in athletes with concussions. PMID:24933431
Lang, Jason M; Connell, Christian M
2017-05-01
Childhood exposure to trauma, including violence and abuse, is a major public health concern that has resulted in increased efforts to promote trauma-informed child-serving systems. Trauma screening is an important component of such trauma-informed systems, yet widespread use of trauma screening is rare in part due to the lack of brief, validated trauma screening measures for children. We describe development and validation of the Child Trauma Screen (CTS), a 10-item screening measure of trauma exposure and posttraumatic stress disorder (PTSD) symptoms for children consistent with the DSM-5 definition of PTSD. Study 1 describes measure development incorporating analysis to derive items based on existing measures from 1,065 children and caregivers together with stakeholder input to finalize item selection. Study 2 describes validation of the CTS with a clinical sample of 74 children and their caregivers. Results support the CTS as an empirically derived, reliable measure to screen children for trauma exposure and PTSD symptoms with strong convergent, divergent, and criterion validity. The CTS is a promising measure for rapidly and reliably screening children for trauma exposure and PTSD symptoms. Future research is needed to confirm validation and to examine feasibility and utility of its use across various child-serving systems. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
1992-04-01
contractor’s existing data collection, analysis and corrective action system shall be utilized, with modification only as necessary to meet the...either from test or from analysis of field data . The procedures of MIL-STD-756B assume that the reliability of a 18 DEFINE IDENTIFY SOFTWARE LIFE CYCLE...to generate sufficient data to report a statistically valid reliability figure for a class of software. Casual data gathering accumulates data more
Ethical Implications of Validity-vs.-Reliability Trade-Offs in Educational Research
ERIC Educational Resources Information Center
Fendler, Lynn
2016-01-01
In educational research that calls itself empirical, the relationship between validity and reliability is that of trade-off: the stronger the bases for validity, the weaker the bases for reliability (and vice versa). Validity and reliability are widely regarded as basic criteria for evaluating research; however, there are ethical implications of…
Comprehensive Design Reliability Activities for Aerospace Propulsion Systems
NASA Technical Reports Server (NTRS)
Christenson, R. L.; Whitley, M. R.; Knight, K. C.
2000-01-01
This technical publication describes the methodology, model, software tool, input data, and analysis result that support aerospace design reliability studies. The focus of these activities is on propulsion systems mechanical design reliability. The goal of these activities is to support design from a reliability perspective. Paralleling performance analyses in schedule and method, this requires the proper use of metrics in a validated reliability model useful for design, sensitivity, and trade studies. Design reliability analysis in this view is one of several critical design functions. A design reliability method is detailed and two example analyses are provided-one qualitative and the other quantitative. The use of aerospace and commercial data sources for quantification is discussed and sources listed. A tool that was developed to support both types of analyses is presented. Finally, special topics discussed include the development of design criteria, issues of reliability quantification, quality control, and reliability verification.
What to Do With "Moderate" Reliability and Validity Coefficients?
Post, Marcel W
2016-07-01
Clinimetric studies may use criteria for test-retest reliability and convergent validity such that correlation coefficients as low as .40 are supportive of reliability and validity. It can be argued that moderate (.40-.60) correlations should not be interpreted in this way and that reliability coefficients <.70 should be considered as indicative of unreliability. Convergent validity coefficients in the .40 to .60 or .40 to .70 range should be considered as indications of validity problems, or as inconclusive at best. Studies on reliability and convergent should be designed in such a way that it is realistic to expect high reliability and validity coefficients. Multitrait multimethod approaches are preferred to study construct (convergent-divergent) validity. Copyright © 2016 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Concurrent validity and reliability of the Alberta Infant Motor Scale in premature infants.
Almeida, Kênnea Martins; Dutra, Maria Virginia Peixoto; Mello, Rosane Reis de; Reis, Ana Beatriz Rodrigues; Martins, Priscila Silveira
2008-01-01
To verify the concurrent validity and interobserver reliability of the Alberta Infant Motor Scale (AIMS) in premature infants followed-up at the outpatient clinic of Instituto Fernandes Figueira, Fundação Oswaldo Cruz (IFF/Fiocruz), in Rio de Janeiro, Brazil. A total of 88 premature infants were enrolled at the follow-up clinic at IFF/Fiocruz, between February and December of 2006. For the concurrent validity study, 46 infants were assessed at either 6 (n = 26) or 12 (n = 20) months' corrected age using the AIMS and the second edition of the Bayley Scales of Infant Development, by two different observers, and applying Pearson's correlation coefficient to analyze the results. For the reliability study, 42 infants between 0 and 18 months were assessed using the Alberta Infant Motor Scale, by two different observers and the results analyzed using the intraclass correlation coefficient. The concurrent validity study found a high level of correlation between the two scales (r = 0.95) and one that was statistically significant (p < 0.01) for the entire population of infants, with higher values at 12 months (r = 0.89) than at 6 months (r = 0.74). The interobserver reliability study found satisfactory intraclass correlation coefficients at all ages tested, varying from 0.76 to 0.99. The AIMS is a valid and reliable instrument for the evaluation of motor development in high-risk infants within the Brazilian public health system.
Reliability and Validity of the Footprint Assessment Method Using Photoshop CS5 Software.
Gutiérrez-Vilahú, Lourdes; Massó-Ortigosa, Núria; Costa-Tutusaus, Lluís; Guerra-Balic, Myriam
2015-05-01
Several sophisticated methods of footprint analysis currently exist. However, it is sometimes useful to apply standard measurement methods of recognized evidence with an easy and quick application. We sought to assess the reliability and validity of a new method of footprint assessment in a healthy population using Photoshop CS5 software (Adobe Systems Inc, San Jose, California). Forty-two footprints, corresponding to 21 healthy individuals (11 men with a mean ± SD age of 20.45 ± 2.16 years and 10 women with a mean ± SD age of 20.00 ± 1.70 years) were analyzed. Footprints were recorded in static bipedal standing position using optical podography and digital photography. Three trials for each participant were performed. The Hernández-Corvo, Chippaux-Smirak, and Staheli indices and the Clarke angle were calculated by manual method and by computerized method using Photoshop CS5 software. Test-retest was used to determine reliability. Validity was obtained by intraclass correlation coefficient (ICC). The reliability test for all of the indices showed high values (ICC, 0.98-0.99). Moreover, the validity test clearly showed no difference between techniques (ICC, 0.99-1). The reliability and validity of a method to measure, assess, and record the podometric indices using Photoshop CS5 software has been demonstrated. This provides a quick and accurate tool useful for the digital recording of morphostatic foot study parameters and their control.
Parkison, Steven A.; Carlson, Jay D.; Chaudoin, Tammy R.; Hoke, Traci A.; Schenk, A. Katrin; Goulding, Evan H.; Pérez, Lance C.; Bonasera, Stephen J.
2016-01-01
Inexpensive, high-throughput, low maintenance systems for precise temporal and spatial measurement of mouse home cage behavior (including movement, feeding, and drinking) are required to evaluate products from large scale pharmaceutical design and genetic lesion programs. These measurements are also required to interpret results from more focused behavioral assays. We describe the design and validation of a highly-scalable, reliable mouse home cage behavioral monitoring system modeled on a previously described, one-of-a-kind system [1]. Mouse position was determined by solving static equilibrium equations describing the force and torques acting on the system strain gauges; feeding events were detected by a photobeam across the food hopper, and drinking events were detected by a capacitive lick sensor. Validation studies show excellent agreement between mouse position and drinking events measured by the system compared with video-based observation – a gold standard in neuroscience. PMID:23366406
Ramnarayan, Padmanabhan; Kapoor, Ritika R; Coren, Michael; Nanduri, Vasantha; Tomlinson, Amanda L; Taylor, Paul M; Wyatt, Jeremy C; Britto, Joseph F
2003-01-01
Few previous studies evaluating the benefits of diagnostic decision support systems have simultaneously measured changes in diagnostic quality and clinical management prompted by use of the system. This report describes a reliable and valid scoring technique to measure the quality of clinical decision plans in an acute medical setting, where diagnostic decision support tools might prove most useful. Sets of differential diagnoses and clinical management plans generated by 71 clinicians for six simulated cases, before and after decision support from a Web-based pediatric differential diagnostic tool (ISABEL), were used. A composite quality score was calculated separately for each diagnostic and management plan by considering the appropriateness value of each component diagnostic or management suggestion, a weighted sum of individual suggestion ratings, relevance of the entire plan, and its comprehensiveness. The reliability and validity (face, concurrent, construct, and content) of these two final scores were examined. Two hundred fifty-two diagnostic and 350 management suggestions were included in the interrater reliability analysis. There was good agreement between raters (intraclass correlation coefficient, 0.79 for diagnoses, and 0.72 for management). No counterintuitive scores were demonstrated on visual inspection of the sets. Content validity was verified by a consultation process with pediatricians. Both scores discriminated adequately between the plans of consultants and medical students and correlated well with clinicians' subjective opinions of overall plan quality (Spearman rho 0.65, p < 0.01). The diagnostic and management scores for each episode showed moderate correlation (r = 0.51). The scores described can be used as key outcome measures in a larger study to fully assess the value of diagnostic decision aids, such as the ISABEL system.
Assessing Peer Entry and Play in Preschoolers at Risk for Maladjustment
ERIC Educational Resources Information Center
Brotman, Laurie Miller; Gouley, Kathleen Kiely; Chesir-Teran, Daniel
2005-01-01
This study evaluated the psychometric properties of an observational rating system for assessing preschoolers' peer entry and play skills: Observed Peer Play in Unfamiliar Settings (OPPUS). Participants were 84 preschoolers at risk for psychopathology. Reliability and concurrent validity are reported. The 30-min paradigm yielded reliable indexes…
Yan, Yu-Xiang; Liu, You-Qin; Li, Man; Hu, Pei-Feng; Guo, Ai-Min; Yang, Xing-Hua; Qiu, Jing-Jun; Yang, Shan-Shan; Shen, Jian; Zhang, Li-Ping; Wang, Wei
2009-01-01
Background Suboptimal health status (SHS) is characterized by ambiguous health complaints, general weakness, and lack of vitality, and has become a new public health challenge in China. It is believed to be a subclinical, reversible stage of chronic disease. Studies of intervention and prognosis for SHS are expected to become increasingly important. Consequently, a reliable and valid instrument to assess SHS is essential. We developed and evaluated a questionnaire for measuring SHS in urban Chinese. Methods Focus group discussions and a literature review provided the basis for the development of the questionnaire. Questionnaire validity and reliability were evaluated in a small pilot study and in a larger cross-sectional study of 3000 individuals. Analyses included tests for reliability and internal consistency, exploratory and confirmatory factor analysis, and tests for discriminative ability and convergent validity. Results The final questionnaire included 25 items on SHS (SHSQ-25), and encompassed 5 subscales: fatigue, the cardiovascular system, the digestive tract, the immune system, and mental status. Overall, 2799 of 3000 participants completed the questionnaire (93.3%). Test-retest reliability coefficients of individual items ranged from 0.89 to 0.98. Item-subscale correlations ranged from 0.51 to 0.72, and Cronbach’s α was 0.70 or higher for all subscales. Factor analysis established 5 distinct domains, as conceptualized in our model. One-way ANOVA showed statistically significant differences in scale scores between 3 occupation groups; these included total scores and subscores (P < 0.01). The correlation between the SHS scores and experienced stress was statistically significant (r = 0.57, P < 0.001). Conclusions The SHSQ-25 is a reliable and valid instrument for measuring sub-health status in urban Chinese. PMID:19749497
Torok, Kathryn S.; Baker, Nancy A.; Lucas, Mary; Domsic, Robyn T.; Boudreau, Robert; Medsger, Thomas A.
2010-01-01
Objectives To determine the reliability and validity of a new measure of finger motion in patients with systemic sclerosis (SSc), the ‘delta finger-to-palm’ (delta FTP) and compare its psychometric properties to the traditional measure of finger motion, the finger-to-palm (FTP). Methods Phase 1: The reliability of the delta FTP and FTP were examined in 39 patients with SSc. Phase 2: Criterion and convergent construct validity of both measures were examined in 17 patients with SSc by comparing them to other clinical measures: Total Active Range of Motion (TAROM), Hand Mobility in Scleroderma (HAMIS), the Duruoz Hand Index (DHI), Health Assessment Questionnaire (HAQ), and modified Rodnan skin score (mRSS). Phase 3: Sensitivity to change of the delta FTP was investigated in 24 patients with early diffuse cutaneous SSc. Results Both measures had excellent intra-rater and inter-rater reliability (ICC 0.92 to 0.99). Fair to strong correlations (rs=0.49–0.94) were observed between the delta FTP and TAROM, HAMIS, and DHI. Fair to moderate correlations were observed between delta FTP and HAQ components related to hand function and upper extremity mRSS. Correlations of the traditional FTP with these measures were fair to strong, but most often the delta FTP outperformed the FTP. The effect size and standardised response mean for the mean delta FTP were 0.50 and 1.10 respectively, over a 2–8 month period. Conclusion The delta FTP is a valid and reliable measure of finger motion in patients with SSc which outperforms the FTP. PMID:20576211
Cushion, Christopher; Harvey, Stephen; Muir, Bob; Nelson, Lee
2012-01-01
We outline the evolution of a computerised systematic observation tool and describe the process for establishing the validity and reliability of this new instrument. The Coach Analysis and Interventions System (CAIS) has 23 primary behaviours related to physical behaviour, feedback/reinforcement, instruction, verbal/non-verbal, questioning and management. The instrument also analyses secondary coach behaviour related to performance states, recipient, timing, content and questioning/silence. The CAIS is a multi-dimensional and multi-level mechanism able to provide detailed and contextualised data about specific coaching behaviours occurring in complex and nuanced coaching interventions and environments that can be applied to both practice sessions and competition.
Gräff, Ingo; Goldschmidt, Bernd; Glien, Procula; Bogdanow, Manuela; Fimmers, Rolf; Hoeft, Andreas; Kim, Se-Chan; Grigutsch, Daniel
2014-01-01
Background The German Version of the Manchester Triage System (MTS) has found widespread use in EDs across German-speaking Europe. Studies about the quality criteria validity and reliability of the MTS currently only exist for the English-language version. Most importantly, the content of the German version differs from the English version with respect to presentation diagrams and change indicators, which have a significant impact on the category assigned. This investigation offers a preliminary assessment in terms of validity and inter-rater reliability of the German MTS. Methods Construct validity of assigned MTS level was assessed based on comparisons to hospitalization (general / intensive care), mortality, ED and hospital length of stay, level of prehospital care and number of invasive diagnostics. A sample of 45,469 patients was used. Inter-rater agreement between an expert and triage nurses (reliability) was calculated separately for a subset group of 167 emergency patients. Results For general hospital admission the area under the curve (AUC) of the receiver operating characteristic was 0.749; for admission to ICU it was 0.871. An examination of MTS-level and number of deceased patients showed that the higher the priority derived from MTS, the higher the number of deaths (p<0.0001 / χ2 Test). There was a substantial difference in the 30-day survival among the 5 MTS categories (p<0.0001 / log-rank test).The AUC for the predict 30-day mortality was 0.613. Categories orange and red had the highest numbers of heart catheter and endoscopy. Category red and orange were mostly accompanied by an emergency physician, whereas categories blue and green were walk-in patients. Inter-rater agreement between expert triage nurses was almost perfect (κ = 0.954). Conclusion The German version of the MTS is a reliable and valid instrument for a first assessment of emergency patients in the emergency department. PMID:24586477
Park, Juhyun; Kang, Minyong; Jeong, Chang Wook; Oh, Sohee; Lee, Jeong Woo; Lee, Seung Bae; Son, Hwancheol; Jeong, Hyeon; Cho, Sung Yong
2015-08-01
The modified Seoul National University Renal Stone Complexity scoring system (S-ReSC-R) for retrograde intrarenal surgery (RIRS) was developed as a tool to predict stone-free rate (SFR) after RIRS. We externally validated the S-ReSC-R. We retrospectively reviewed 159 patients who underwent RIRS. The S-ReSC-R was assigned from 1 to 12 according to the location and number of sites involved. The stone-free status was defined as no evidence of a stone or with clinically insignificant residual fragment stones less than 2 mm. Interobserver and test-retest reliabilities were evaluated. Statistical performance of the prediction model was assessed by its predictive accuracy, predictive probability, and clinical usefulness. Overall SFR was 73.0%. The SFRs were 86.7%, 70.2%, and 48.6% in low-score (1-2), intermediate-score (3-4), and high-score (5-12) groups, respectively (p<0.001). External validation of S-ReSC-R revealed an area under the curve (AUC) of 0.731 (95% CI 0.650-0.813). The AUC of the three-titered S-ReSC-R was 0.701 (95% CI 0.609-0.794). The calibration plot showed that the predicted probability of SFR had a concordance comparable to that of observed frequency. The Hosmer-Lemeshow goodness of fit test revealed a p-value of 0.01 for the S-ReSC-R and 0.90 for the three-titered S-ReSC-R. Interobserver and test-retest reliabilities revealed an almost perfect level of agreement. The present study proved the predictive value of S-ReSC-R to predict SFR following RIRS in an independent cohort. Interobserver and test-retest reliabilities confirmed that S-ReSC-R was reliable and valid.
Applying Resource Utilization Groups (RUG-III) in Hong Kong nursing homes.
Chou, Kee-Lee; Chi, Iris; Leung, Joe C B
2008-01-01
Resource Utilization Groups III (RUG-III) is a case-mix system developed in the United States for categorization of nursing home residents and the financing of residential care services. In Hong Kong, RUG-III is based on several board groups of residents. The aim of this study was to examine the reliability and validity of the RUG-III in Hong Kong nursing homes. A cross-sectional survey was conducted in seven residential facilities operated by one agency. Residents ( N = 1,127) were assessed by the Minimum Data Set (MDS) and nursing as well as auxiliary staff care times were recorded within 2 weeks before or after the completion of MDS assessment. Forty-five out 1,127 residents were re-interviewed by an independent assessor to assess the inter-rater reliability. The inter-rater reliability of MDS assessment was excellent (kappa = 0.76) and the original RUG-III accounted for about 30 per cent of nursing staff time. Results provide preliminary evidence to support that RUG-III is a reliable and valid case-mix system for Hong Kong nursing homes, but future studies must be explored to reduce the variance of resource use explained by this case-mix system.
NASA Technical Reports Server (NTRS)
Simmons, D. B.
1975-01-01
The DOMONIC system has been modified to run on the Univac 1108 and the CDC 6600 as well as the IBM 370 computer system. The DOMONIC monitor system has been implemented to gather data which can be used to optimize the DOMONIC system and to predict the reliability of software developed using DOMONIC. The areas of quality metrics, error characterization, program complexity, program testing, validation and verification are analyzed. A software reliability model for estimating program completion levels and one on which to base system acceptance have been developed. The DAVE system which performs flow analysis and error detection has been converted from the University of Colorado CDC 6400/6600 computer to the IBM 360/370 computer system for use with the DOMONIC system.
Integrated Human-in-the-Loop Ground Testing - Value, History, and the Future
NASA Technical Reports Server (NTRS)
Henninger, Donald L.
2016-01-01
Systems for very long-duration human missions to Mars will be designed to operate reliably for many years and many of these systems will never be returned to Earth. The need for high reliability is driven by the requirement for safe functioning of remote, long-duration crewed systems and also by unsympathetic abort scenarios. Abort from a Mars mission could be as long as 450 days to return to Earth. The key to developing a human-in-the-loop architecture is a development process that allows for a logical sequence of validating successful development in a stepwise manner, with assessment of key performance parameters (KPPs) at each step; especially important are KPPs for technologies evaluated in a full systems context with human crews on Earth and on space platforms such as the ISS. This presentation will explore the implications of such an approach to technology development and validation including the roles of ground and space-based testing necessary to develop a highly reliable system for long duration human exploration missions. Historical development and systems testing from Mercury to the International Space Station (ISS) to ground testing will be reviewed. Current work as well as recommendations for future work will be described.
Maksymowych, Walter P; Cibere, Jolanda; Loeuille, Damien; Weber, Ulrich; Zubler, Veronika; Roemer, Frank W; Jaremko, Jacob L; Sayre, Eric C; Lambert, Robert G W
2014-02-01
Development of a validated magnetic resonance image (MRI) scoring system is essential in hip OA because radiographs are insensitive to change. We assessed the feasibility and reliability of 2 previously developed scoring methods: (1) the Hip Inflammation MRI Scoring System (HIMRISS) and (2) the Hip Osteoarthritis MRI Scoring System (HOAMS). Six readers (3 radiologists, 3 rheumatologists) participated in 2 reading exercises. In Reading Exercise 1, MRI of the hip of 20 subjects were read at a single time point followed by further standardization of methodology. In Reading Exercise 2, MRI of the hip of 18 subjects from a randomized controlled trial, assessed at 2 timepoints, and 27 subjects from a cross-sectional study were read for HIMRISS and HOAMS bone marrow lesions (BML) and synovitis. Reliability was assessed using intraclass correlation coefficient (ICC) and kappa statistics. Both methods were considered feasible. For Reading 1, HIMRISS ICC were 0.52, 0.61, 0.70, and 0.58 for femoral BML, acetabular BML, effusion, and total scores, respectively; and for HOAMS, summed BML and synovitis ICC were 0.52 and 0.46, respectively. For Reading 2, HIMRISS and HOAMS ICC for BML and synovitis-effusion improved substantially. Interobserver reliability for change scores was 0.81 and 0.71 for HIMRISS femoral and HOAMS summed BML, respectively. Responsiveness and discrimination was moderate to high for synovitis-effusion. Significant associations were noted between BML or synovitis scores and Western Ontario and McMaster Universities Osteoarthritis Index pain scores for baseline values (p ≤ 0.001). The BML and synovitis-effusion components of both HIMRISS and HOAMS scoring systems are feasible and reliable, and should be validated further.
Lohrer, Heinz; Nauck, Tanja
2009-10-30
Achilles tendinopathy is the predominant overuse injury in runners. To further investigate this overload injury in transverse and longitudinal studies a valid, responsive and reliable outcome measure is demanded. Most questionnaires have been developed for English-speaking populations. This is also true for the VISA-A score, so far representing the only valid, reliable, and disease specific questionnaire for Achilles tendinopathy. To internationally compare research results, to perform multinational studies or to exclude bias originating from subpopulations speaking different languages within one country an equivalent instrument is demanded in different languages. The aim of this study was therefore to cross-cultural adapt and validate the VISA-A questionnaire for German-speaking Achilles tendinopathy patients. According to the "guidelines for the process of cross-cultural adaptation of self-report measures" the VISA-A score was cross-culturally adapted into German (VISA-A-G) using six steps: Translation, synthesis, back translation, expert committee review, pretesting (n = 77), and appraisal of the adaptation process by an advisory committee determining the adequacy of the cross-cultural adaptation. The resulting VISA-A-G was then subjected to an analysis of reliability, validity, and internal consistency in 30 Achilles tendinopathy patients and 79 asymptomatic people. Concurrent validity was tested against a generic tendon grading system (Percy and Conochie) and against a classification system for the effect of pain on athletic performance (Curwin and Stanish). The "advisory committee" determined the VISA-A-G questionnaire as been translated "acceptable". The VISA-A-G questionnaire showed moderate to excellent test-retest reliability (ICC = 0.60 to 0.97). Concurrent validity showed good coherence when correlated with the grading system of Curwin and Stanish (rho = -0.95) and for the Percy and Conochie grade of severity (rho 0.95). Internal consistency (Cronbach's alpha) for the total VISA-A-G scores of the patients was calculated to be 0.737. The VISA-A questionnaire was successfully cross-cultural adapted and validated for use in German speaking populations. The psychometric properties of the VISA-A-G questionnaire are similar to those of the original English version. It therefore can be recommended as a sufficiently robust tool for future measuring clinical severity of Achilles tendinopathy in German speaking patients.
Lohrer, Heinz; Nauck, Tanja
2009-01-01
Background Achilles tendinopathy is the predominant overuse injury in runners. To further investigate this overload injury in transverse and longitudinal studies a valid, responsive and reliable outcome measure is demanded. Most questionnaires have been developed for English-speaking populations. This is also true for the VISA-A score, so far representing the only valid, reliable, and disease specific questionnaire for Achilles tendinopathy. To internationally compare research results, to perform multinational studies or to exclude bias originating from subpopulations speaking different languages within one country an equivalent instrument is demanded in different languages. The aim of this study was therefore to cross-cultural adapt and validate the VISA-A questionnaire for German-speaking Achilles tendinopathy patients. Methods According to the "guidelines for the process of cross-cultural adaptation of self-report measures" the VISA-A score was cross-culturally adapted into German (VISA-A-G) using six steps: Translation, synthesis, back translation, expert committee review, pretesting (n = 77), and appraisal of the adaptation process by an advisory committee determining the adequacy of the cross-cultural adaptation. The resulting VISA-A-G was then subjected to an analysis of reliability, validity, and internal consistency in 30 Achilles tendinopathy patients and 79 asymptomatic people. Concurrent validity was tested against a generic tendon grading system (Percy and Conochie) and against a classification system for the effect of pain on athletic performance (Curwin and Stanish). Results The "advisory committee" determined the VISA-A-G questionnaire as been translated "acceptable". The VISA-A-G questionnaire showed moderate to excellent test-retest reliability (ICC = 0.60 to 0.97). Concurrent validity showed good coherence when correlated with the grading system of Curwin and Stanish (rho = -0.95) and for the Percy and Conochie grade of severity (rho 0.95). Internal consistency (Cronbach's alpha) for the total VISA-A-G scores of the patients was calculated to be 0.737. Conclusion The VISA-A questionnaire was successfully cross-cultural adapted and validated for use in German speaking populations. The psychometric properties of the VISA-A-G questionnaire are similar to those of the original English version. It therefore can be recommended as a sufficiently robust tool for future measuring clinical severity of Achilles tendinopathy in German speaking patients. PMID:19878572
Classification in childhood disability: focusing on function in the 21st century.
Rosenbaum, Peter; Eliasson, Ann-Christin; Hidecker, Mary Jo Cooley; Palisano, Robert J
2014-08-01
Classification systems in health care are usually based on current understanding of the condition. They are often derived empirically and adopted applying sound principles of measurement science to assess whether they are reliable (consistent) and valid (true) for the purposes to which they are applied. In the past 15 years, the authors have developed and validated classification systems for specific aspects of everyday function in people with cerebral palsy--gross motor function, manual abilities, and communicative function. This article describes the approaches used to conceptualize each aspect of function, develop the tools, and assess their reliability and validity. We report on the utility of each system with respect to clinical applicability, use of these tools for research, and the uptake and impact that they have had around the world. We hope that readers will find these accounts interesting, relevant, and applicable to their daily work with children and youth with disabilities. © The Author(s) 2014.
Koontz, Alicia M; Lin, Yen-Sheng; Kankipati, Padmaja; Boninger, Michael L; Cooper, Rory A
2011-01-01
This study describes a new custom measurement system designed to investigate the biomechanics of sitting-pivot wheelchair transfers and assesses the reliability of selected biomechanical variables. Variables assessed include horizontal and vertical reaction forces underneath both hands and three-dimensional trunk, shoulder, and elbow range of motion. We examined the reliability of these measures between 5 consecutive transfer trials for 5 subjects with spinal cord injury and 12 nondisabled subjects while they performed a self-selected sitting pivot transfer from a wheelchair to a level bench. A majority of the biomechanical variables demonstrated moderate to excellent reliability (r > 0.6). The transfer measurement system recorded reliable and valid biomechanical data for future studies of sitting-pivot wheelchair transfers.We recommend a minimum of five transfer trials to obtain a reliable measure of transfer technique for future studies.
Yu, Ting Yue; Syeda, Fahima; Holmes, Andrew P; Osborne, Benjamin; Dehghani, Hamid; Brain, Keith L; Kirchhof, Paulus; Fabritz, Larissa
2014-08-01
We developed and validated a new optical mapping system for quantification of electrical activation and repolarisation in murine atria. The system makes use of a novel 2nd generation complementary metal-oxide-semiconductor (CMOS) camera with deliberate oversampling to allow both assessment of electrical activation with high spatial and temporal resolution (128 × 2048 pixels) and reliable assessment of atrial murine repolarisation using post-processing of signals. Optical recordings were taken from isolated, superfused and electrically stimulated murine left atria. The system reliably describes activation sequences, identifies areas of functional block, and allows quantification of conduction velocities and vectors. Furthermore, the system records murine atrial action potentials with comparable duration to both monophasic and transmembrane action potentials in murine atria. Copyright © 2014 The Authors. Published by Elsevier Ltd.. All rights reserved.
Marques, Alda; Almeida, Sara; Carvalho, Joana; Cruz, Joana; Oliveira, Ana; Jácome, Cristina
2016-12-01
To assess the reliability, validity, and ability to identify fall status of the Balance Evaluation Systems Test (BESTest), Mini-BESTest, and Brief-BESTest, compared with the Berg Balance Scale (BBS), in older people living in the community. Cross-sectional. Community centers. Older adults (N=122; mean age ± SD, 76±9y). Not applicable. Participants reported on falls history in the preceding year and completed the Activities-Specific Balance Confidence (ABC) Scale. The BBS, BESTest, and the Five Times Sit-To-Stand Test were administered. Interrater (2 physiotherapists) and test-retest relative (48-72h) and absolute reliabilities were explored with the intraclass correlation coefficient (ICC) equation (2,1) and the Bland and Altman method. Minimal detectable changes at the 95% confidence level (MDC 95 ) were established. Validity was assessed by correlating the balance tests with each other and with the ABC Scale (Spearman correlation coefficients-ρ). Receiver operating characteristics assessed the ability of each balance test to differentiate between people with and without a history of falls. All balance tests presented good to excellent interrater (ICC=.71-.93) and test-retest (ICC=.50-.82) relative reliability, with no evidence of bias. MDC 95 values were 4.6, 9, 3.8, and 4.1 points for the BBS, BESTest, Mini-BESTest, and Brief-BESTest, respectively. All tests were significantly correlated with each other (ρ=.83-.96) and with the ABC Scale (ρ=.46-.61). Acceptable ability to identify fall status (areas under the curve, .71-.78) was found for all tests. Cutoff points were 48.5, 82, 19.5, and 12.5 points for the BBS, BESTest, Mini-BESTest, and Brief-BESTest, respectively. All balance tests are reliable, valid, and able to identify fall status in older people living in the community. Therefore, the choice of which test to use will depend on the level of balance impairment, purpose, and time availability. Copyright © 2016. Published by Elsevier Inc.
Moore, Amy Lawson; Miller, Terissa M
2018-01-01
The purpose of the current study is to evaluate the validity and reliability of the revised Gibson Test of Cognitive Skills, a computer-based battery of tests measuring short-term memory, long-term memory, processing speed, logic and reasoning, visual processing, as well as auditory processing and word attack skills. This study included 2,737 participants aged 5-85 years. A series of studies was conducted to examine the validity and reliability using the test performance of the entire norming group and several subgroups. The evaluation of the technical properties of the test battery included content validation by subject matter experts, item analysis and coefficient alpha, test-retest reliability, split-half reliability, and analysis of concurrent validity with the Woodcock Johnson III Tests of Cognitive Abilities and Tests of Achievement. Results indicated strong sources of evidence of validity and reliability for the test, including internal consistency reliability coefficients ranging from 0.87 to 0.98, test-retest reliability coefficients ranging from 0.69 to 0.91, split-half reliability coefficients ranging from 0.87 to 0.91, and concurrent validity coefficients ranging from 0.53 to 0.93. The Gibson Test of Cognitive Skills-2 is a reliable and valid tool for assessing cognition in the general population across the lifespan.
Evaluation of tools used to measure calcium and/or dairy consumption in adults.
Magarey, Anthea; Baulderstone, Lauren; Yaxley, Alison; Markow, Kylie; Miller, Michelle
2015-05-01
To identify and critique tools for the assessment of Ca and/or dairy intake in adults, in order to ascertain the most accurate and reliable tools available. A systematic review of the literature was conducted using defined inclusion and exclusion criteria. Articles reporting on originally developed tools or testing the reliability or validity of existing tools that measure Ca and/or dairy intake in adults were included. Author-defined criteria for reporting reliability and validity properties were applied. Studies conducted in Western countries. Adults. Thirty papers, utilising thirty-six tools assessing intake of dairy, Ca or both, were identified. Reliability testing was conducted on only two dairy and five Ca tools, with results indicating that only one dairy and two Ca tools were reliable. Validity testing was conducted for all but four Ca-only tools. There was high reliance in validity testing on lower-order tests such as correlation and failure to differentiate between statistical and clinically meaningful differences. Results of the validity testing suggest one dairy and five Ca tools are valid. Thus one tool was considered both reliable and valid for the assessment of dairy intake and only two tools proved reliable and valid for the assessment of Ca intake. While several tools are reliable and valid, their application across adult populations is limited by the populations in which they were tested. These results indicate a need for tools that assess Ca and/or dairy intake in adults to be rigorously tested for reliability and validity.
2017-01-01
Objective To investigate the reliability and validity of a new method for isometric back extensor strength measurement using a portable dynamometer. Methods A chair equipped with a small portable dynamometer was designed (Power Track II Commander Muscle Tester). A total of 15 men (mean age, 34.8±7.5 years) and 15 women (mean age, 33.1±5.5 years) with no current back problems or previous history of back surgery were recruited. Subjects were asked to push the back of the chair while seated, and their isometric back extensor strength was measured by the portable dynamometer. Test-retest reliability was assessed with intraclass correlation coefficient (ICC). For the validity assessment, isometric back extensor strength of all subjects was measured by a widely used physical performance evaluation instrument, BTE PrimusRS system. The limit of agreement (LoA) from the Bland-Altman plot was evaluated between two methods. Results The test-retest reliability was excellent (ICC=0.82; 95% confidence interval, 0.65–0.91). The Bland-Altman plots demonstrated acceptable agreement between the two methods: the lower 95% LoA was −63.1 N and the upper 95% LoA was 61.1 N. Conclusion This study shows that isometric back extensor strength measurement using a portable dynamometer has good reliability and validity. PMID:29201818
Validation of a method for assessing resident physicians' quality improvement proposals.
Leenstra, James L; Beckman, Thomas J; Reed, Darcy A; Mundell, William C; Thomas, Kris G; Krajicek, Bryan J; Cha, Stephen S; Kolars, Joseph C; McDonald, Furman S
2007-09-01
Residency programs involve trainees in quality improvement (QI) projects to evaluate competency in systems-based practice and practice-based learning and improvement. Valid approaches to assess QI proposals are lacking. We developed an instrument for assessing resident QI proposals--the Quality Improvement Proposal Assessment Tool (QIPAT-7)-and determined its validity and reliability. QIPAT-7 content was initially obtained from a national panel of QI experts. Through an iterative process, the instrument was refined, pilot-tested, and revised. Seven raters used the instrument to assess 45 resident QI proposals. Principal factor analysis was used to explore the dimensionality of instrument scores. Cronbach's alpha and intraclass correlations were calculated to determine internal consistency and interrater reliability, respectively. QIPAT-7 items comprised a single factor (eigenvalue = 3.4) suggesting a single assessment dimension. Interrater reliability for each item (range 0.79 to 0.93) and internal consistency reliability among the items (Cronbach's alpha = 0.87) were high. This method for assessing resident physician QI proposals is supported by content and internal structure validity evidence. QIPAT-7 is a useful tool for assessing resident QI proposals. Future research should determine the reliability of QIPAT-7 scores in other residency and fellowship training programs. Correlations should also be made between assessment scores and criteria for QI proposal success such as implementation of QI proposals, resident scholarly productivity, and improved patient outcomes.
Reliability evaluation of microgrid considering incentive-based demand response
NASA Astrophysics Data System (ADS)
Huang, Ting-Cheng; Zhang, Yong-Jun
2017-07-01
Incentive-based demand response (IBDR) can guide customers to adjust their behaviour of electricity and curtail load actively. Meanwhile, distributed generation (DG) and energy storage system (ESS) can provide time for the implementation of IBDR. The paper focus on the reliability evaluation of microgrid considering IBDR. Firstly, the mechanism of IBDR and its impact on power supply reliability are analysed. Secondly, the IBDR dispatch model considering customer’s comprehensive assessment and the customer response model are developed. Thirdly, the reliability evaluation method considering IBDR based on Monte Carlo simulation is proposed. Finally, the validity of the above models and method is studied through numerical tests on modified RBTS Bus6 test system. Simulation results demonstrated that IBDR can improve the reliability of microgrid.
Seligman, Sarah C; Giovannetti, Tania; Sestito, John; Libon, David J
2014-01-01
Mild functional difficulties have been associated with early cognitive decline in older adults and increased risk for conversion to dementia in mild cognitive impairment, but our understanding of this decline has been limited by a dearth of objective methods. This study evaluated the reliability and validity of a new system to code subtle errors on an established performance-based measure of everyday action and described preliminary findings within the context of a theoretical model of action disruption. Here 45 older adults completed the Naturalistic Action Test (NAT) and neuropsychological measures. NAT performance was coded for overt errors, and subtle action difficulties were scored using a novel coding system. An inter-rater reliability coefficient was calculated. Validity of the coding system was assessed using a repeated-measures ANOVA with NAT task (simple versus complex) and error type (overt versus subtle) as within-group factors. Correlation/regression analyses were conducted among overt NAT errors, subtle NAT errors, and neuropsychological variables. The coding of subtle action errors was reliable and valid, and episodic memory breakdown predicted subtle action disruption. Results suggest that the NAT can be useful in objectively assessing subtle functional decline. Treatments targeting episodic memory may be most effective in addressing early functional impairment in older age.
Quantifying Engagement: Measuring Player Involvement in Human-Avatar Interactions
Norris, Anne E.; Weger, Harry; Bullinger, Cory; Bowers, Alyssa
2014-01-01
This research investigated the merits of using an established system for rating behavioral cues of involvement in human dyadic interactions (i.e., face-to-face conversation) to measure involvement in human-avatar interactions. Gameplay audio-video and self-report data from a Feasibility Trial and Free Choice study of an effective peer resistance skill building simulation game (DRAMA-RAMA™) were used to evaluate reliability and validity of the rating system when applied to human-avatar interactions. The Free Choice study used a revised game prototype that was altered to be more engaging. Both studies involved girls enrolled in a public middle school in Central Florida that served a predominately Hispanic (greater than 80%), low-income student population. Audio-video data were coded by two raters, trained in the rating system. Self-report data were generated using measures of perceived realism, predictability and flow administered immediately after game play. Hypotheses for reliability and validity were supported: Reliability values mirrored those found in the human dyadic interaction literature. Validity was supported by factor analysis, significantly higher levels of involvement in Free Choice as compared to Feasibility Trial players, and correlations between involvement dimension sub scores and self-report measures. Results have implications for the science of both skill-training intervention research and game design. PMID:24748718
Huang, Min H; Miller, Kara; Smith, Kristin; Fredrickson, Kayle; Shilling, Tracy
2016-01-01
Cancer is primarily a disease of older adults. About 77% of all cancers are diagnosed in persons aged 55 years and older. Cancer and its treatment can cause diverse sequelae impacting body systems underlying balance control. No study has examined the psychometric properties of balance assessment tools in older cancer survivors, presenting a significant challenge in the selection of outcome measures for clinicians treating this fast-growing population. This study aimed to determine the reliability, validity, and minimal detectable change (MDC) of the Balance Evaluation System Test (BESTest), Mini-Balance Evaluation Systems Test (Mini-BESTest), and Brief-Balance Evaluation Systems Test (Brief-BESTest) in community-dwelling older cancer survivors. This study was a cross-sectional design. Twenty breast and 8 prostate cancer survivors participated [age (SD) = 68.4 (8.13) years]. The BESTest and Activity-specific Balance Confidence (ABC) Scale were administered during the first session. Scores of Mini-BESTest and Brief-BESTest were extracted on the basis of the scores of BESTest. The BESTest was repeated within 1 to 2 weeks by the same rater to determine the test-retest reliability. For the analysis of the inter-rater reliability, 21 participants were randomly selected to be evaluated by 2 raters. A primary rater administered the test. The 2 raters independently and concurrently scored the performance of the participants. Each rater recorded the ratings separately on the scoring sheet. No discussion among the raters was allowed throughout the testing. Intraclass correlation coefficients (ICCs), standard error of measurement, minimal detectable change (MDC), and Bland-Altman plots were calculated. Concurrent validity of these balance tests with the ABC Scale was examined using the Spearman correlation. The BESTest, Mini-BESTest, and Brief-BESTest had high test-retest (ICC = 0.90-0.94) and interrater reliability (ICC = 0.86-0.96), small standard error of measurement (0.86-2.47 points), and MDC (2.39-6.86 points). The Bland-Altman plot revealed no systematic errors. The scores of BESTest, Mini-BEST, and Brief-BEST were correlated significantly with those of ABC Scale (P < .01), supporting their concurrent validity. The BESTest, Mini-BESTest, and Brief-BESTest showed high interrater and test-retest reliability, and excellent concurrent validity with the ABC Scale for community-dwelling cancer survivors aged 55 years and older who had completed cancer treatments for at least 3 months. Future studies are necessary to determine the predictive values for determining fall risks using balance assessment tools in older cancer survivors. Clinicians can utilize the BESTest and its short versions to evaluate balance problems in community-dwelling older cancer survivors and apply the established MDC to assess the intervention outcomes.
Burns, C
1991-01-01
Pediatric nurse practitioners (PNPs) need an integrated, comprehensive classification that includes nursing, disease, and developmental diagnoses to effectively describe their practice. No such classification exists. Further, methodologic studies to help evaluate the content validity of any nursing taxonomy are unavailable. A conceptual framework was derived. Then 178 diagnoses from the North American Nursing Diagnosis Association (NANDA) 1986 list, selected diagnoses from the International Classification of Diseases, the Diagnostic and Statistical Manual, Third Revision, and others were selected. This framework identified and listed, with definitions, three domains of diagnoses: Developmental Problems, Diseases, and Daily Living Problems. The diagnoses were ranked using a 4-point scale (4 = highly related to 1 = not related) and were placed into the three domains. The rating scale was assigned by a panel of eight expert pediatric nurses. Diagnoses that were assigned to the Daily Living Problems domain were then sorted into the 11 Functional Health patterns described by Gordon (1987). Reliability was measured using proportions of agreement and Kappas. Content validity of the groups created was measured using indices of content validity and average congruency percentages. The experts used a new method to sort the diagnoses in a new way that decreased overlaps among the domains. The Developmental and Disease domains were judged reliable and valid. The Daily Living domain of nursing diagnoses showed marginally acceptable validity with acceptable reliability. Six Functional Health Patterns were judged reliable and valid, mixed results were determined for four categories, and the Coping/Stress Tolerance category was judged reliable but not valid using either test. There were considerable differences between the panel's, Gordon's (1987), and NANDA's clustering of NANDA diagnoses. This study defines the diagnostic practice of nurses from a holistic, patient-centered perspective. It is the first study to use quantitative methods to test a diagnostic classification system for nursing. The classification model could also be adapted for other nurse specialties.
Expert system verification and validation study. Delivery 3A and 3B: Trip summaries
NASA Technical Reports Server (NTRS)
French, Scott
1991-01-01
Key results are documented from attending the 4th workshop on verification, validation, and testing. The most interesting part of the workshop was when representatives from the U.S., Japan, and Europe presented surveys of VV&T within their respective regions. Another interesting part focused on current efforts to define industry standards for artificial intelligence and how that might affect approaches to VV&T of expert systems. The next part of the workshop focused on VV&T methods of applying mathematical techniques to verification of rule bases and techniques for capturing information relating to the process of developing software. The final part focused on software tools. A summary is also presented of the EPRI conference on 'Methodologies, Tools, and Standards for Cost Effective Reliable Software Verification and Validation. The conference was divided into discussion sessions on the following issues: development process, automated tools, software reliability, methods, standards, and cost/benefit considerations.
Fabricant, Peter D; Robles, Alex; Downey-Zayas, Timothy; Do, Huong T; Marx, Robert G; Widmann, Roger F; Green, Daniel W
2013-10-01
Having simple and reliable validated outcome measures is vital to conducting high-quality outcomes research in the field of orthopaedic surgery. Activity level is a key prognostic variable for patients with sports injuries. There is a paucity of such activity scales for children and adolescents who are otherwise healthy and athletically active. In addition to frequency and intensity of athletic activity, level of play and coach/trainer supervision are important variables unique to children and adolescents that are not captured in available adult scoring systems. To create and validate a concise and comprehensive activity rating scale for athletically active children and adolescents 10 to 18 years of age. Cohort study (diagnosis); Level of evidence, 2. Item generation was performed with a panel of orthopaedic surgeons and adolescent athletes. Item reduction, pilot testing and scale refinement resulted in a final 8-item instrument, the Hospital for Special Surgery Pediatric Functional Activity Brief Scale (HSS Pedi-FABS). Existing methods were used to determine reliability and validation. The Flesch-Kincaid score was calculated at a 6.6th-grade reading level (approximately 13 years old); therefore, although all subjects provided their own answers, parents were allowed to assist children younger than 13 years with reading the questionnaire. Scale reliability was excellent (test-retest reliability, intraclass correlation coefficient = 0.91; internal consistency, Cronbach alpha = .914), and there were no floor or ceiling effects. There was also robust construct validity: Convergent validity testing revealed positive correlations between the HSS Pedi-FABS and level of competition in athletic activity, number of reported hours of athletic activity per week, and existing comparable adult and pediatric scales. Discriminant validity was shown with age, body mass index, and type of sport as measured by the Daniel scale. The 8-item HSS Pedi-FABS can be used to reliably and accurately evaluate activity level as a prognostic variable for clinical research studies. It is a simple, reliable, and valid metric to assess activity in children and adolescents 10 to 18 years of age. This instrument will lead to better evaluation of posttreatment outcomes and patient-reported activity for child and adolescent athletes.
2010-01-01
Background As a result of scientific and medical professionals gaining interest in Stress and Health Related Quality of Life (HRQL), the aim of our research is, thus, to validate into Spanish the German questionnaire Bad Sobernheim Stress Questionnaire (BSSQ) (mit Korsett), for adolescents wearing braces. Methods The methodology used adheres to literature on trans-cultural adaptation by doing a translation and a back translation; it involved 35 adolescents, ages ranging between 10 and 16, with Adolescent Idiopathic Scoliosis (AIS) and wearing the same kind of brace (Rigo System Chêneau Brace). The materials used were a socio-demographics data questionnaire, the SRS-22 and the Spanish version of BSSQ(brace).es. The statistical analysis calculated the reliability (test-retest reliability and internal consistency) and the validity (convergent and construct validity) of the BSSQ (brace).es. Results BSSQ(brace).es is reliable because of its satisfactory internal consistency (Cronbach's alpha coefficient was 0.809, p < 0.001) and temporal stability (test-retest method with a Pearson correlation coefficient of 0.902 (p < 0.01)). It demonstrated convergent validity with SRS-22 since the Pearson correlation coefficient was 0.656 (p < 0.01). By undertaking an Exploratory Principal Components Analysis, a latent structure was found based on two Components which explicate the variance at 60.8%. Conclusions BSSQ (brace).es is reliable and valid and can be used with Spanish adolescents to assess the stress level caused by the brace. PMID:20633253
Validity and reliability of a pilot scale for assessment of multiple system atrophy symptoms.
Matsushima, Masaaki; Yabe, Ichiro; Takahashi, Ikuko; Hirotani, Makoto; Kano, Takahiro; Horiuchi, Kazuhiro; Houzen, Hideki; Sasaki, Hidenao
2017-01-01
Multiple system atrophy (MSA) is a rare progressive neurodegenerative disorder for which brief yet sensitive scale is required in order for use in clinical trials and general screening. We previously compared several scales for the assessment of MSA symptoms and devised an eight-item pilot scale with large standardized response mean [handwriting, finger taps, transfers, standing with feet together, turning trunk, turning 360°, gait, body sway]. The aim of the present study is to investigate the validity and reliability of a simple pilot scale for assessment of multiple system atrophy symptoms. Thirty-two patients with MSA (15 male/17 female; 20 cerebellar subtype [MSA-C]/12 parkinsonian subtype [MSA-P]) were prospectively registered between January 1, 2014 and February 28, 2015. Patients were evaluated by two independent raters using the Unified MSA Rating Scale (UMSARS), Scale for Assessment and Rating of Ataxia (SARA), and the pilot scale. Correlations between UMSARS, SARA, pilot scale scores, intraclass correlation coefficients (ICCs), and Cronbach's alpha coefficients were calculated. Pilot scale scores significantly correlated with scores for UMSARS Parts I, II, and IV as well as with SARA scores. Intra-rater and inter-rater ICCs and Cronbach's alpha coefficients remained high (> 0.94) for all measures. The results of the present study indicate the validity and reliability of the eight-item pilot scale, particularly for the assessment of symptoms in patients with early state multiple system atrophy.
Mitchell, Katy; Graff, Megan; Hedt, Corbin; Simmons, James
2016-08-01
Purpose/hypothesis: This study was designed to investigate the test-retest reliability, concurrent validity, and the standard error of measurement (SEm) of a pulse rate assessment application (Azumio®'s Instant Heart Rate) on both Android® and iOS® (iphone operating system) smartphones as compared to a FT7 Polar® Heart Rate monitor. Number of subjects: 111. Resting (sitting) pulse rate was assessed twice and then the participants were asked to complete a 1-min standing step test and then immediately re-assessed. The smartphone assessors were blinded to their measurements. Test-retest reliability (intraclass correlation coefficient [ICC 2,1] and 95% confidence interval) for the three tools at rest (time 1/time 2): iOS® (0.76 [0.67-0.83]); Polar® (0.84 [0.78-0.89]); and Android® (0.82 [0.75-0.88]). Concurrent validity at rest time 2 (ICC 2,1) with the Polar® device: IOS® (0.92 [0.88-0.94]) and Android® (0.95 [0.92-0.96]). Concurrent validity post-exercise (time 3) (ICC) with the Polar® device: iOS® (0.90 [0.86-0.93]) and Android® (0.94 [0.91-0.96]). The SEm values for the three devices at rest: iOS® (5.77 beats per minute [BPM]), Polar® (4.56 BPM) and Android® (4.96 BPM). The Android®, iOS®, and Polar® devices showed acceptable test-retest reliability at rest and post-exercise. Both the smartphone platforms demonstrated concurrent validity with the Polar® at rest and post-exercise. The Azumio® Instant Heart Rate application when used by either platform appears to be a reliable and valid tool to assess pulse rate in healthy individuals.
Reliability and validity of the instrument used in BRFSS to assess physical activity.
Yore, Michelle M; Ham, Sandra A; Ainsworth, Barbara E; Kruger, Judy; Reis, Jared P; Kohl, Harold W; Macera, Caroline A
2007-08-01
State-level statistics of adherence to the physical activity objectives in Healthy People 2010 are derived from the Behavioral Risk Factor Surveillance System (BRFSS) data. BRFSS physical activity questions were updated in 2001 to include domains of leisure time, household, and transportation-related activity of moderate- and vigorous intensity, and walking questions. This article reports the reliability and validity of these questions. The BRFSS Physical Activity Study (BPAS) was conducted from September 2000 to May 2001 in Columbia, SC. Sixty participants were followed for 22 d; they answered the physical activity questions three times via telephone, wore a pedometer and accelerometer, and completed a daily physical activity log for 1 wk. Measures for moderate, vigorous, recommended (i.e., met the criteria for moderate or vigorous), and strengthening activities were created according to Healthy People 2010 operational definitions. Reliability and validity were assessed using Cohen's kappa (kappa) and Pearson correlation coefficients. Seventy-three percent of participants met the recommended activity criteria compared with 45% in the total U.S. population. Test-retest reliability (kappa) was 0.35-0.53 for moderate activity, 0.80-0.86 for vigorous activity, 0.67-0.84 for recommended activity, and 0.85-0.92 for strengthening. Validity (kappa) of the survey (using the accelerometer as the standard) was 0.17-0.22 for recommended activity. Validity (kappa) of the survey (using the physical activity log as the standard) was 0.40-0.52 for recommended activity. The validity and reliability of the BRFSS physical activity questions suggests that this instrument can classify groups of adults into the levels of recommended and vigorous activity as defined by Healthy People 2010. Repeated administration of these questions over time will help to identify trends in physical activity.
2004-01-01
Background Evaluation is a challenging but necessary part of the development cycle of clinical information systems like the electronic medical records (EMR) system. It is believed that such evaluations should include multiple perspectives, be comparative and employ both qualitative and quantitative methods. Self-administered questionnaires are frequently used as a quantitative evaluation method in medical informatics, but very few validated questionnaires address clinical use of EMR systems. Methods We have developed a task-oriented questionnaire for evaluating EMR systems from the clinician's perspective. The key feature of the questionnaire is a list of 24 general clinical tasks. It is applicable to physicians of most specialties and covers essential parts of their information-oriented work. The task list appears in two separate sections, about EMR use and task performance using the EMR, respectively. By combining these sections, the evaluator may estimate the potential impact of the EMR system on health care delivery. The results may also be compared across time, site or vendor. This paper describes the development, performance and validation of the questionnaire. Its performance is shown in two demonstration studies (n = 219 and 80). Its content is validated in an interview study (n = 10), and its reliability is investigated in a test-retest study (n = 37) and a scaling study (n = 31). Results In the interviews, the physicians found the general clinical tasks in the questionnaire relevant and comprehensible. The tasks were interpreted concordant to their definitions. However, the physicians found questions about tasks not explicitly or only partially supported by the EMR systems difficult to answer. The two demonstration studies provided unambiguous results and low percentages of missing responses. In addition, criterion validity was demonstrated for a majority of task-oriented questions. Their test-retest reliability was generally high, and the non-standard scale was found symmetric and ordinal. Conclusion This questionnaire is relevant for clinical work and EMR systems, provides reliable and interpretable results, and may be used as part of any evaluation effort involving the clinician's perspective of an EMR system. PMID:15018620
Laerum, Hallvard; Faxvaag, Arild
2004-02-09
Evaluation is a challenging but necessary part of the development cycle of clinical information systems like the electronic medical records (EMR) system. It is believed that such evaluations should include multiple perspectives, be comparative and employ both qualitative and quantitative methods. Self-administered questionnaires are frequently used as a quantitative evaluation method in medical informatics, but very few validated questionnaires address clinical use of EMR systems. We have developed a task-oriented questionnaire for evaluating EMR systems from the clinician's perspective. The key feature of the questionnaire is a list of 24 general clinical tasks. It is applicable to physicians of most specialties and covers essential parts of their information-oriented work. The task list appears in two separate sections, about EMR use and task performance using the EMR, respectively. By combining these sections, the evaluator may estimate the potential impact of the EMR system on health care delivery. The results may also be compared across time, site or vendor. This paper describes the development, performance and validation of the questionnaire. Its performance is shown in two demonstration studies (n = 219 and 80). Its content is validated in an interview study (n = 10), and its reliability is investigated in a test-retest study (n = 37) and a scaling study (n = 31). In the interviews, the physicians found the general clinical tasks in the questionnaire relevant and comprehensible. The tasks were interpreted concordant to their definitions. However, the physicians found questions about tasks not explicitly or only partially supported by the EMR systems difficult to answer. The two demonstration studies provided unambiguous results and low percentages of missing responses. In addition, criterion validity was demonstrated for a majority of task-oriented questions. Their test-retest reliability was generally high, and the non-standard scale was found symmetric and ordinal. This questionnaire is relevant for clinical work and EMR systems, provides reliable and interpretable results, and may be used as part of any evaluation effort involving the clinician's perspective of an EMR system.
[Evaluation of Suicide Risk Levels in Hospitals: Validity and Reliability Tests].
Macagnino, Sandro; Steinert, Tilman; Uhlmann, Carmen
2018-05-01
Examination of in-hospital suicide risk levels concerning their validity and their reliability. The internal suicide risk levels were evaluated in a cross sectional study of in 163 inpatients. A reliability check was performed via determining interrater-reliability of senior physician, therapist and the responsible nurse. Within the scope of the validity check, we conducted analyses of criterion validity and construct validity. For the total sample an "acceptable" to "good" interrater-reliability (Kendalls W = .77) of suicide risk levels were obtained. Schizophrenic disorders showed the lowest values, for personality disorders we found the highest level of interrater-reliability. When examining the criterion validity, Item-9 of the BDI-II is substantial correlated to our suicide risk levels (ρ m = .54, p < .01). Within the scope of construct validity check, affective disorders showed the highest correlation (ρ = .77), compatible also with "convergent validity". They differed with schizophrenic disorders which showed the least concordance (ρ = .43). In-hospital suicide risk levels may represent an important contribution to the assessment of suicidal behavior of inpatients experiencing psychiatric treatment due to their overall good validity and reliability. © Georg Thieme Verlag KG Stuttgart · New York.
ERIC Educational Resources Information Center
Gillham, James; Woelfel, Joseph
1977-01-01
Describes the Galileo system of measurement operations including reliability and validity data. Illustrations of some of the relations between Galileo measures and traditional procedures are provided. (MH)
Educational testing validity and reliability in pharmacy and medical education literature.
Hoover, Matthew J; Jung, Rose; Jacobs, David M; Peeters, Michael J
2013-12-16
To evaluate and compare the reliability and validity of educational testing reported in pharmacy education journals to medical education literature. Descriptions of validity evidence sources (content, construct, criterion, and reliability) were extracted from articles that reported educational testing of learners' knowledge, skills, and/or abilities. Using educational testing, the findings of 108 pharmacy education articles were compared to the findings of 198 medical education articles. For pharmacy educational testing, 14 articles (13%) reported more than 1 validity evidence source while 83 articles (77%) reported 1 validity evidence source and 11 articles (10%) did not have evidence. Among validity evidence sources, content validity was reported most frequently. Compared with pharmacy education literature, more medical education articles reported both validity and reliability (59%; p<0.001). While there were more scholarship of teaching and learning (SoTL) articles in pharmacy education compared to medical education, validity, and reliability reporting were limited in the pharmacy education literature.
Lohrer, H; Nauck, T
2010-06-01
The VISA-A questionnaire is currently the only valid, reliable, and disease specific patient administered questionnaire for research in Achilles tendinopathy. To perform multinational and multilingual investigations this instrument was already adapted to several languages. According to the "guidelines for the process of cross-cultural adaptation of self-report measures" we already translated and validated the VISA-A questionnaire for patients with Achilles tendinopathy. To cross-culturally adapt and validate the VISA-A Questionnaire for German-speaking patients suffering from Haglund's disease. The VISA-A-G questionnaire was tested for reliability, validity, and internal consistency in 39 Haglund's disease patients and 79 asymptomatic persons. For concurrent validity the VISA-A-G was compared with the Curwin and Stanish tendon grading system and with the Percy and Conochie classification system for the effect of pain on athletic performance. VISA-A-G results in Haglund's disease were additionally compared with VISA-A-G results obtained from Achilles tendinopathy patients and with VISA-A results presented in the international literature. ICC for the VISA-A-G questionnaire in conservatively treated Haglund's disease patients was 0.96. In asymptomatic students and joggers ICC was 0.97 and 0.60. When correlated with the grading system of Curwin and Stanish and with the Percy and Conochie classification rho was -0.95 and 0.94, respectively. Internal consistency (Cronbach's alpha) for the total VISA-A-G scores of the patients was calculated to be 0.87. Compared with VISA-A-G results obtained from Achilles tendinopathy patients there was no relevant difference discernible. Compared with VISA-A results presented in the original publication no difference was found statistically for students, healthy people, conservative, and preoperative patients, respectively. This study confirms that the VISA-A-G is a valid and reliable measure for German-speaking patients suffering from Haglund's disease. Georg Thieme Verlag KG Stuttgart, New York.
Optimizing preventive maintenance policy: A data-driven application for a light rail braking system.
Corman, Francesco; Kraijema, Sander; Godjevac, Milinko; Lodewijks, Gabriel
2017-10-01
This article presents a case study determining the optimal preventive maintenance policy for a light rail rolling stock system in terms of reliability, availability, and maintenance costs. The maintenance policy defines one of the three predefined preventive maintenance actions at fixed time-based intervals for each of the subsystems of the braking system. Based on work, maintenance, and failure data, we model the reliability degradation of the system and its subsystems under the current maintenance policy by a Weibull distribution. We then analytically determine the relation between reliability, availability, and maintenance costs. We validate the model against recorded reliability and availability and get further insights by a dedicated sensitivity analysis. The model is then used in a sequential optimization framework determining preventive maintenance intervals to improve on the key performance indicators. We show the potential of data-driven modelling to determine optimal maintenance policy: same system availability and reliability can be achieved with 30% maintenance cost reduction, by prolonging the intervals and re-grouping maintenance actions.
Optimizing preventive maintenance policy: A data-driven application for a light rail braking system
Corman, Francesco; Kraijema, Sander; Godjevac, Milinko; Lodewijks, Gabriel
2017-01-01
This article presents a case study determining the optimal preventive maintenance policy for a light rail rolling stock system in terms of reliability, availability, and maintenance costs. The maintenance policy defines one of the three predefined preventive maintenance actions at fixed time-based intervals for each of the subsystems of the braking system. Based on work, maintenance, and failure data, we model the reliability degradation of the system and its subsystems under the current maintenance policy by a Weibull distribution. We then analytically determine the relation between reliability, availability, and maintenance costs. We validate the model against recorded reliability and availability and get further insights by a dedicated sensitivity analysis. The model is then used in a sequential optimization framework determining preventive maintenance intervals to improve on the key performance indicators. We show the potential of data-driven modelling to determine optimal maintenance policy: same system availability and reliability can be achieved with 30% maintenance cost reduction, by prolonging the intervals and re-grouping maintenance actions. PMID:29278245
Hamre, Charlotta; Botolfsen, Pernille; Tangen, Gro Gujord; Helbostad, Jorunn L
2017-04-20
The Balance Evaluation Systems Test (BESTest) was developed to assess underlying systems for balance control in order to be able to individually tailor rehabilitation interventions to people with balance disorders. A short form, the Mini-BESTest, was developed as a screening test. The study aimed to assess interrater and test-retest reliability of the Norwegian version of the BESTest and the Mini-BESTest in community-dwelling people with increased risk of falling and to assess concurrent validity with the Fall Efficacy Scale-International (FES-I), and it was an observational study with a cross-sectional design. Forty-two persons with increased risk of falling (elderly over 65 years of age, persons with a history of stroke or Multiple Sclerosis) were assessed twice by two raters. Relative reliability was analysed with Intraclass Correlation Coefficient (ICC), and absolute reliability with standard error of measurement (SEM) and smallest detectable change (SDC). Concurrent validity was assessed against the FES-I using Spearman's rho. The BESTest showed very good interrater reliability (ICC = 0.98, SEM = 1.79, SDC 95 = 5.0) and test-retest reliability (rater A/rater B = ICC = 0.89/0.89, SEM = 3.9/4.3, SDC 95 = 10.8/11.8). The Mini-BESTest also showed very good interrater reliability (ICC = 0.95, SEM = 1.19, SDC 95 = 3.3) and test-retest reliability (rater A/rater B = ICC = 0.85/0.84, SEM = 1.8/1.9, SDC 95 = 4.9/5.2). The correlations were moderate between the FES-I and both the BESTest and the Mini-BESTest (Spearman's rho -0.51 and-0.50, p < 0.01). The BESTest and its short form, the Mini-BESTest, showed very good interrater and test-retest reliability when assessed in a heterogeneous sample of people with increased risk of falling. The concurrent validity measured against the FES-I showed moderate correlation. The results are comparable with earlier studies and indicate that the Norwegian versions can be used in daily clinic and in research.
Reliability and validity of ten consumer activity trackers.
Kooiman, Thea J M; Dontje, Manon L; Sprenger, Siska R; Krijnen, Wim P; van der Schans, Cees P; de Groot, Martijn
2015-01-01
Activity trackers can potentially stimulate users to increase their physical activity behavior. The aim of this study was to examine the reliability and validity of ten consumer activity trackers for measuring step count in both laboratory and free-living conditions. Healthy adult volunteers (n = 33) walked twice on a treadmill (4.8 km/h) for 30 min while wearing ten different activity trackers (i.e. Lumoback, Fitbit Flex, Jawbone Up, Nike+ Fuelband SE, Misfit Shine, Withings Pulse, Fitbit Zip, Omron HJ-203, Yamax Digiwalker SW-200 and Moves mobile application). In free-living conditions, 56 volunteers wore the same activity trackers for one working day. Test-retest reliability was analyzed with the Intraclass Correlation Coefficient (ICC). Validity was evaluated by comparing each tracker with the gold standard (Optogait system for laboratory and ActivPAL for free-living conditions), using paired samples t-tests, mean absolute percentage errors, correlations and Bland-Altman plots. Test-retest analysis revealed high reliability for most trackers except for the Omron (ICC .14), Moves app (ICC .37) and Nike+ Fuelband (ICC .53). The mean absolute percentage errors of the trackers in laboratory and free-living conditions respectively, were: Lumoback (-0.2, -0.4), Fibit Flex (-5.7, 3.7), Jawbone Up (-1.0, 1.4), Nike+ Fuelband (-18, -24), Misfit Shine (0.2, 1.1), Withings Pulse (-0.5, -7.9), Fitbit Zip (-0.3, 1.2), Omron (2.5, -0.4), Digiwalker (-1.2, -5.9), and Moves app (9.6, -37.6). Bland-Altman plots demonstrated that the limits of agreement varied from 46 steps (Fitbit Zip) to 2422 steps (Nike+ Fuelband) in the laboratory condition, and 866 steps (Fitbit Zip) to 5150 steps (Moves app) in the free-living condition. The reliability and validity of most trackers for measuring step count is good. The Fitbit Zip is the most valid whereas the reliability and validity of the Nike+ Fuelband is low.
Mantarova, Stefka G; Velcheva, Irena V; Georgieva, Spaska O; Stambolieva, Katerina I
2013-01-01
The last twenty years have witnessed a surge of interest in the autonomic symptoms in Parkinson's disease (PD) and the possibilities to diagnose and treat them. The specialized questionnaire assessing the autonomic symptoms in Parkinson's disease (SCOPA-AUT) has been validated and available in English, Dutch and Spanish. In this study we aim at evaluating the validity, reliability and applicability of the Bulgarian version of SCOPA-AUT (SCOPA-AUT-BG). The study included 55 patients with idiopathic PD (mean age 64.4 +/- 8.9 yrs), and 40 healthy controls (mean age 58.5 +/- 9.4 yrs). Clinical severity and disease stage were assessed by United Parkinson's disease rating scale (UPRDS) and Hoen and Yahr (H&Y). Thirty-two of the PD patients completed SCOPA-AUT-BG again after a 7-day interval. Questionnaire reliability was analyzed by determining the internal consistency, homogeneity, discriminatory and construct validity and test-retest reliability. Analyses showed good internal consistency of the summary evaluation of SCOPA-AUT-BG (coefficient alpha of Cronbach = 0.79), which indicates the high reliability of the questionnaire. The lowest Cronbach's alpha coefficient (0.53) was found for the subscale "cardiovascular functions". A dominant role belongs to the subscales for gastrointestinal and urinary functions (Cronbach's Alpha > 0.7), where a significantly high correlation of PD with the UPDRS scale was observed. We found high test-retest reliability based on the responses associated with dysfunction of the gastrointestinal, urinary, thermoregulatory and pupillary autonomic systems. The correlation of the results of SCOPA-AUT-BG with UPDRS is higher than that with H&Y, and the construct validity is high except for the cardiovascular and pupillomotor functions subscales. The results of this study show that SCOPA-AUT-BG is a valid and reliable specialized questionnaire to evaluate autonomic function in patients with Parkinson's disease. Using it allows for more detailed clinical evaluation of these patients and justifies the need to refer them to specialized examination of autonomic functions.
Sung, Ki Hyuk; Kwon, Soon-Sun; Narayanan, Unni G; Chung, Chin Youb; Lee, Kyoung Min; Lee, Seung Yeol; Lee, Damian J; Park, Moon Seok
2015-01-01
The aim of this study was to translate and transculturally adapt the Caregiver Priorities & Child Health Index of Life with Disabilities (CPCHILD) questionnaire into Korean language, and to test the reliability and validity, including the internal consistency, known-group validity and factor analysis of the Korean version of the CPCHILD. A Korean version of CPCHILD was produced according to internationally accepted guidelines. For validity testing, 194 consecutive parents or caregivers of children with cerebral palsy (CP) were recruited and completed the questionnaire. Internal consistency, test-retest reliability, and known-groups validity were evaluated and factor analysis was performed to validate the Korean version of the CPCHILD. In terms of internal consistency, a Cronbach's alpha was above 0.90 in all domains of the CPCHILD (range 0.921 to 0.966), except the 5th domain (0.628). In terms of known-groups validity, the total score of the CPCHILD was significantly different according to the Gross Motor Function Classification System (GMFCS) level (p < 0.001). Intra-class correlation coefficient spanned from 0.517 to 0.801. Factor analysis showed that the five-factor solution of the CPCHILD explained 76.7% of the variance with 59.0, 6.5, 5.1, 4.2 and 3.2% of variance by each components number. The Korean version of CPCHILD was found to be a reliable and valid questionnaire of caregivers' perspectives on the health-related quality of life in severely affected children with CP. However, the Korean version of CPCHILD contains some redundant items, and factor analysis suggested a five-domain questionnaire. Implication for Rehabilitation The Korean version of CPCHILD is a reliable, internally consistent, valid instrument for assessing the health-related quality of life in severely affected children with CP from the perspective of caregivers. After the transcultural adaptation and validation of the Korean CPCHILD, it can be reliably used in clinical and research settings to evaluate the health-related quality of life in Korean patients with CP.
Translation and validation of the Dutch new Knee Society Scoring System ©.
Van Der Straeten, Catherine; Witvrouw, Erik; Willems, Tine; Bellemans, Johan; Victor, Jan
2013-11-01
A new version of The Knee Society Knee Scoring System(©) (KSS) has recently been developed. Before this scale can be used in non-English-speaking populations, it has to be translated and validated for a particular population. We evaluated the construct and content validity, the test-retest reliability, and the internal consistency of the Dutch version of the New Knee Society KSS. A Dutch translation was performed using a forward-backward translation protocol. We tested the construct validity of the Dutch New KSS by comparing it with the Dutch versions of the WOMAC, Knee Injury and Osteoarthritis Outcome Score (KOOS), and SF-12 scores in 137 patients undergoing total knee arthroplasty (TKA). Content validity was assessed by comparing pre- and postoperative scores and by checking floor and ceiling effects. To evaluate test-retest reliability and consistency, 47 patients completed the questionnaire a second time with a mean of 8 days interval (range, 2-20 days) between tests. Construct validity was demonstrated because the Dutch New KSS correlated well with the Dutch WOMAC (r = -0.751; p < 0.001), Dutch KOOS (r = -0.723; p < 0.001), and Dutch SF-12 (r = 0.569; p < 0.001). There was a significant difference between pre- and postoperative scores (p < 0.001) in line with the other scores. Test-retest reliability proved excellent with an intraclass correlation coefficient between 0.73 and 0.92 depending on the domain tested. Consistency as indicated by Cronbach's alpha ranging from 0.84 to 0.96 was good to excellent. As demonstrated by the validation procedure, the Dutch New KSS is an excellent instrument to evaluate TKA outcome in Dutch-speaking patients.
A self-report measure of legal and administrative aggression within intimate relationships.
Hines, Denise A; Douglas, Emily M; Berger, Joshua L
2015-01-01
Although experts agree that intimate partner violence (IPV) is a multidimensional phenomenon comprised of both physical and non-physical acts, there is no measure of legal and administrative (LA) forms of IPV. LA aggression is when one partner manipulates the legal and other administrative systems to the detriment of his/her partner. Our measure was developed using the qualitative literature on male IPV victims' experiences. We tested the reliability and validity of our LA aggression measure on two samples of men: 611 men who sustained IPV and sought help, and 1,601 men in a population-based sample. Construct validity of the victimization scale was supported through factor analyses, correlations with other forms of IPV victimization, and comparisons of the rates of LA aggression between the two samples; reliability was established through Cronbach's alpha. Evidence for the validity and reliability of the perpetration scale was mixed and therefore needs further analyses and revisions before we can recommend its use in empirical work. There is initial support for the victimization scale as a valid and reliable measure of LA aggression victimization among men, but work is needed using women's victimization's experiences to establish reliability and validity of this measure for women. An LA aggression measure should be developed using LGBTQ victims' experiences, and for couples who are well into the divorce and child custody legal process. Legal personnel and practitioners should be educated on this form of IPV so that they can appropriately work with clients who have been victimized or perpetrate LA aggression. © 2014 Wiley Periodicals, Inc.
Park, Young-Jae; Lee, Jin-Moo; Yoo, Seung-Yeon; Park, Young-Bae
2016-04-01
To examine whether color parameters of tongue inspection (TI) using a digital camera was reliable and valid, and to examine which color parameters serve as predictors of symptom patterns in terms of East Asian medicine (EAM). Two hundred female subjects' tongue substances were photographed by a mega-pixel digital camera. Together with the photographs, the subjects were asked to complete Yin deficiency, Phlegm pattern, and Cold-Heat pattern questionnaires. Using three sets of digital imaging software, each digital image was exposure- and white balance-corrected, and finally L* (luminance), a* (red-green balance), and b* (yellow-blue balance) values of the tongues were calculated. To examine intra- and inter-rater reliabilities and criterion validity of the color analysis method, three raters were asked to calculate color parameters for 20 digital image samples. Finally, four hierarchical regression models were formed. Color parameters showed good or excellent reliability (0.627-0.887 for intra-class correlation coefficients) and significant criterion validity (0.523-0.718 for Spearman's correlation). In the hierarchical regression models, age was a significant predictor of Yin deficiency (β = 0.192), and b* value of the tip of the tongue was a determinant predictor of Yin deficiency, Phlegm, and Heat patterns (β = - 0.212, - 0.172, and - 0.163). Luminance (L*) was predictive of Yin deficiency (β = -0.172) and Cold (β = 0.173) pattern. Our results suggest that color analysis of the tongue using the L*a*b* system is reliable and valid, and that color parameters partially serve as symptom pattern predictors in EAM practice.
A formal approach to validation and verification for knowledge-based control systems
NASA Technical Reports Server (NTRS)
Castore, Glen
1987-01-01
As control systems become more complex in response to desires for greater system flexibility, performance and reliability, the promise is held out that artificial intelligence might provide the means for building such systems. An obstacle to the use of symbolic processing constructs in this domain is the need for verification and validation (V and V) of the systems. Techniques currently in use do not seem appropriate for knowledge-based software. An outline of a formal approach to V and V for knowledge-based control systems is presented.
Validation Test Results for Orthogonal Probe Eddy Current Thruster Inspection System
NASA Technical Reports Server (NTRS)
Wincheski, Russell A.
2007-01-01
Recent nondestructive evaluation efforts within NASA have focused on an inspection system for the detection of intergranular cracking originating in the relief radius of Primary Reaction Control System (PCRS) Thrusters. Of particular concern is deep cracking in this area which could lead to combustion leakage in the event of through wall cracking from the relief radius into an acoustic cavity of the combustion chamber. In order to reliably detect such defects while ensuring minimal false positives during inspection, the Orthogonal Probe Eddy Current (OPEC) system has been developed and an extensive validation study performed. This report describes the validation procedure, sample set, and inspection results as well as comparing validation flaws with the response from naturally occuring damage.
Barrett, Eva; McCreesh, Karen; Lewis, Jeremy
2014-02-01
A wide array of instruments are available for non-invasive thoracic kyphosis measurement. Guidelines for selecting outcome measures for use in clinical and research practice recommend that properties such as validity and reliability are considered. This systematic review reports on the reliability and validity of non-invasive methods for measuring thoracic kyphosis. A systematic search of 11 electronic databases located studies assessing reliability and/or validity of non-invasive thoracic kyphosis measurement techniques. Two independent reviewers used a critical appraisal tool to assess the quality of retrieved studies. Data was extracted by the primary reviewer. The results were synthesized qualitatively using a level of evidence approach. 27 studies satisfied the eligibility criteria and were included in the review. The reliability, validity and both reliability and validity were investigated by sixteen, two and nine studies respectively. 17/27 studies were deemed to be of high quality. In total, 15 methods of thoracic kyphosis were evaluated in retrieved studies. All investigated methods showed high (ICC ≥ .7) to very high (ICC ≥ .9) levels of reliability. The validity of the methods ranged from low to very high. The strongest levels of evidence for reliability exists in support of the Debrunner kyphometer, Spinal Mouse and Flexicurve index, and for validity supports the arcometer and Flexicurve index. Further reliability and validity studies are required to strengthen the level of evidence for the remaining methods of measurement. This should be addressed by future research. Copyright © 2013 Elsevier Ltd. All rights reserved.
Janssen, Ellen M; Marshall, Deborah A; Hauber, A Brett; Bridges, John F P
2017-12-01
The recent endorsement of discrete-choice experiments (DCEs) and other stated-preference methods by regulatory and health technology assessment (HTA) agencies has placed a greater focus on demonstrating the validity and reliability of preference results. Areas covered: We present a practical overview of tests of validity and reliability that have been applied in the health DCE literature and explore other study qualities of DCEs. From the published literature, we identify a variety of methods to assess the validity and reliability of DCEs. We conceptualize these methods to create a conceptual model with four domains: measurement validity, measurement reliability, choice validity, and choice reliability. Each domain consists of three categories that can be assessed using one to four procedures (for a total of 24 tests). We present how these tests have been applied in the literature and direct readers to applications of these tests in the health DCE literature. Based on a stakeholder engagement exercise, we consider the importance of study characteristics beyond traditional concepts of validity and reliability. Expert commentary: We discuss study design considerations to assess the validity and reliability of a DCE, consider limitations to the current application of tests, and discuss future work to consider the quality of DCEs in healthcare.
Nikolaidis, Pantelis T; Clemente, Filipe M; van der Linden, Cornelis M I; Rosemann, Thomas; Knechtle, Beat
2018-01-01
The objectives of the present study were to examine the validity and reliability of the 10 Hz Johan GPS unit in assessing in-line movement and change of direction. The validity was tested against the criterion measure of 200 m track-and-field (track-and-field athletes, n = 8) and 20 m shuttle run endurance test (female soccer players, n = 20). Intra-unit and inter-unit reliability was tested by intra-class correlation coefficient (ICC) and coefficient of variation (CV), respectively. An analysis of variance examined differences between the GPS measurement and five laps of 200 m at 15 km/h, and t -test examined differences between the GPS measurement and 20 m shuttle run endurance test. The difference between the GPS measurement and 200 m distance ranged from -0.13 ± 3.94 m (95% CI -3.42; 3.17) in the first lap to 2.13 ± 2.64 m (95% CI -0.08; 4.33) in the fifth lap. A good intra-unit reliability was observed in 200 m (ICC = 0.833, 95% CI 0.535; 0.962). Inter-unit CV ranged from 1.31% (fifth lap) to 2.20% (third lap). The difference between the GPS measurement and 20 m shuttle run endurance test ranged from 0.33 ± 4.16 m (95% CI -10.01; 10.68) in 11.5 km/h to 9.00 ± 5.30 m (95% CI 6.44; 11.56) in 8.0 km/h. A moderate intra-unit reliability was shown in the second and third stage of the 20 m shuttle run endurance test (ICC = 0.718, 95% CI 0.222;0.898) and good reliability in the fifth, sixth, seventh and eighth (ICC = 0.831, 95% CI -0.229;0.996). Inter-unit CV ranged from 2.08% (11.5 km/h) to 3.92% (8.5 km/h). Based on these findings, it was concluded that the 10 Hz Johan system offers an affordable valid and reliable tool for coaches and fitness trainers to monitor training and performance.
Statistical modeling of software reliability
NASA Technical Reports Server (NTRS)
Miller, Douglas R.
1992-01-01
This working paper discusses the statistical simulation part of a controlled software development experiment being conducted under the direction of the System Validation Methods Branch, Information Systems Division, NASA Langley Research Center. The experiment uses guidance and control software (GCS) aboard a fictitious planetary landing spacecraft: real-time control software operating on a transient mission. Software execution is simulated to study the statistical aspects of reliability and other failure characteristics of the software during development, testing, and random usage. Quantification of software reliability is a major goal. Various reliability concepts are discussed. Experiments are described for performing simulations and collecting appropriate simulated software performance and failure data. This data is then used to make statistical inferences about the quality of the software development and verification processes as well as inferences about the reliability of software versions and reliability growth under random testing and debugging.
Cohen, Alysia; McDonald, Samantha; McIver, Kerry; Pate, Russell; Trost, Stewart
2014-05-01
The purpose of this study was to evaluate the validity and interrater reliability of the Observational System for Recording Activity in Children: Youth Sports (OSRAC:YS). Children (N = 29) participating in a parks and recreation soccer program were observed during regularly scheduled practices. Physical activity (PA) intensity and contextual factors were recorded by momentary time-sampling procedures (10-second observe, 20-second record). Two observers simultaneously observed and recorded children's PA intensity, practice context, social context, coach behavior, and coach proximity. Interrater reliability was based on agreement (Kappa) between the observer's coding for each category, and the Intraclass Correlation Coefficient (ICC) for percent of time spent in MVPA. Validity was assessed by calculating the correlation between OSRAC:YS estimated and objectively measured MVPA. Kappa statistics for each category demonstrated substantial to almost perfect interobserver agreement (Kappa = 0.67-0.93). The ICC for percent time in MVPA was 0.76 (95% C.I. = 0.49-0.90). A significant correlation (r = .73) was observed for MVPA recorded by observation and MVPA measured via accelerometry. The results indicate the OSRAC:YS is a reliable and valid tool for measuring children's PA and contextual factors during a youth soccer practice.
Federal Register 2010, 2011, 2012, 2013, 2014
2012-09-13
...] Food and Drug Administration/American Glaucoma Society Workshop on the Validity, Reliability, and... entitled ``FDA/American Glaucoma Society (AGS) Workshop on the Validity, Reliability, and Usability of... research. The purpose of this public workshop is to provide a forum for discussing the validity...
Hulteen, Ryan M; Lander, Natalie J; Morgan, Philip J; Barnett, Lisa M; Robertson, Samuel J; Lubans, David R
2015-10-01
It has been suggested that young people should develop competence in a variety of 'lifelong physical activities' to ensure that they can be active across the lifespan. The primary aim of this systematic review is to report the methodological properties, validity, reliability, and test duration of field-based measures that assess movement skill competency in lifelong physical activities. A secondary aim was to clearly define those characteristics unique to lifelong physical activities. A search of four electronic databases (Scopus, SPORTDiscus, ProQuest, and PubMed) was conducted between June 2014 and April 2015 with no date restrictions. Studies addressing the validity and/or reliability of lifelong physical activity tests were reviewed. Included articles were required to assess lifelong physical activities using process-oriented measures, as well as report either one type of validity or reliability. Assessment criteria for methodological quality were adapted from a checklist used in a previous review of sport skill outcome assessments. Movement skill assessments for eight different lifelong physical activities (badminton, cycling, dance, golf, racquetball, resistance training, swimming, and tennis) in 17 studies were identified for inclusion. Methodological quality, validity, reliability, and test duration (time to assess a single participant), for each article were assessed. Moderate to excellent reliability results were found in 16 of 17 studies, with 71% reporting inter-rater reliability and 41% reporting intra-rater reliability. Only four studies in this review reported test-retest reliability. Ten studies reported validity results; content validity was cited in 41% of these studies. Construct validity was reported in 24% of studies, while criterion validity was only reported in 12% of studies. Numerous assessments for lifelong physical activities may exist, yet only assessments for eight lifelong physical activities were included in this review. Generalizability of results may be more applicable if more heterogeneous samples are used in future research. Moderate to excellent levels of inter- and intra-rater reliability were reported in the majority of studies. However, future work should look to establish test-retest reliability. Validity was less commonly reported than reliability, and further types of validity other than content validity need to be established in future research. Specifically, predictive validity of 'lifelong physical activity' movement skill competency is needed to support the assertion that such activities provide the foundation for a lifetime of activity.
Validity and Reliability of Turkish Male Breast Self-Examination Instrument.
Erkin, Özüm; Göl, İlknur
2018-04-01
This study aims to measure the validity and reliability of Turkish male breast self-examination (MBSE) instrument. The methodological study was performed in 2016 at Ege University, Faculty of Nursing, İzmir, Turkey. The MBSE includes ten steps. For validity studies, face validity, content validity, and construct validity (exploratory factor analysis) were done. For reliability study, Kuder Richardson was calculated. The content validity index was found to be 0.94. Kendall W coefficient was 0.80 (p=0.551). The total variance explained by the two factors was found to be 63.24%. Kuder Richardson 21 was done for reliability study and found to be 0.97 for the instrument. The final instrument included 10 steps and two stages. The Turkish version of MBSE is a valid and reliable instrument for early diagnose. The MBSE can be used in Turkish speaking countries and cultures with two stages and 10 steps.
Koontz, Alicia M.; Lin, Yen-Sheng; Kankipati, Padmaja; Boninger, Michael L.; Cooper, Rory A.
2017-01-01
This study describes a new custom measurement system designed to investigate the biomechanics of sitting-pivot wheelchair transfers and assesses the reliability of selected biomechanical variables. Variables assessed include horizontal and vertical reaction forces underneath both hands and three-dimensional trunk, shoulder, and elbow range of motion. We examined the reliability of these measures between 5 consecutive transfer trials for 5 subjects with spinal cord injury and 12 non-disabled subjects while they performed a self-selected sitting pivot transfer from a wheelchair to a level bench. A majority of the biomechanical variables demonstrated moderate to excellent reliability (r > 0.6). The transfer measurement system recorded reliable and valid biomechanical data for future studies of sitting-pivot wheelchair transfers. We recommend a minimum of five transfer trials to obtain a reliable measure of transfer technique for future studies. PMID:22068376
Mayo, Ann M
2015-01-01
It is important for CNSs and other APNs to consider the reliability and validity of instruments chosen for clinical practice, evidence-based practice projects, or research studies. Psychometric testing uses specific research methods to evaluate the amount of error associated with any particular instrument. Reliability estimates explain more about how well the instrument is designed, whereas validity estimates explain more about scores that are produced by the instrument. An instrument may be architecturally sound overall (reliable), but the same instrument may not be valid. For example, if a specific group does not understand certain well-constructed items, then the instrument does not produce valid scores when used with that group. Many instrument developers may conduct reliability testing only once, yet continue validity testing in different populations over many years. All CNSs should be advocating for the use of reliable instruments that produce valid results. Clinical nurse specialists may find themselves in situations where reliability and validity estimates for some instruments that are being utilized are unknown. In such cases, CNSs should engage key stakeholders to sponsor nursing researchers to pursue this most important work.
Gray, Aaron D; Willis, Brad W; Skubic, Marjorie; Huo, Zhiyu; Razu, Swithin; Sherman, Seth L; Guess, Trent M; Jahandar, Amirhossein; Gulbrandsen, Trevor R; Miller, Scott; Siesener, Nathan J
Noncontact anterior cruciate ligament (ACL) injury in adolescent female athletes is an increasing problem. The knee-ankle separation ratio (KASR), calculated at initial contact (IC) and peak flexion (PF) during the drop vertical jump (DVJ), is a measure of dynamic knee valgus. The Microsoft Kinect V2 has shown promise as a reliable and valid marker-less motion capture device. The Kinect V2 will demonstrate good to excellent correlation between KASR results at IC and PF during the DVJ, as compared with a "gold standard" Vicon motion analysis system. Descriptive laboratory study. Level 2. Thirty-eight healthy volunteer subjects (20 male, 18 female) performed 5 DVJ trials, simultaneously measured by a Vicon MX-T40S system, 2 AMTI force platforms, and a Kinect V2 with customized software. A total of 190 jumps were completed. The KASR was calculated at IC and PF during the DVJ. The intraclass correlation coefficient (ICC) assessed the degree of KASR agreement between the Kinect and Vicon systems. The ICCs of the Kinect V2 and Vicon KASR at IC and PF were 0.84 and 0.95, respectively, showing excellent agreement between the 2 measures. The Kinect V2 successfully identified the KASR at PF and IC frames in 182 of 190 trials, demonstrating 95.8% reliability. The Kinect V2 demonstrated excellent ICC of the KASR at IC and PF during the DVJ when compared with the Vicon system. A customized Kinect V2 software program demonstrated good reliability in identifying the KASR at IC and PF during the DVJ. Reliable, valid, inexpensive, and efficient screening tools may improve the accessibility of motion analysis assessment of adolescent female athletes.
Smart Water Conservation System for Irrigated Landscape. ESTCP Cost and Performance Report
2016-10-01
water use by as much as 70% in support of meeting EO 13693. Additional performance objectives were to validate energy reduction, cost effectiveness ...Additional performance objectives were to validate energy reduction, cost effectiveness , and system reliability while maintaining satisfactory plant health...developments. The demonstration was conducted for two different climatic regions in the southwestern part of the United States (U.S.), where a typical
ERIC Educational Resources Information Center
Sandilos, Lia E.
2012-01-01
The purpose of the current study was to evaluate the structural validity and stability of scores on a measure of global classroom quality, the Classroom Assessment Scoring System, Kindergarten-Third Grade (CLASS K-3; Pianta, La Paro, & Hamre, 2008). Using data from a sample of 417 kindergarten classrooms in the rural Southern and Mid-Atlantic…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Carter, R.J.
1997-04-01
The primary purpose of the "modification and validation of an automotive data processing unit (DPU), compressed video system, and communications equipment" cooperative research and development agreement (CRADA) was to modify and validate both hardware and software, developed by Scientific Atlanta, Incorporated (S-A) for defense applications (e.g., rotary-wing airplanes), for the commercial sector surface transportation domain (i.e., automobiles and trucks). S-A also furnished a state-of-the-art compressed video digital storage and retrieval system (CVDSRS), and off-the-shelf data storage and transmission equipment to support the data acquisition system for crash avoidance research (DASCAR) project conducted by Oak Ridge National Laboratory (ORNL). In turn,more » S-A received access to hardware and technology related to DASCAR. DASCAR was subsequently removed completely and installation was repeated a number of times to gain an accurate idea of complete installation, operation, and removal of DASCAR. Upon satisfactory completion of the DASCAR construction and preliminary shakedown, ORNL provided NHTSA with an operational demonstration of DASCAR at their East Liberty, OH test facility. The demonstration included an on-the-road demonstration of the entire data acquisition system using NHTSA'S test track. In addition, the demonstration also consisted of a briefing, containing the following: ORNL generated a plan for validating the prototype data acquisition system with regard to: removal of DASCAR from an existing vehicle, and installation and calibration in other vehicles; reliability of the sensors and systems; data collection and transmission process (data integrity); impact on the drivability of the vehicle and obtrusiveness of the system to the driver; data analysis procedures; conspicuousness of the vehicle to other drivers; and DASCAR installation and removal training and documentation. In order to identify any operational problems not captured by the systems testing and evaluation, the validation plan also addressed a short-term pilot research program to manipulate DASCAR under operational conditions using "naive" drivers. The effort exercised the fill capabilities of the data acquisition system. ORNL subsequently evaluated and pilot tested the data acquisition system using the validation plan. The plan was implemented in full at the NHTSA East Liberty, OH test facility, and was carried out as a cooperative effort with the Vehicle Research and Test Center staff. ORNL determined the reliability of the sensors and systems by exercising DASCAR For one vehicle type, ORNL evaluated systems reliability over a continuous period of 30 days with particular attention paid to maintenance of calibration and data integrity.« less
Loeding, B L; Greenan, J P
1998-12-01
The study examined the validity and reliability of four assessments, with three instruments per domain. Domains included generalizable mathematics, communication, interpersonal relations, and reasoning skills. Participants were deaf, legally blind, or visually impaired students enrolled in vocational classes at residential secondary schools. The researchers estimated the internal consistency reliability, test-retest reliability, and construct validity correlations of three subinstruments: student self-ratings, teacher ratings, and performance assessments. The data suggest that these instruments are highly internally consistent measures of generalizable vocational skills. Four performance assessments have high-to-moderate test-retest reliability estimates, and were generally considered to possess acceptable validity and reliability.
Internal Consistency, Retest Reliability, and their Implications For Personality Scale Validity
McCrae, Robert R.; Kurtz, John E.; Yamagata, Shinji; Terracciano, Antonio
2010-01-01
We examined data (N = 34,108) on the differential reliability and validity of facet scales from the NEO Inventories. We evaluated the extent to which (a) psychometric properties of facet scales are generalizable across ages, cultures, and methods of measurement; and (b) validity criteria are associated with different forms of reliability. Composite estimates of facet scale stability, heritability, and cross-observer validity were broadly generalizable. Two estimates of retest reliability were independent predictors of the three validity criteria; none of three estimates of internal consistency was. Available evidence suggests the same pattern of results for other personality inventories. Internal consistency of scales can be useful as a check on data quality, but appears to be of limited utility for evaluating the potential validity of developed scales, and it should not be used as a substitute for retest reliability. Further research on the nature and determinants of retest reliability is needed. PMID:20435807
Validity and Inter-Rater Reliability of a Novel Bedside Referral Tool for Spasticity
2018-02-20
Spasticity, Muscle; Muscular Diseases; Musculoskeletal Disease; Muscle Hypertonia; Muscle Spasticity; Neuromuscular Manifestations; Signs and Symptoms; Nervous System Diseases; Neurologic Manifestations
Bock, Astrid; Huber, Eva; Peham, Doris; Benecke, Cord
2015-01-01
The development (Study 1) and validation (Study 2) of a categorical system for the attribution of facial expressions of negative emotions to specific functions. The facial expressions observed inOPDinterviews (OPD-Task-Force 2009) are coded according to the Facial Action Coding System (FACS; Ekman et al. 2002) and attributed to categories of basic emotional displays using EmFACS (Friesen & Ekman 1984). In Study 1 we analyze a partial sample of 20 interviews and postulate 10 categories of functions that can be arranged into three main categories (interactive, self and object). In Study 2 we rate the facial expressions (n=2320) from the OPD interviews (10 minutes each interview) of 80 female subjects (16 healthy, 64 with DSM-IV diagnosis; age: 18-57 years) according to the categorical system and correlate them with problematic relationship experiences (measured with IIP,Horowitz et al. 2000). Functions of negative facial expressions can be attributed reliably and validly with the RFE-Coding System. The attribution of interactive, self-related and object-related functions allows for a deeper understanding of the emotional facial expressions of patients with mental disorders.
NASA Technical Reports Server (NTRS)
Jacklin, Stephen; Schumann, Johann; Gupta, Pramod; Richard, Michael; Guenther, Kurt; Soares, Fola
2005-01-01
Adaptive control technologies that incorporate learning algorithms have been proposed to enable automatic flight control and vehicle recovery, autonomous flight, and to maintain vehicle performance in the face of unknown, changing, or poorly defined operating environments. In order for adaptive control systems to be used in safety-critical aerospace applications, they must be proven to be highly safe and reliable. Rigorous methods for adaptive software verification and validation must be developed to ensure that control system software failures will not occur. Of central importance in this regard is the need to establish reliable methods that guarantee convergent learning, rapid convergence (learning) rate, and algorithm stability. This paper presents the major problems of adaptive control systems that use learning to improve performance. The paper then presents the major procedures and tools presently developed or currently being developed to enable the verification, validation, and ultimate certification of these adaptive control systems. These technologies include the application of automated program analysis methods, techniques to improve the learning process, analytical methods to verify stability, methods to automatically synthesize code, simulation and test methods, and tools to provide on-line software assurance.
Aziz, M M; Galal, M A A; Elzohri, M H; El-Nouby, F; Leong, K P
2018-04-01
Systemic lupus erythematosus (SLE) is a chronic autoimmune disease which affects all aspects of quality of life (QoL) of the patients. Comprehensive patient assessment should include QoL measures in addition to the objective clinical measures of the disease. There is no specific Arabic instrument for assessment of QoL of SLE patients. The objective of this study was to translate and cross culturally adapt the SLEQOL questionnaire into Arabic and test its reliability and validity. The SLEQOL questionnaire was translated into Arabic based on the Guidelines for Translation and Cross-cultural Adaptation into other languages. Reliability was assessed by interviewing patients three times: two interviews on the same day by different interviewers and the third interview 14 days later by one of the first interviewers. Validity was assessed by correlating SLEQOL scores of 91 patients with 36-item Short Form Health Survey (SF-36) scores and clinical parameters of the patients. We found that the Arabic version of SLEQOL has a Cronbach's alpha of 0.936, interobserver and intraobserver correlation coefficients of 0.809 and 0.886 respectively. Strong correlations were also found between SLEQOL scores and SF-36 Physical and Mental Component summaries. In conclusion, the Arabic version of SLEQOL is a reliable and valid instrument for measuring QoL of Egyptian SLE patients.
Hotchkiss, David R; Aqil, Anwer; Lippeveld, Theo; Mukooyo, Edward
2010-07-03
Sound policy, resource allocation and day-to-day management decisions in the health sector require timely information from routine health information systems (RHIS). In most low- and middle-income countries, the RHIS is viewed as being inadequate in providing quality data and continuous information that can be used to help improve health system performance. In addition, there is limited evidence on the effectiveness of RHIS strengthening interventions in improving data quality and use. The purpose of this study is to evaluate the usefulness of the newly developed Performance of Routine Information System Management (PRISM) framework, which consists of a conceptual framework and associated data collection and analysis tools to assess, design, strengthen and evaluate RHIS. The specific objectives of the study are: a) to assess the reliability and validity of the PRISM instruments and b) to assess the validity of the PRISM conceptual framework. Facility- and worker-level data were collected from 110 health care facilities in twelve districts in Uganda in 2004 and 2007 using records reviews, structured interviews and self-administered questionnaires. The analysis procedures include Cronbach's alpha to assess internal consistency of selected instruments, test-retest analysis to assess the reliability and sensitivity of the instruments, and bivariate and multivariate statistical techniques to assess validity of the PRISM instruments and conceptual framework. Cronbach's alpha analysis suggests high reliability (0.7 or greater) for the indices measuring a promotion of a culture of information, RHIS tasks self-efficacy and motivation. The study results also suggest that a promotion of a culture of information influences RHIS tasks self-efficacy, RHIS tasks competence and motivation, and that self-efficacy and the presence of RHIS staff have a direct influence on the use of RHIS information, a key aspect of RHIS performance. The study results provide some empirical support for the reliability and validity of the PRISM instruments and the validity of the PRISM conceptual framework, suggesting that the PRISM approach can be effectively used by RHIS policy makers and practitioners to assess the RHIS and evaluate RHIS strengthening interventions. However, additional studies with larger sample sizes are needed to further investigate the value of the PRISM instruments in exploring the linkages between RHIS data quality and use, and health systems performance.
Study on the Validity and Reliability of Melbourne Decision Making Scale in Turkey
ERIC Educational Resources Information Center
Çolakkadioglu, Oguzhan; Deniz, M. Engin
2015-01-01
This study is to analyze the validity and reliability of Melbourne Decision Making Questionnaire (MDMQ). The sample consisted of 650 university students. The structural validity of the MDMQ, as well as correlations among its sub-scales, measure-bound validity, internal consistency, item total correlations and test-retest reliability coefficients…
A Model for Estimating the Reliability and Validity of Criterion-Referenced Measures.
ERIC Educational Resources Information Center
Edmonston, Leon P.; Randall, Robert S.
A decision model designed to determine the reliability and validity of criterion referenced measures (CRMs) is presented. General procedures which pertain to the model are discussed as to: Measures of relationship, Reliability, Validity (content, criterion-oriented, and construct validation), and Item Analysis. The decision model is presented in…
Children's Reaction to Types of Television. Technical Report No. 28.
ERIC Educational Resources Information Center
Hines, Brainard W.
An observational system having high inter-rater reliability and providing a reliable estimate of patterns of behavior across time periods is developed and tested for use in evaluating children's responses to a number of television styles and modes of presentation. This project was designed to meet three goals: first, to develop a valid and…
A method of measuring three-dimensional scapular attitudes using the optotrak probing system.
Hébert, L J; Moffet, H; McFadyen, B J; St-Vincent, G
2000-01-01
To develop a method to obtain accurate three-dimensional scapular attitudes and to assess their concurrent validity and reliability. In this methodological study, the three-dimensional scapular attitudes were calculated in degrees, using a rotation matrix (cyclic Cardanic sequence), from spatial coordinates obtained with the probing of three non colinear landmarks first on an anatomical model and second on a healthy subject. Although abnormal movement of the scapula is related to shoulder impingement syndrome, it is not clearly understood whether or not scapular motion impairment is a predisposing factor. Characterization of three-dimensional scapular attitudes in planes and at joint angles for which sub-acromial impingement is more likely to occur is not known. The Optotrak probing system was used. An anatomical model of the scapula was built and allowed us to impose scapular attitudes of known direction and magnitude. A local coordinate reference system was defined with three non colinear anatomical landmarks to assess accuracy and concurrent validity of the probing method with fixed markers. Axial rotation angles were calculated from a rotation matrix using a cyclic Cardanic sequence of rotations. The same three non colinear body landmarks were digitized on one healthy subject and the three dimensional scapular attitudes obtained were compared between sessions in order to assess the reliability. The measure of three dimensional scapular attitudes calculated from data using the Optotrak probing system was accurate with means of the differences between imposed and calculated rotation angles ranging from 1.5 degrees to 4.2 degrees. Greatest variations were observed around the third axis of the Cardanic sequence associated with posterior-anterior transverse rotations. The mean difference between the Optotrak probing system method and fixed markers was 1.73 degrees showing a good concurrent validity. Differences between the two methods were generally very low for one and two direction displacements and the largest discrepancies were observed for imposed displacements combining movement about the three axes. The between sessions variation of three dimensional scapular attitudes was less than 10% for most of the arm positions adopted by a healthy subject suggesting a good reliability. The Optotrak probing system used with a standardized protocol lead to accurate, valid and reliable measures of scapular attitudes. Although abnormal range of motion of the scapula is often related to shoulder pathologies, reliable outcome measures to quantify three-dimensional scapular motion on subjects are not available. It is important to establish a standardized protocol to characterize three-dimensional scapular motion on subjects using a method for which the accuracy and validity are known. The method used in the present study has provided such a protocol and will now allow to verify to what extent, scapular motion impairment is linked to the development of specific shoulder pathologies.
ERIC Educational Resources Information Center
Lincove, Jane Arnold; Osborne, Cynthia; Dillon, Amanda; Mills, Nicholas
2014-01-01
Despite questions about validity and reliability, the use of value-added estimation methods has moved beyond academic research into state accountability systems for teachers, schools, and teacher preparation programs (TPPs). Prior studies of value-added measurement for TPPs test the validity of researcher-designed models and find that measuring…
Validity and reliability of the BOD POD® S/T tracking system.
Tseh, W; Caputo, J L; Keefer, D J
2010-10-01
BOD POD(®) self-testing (S/T) body composition tracking system is a practical assessment tool designed for use in the health and fitness industries. Relative to its parent counterpart, the BOD POD(®) S/T has received little research attention. The primary purpose was to determine the validity of the BOD POD(®) S/T against hydrostatic weighing and 7-site skinfolds. Secondary aim was to determine the within-day and between-day reliability of the BOD POD(®) S/T. After a period of equipment and testing accommodation, volunteer's (N=50) body composition (%BF) via 7-site skinfolds, BOD POD(®) S/T, and hydrostatic weighing were obtained on the second and third visits. BOD POD(®) S/T significantly overestimated %BF when compared to hydrostatic weighing and 7-site skinfolds. There was no statistical difference between 7-site skinfolds and hydrostatic weighing values. BOD POD(®) S/T reliability within-day and between-days were high. While the BOD POD(®) S/T body composition tracking system is deemed reliable both within-day and between-days, it did significantly overestimate %BF in comparison to hydrostatic weighing and skinfolds. Future research should be aimed at deriving a correction factor for this body composition assessment tool. © Georg Thieme Verlag KG Stuttgart · New York.
Carlozzi, Noelle E; Hanks, Robin; Lange, Rael T; Brickell D Psych, Tracey A; Ianni, Phillip A; Miner, Jennifer A; French Psy D, Louis M; Kallen, Michael A; Sander, Angelle M
2018-06-19
To provide important reliability and validity data to support the use of the PROMIS Mental Health measures in caregivers of civilians or service members/veterans with traumatic brain injury (TBI). Patient-reported outcomes surveys administered through an electronic data collection platform. Three TBI Model Systems rehabilitation hospitals, an academic medical center, and a military medical treatment facility. 560 caregivers of individuals with a documented TBI (344 civilians and 216 military) INTERVENTION: Not Applicable MAIN OUTCOME MEASURES: PROMIS Anxiety, Depression, and Anger Item Banks RESULTS: Internal consistency for all of the PROMIS Mental Health item banks was very good (all α > .86) and three-week test retest reliability was good to adequate (ranged from .65 to .85). Convergent validity and discriminant validity of the PROMIS measures was also supported. Caregivers of individuals that were low functioning had worse emotional HRQOL (as measured by the three PROMIS measures) than caregivers of high functioning individuals, supporting known groups validity. Finally, levels of distress, as measured by the PROMIS measures, were elevated for those caring for low-functioning individuals in both samples (rates ranged from 26.2% to 43.6% for caregivers of low-functioning individuals). Results support the reliability and validity of the PROMIS Anxiety, Depression, and Anger item banks in caregivers of civilians and service members/veterans with TBI. Ultimately, these measures can be used to provide a standardized assessment of HRQOL as it relates to mental health in these caregivers. Copyright © 2018. Published by Elsevier Inc.
Wang, Chang-Hwai; Lee, Jin-Chuan; Yuan, Yu-Hsi
2014-01-01
The purpose of this research is to establish and verify the psychometric and structural properties of the self-report Chinese Sexual Assault Symptom Scale (C-SASS) to assess the trauma experienced by Chinese victims of sexual assault. An earlier version of the C-SASS was constructed using a modified list of the same trauma symptoms administered to an American sample and used to develop and validate the Sexual Assault Symptom Scale II (SASS II). The rationale of this study is to revise the earlier version of the C-SASS, using a larger and more representative sample and more robust statistical analysis than in earlier research, to permit a more thorough examination of the instrument and further confirm the dimensions of sexual assault trauma in Chinese victims of rape. In this study, a sample of 418 victims from northern Taiwan was collected to confirm the reliability and validity of the C-SASS. Exploratory factor analysis yielded five common factors: Safety Fears, Self-Blame, Health Fears, Anger and Emotional Lability, and Fears About the Criminal Justice System. Further tests of the validity and composite reliability of the C-SASS were provided by the structural equation modeling (SEM). The results indicated that the C-SASS was a brief, valid, and reliable instrument for assessing sexual assault trauma among Chinese victims in Taiwan. The scale can be used to evaluate victims in sexual assault treatment centers around Taiwan, as well as to capture the characteristics of sexual assault trauma among Chinese victims.
Validity and reliability of Nike + Fuelband for estimating physical activity energy expenditure.
Tucker, Wesley J; Bhammar, Dharini M; Sawyer, Brandon J; Buman, Matthew P; Gaesser, Glenn A
2015-01-01
The Nike + Fuelband is a commercially available, wrist-worn accelerometer used to track physical activity energy expenditure (PAEE) during exercise. However, validation studies assessing the accuracy of this device for estimating PAEE are lacking. Therefore, this study examined the validity and reliability of the Nike + Fuelband for estimating PAEE during physical activity in young adults. Secondarily, we compared PAEE estimation of the Nike + Fuelband with the previously validated SenseWear Armband (SWA). Twenty-four participants (n = 24) completed two, 60-min semi-structured routines consisting of sedentary/light-intensity, moderate-intensity, and vigorous-intensity physical activity. Participants wore a Nike + Fuelband and SWA, while oxygen uptake was measured continuously with an Oxycon Mobile (OM) metabolic measurement system (criterion). The Nike + Fuelband (ICC = 0.77) and SWA (ICC = 0.61) both demonstrated moderate to good validity. PAEE estimates provided by the Nike + Fuelband (246 ± 67 kcal) and SWA (238 ± 57 kcal) were not statistically different than OM (243 ± 67 kcal). Both devices also displayed similar mean absolute percent errors for PAEE estimates (Nike + Fuelband = 16 ± 13 %; SWA = 18 ± 18 %). Test-retest reliability for PAEE indicated good stability for Nike + Fuelband (ICC = 0.96) and SWA (ICC = 0.90). The Nike + Fuelband provided valid and reliable estimates of PAEE, that are similar to the previously validated SWA, during a routine that included approximately equal amounts of sedentary/light-, moderate- and vigorous-intensity physical activity.
Krause, Fabian G; Di Silvestro, Matthew; Penner, Murray J; Wing, Kevin J; Glazebrook, Mark A; Daniels, Timothy R; Lau, Johnny T C; Younger, Alastair S E
2012-02-01
End-stage ankle arthritis is operatively treated with numerous designs of total ankle replacement and different techniques for ankle fusion. For superior comparison of these procedures, outcome research requires a classification system to stratify patients appropriately. A postoperative 4-type classification system was designed by 6 fellowship-trained foot and ankle surgeons. Four surgeons reviewed blinded patient profiles and radiographs on 2 occasions to determine the interobserver and intraobserver reliability of the classification. Excellent interobserver reliability (κ = .89) and intraobserver reproducibility (κ = .87) were demonstrated for the postoperative classification system. In conclusion, the postoperative Canadian Orthopaedic Foot and Ankle Society (COFAS) end-stage ankle arthritis classification system appears to be a valid tool to evaluate the outcome of patients operated for end-stage ankle arthritis.
Improving the Selection, Classification, and Utilization of Army Enlisted Personnel. Project A
1987-06-01
performance measures, to determine whether the new predictors have incremental validity over and above the present system. These two components must be...critical aspect of this task is the demonstration of the incremental validity added by new predictors. Task 3. Measurement of School/Training Success...chances of incremental validity and classification efficiency. 3. Retain measures with adequate reliability. Using all accumulated information, the final
Development, reliability, and validity of the My Child's Play (MCP) questionnaire.
Schneider, Eleanor; Rosenblum, Sara
2014-01-01
This article describes the development, reliability, and validity of My Child's Play (MCP), a parent questionnaire designed to evaluate the play of children ages 3-9 yr. The first phase of the study determined the questionnaire's content and face validity. Subsequently, the internal reliability consistency and construct and concurrent validity were demonstrated using 334 completed questionnaires. The MCP showed good internal consistency (α = .86). The factor analysis revealed four distinct factors with acceptable levels of internal reliability (Cronbach's αs = .63-.81) and gender- and age-related differences in play characteristics; both findings attest to the tool's construct validity. Significant correlations (r = .33, p < .0001) with the Parent as a Teacher Inventory demonstrate the MCP's concurrent validity. The MCP demonstrated acceptable reliability and validity. It appears to be a promising standardized assessment tool for use in research and practice to promote understanding of a child's play. Copyright © 2014 by the American Occupational Therapy Association, Inc.
Carlozzi, Noelle E; Ianni, Phillip A; Tulsky, David S; Brickell, Tracey A; Lange, Rael T; French, Louis M; Cella, David; Kallen, Michael A; Miner, Jennifer A; Kratz, Anna L
2018-06-19
To examine the reliability and validity of Patient Reported Outcomes Measurement Information System (PROMIS) measures of sleep disturbance and fatigue in TBI caregivers and to determine the severity of fatigue and sleep disturbance in these caregivers. Cross-sectional survey data collected through an online data capture platform. Four rehabilitation hospitals and Walter Reed National Military Medical Center. Caregivers (N=560) of civilians (n=344) and service member/veterans (n=216) with TBI. Not Applicable MAIN OUTCOME MEASURES: PROMIS sleep and fatigue measures administered as both computerized adaptive tests (CATs) and 4-item short forms (SFs). For both samples, floor and ceiling effects for the PROMIS measures were low (<11%), internal consistency was very good (all alphas ≥0.80), and test-retest reliability was acceptable (all r≥0.70 except for the fatigue CAT in the service member/veteran sample r=0.63). Convergent validity was supported by moderate correlations between the PROMIS and related measures. Discriminant validity was supported by low correlations between PROMIS measures and measures of dissimilar constructs. PROMIS scores indicated significantly worse sleep and fatigue for those caring for someone with high levels versus low levels of impairment. Findings support the reliability and validity of the PROMIS CAT and SF measures of sleep disturbance and fatigue in caregivers of civilians and service members/veterans with TBI. Copyright © 2018. Published by Elsevier Inc.
Rantalainen, Timo; Gastin, Paul B; Spangler, Rhys; Wundersitz, Daniel
2018-09-01
The purpose of the present study was to evaluate the concurrent validity and test-retest repeatability of torso-worn IMU-derived power and jump height in a counter-movement jump test. Twenty-seven healthy recreationally active males (age, 21.9 [SD 2.0] y, height, 1.76 [0.7] m, mass, 73.7 [10.3] kg) wore an IMU and completed three counter-movement jumps a week apart. A force platform and a 3D motion analysis system were used to concurrently measure the jumps and subsequently derive power and jump height (based on take-off velocity and flight time). The IMU significantly overestimated power (mean difference = 7.3 W/kg; P < 0.001) compared to force-platform-derived power but good correspondence between methods was observed (Intra-class correlation coefficient [ICC] = 0.69). IMU-derived power exhibited good reliability (ICC = 0.67). Velocity-derived jump heights exhibited poorer concurrent validity (ICC = 0.72 to 0.78) and repeatability (ICC = 0.68) than flight-time-derived jump heights, which exhibited excellent validity (ICC = 0.93 to 0.96) and reliability (ICC = 0.91). Since jump height and power are closely related, and flight-time-derived jump height exhibits excellent concurrent validity and reliability, flight-time-derived jump height could provide a more desirable measure compared to power when assessing athletic performance in a counter-movement jump with IMUs.
Soldier Dimensions in Combat Models
1990-05-07
and performance. Questionnaires, SQTs, and ARTEPs were often used. Many scales had estimates of reliability but few had validity data. Most studies...pending its validation . Research plans were provided for applications in simulated combat and with simulation devices, for data previously gathered...regarding reliability and validity . Lack of information following an instrument indicates neither reliability nor validity information was provided by the
Methodology Series Module 9: Designing Questionnaires and Clinical Record Forms - Part II.
Setia, Maninder Singh
2017-01-01
This article is a continuation of the previous module on designing questionnaires and clinical record form in which we have discussed some basic points about designing the questionnaire and clinical record forms. In this section, we will discuss the reliability and validity of questionnaires. The different types of validity are face validity, content validity, criterion validity, and construct validity. The different types of reliability are test-retest reliability, inter-rater reliability, and intra-rater reliability. Some of these parameters are assessed by subject area experts. However, statistical tests should be used for evaluation of other parameters. Once the questionnaire has been designed, the researcher should pilot test the questionnaire. The items in the questionnaire should be changed based on the feedback from the pilot study participants and the researcher's experience. After the basic structure of the questionnaire has been finalized, the researcher should assess the validity and reliability of the questionnaire or the scale. If an existing standard questionnaire is translated in the local language, the researcher should assess the reliability and validity of the translated questionnaire, and these values should be presented in the manuscript. The decision to use a self- or interviewer-administered, paper- or computer-based questionnaire depends on the nature of the questions, literacy levels of the target population, and resources.
Methodology Series Module 9: Designing Questionnaires and Clinical Record Forms – Part II
Setia, Maninder Singh
2017-01-01
This article is a continuation of the previous module on designing questionnaires and clinical record form in which we have discussed some basic points about designing the questionnaire and clinical record forms. In this section, we will discuss the reliability and validity of questionnaires. The different types of validity are face validity, content validity, criterion validity, and construct validity. The different types of reliability are test-retest reliability, inter-rater reliability, and intra-rater reliability. Some of these parameters are assessed by subject area experts. However, statistical tests should be used for evaluation of other parameters. Once the questionnaire has been designed, the researcher should pilot test the questionnaire. The items in the questionnaire should be changed based on the feedback from the pilot study participants and the researcher's experience. After the basic structure of the questionnaire has been finalized, the researcher should assess the validity and reliability of the questionnaire or the scale. If an existing standard questionnaire is translated in the local language, the researcher should assess the reliability and validity of the translated questionnaire, and these values should be presented in the manuscript. The decision to use a self- or interviewer-administered, paper- or computer-based questionnaire depends on the nature of the questions, literacy levels of the target population, and resources. PMID:28584367
Lou, Yanni; Lu, Linghui; Li, Yuan; Liu, Meng; Bredle, Jason M; Jia, Liqun
2015-10-01
The study objective was to determine the reliability and validity of the Chinese version of the Functional Assessment of Chronic Illness Therapy - Ascites Index (FACIT-AI). A forward-backward translation procedure was adopted to develop the Chinese version of the FACIT-AI, which was tested in 69 patients with malignant ascites. Cronbach's α, split-half reliability, and test-retest reliability were used to assess the reliability of the scale. The content validity index was used to assess the content validity, while factor analysis was used for construct validity and correlation analysis was used for criterion validity. The Cronbach's α was 0.772 for the total scale, and the split-half reliability was 0.693. The test-retest correlation was 0.972. The content validity index for the scale was 0.8-1.0. Four factors were extracted by factor analysis, and these contributed 63.51% of the total variance. Item-total correlations ranged from 0.591 to 0.897, and these were correlated with visual analog scale scores (correlation coefficient, 0.889; P<0.01). The Chinese version of the FACIT-AI has good reliability and validity and can be used as a tool to measure quality of life in Chinese patients with malignant ascites.
Mani, Suresh; Sharma, Shobha; Omar, Baharudin; Paungmali, Aatit; Joseph, Leonard
2017-04-01
Purpose The purpose of this review is to systematically explore and summarise the validity and reliability of telerehabilitation (TR)-based physiotherapy assessment for musculoskeletal disorders. Method A comprehensive systematic literature review was conducted using a number of electronic databases: PubMed, EMBASE, PsycINFO, Cochrane Library and CINAHL, published between January 2000 and May 2015. The studies examined the validity, inter- and intra-rater reliabilities of TR-based physiotherapy assessment for musculoskeletal conditions were included. Two independent reviewers used the Quality Appraisal Tool for studies of diagnostic Reliability (QAREL) and the Quality Assessment of Diagnostic Accuracy Studies (QUADAS) tool to assess the methodological quality of reliability and validity studies respectively. Results A total of 898 hits were achieved, of which 11 articles based on inclusion criteria were reviewed. Nine studies explored the concurrent validity, inter- and intra-rater reliabilities, while two studies examined only the concurrent validity. Reviewed studies were moderate to good in methodological quality. The physiotherapy assessments such as pain, swelling, range of motion, muscle strength, balance, gait and functional assessment demonstrated good concurrent validity. However, the reported concurrent validity of lumbar spine posture, special orthopaedic tests, neurodynamic tests and scar assessments ranged from low to moderate. Conclusion TR-based physiotherapy assessment was technically feasible with overall good concurrent validity and excellent reliability, except for lumbar spine posture, orthopaedic special tests, neurodynamic testa and scar assessment.
Kwon, Sungjun; Kim, Jeehoon; Kang, Seungwoo; Lee, Youngki; Baek, Hyunjae
2014-01-01
Abstract We propose CardioGuard, a brassiere-based reliable electrocardiogram (ECG) monitoring sensor system, for supporting daily smartphone healthcare applications. It is designed to satisfy two key requirements for user-unobtrusive daily ECG monitoring: reliability of ECG sensing and usability of the sensor. The system is validated through extensive evaluations. The evaluation results showed that the CardioGuard sensor reliably measure the ECG during 12 representative daily activities including diverse movement levels; 89.53% of QRS peaks were detected on average. The questionnaire-based user study with 15 participants showed that the CardioGuard sensor was comfortable and unobtrusive. Additionally, the signal-to-noise ratio test and the washing durability test were conducted to show the high-quality sensing of the proposed sensor and its physical durability in practical use, respectively. PMID:25405527
Sensor validation and fusion for gas turbine vibration monitoring
NASA Astrophysics Data System (ADS)
Yan, Weizhong; Goebel, Kai F.
2003-08-01
Vibration monitoring is an important practice throughout regular operation of gas turbine power systems and, even more so, during characterization tests. Vibration monitoring relies on accurate and reliable sensor readings. To obtain accurate readings, sensors are placed such that the signal is maximized. In the case of characterization tests, strain gauges are placed at the location of vibration modes on blades inside the gas turbine. Due to the prevailing harsh environment, these sensors have a limited life and decaying accuracy, both of which impair vibration assessment. At the same time bandwidth limitations may restrict data transmission, which in turn limits the number of sensors that can be used for assessment. Knowing the sensor status (normal or faulty), and more importantly, knowing the true vibration level of the system all the time is essential for successful gas turbine vibration monitoring. This paper investigates a dynamic sensor validation and system health reasoning scheme that addresses the issues outlined above by considering only the information required to reliably assess system health status. In particular, if abnormal system health is suspected or if the primary sensor is determined to be faulted, information from available "sibling" sensors is dynamically integrated. A confidence expresses the complex interactions of sensor health and system health, their reliabilities, conflicting information, and what the health assessment is. Effectiveness of the scheme in achieving accurate and reliable vibration evaluation is then demonstrated using a combination of simulated data and a small sample of a real-world application data where the vibration of compressor blades during a real time characterization test of a new gas turbine power system is monitored.
Validation of LOC-I interventions
DOT National Transportation Integrated Search
2012-08-13
The basic tenet of this paper is that todays national airspace systems, at least in advanced industrial countries, qualify as so-called Highly Reliable Systems (HRS). In an HRS, even the type of accident that causes the most fatalities is a rare e...
Singh, Aparna; Singh, Girish; Patwardhan, Kishor; Gehlot, Sangeeta
2017-01-01
According to Ayurveda, the traditional system of healthcare of Indian origin, Agni is the factor responsible for digestion and metabolism. Four functional states (Agnibala) of Agni have been recognized: regular, irregular, intense, and weak. The objective of the present study was to develop and validate a self-assessment tool to estimate Agnibala The developed tool was evaluated for its reliability and validity by administering it to 300 healthy volunteers of either gender belonging to 18 to 40-year age group. Besides confirming the statistical validity and reliability, the practical utility of the newly developed tool was also evaluated by recording serum lipid parameters of all the volunteers. The results show that the lipid parameters vary significantly according to the status of Agni The tool, therefore, may be used to screen normal population to look for possible susceptibility to certain health conditions. © The Author(s) 2016.
Spanish-language screening scales: A critical review.
Torres-Castro, S; Mena-Montes, B; González-Ambrosio, G; Zubieta-Zavala, A; Torres-Carrillo, N M; Acosta-Castillo, G I; Espinel-Bermúdez, M C
2018-05-09
Dementia is a chronic, degenerative disease with a strong impact on families and health systems. The instruments currently in use for measuring cognitive impairment have different psychometric characteristics in terms of application time, cut-off point, reliability, and validity. The objective of this review is to describe the characteristics of the validated, Spanish-language versions of the Mini-Cog, Clock-Drawing Test, and Mini-Mental State Examination scales for cognitive impairment screening. We performed a three-stage literature search of articles published on Medline since 1953. We selected articles on validated, Spanish-language versions of the scales that included data on reliability, validity, sensitivity, and specificity. The 3 screening tools assessed in this article provide support for primary care professionals. Timely identification of mild cognitive impairment and dementia is crucial for the prognosis of these patients. Copyright © 2018 Sociedad Española de Neurología. Publicado por Elsevier España, S.L.U. All rights reserved.
Corner, E J; Wood, H; Englebretsen, C; Thomas, A; Grant, R L; Nikoletou, D; Soni, N
2013-03-01
To develop a scoring system to measure physical morbidity in critical care - the Chelsea Critical Care Physical Assessment Tool (CPAx). The development process was iterative involving content validity indices (CVI), a focus group and an observational study of 33 patients to test construct validity against the Medical Research Council score for muscle strength, peak cough flow, Australian Therapy Outcome Measures score, Glasgow Coma Scale score, Bloomsbury sedation score, Sequential Organ Failure Assessment score, Short Form 36 (SF-36) score, days of mechanical ventilation and inter-rater reliability. Trauma and general critical care patients from two London teaching hospitals. Users of the CPAx felt that it possessed content validity, giving a final CVI of 1.00 (P<0.05). Construct validation data showed moderate to strong significant correlations between the CPAx score and all secondary measures, apart from the mental component of the SF-36 which demonstrated weak correlation with the CPAx score (r=0.024, P=0.720). Reliability testing showed internal consistency of α=0.798 and inter-rater reliability of κ=0.988 (95% confidence interval 0.791 to 1.000) between five raters. This pilot work supports proof of concept of the CPAx as a measure of physical morbidity in the critical care population, and is a cogent argument for further investigation of the scoring system. Copyright © 2012 Chartered Society of Physiotherapy. Published by Elsevier Ltd. All rights reserved.
Assessing reliability and validity measures in managed care studies.
Montoya, Isaac D
2003-01-01
To review the reliability and validity literature and develop an understanding of these concepts as applied to managed care studies. Reliability is a test of how well an instrument measures the same input at varying times and under varying conditions. Validity is a test of how accurately an instrument measures what one believes is being measured. A review of reliability and validity instructional material was conducted. Studies of managed care practices and programs abound. However, many of these studies utilize measurement instruments that were developed for other purposes or for a population other than the one being sampled. In other cases, instruments have been developed without any testing of the instrument's performance. The lack of reliability and validity information may limit the value of these studies. This is particularly true when data are collected for one purpose and used for another. The usefulness of certain studies without reliability and validity measures is questionable, especially in cases where the literature contradicts itself
van Oostveen, Catharina J; Ubbink, Dirk T; Mens, Marian A; Pompe, Edwin A; Vermeulen, Hester
2016-03-01
To investigate the reliability, validity and feasibility of the RAFAELA workforce planning system (including the Oulu patient classification system - OPCq), before deciding on implementation in Dutch hospitals. The complexity of care, budgetary restraints and demand for high-quality patient care have ignited the need for transparent hospital workforce planning. Nurses from 12 wards of two university hospitals were trained to test the reliability of the OPCq by investigating the absolute agreement of nursing care intensity (NCI) measurements among nurses. Validity was tested by assessing whether optimal NCI/nurse ratio, as calculated by a regression analysis in RAFAELA, was realistic. System feasibility was investigated through a questionnaire among all nurses involved. Almost 67 000 NCI measurements were performed between December 2013 and June 2014. Agreement using the OPCq varied between 38% and 91%. For only 1 in 12 wards was the optimal NCI area calculated judged as valid. Although the majority of respondents was positive about the applicability and user-friendliness, RAFAELA was not accepted as useful workforce planning system. Nurses' performance using the RAFAELA system did not warrant its implementation. Hospital managers should first focus on enlarging the readiness of nurses regarding the implementation of a workforce planning system. © 2015 John Wiley & Sons Ltd.
Making Best Practice Standard--And Lasting
ERIC Educational Resources Information Center
Stringfield, Sam; Reynolds, David; Schaffer, Eugene
2012-01-01
If our schools were a research project, we'd say that while some aspects of our schools are highly valid, our overall academic systems aren't reliable. If educators are to meet the challenge of leaving no child behind, schools will have to improve with much greater reliability. In the mid-1990s, a group of British secondary schools decided to work…
The reliability and validity of a three-camera foot image system for obtaining foot anthropometrics.
O'Meara, Damien; Vanwanseele, Benedicte; Hunt, Adrienne; Smith, Richard
2010-08-01
The purpose was to develop a foot image capture and measurement system with web cameras (the 3-FIS) to provide reliable and valid foot anthropometric measures with efficiency comparable to that of the conventional method of using a handheld anthropometer. Eleven foot measures were obtained from 10 subjects using both methods. Reliability of each method was determined over 3 consecutive days using the intraclass correlation coefficient and root mean square error (RMSE). Reliability was excellent for both the 3-FIS and the handheld anthropometer for the same 10 variables, and good for the fifth metatarsophalangeal joint height. The RMSE values over 3 days ranged from 0.9 to 2.2 mm for the handheld anthropometer, and from 0.8 to 3.6 mm for the 3-FIS. The RMSE values between the 3-FIS and the handheld anthropometer were between 2.3 and 7.4 mm. The 3-FIS required less time to collect and obtain the final variables than the handheld anthropometer. The 3-FIS provided accurate and reproducible results for each of the foot variables and in less time than the conventional approach of a handheld anthropometer.
Spieler, E A; Barth, P S; Burton, J F; Himmelstein, J; Rudolph, L
2000-01-26
The American Medical Association's Guides to the Evaluation of Permanent Impairment, Fourth Edition, is the most commonly used tool in the United States for rating permanent impairments for disability systems. The Guides, currently undergoing revision, has been the focus of considerable controversy. Criticisms have focused on 2 areas: internal deficiencies, including the lack of a comprehensive, valid, reliable, unbiased, and evidence-based system for rating impairments; and the way in which workers' compensation systems use the ratings, resulting in inappropriate compensation. We focus on the internal deficiencies and recommend that the Guides remains a tool for evaluation of permanent impairment, not disability. To maintain wide acceptance of the Guides, its authors need to improve the validity, internal consistency, and comprehensiveness of the ratings; document reliability and reproducibility of the results; and make the Guides easily comprehensible and accessible to physicians.
Merk, Josef; Schlotz, Wolff; Falter, Thomas
2017-01-01
This study presents a new measure of value systems, the Motivational Value Systems Questionnaire (MVSQ), which is based on a theory of value systems by psychologist Clare W. Graves. The purpose of the instrument is to help people identify their personal hierarchies of value systems and thus become more aware of what motivates and demotivates them in work-related contexts. The MVSQ is a forced-choice (FC) measure, making it quicker to complete and more difficult to intentionally distort, but also more difficult to assess its psychometric properties due to ipsativity of FC data compared to rating scales. To overcome limitations of ipsative data, a Thurstonian IRT (TIRT) model was fitted to the questionnaire data, based on a broad sample of N = 1,217 professionals and students. Comparison of normative (IRT) scale scores and ipsative scores suggested that MVSQ IRT scores are largely freed from restrictions due to ipsativity and thus allow interindividual comparison of scale scores. Empirical reliability was estimated using a sample-based simulation approach which showed acceptable and good estimates and, on average, slightly higher test-retest reliabilities. Further, validation studies provided evidence on both construct validity and criterion-related validity. Scale score correlations and associations of scores with both age and gender were largely in line with theoretically- and empirically-based expectations, and results of a multitrait-multimethod analysis supports convergent and discriminant construct validity. Criterion validity was assessed by examining the relation of value system preferences to departmental affiliation which revealed significant relations in line with prior hypothesizing. These findings demonstrate the good psychometric properties of the MVSQ and support its application in the assessment of value systems in work-related contexts. PMID:28979228
Merk, Josef; Schlotz, Wolff; Falter, Thomas
2017-01-01
This study presents a new measure of value systems, the Motivational Value Systems Questionnaire (MVSQ), which is based on a theory of value systems by psychologist Clare W. Graves. The purpose of the instrument is to help people identify their personal hierarchies of value systems and thus become more aware of what motivates and demotivates them in work-related contexts. The MVSQ is a forced-choice (FC) measure, making it quicker to complete and more difficult to intentionally distort, but also more difficult to assess its psychometric properties due to ipsativity of FC data compared to rating scales. To overcome limitations of ipsative data, a Thurstonian IRT (TIRT) model was fitted to the questionnaire data, based on a broad sample of N = 1,217 professionals and students. Comparison of normative (IRT) scale scores and ipsative scores suggested that MVSQ IRT scores are largely freed from restrictions due to ipsativity and thus allow interindividual comparison of scale scores. Empirical reliability was estimated using a sample-based simulation approach which showed acceptable and good estimates and, on average, slightly higher test-retest reliabilities. Further, validation studies provided evidence on both construct validity and criterion-related validity. Scale score correlations and associations of scores with both age and gender were largely in line with theoretically- and empirically-based expectations, and results of a multitrait-multimethod analysis supports convergent and discriminant construct validity. Criterion validity was assessed by examining the relation of value system preferences to departmental affiliation which revealed significant relations in line with prior hypothesizing. These findings demonstrate the good psychometric properties of the MVSQ and support its application in the assessment of value systems in work-related contexts.
Reliability of Long-Term Wave Conditions Predicted with Data Sets of Short Duration
1985-03-01
the validity and reliability of predicted probable wave heights obtained from data of limited duration. BACKGROUND: The basic steps listed by...interest to perform the analysis outlined in steps 2 to 5, the prediction would only be reliable for up to a 3year return period. For a 5-year data set...for long-term hindcast data . The data retrieval and analysis program known as the Sea State Engineering Analysis System (SEAS) makes handling of the
Francis, Heather M; Osborne-Crowley, Katherine; McDonald, Skye
2017-01-01
To describe the reliability and validity of a new measure, the Social Skills Questionnaire for Traumatic Brain Injury (SSQ-TBI). Fifty-one adults with severe TBI completed the SSQ-TBI questionnaire. Scores were compared to informant- and self-report on questionnaires addressing frontal lobe mediated behaviour, as well as performance on an objective measure of social cognition and neuropsychological tasks, in order to provide evidence of concurrent, divergent and predictive validity. Internal consistency was excellent at α = 0.90. Convergent validity was good, with informant ratings on the SSQ-TBI significantly correlated with Neuropsychiatric Inventory Disinhibition sub-scales (r = 0.50-63), the Current Behaviour Scale (r = 0.39-0.48) and Frontal Systems Behaviour Scale (r = 0.60-0.83). However, no relationship was seen with an objective measure of social skills or neuropsychological tasks of disinhibition. There was a significant relationship with real-world psychosocial outcomes on the Sydney Psychosocial Reintegration Scale-2 (r = -0.38--0.69) Conclusions: This study provides preliminary findings of good internal consistency and convergent and predictive validity of a social skills questionnaire adapted to be appropriate for individuals with TBI. Further assessment of psychometric properties such as test-re-test reliability and factor structure is warranted.
Kim, Hee-Ju; Abraham, Ivo
2017-01-01
Evidence is needed on the clinicometric properties of single-item or short measures as alternatives to comprehensive measures. We examined whether two single-item fatigue measures (i.e., Likert scale, numeric rating scale) or a short fatigue measure were comparable to a comprehensive measure in reliability (i.e., internal consistency and test-retest reliability) and validity (i.e., convergent, concurrent, and predictive validity) in Korean young adults. For this quantitative study, we selected the Functional Assessment of Chronic Illness Therapy-Fatigue for the comprehensive measure and the Profile of Mood States-Brief, Fatigue subscale for the short measure; and constructed two single-item measures. A total of 368 students from four nursing colleges in South Korea participated. We used Cronbach's alpha and item-total correlation for internal consistency reliability and intraclass correlation coefficient for test-retest reliability. We assessed Pearson's correlation with a comprehensive measure for convergent validity, with perceived stress level and sleep quality for concurrent validity and the receiver operating characteristic curve for predictive validity. The short measure was comparable to the comprehensive measure in internal consistency reliability (Cronbach's alpha=0.81 vs. 0.88); test-retest reliability (intraclass correlation coefficient=0.66 vs. 0.61); convergent validity (r with comprehensive measure=0.79); concurrent validity (r with perceived stress=0.55, r with sleep quality=0.39) and predictive validity (area under curve=0.88). Single-item measures were not comparable to the comprehensive measure. A short fatigue measure exhibited similar levels of reliability and validity to the comprehensive measure in Korean young adults. Copyright © 2016 Elsevier Ltd. All rights reserved.
Minor, M A; Reid, J C; Griffin, J Z; Pittman, C B; Patrick, T B; Cutts, J H
1998-02-01
To identify innovative strategies to support appropriate, self-directed exercise that increase physical activity levels of people with arthritis. This article reports on one interactive, multimedia exercise performance support system (PSS) for people with lower extremity impairments in strength or flexibility. An interdisciplinary team developed the PSS using self-report of lower extremity musculoskeletal impairments (flexibility and strength) to produce an individualized exercise program with video and print educational materials. Initial evaluation has investigated the validity and reliability of program assessments and recommendations. PSS self-report and professional assessments were similar, with more impairments indicated by self-report. PSS exercise recommendations were similar to those made by 3 expert physical therapists using the same exercise data base. Results of PSS impairment assessments were stable over a 1-week period. PSS exercise recommendations appear to be reliable and a valid reflection of current exercise knowledge in rheumatology. Furthermore, users were able to complete the computer-based program with minimal assistance and reported it to be enjoyable and informative.
Borotikar, Bhushan; Lempereur, Mathieu; Lelievre, Mathieu; Burdin, Valérie; Ben Salem, Douraied; Brochard, Sylvain
2017-01-01
To report evidence for the concurrent validity and reliability of dynamic MRI techniques to evaluate in vivo joint and muscle mechanics, and to propose recommendations for their use in the assessment of normal and impaired musculoskeletal function. The search was conducted on articles published in Web of science, PubMed, Scopus, Academic search Premier, and Cochrane Library between 1990 and August 2017. Studies that reported the concurrent validity and/or reliability of dynamic MRI techniques for in vivo evaluation of joint or muscle mechanics were included after assessment by two independent reviewers. Selected articles were assessed using an adapted quality assessment tool and a data extraction process. Results for concurrent validity and reliability were categorized as poor, moderate, or excellent. Twenty articles fulfilled the inclusion criteria with a mean quality assessment score of 66% (±10.4%). Concurrent validity and/or reliability of eight dynamic MRI techniques were reported, with the knee being the most evaluated joint (seven studies). Moderate to excellent concurrent validity and reliability were reported for seven out of eight dynamic MRI techniques. Cine phase contrast and real-time MRI appeared to be the most valid and reliable techniques to evaluate joint motion, and spin tag for muscle motion. Dynamic MRI techniques are promising for the in vivo evaluation of musculoskeletal mechanics; however results should be evaluated with caution since validity and reliability have not been determined for all joints and muscles, nor for many pathological conditions.
Uysal, Hilal; Ozcan, Şeyda
2011-06-01
Many new measuring devices have been developed so that broader psychometric measurements in the coronary artery disease, disease-specific health status measurements, and identification of the broader quality of life can be performed in the recent years. The study was intended to determine whether, and to what extent, MIDAS is a valid and reliable measurement to the patients suffering from myocardial infarction for the first time in Turkey. The research was conducted with the patients hospitalized and treated with myocardial infarction in the cardiology departments of 2 hospitals in Istanbul, Turkey, between 2007 and 2008. Psychometric evaluations of TR-MIDAS were used for validity studies; language validity, content validity, construct validity were examined. For reliability studies; the tool's internal consistency reliability, Cronbach's alpha reliability coefficient, and test-retest reliability were completed. The instrument's content validity index was determined to be "0.95". Principal component analysis revealed six factors with an eigenvalue >1.5. Cronbach's alpha was found to be 0.89 for total scale which was an acceptable value. The total's test-retest reliability was 0.51 (p<0.01). Data obtained at the end of the study supports that Turkish Myocardial Infarction Dimensional Assessment Scale is a valid and reliable instrument as a disease-specific scale to assess the patients' quality of life suffering from myocardial infarction in Turkey. Copyright © 2010 European Society of Cardiology. Published by Elsevier B.V. All rights reserved.
A Performance Appraisal System for School Principals.
ERIC Educational Resources Information Center
Knoop, Robert; Common, Ronald W.
The Performance Review, Analysis, and Improvement System for Educators (PRAISE) is a formative evaluation instrument designed to improve the performance of school principals. The system appears to be reliable and valid and is flexible enough to accommodate the needs of a variety of schools. Sample items and categories of the instrument include…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Strons, Philip; Bailey, James L.; Davis, John
2016-03-01
In this work, we apply the CFD in modeling airflow and particulate transport. This modeling is then compared to field validation studies to both inform and validate the modeling assumptions. Based on the results of field tests, modeling assumptions and boundary conditions are refined and the process is repeated until the results are found to be reliable with a high level of confidence.
ERIC Educational Resources Information Center
Ware, John E.; And Others
This paper summarizes findings of a program of evaluation research. In this research the authors assumed that reliable and valid measurement of consumers' attitudes and beliefs about various aspects of medical care is possible and that more valid evaluation systems could be achieved if consumer perceptions are carefully considered during…
ERIC Educational Resources Information Center
Kaya, Osman Nafiz; Kilic, Ziya
2004-01-01
Student-centered approach of scoring the concept maps consisted of three elements namely symbol system, individual portfolio and scoring scheme. We scored student-constructed concept maps based on 5 concept map criteria: validity of concepts, adequacy of propositions, significance of cross-links, relevancy of examples, and interconnectedness. With…
Validity-Supporting Evidence of the Self-Efficacy for Teaching Mathematics Instrument
ERIC Educational Resources Information Center
McGee, Jennifer R.; Wang, Chuang
2014-01-01
The purpose of this study is to provide evidence of reliability and validity of the Self-Efficacy for Teaching Mathematics Instrument (SETMI). Self-efficacy, as defined by Bandura, was the theoretical framework for the development of the instrument. The complex belief systems of mathematics teachers, as touted by Ernest provided insights into the…
Longitudinal Models of Reliability and Validity: A Latent Curve Approach.
ERIC Educational Resources Information Center
Tisak, John; Tisak, Marie S.
1996-01-01
Dynamic generalizations of reliability and validity that will incorporate longitudinal or developmental models, using latent curve analysis, are discussed. A latent curve model formulated to depict change is incorporated into the classical definitions of reliability and validity. The approach is illustrated with sociological and psychological…
Scoring Rubric Development: Validity and Reliability.
ERIC Educational Resources Information Center
Moskal, Barbara M.; Leydens, Jon A.
2000-01-01
Provides clear definitions of the terms "validity" and "reliability" in the context of developing scoring rubrics and illustrates these definitions through examples. Also clarifies how validity and reliability may be addressed in the development of scoring rubrics, defined as descriptive scoring schemes developed to guide the analysis of the…
RELIABILITY OF CONFOCAL MICROSCOPY SPECTRAL IMAGING SYSTEMS: USE OF MULTISPECTRAL BEADS
Background: There is a need for a standardized, impartial calibration, and validation protocol on confocal spectral imaging (CSI) microscope systems. To achieve this goal, it is necessary to have testing tools to provide a reproducible way to evaluate instrument performance. ...
Griswold, David; Rockwell, Kyle; Killa, Carri; Maurer, Michael; Landgraff, Nancy; Learman, Ken
2015-01-01
The aim of this study was to determine the reliability and concurrent validity of commonly used physical performance tests using the OmniVR Virtual Rehabilitation System for healthy community-dwelling elders. Participants (N = 40) were recruited by the authors and were screened for eligibility. The initial method of measurement was randomized to either virtual reality (VR) or clinically based measures (CM). Physical performance tests included the five times sit to stand, Timed Up and Go (TUG), Forward Functional Reach (FFR) and 30-s stand test. A random number generator determined the testing order. The test-re-test reliability for the VR and CM was determined. Furthermore, concurrent validity was determined using a Pearson product moment correlation (Pearson r). The VR demonstrated excellent reliability for 5 × STS intraclass correlation coefficient (ICC) = 0.931(3,1), FFR ICC = 0.846(3,1) and the TUG ICC = 0.944(3,1). The concurrent validity data for the VR and CM (ICC 3, k) were moderate for FFR ICC = 0.682, excellent 5 × STS ICC = 0.889 and excellent for the TUG ICC = 0.878. The concurrent validity of the 30-s stand test was good ICC = 0.735(3,1). This study supports the use of VR equipment for measuring physical performance tests in the clinic for healthy community-dwelling elders. Virtual reality equipment is not only used to treat balance impairments but it is also used to measure and determine physical impairments through the use of physical performance tests. Virtual reality equipment is a reliable and valid tool for collecting physical performance data for the 5 × STS, FFR, TUG and 30-s stand test for healthy community-dwelling elders.
Wagenlehner, Florian Martin Erich; Fröhlich, Oliver; Bschleipfer, Thomas; Weidner, Wolfgang; Perletti, Gianpaolo
2014-06-01
Anatomical damage to pelvic floor structures may cause multiple symptoms. The Integral Theory System Questionnaire (ITSQ) is a holistic questionnaire that uses symptoms to help locate damage in specific connective tissue structures as a guide to reconstructive surgery. It is based on the integral theory, which states that pelvic floor symptoms and prolapse are both caused by lax suspensory ligaments. The aim of the present study was to psychometrically validate the ITSQ. Established psychometric properties including validity, reliability, and responsiveness were considered for evaluation. Criterion validity was assessed in a cohort of 110 women with pelvic floor dysfunctions by analyzing the correlation of questionnaire responses with objective clinical data. Test-retest was performed with questionnaires from 47 patients. Cronbach's alpha and "split-half" reliability coefficients were calculated for inner consistency analysis. Psychometric properties of ITSQ were comparable to the ones of previously validated Pelvic Floor Questionnaires. Face validity and content validity were approved by an expert group of the International Collaboration of Pelvic Floor surgeons. Convergent validity assessed using Bayesian method was at least as accurate as the expert assessment of anatomical defects. Objective data measurement in patients demonstrated significant correlations with ITSQ domains fulfilling criterion validity. Internal consistency values ranked from 0.85 to 0.89 in different scenarios. The ITSQ proofed accurate and is able to serve as a holistic Pelvic Floor Questionnaire directing symptoms to site-specific pelvic floor reconstructive surgery.
Cutolo, Maurizio; Vanhaecke, Amber; Ruaro, Barbara; Deschepper, Ellen; Ickinger, Claudia; Melsens, Karin; Piette, Yves; Trombetta, Amelia Chiara; De Keyser, Filip; Smith, Vanessa
2018-06-06
A reliable tool to evaluate flow is paramount in systemic sclerosis (SSc). We describe herein on the one hand a systematic literature review on the reliability of laser speckle contrast analysis (LASCA) to measure the peripheral blood perfusion (PBP) in SSc and perform an additional pilot study, investigating the intra- and inter-rater reliability of LASCA. A systematic search was performed in 3 electronic databases, according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. In the pilot study, 30 SSc patients and 30 healthy subjects (HS) underwent LASCA assessment. Intra-rater reliability was assessed by having a first anchor rater performing the measurements at 2 time-points and inter-rater reliability by having the anchor rater and a team of second raters performing the measurements in 15 SSc and 30 HS. The measurements were repeated with a second anchor rater in the other 15 SSc patients, as external validation. Only 1 of the 14 records of interest identified through the systematic search was included in the final analysis. In the additional pilot study: intra-class correlation coefficient (ICC) for intra-rater reliability of the first anchor rater was 0.95 in SSc and 0.93 in HS, the ICC for inter-rater reliability was 0.97 in SSc and 0.93 in HS. Intra- and inter-rater reliability of the second anchor rater was 0.78 and 0.87. The identified literature regarding the reliability of LASCA measurements reports good to excellent inter-rater agreement. This very pilot study could confirm the reliability of LASCA measurements with good to excellent inter-rater agreement and found additionally good to excellent intra-rater reliability. Furthermore, similar results were found in the external validation. Copyright © 2018. Published by Elsevier B.V.
Developing a weighted measure of speech sound accuracy.
Preston, Jonathan L; Ramsdell, Heather L; Oller, D Kimbrough; Edwards, Mary Louise; Tobin, Stephen J
2011-02-01
To develop a system for numerically quantifying a speaker's phonetic accuracy through transcription-based measures. With a focus on normal and disordered speech in children, the authors describe a system for differentially weighting speech sound errors on the basis of various levels of phonetic accuracy using a Weighted Speech Sound Accuracy (WSSA) score. The authors then evaluate the reliability and validity of this measure. Phonetic transcriptions were analyzed from several samples of child speech, including preschoolers and young adolescents with and without speech sound disorders and typically developing toddlers. The new measure of phonetic accuracy was validated against existing measures, was used to discriminate typical and disordered speech production, and was evaluated to examine sensitivity to changes in phonetic accuracy over time. Reliability between transcribers and consistency of scores among different word sets and testing points are compared. Initial psychometric data indicate that WSSA scores correlate with other measures of phonetic accuracy as well as listeners' judgments of the severity of a child's speech disorder. The measure separates children with and without speech sound disorders and captures growth in phonetic accuracy in toddlers' speech over time. The measure correlates highly across transcribers, word lists, and testing points. Results provide preliminary support for the WSSA as a valid and reliable measure of phonetic accuracy in children's speech.
Yapali, Gökmen; Günel, Mintaze Kerem; Karahan, Sevilay
2012-05-15
The study design was cross-cultural adaptation and investigation of reliability and validity of the Copenhagen Neck Functional Disability Scale (CNFDS). The aim of this study was to translate the CNFDS into Turkish language and assess its reliability and validity among patients with neck pain in Turkish population. The CNFDS is a reliable and valid evaluation instrument for disability, but there is no published the Turkish version of the CNFDS. One hundred one subjects who had chronic neck pain were included in this study. The CNFDS, Neck Pain and Disability Scale, and visual analogue scale were administered to all subjects. For investigating test-retest reliability, correlation between CNFDS scores, applied at 1-week interval, intraclass correlation coefficient score for test-retest reliability was 0.86 (95% confidence interval = 0.679-0.935). There was no difference between test-retest scores (P < 0.001). For investigating concurrent validity, correlation between total score of the CNFDS and the mean visual analogue scale was r = 0.73 (P < 0.001). Concurrent validity of the CNFDS was very good. For investigating construct validity, correlation between total score of the CNFDS and the Neck Pain and Disability Scale was r = 0.78 (P < 0.001). Construct validity of the CNFDS was also very good. Our results suggest that the Turkish version of the CNFDS is a reliable and valid instrument for Turkish people.
Development of a Conservative Model Validation Approach for Reliable Analysis
2015-01-01
CIE 2015 August 2-5, 2015, Boston, Massachusetts, USA [DRAFT] DETC2015-46982 DEVELOPMENT OF A CONSERVATIVE MODEL VALIDATION APPROACH FOR RELIABLE...obtain a conservative simulation model for reliable design even with limited experimental data. Very little research has taken into account the...3, the proposed conservative model validation is briefly compared to the conventional model validation approach. Section 4 describes how to account
Hybrid automated reliability predictor integrated work station (HiREL)
NASA Technical Reports Server (NTRS)
Bavuso, Salvatore J.
1991-01-01
The Hybrid Automated Reliability Predictor (HARP) integrated reliability (HiREL) workstation tool system marks another step toward the goal of producing a totally integrated computer aided design (CAD) workstation design capability. Since a reliability engineer must generally graphically represent a reliability model before he can solve it, the use of a graphical input description language increases productivity and decreases the incidence of error. The captured image displayed on a cathode ray tube (CRT) screen serves as a documented copy of the model and provides the data for automatic input to the HARP reliability model solver. The introduction of dependency gates to a fault tree notation allows the modeling of very large fault tolerant system models using a concise and visually recognizable and familiar graphical language. In addition to aiding in the validation of the reliability model, the concise graphical representation presents company management, regulatory agencies, and company customers a means of expressing a complex model that is readily understandable. The graphical postprocessor computer program HARPO (HARP Output) makes it possible for reliability engineers to quickly analyze huge amounts of reliability/availability data to observe trends due to exploratory design changes.
Evaluation of Validity and Reliability for Hierarchical Scales Using Latent Variable Modeling
ERIC Educational Resources Information Center
Raykov, Tenko; Marcoulides, George A.
2012-01-01
A latent variable modeling method is outlined, which accomplishes estimation of criterion validity and reliability for a multicomponent measuring instrument with hierarchical structure. The approach provides point and interval estimates for the scale criterion validity and reliability coefficients, and can also be used for testing composite or…
ERIC Educational Resources Information Center
Roberts, Clare; Pratt, Chris
1988-01-01
The study evaluated the psychometric properties of reliability and construct validity of the Attitude Toward Mainstreaming Scale (ATMS) in an Australian context. It was concluded that the scale is both reliable and factorially valid in an Australian context. (Author/DB)
Self-esteem among nursing assistants: reliability and validity of the Rosenberg Self-Esteem Scale.
McMullen, Tara; Resnick, Barbara
2013-01-01
To establish the reliability and validity of the Rosenberg Self-Esteem Scale (RSES) when used with nursing assistants (NAs). Testing the RSES used baseline data from a randomized controlled trial testing the Res-Care Intervention. Female NAs were recruited from nursing homes (n = 508). Validity testing for the positive and negative subscales of the RSES was based on confirmatory factor analysis (CFA) using structural equation modeling and Rasch analysis. Estimates of reliability were based on Rasch analysis and the person separation index. Evidence supports the reliability and validity of the RSES in NAs although we recommend minor revisions to the measure for subsequent use. Establishing reliable and valid measures of self-esteem in NAs will facilitate testing of interventions to strengthen workplace self-esteem, job satisfaction, and retention.
Hickman, Ronald L; Clochesy, John M; Hetland, Breanna; Alaamri, Marym
2017-04-01
There are limited reliable and valid measures of the patient- provider interaction among adults with hypertension. Therefore, the purpose of this report is to describe the construct validity and reliability of the Questionnaire on the Quality of Physician-Patient Interaction (QQPPI), in community-dwelling adults with hypertension. A convenience sample of 109 participants with hypertension was recruited and administered the QQPPI at baseline and 8 weeks later. The exploratory factor analysis established a 12-item, 2-factor structure for the QQPPI was valid in this sample. The modified QQPPI proved to have sufficient internal consistency and test- retest reliability. The modified QQPPI is a valid and reliable measure of the provider-patient interaction, a construct posited to impact self-management, in adults with hypertension.
Validation of Commercial Fiber Optic Components for Aerospace Environments
NASA Technical Reports Server (NTRS)
Ott, Melanie N.
2005-01-01
Full qualification for commercial photonic parts as defined by the Military specification system in the past, is not feasible. Due to changes in the photonic components industry and the Military specification system that NASA had relied upon so heavily in the past, an approach to technology validation of commercial off the shelf parts had to be devised. This approach involves knowledge of system requirements, environmental requirements and failure modes of the particular components under consideration. Synthesizing the criteria together with the major known failure modes to formulate a test plan is an effective way of establishing knowledge based "qualification". Although this does not provide the type of reliability assurance that the Military specification system did in the past, it is an approach that allows for increased risk mitigation. The information presented will introduce the audience to the technology validation approach that is currently applied at NASA for the usage of commercial-off-the-shelf (COTS) fiber optic components for space flight environments. The focus will be on how to establish technology validation criteria for commercial fiber products such that continued reliable performance is assured under the harsh environmental conditions of typical missions. The goal of this presentation is to provide the audience with an approach to formulating a COTS qualification test plan for these devices. Examples from past NASA missions will be discussed.
Tomita, Machiko R; Saharan, Sumandeep; Rajendran, Sheela; Nochajski, Susan M; Schweitzer, Jo A
2014-01-01
OBJECTIVE. To identify psychometric properties of the Home Safety Self-Assessment Tool (HSSAT) to prevent falls in community-dwelling older adults. METHOD. We tested content validity, test-retest reliability, interrater reliability, construct validity, convergent and discriminant validity, and responsiveness to change. RESULTS. The content validity index was .98, the intraclass correlation coefficient for test-retest reliability was .97, and the interrater reliability was .89. The difference on identified risk factors between the use and nonuse of the HSSAT was significant (p = .005). Convergent validity with the Centers for Disease Control and Prevention Home Safety Checklist was high (r = .65), and discriminant validity with fear of falling was very low (r = .10). The responsiveness to change was moderate (standardized response mean = 0.57). CONCLUSION. The HSSAT is a reliable and valid instrument to identify fall risks in a home environment, and the HSSAT booklet is effective as educational material leading to improvement in home safety. Copyright © 2014 by the American Occupational Therapy Association, Inc.
Lee, Myeongjun; Kim, Hyunjung; Shin, Donghee; Lee, Sangyun
2016-01-01
Harassment means systemic and repeated unethical acts. Research on workplace harassment have been conducted widely and the NAQ-R has been widely used for the researches. But this tool, however the limitations in revealing differended in sub-factors depending on the culture and in reflecting that unique characteristics of the Koren society. So, The workplace harassment questionnaire for Korean finace and service workers has been developed to assess the level of personal harassment at work. This study aims to develop a tool to assess the level of personal harassment at work and to test its validity and reliability while examining specific characteristics of workplace harassment against finance and service workers in Korea. The framework of survey was established based on literature review, focused-group interview for the Korean finance and service workers. To verify its reliability, Cronbach's alpha coefficient was calculated; and to verify its validity, items and factors of the tool were analyzed. The correlation matrix analysis was examined to verify the tool's convergent validity and discriminant validity. Structural validity was verified by checking statistical significance in relation to the BDI-K. Cronbach's alpha coefficient of this survey was 0.93, which indicates a quite high level of reliability. To verify the appropriateness of this survey tool, its construct validity was examined through factor analysis. As a result of the factor analysis, 3 factors were extracted, explaining 56.5 % of the total variance. The loading values and communalities of the 20 items were 0.85 to 0.48 and 0.71 to 0.46. The convergent validity and discriminant validity were analyzed and rate of item discriminant validity was 100 %. Finally, for the concurrent validity, We examined the relationship between the WHI-KFSW and pschosocial stress by examining the correlation with the BDI-K. The results of chi-square test and multiple logistic analysis indicated that the correlation with the BDI-K was satatisctically significant. Workplace harassment in actual workplaces were investigated based on interviews, and the statistical analysis contributed to systematizing the types of actual workplace harassment. By statistical method, we developed the questionare, 20 items of 3 categories.
Hasanpour, Neda; Attarbashi Moghadam, Behrouz; Sami, Ramin; Tavakol, Kamran
2016-08-01
The clinical COPD questionnaire (CCQ) has been developed to measure the health status of COPD patients. The aim of this study was to translate CCQ into the Persian language and assess the validity and reliability of the translated version. We used a forward-backward procedure to translate the questionnaire. In a cross-sectional study 100 COPD patients and 50 healthy subjects over 40 years old were selected to assess the reliability and construct validity of the instrument. The face and content validity were used for the questionnaire validity. Validity was examined in a population of patients with COPD, using the Persian validated version of the St George's Respiratory Questionnaire (PSGRQ). In order to assess the questionnaire's reliability, the Intraclass correlation coefficient (ICC) and Cronbach's alpha were calculated. Test-retest reliability was tested by re-administering the Persian version of the CCQ (PCCQ) after 1 week. Test-retest carry out of data demonstrates that the PCCQ has excellent reliability (ICC for all 3 domains were higher than 0.9). Internal consistency was found by Cronbach's alpha to be 0.96, 0.94, 0.97, and 0.98 for the symptom, mental state, functional state and total scores respectively. In addition, the correlation between the components of PCCQ and PSGRQ showed satisfactory construct validity. Analyzing the data from healthy subjects and patients divulged that the PCCQ has acceptable discriminant validity. In general, the PCCQ had satisfactory reliability and validity for assessing health-related quality of life status of Iranian COPD patients.
The Interview and Personnel Selection: Is the Process Valid and Reliable?
ERIC Educational Resources Information Center
Niece, Richard
1983-01-01
Reviews recent literature concerning the job interview. Concludes that such interviews are generally ineffective and proposes that school administrators devise techniques for improving their interviewing systems. (FL)
Scaglioni-Solano, Pietro; Aragón-Vargas, Luis F
2014-06-01
Standing balance is an important motor task. Postural instability associated with age typically arises from deterioration of peripheral sensory systems. The modified Clinical Test of Sensory Integration for Balance and the Tandem test have been used to screen for balance. Timed tests present some limitations, whereas quantification of the motions of the center of pressure (CoP) with portable and inexpensive equipment may help to improve the sensitivity of these tests and give the possibility of widespread use. This study determines the validity and reliability of the Wii Balance Board (Wii BB) to quantify CoP motions during the mentioned tests. Thirty-seven older adults completed three repetitions of five balance conditions: eyes open, eyes closed, eyes open on a compliant surface, eyes closed on a compliant surface, and tandem stance, all performed on a force plate and a Wii BB simultaneously. Twenty participants repeated the trials for reliability purposes. CoP displacement was the main outcome measure. Regression analysis indicated that the Wii BB has excellent concurrent validity, and Bland-Altman plots showed good agreement between devices with small mean differences and no relationship between the difference and the mean. Intraclass correlation coefficients (ICCs) indicated modest-to-excellent test-retest reliability (ICC=0.64-0.85). Standard error of measurement and minimal detectable change were similar for both devices, except the 'eyes closed' condition, with greater standard error of measurement for the Wii BB. In conclusion, the Wii BB is shown to be a valid and reliable method to quantify CoP displacement in older adults.
Larsen, Camilla Marie; Juul-Kristensen, Birgit; Lund, Hans; Søgaard, Karen
2014-10-01
The aims were to compile a schematic overview of clinical scapular assessment methods and critically appraise the methodological quality of the involved studies. A systematic, computer-assisted literature search using Medline, CINAHL, SportDiscus and EMBASE was performed from inception to October 2013. Reference lists in articles were also screened for publications. From 50 articles, 54 method names were identified and categorized into three groups: (1) Static positioning assessment (n = 19); (2) Semi-dynamic (n = 13); and (3) Dynamic functional assessment (n = 22). Fifteen studies were excluded for evaluation due to no/few clinimetric results, leaving 35 studies for evaluation. Graded according to the COnsensus-based Standards for the selection of health Measurement INstruments (COSMIN checklist), the methodological quality in the reliability and validity domains was "fair" (57%) to "poor" (43%), with only one study rated as "good". The reliability domain was most often investigated. Few of the assessment methods in the included studies that had "fair" or "good" measurement property ratings demonstrated acceptable results for both reliability and validity. We found a substantially larger number of clinical scapular assessment methods than previously reported. Using the COSMIN checklist the methodological quality of the included measurement properties in the reliability and validity domains were in general "fair" to "poor". None were examined for all three domains: (1) reliability; (2) validity; and (3) responsiveness. Observational evaluation systems and assessment of scapular upward rotation seem suitably evidence-based for clinical use. Future studies should test and improve the clinimetric properties, and especially diagnostic accuracy and responsiveness, to increase utility for clinical practice.
Bania, Theofani
2014-09-01
We determined the criterion validity and the retest reliability of the ΑctivPAL™ monitor in young people with diplegic cerebral palsy (CP). Activity monitor data were compared with the criterion of video recording for 10 participants. For the retest reliability, activity monitor data were collected from 24 participants on two occasions. Participants had to have diplegic CP and be between 14 and 22 years of age. They also had to be of Gross Motor Function Classification System level II or III. Outcomes were time spent in standing, number of steps (physical activity) and time spent in sitting (sedentary behaviour). For criterion validity, coefficients of determination were all high (r(2) ≥ 0.96), and limits of group agreement were relatively narrow, but limits of agreement for individuals were narrow only for number of steps (≥5.5%). Relative reliability was high for number of steps (intraclass correlation coefficient = 0.87) and moderate for time spent in sitting and lying, and time spent in standing (intraclass correlation coefficients = 0.60-0.66). For groups, changes of up to 7% could be due to measurement error with 95% confidence, but for individuals, changes as high as 68% could be due to measurement error. The results support the criterion validity and the retest reliability of the ActivPAL™ to measure physical activity and sedentary behaviour in groups of young people with diplegic CP but not in individuals. Copyright © 2014 John Wiley & Sons, Ltd.
The validation of the visual analogue scale for patient satisfaction after total hip arthroplasty.
Brokelman, Roy B G; Haverkamp, Daniel; van Loon, Corné; Hol, Annemiek; van Kampen, Albert; Veth, Rene
2012-06-01
INTRODUCTION: Patient satisfaction becomes more important in our modern health care system. The assessment of satisfaction is difficult because it is a multifactorial item for which no golden standard exists. One of the potential methods of measuring satisfaction is by using the well-known visual analogue scale (VAS). In this study, we validated VAS for satisfaction. PATIENT AND METHODS: In this prospective study, we studied 147 patients (153 hips). The construct validity was measured using the Spearman correlation test that compares the satisfaction VAS with the Harris hip score, pain VAS at rest and during activity, Oxford hip score, Short Form 36 and Western Ontario McMaster Universities Osteoarthritis Index. The reliability was tested using the intra-class coefficient. RESULTS: The Pearson correlation test showed correlations in the range of 0.40-0.80. The satisfaction VAS had a high correlation between the pain VAS and Oxford hip score, which could mean that pain is one of the most important factors in patient satisfaction. The intra-class coefficient was 0.95. CONCLUSIONS: There is a moderate to mark degree of correlation between the satisfaction VAS and the currently available subjective and objective scoring systems. The intra-class coefficient of 0.95 indicates an excellent test-retest reliability. The VAS satisfaction is a simple instrument to quantify the satisfaction of a patient after total hip arthroplasty. In this study, we showed that the satisfaction VAS has a good validity and reliability.
Advancing implementation science through measure development and evaluation: a study protocol.
Lewis, Cara C; Weiner, Bryan J; Stanick, Cameo; Fischer, Sarah M
2015-07-22
Significant gaps related to measurement issues are among the most critical barriers to advancing implementation science. Three issues motivated the study aims: (a) the lack of stakeholder involvement in defining pragmatic measure qualities; (b) the dearth of measures, particularly for implementation outcomes; and (c) unknown psychometric and pragmatic strength of existing measures. Aim 1: Establish a stakeholder-driven operationalization of pragmatic measures and develop reliable, valid rating criteria for assessing the construct. Aim 2: Develop reliable, valid, and pragmatic measures of three critical implementation outcomes, acceptability, appropriateness, and feasibility. Aim 3: Identify Consolidated Framework for Implementation Research and Implementation Outcome Framework-linked measures that demonstrate both psychometric and pragmatic strength. For Aim 1, we will conduct (a) interviews with stakeholder panelists (N = 7) and complete a literature review to populate pragmatic measure construct criteria, (b) Q-sort activities (N = 20) to clarify the internal structure of the definition, (c) Delphi activities (N = 20) to achieve consensus on the dimension priorities, (d) test-retest and inter-rater reliability assessments of the emergent rating system, and (e) known-groups validity testing of the top three prioritized pragmatic criteria. For Aim 2, our systematic development process involves domain delineation, item generation, substantive validity assessment, structural validity assessment, reliability assessment, and predictive validity assessment. We will also assess discriminant validity, known-groups validity, structural invariance, sensitivity to change, and other pragmatic features. For Aim 3, we will refine our established evidence-based assessment (EBA) criteria, extract the relevant data from the literature, rate each measure using the EBA criteria, and summarize the data. The study outputs of each aim are expected to have a positive impact as they will establish and guide a comprehensive measurement-focused research agenda for implementation science and provide empirically supported measures, tools, and methods for accomplishing this work.
Validity, Reliability, and the Questionable Role of Psychometrics in Plastic Surgery
2014-01-01
Summary: This report examines the meaning of validity and reliability and the role of psychometrics in plastic surgery. Study titles increasingly include the word “valid” to support the authors’ claims. Studies by other investigators may be labeled “not validated.” Validity simply refers to the ability of a device to measure what it intends to measure. Validity is not an intrinsic test property. It is a relative term most credibly assigned by the independent user. Similarly, the word “reliable” is subject to interpretation. In psychometrics, its meaning is synonymous with “reproducible.” The definitions of valid and reliable are analogous to accuracy and precision. Reliability (both the reliability of the data and the consistency of measurements) is a prerequisite for validity. Outcome measures in plastic surgery are intended to be surveys, not tests. The role of psychometric modeling in plastic surgery is unclear, and this discipline introduces difficult jargon that can discourage investigators. Standard statistical tests suffice. The unambiguous term “reproducible” is preferred when discussing data consistency. Study design and methodology are essential considerations when assessing a study’s validity. PMID:25289354
The Validation of a Case-Based, Cumulative Assessment and Progressions Examination
Coker, Adeola O.; Copeland, Jeffrey T.; Gottlieb, Helmut B.; Horlen, Cheryl; Smith, Helen E.; Urteaga, Elizabeth M.; Ramsinghani, Sushma; Zertuche, Alejandra; Maize, David
2016-01-01
Objective. To assess content and criterion validity, as well as reliability of an internally developed, case-based, cumulative, high-stakes third-year Annual Student Assessment and Progression Examination (P3 ASAP Exam). Methods. Content validity was assessed through the writing-reviewing process. Criterion validity was assessed by comparing student scores on the P3 ASAP Exam with the nationally validated Pharmacy Curriculum Outcomes Assessment (PCOA). Reliability was assessed with psychometric analysis comparing student performance over four years. Results. The P3 ASAP Exam showed content validity through representation of didactic courses and professional outcomes. Similar scores on the P3 ASAP Exam and PCOA with Pearson correlation coefficient established criterion validity. Consistent student performance using Kuder-Richardson coefficient (KR-20) since 2012 reflected reliability of the examination. Conclusion. Pharmacy schools can implement internally developed, high-stakes, cumulative progression examinations that are valid and reliable using a robust writing-reviewing process and psychometric analyses. PMID:26941435
Wearable vital parameters monitoring system
NASA Astrophysics Data System (ADS)
Caramaliu, Radu Vadim; Vasile, Alexandru; Bacis, Irina
2015-02-01
The system we propose monitors body temperature, heart rate and beside this, it tracks if the person who wears it suffers a faint. It uses a digital temperature sensor, a pulse sensor and a gravitational acceleration sensor to monitor the eventual faint or small heights free falls. The system continuously tracks the GPS position when available and stores the last valid data. So, when measuring abnormal vital parameters the module will send an SMS, using the GSM cellular network , with the person's social security number, the last valid GPS position for that person, the heart rate, the body temperature and, where applicable, a valid fall alert or non-valid fall alert. Even though such systems exist, they contain only faint detection or heart rate detection. Usually there is a strong correlation between low/high heart rate and an eventual faint. Combining both features into one system results in a more reliable detection device.
Manual and automatic locomotion scoring systems in dairy cows: a review.
Schlageter-Tello, Andrés; Bokkers, Eddie A M; Koerkamp, Peter W G Groot; Van Hertem, Tom; Viazzi, Stefano; Romanini, Carlos E B; Halachmi, Ilan; Bahr, Claudia; Berckmans, Daniël; Lokhorst, Kees
2014-09-01
The objective of this review was to describe, compare and evaluate agreement, reliability, and validity of manual and automatic locomotion scoring systems (MLSSs and ALSSs, respectively) used in dairy cattle lameness research. There are many different types of MLSSs and ALSSs. Twenty-five MLSSs were found in 244 articles. MLSSs use different types of scale (ordinal or continuous) and different gait and posture traits need to be observed. The most used MLSS (used in 28% of the references) is based on asymmetric gait, reluctance to bear weight, and arched back, and is scored on a five-level scale. Fifteen ALSSs were found that could be categorized according to three approaches: (a) the kinetic approach measures forces involved in locomotion, (b) the kinematic approach measures time and distance of variables associated to limb movement and some specific posture variables, and (c) the indirect approach uses behavioural variables or production variables as indicators for impaired locomotion. Agreement and reliability estimates were scarcely reported in articles related to MLSSs. When reported, inappropriate statistical methods such as PABAK and Pearson and Spearman correlation coefficients were commonly used. Some of the most frequently used MLSSs were poorly evaluated for agreement and reliability. Agreement and reliability estimates for the original four-, five- or nine-level MLSS, expressed in percentage of agreement, kappa and weighted kappa, showed large ranges among and sometimes also within articles. After the transformation into a two-level scale, agreement and reliability estimates showed acceptable estimates (percentage of agreement ≥ 75%; kappa and weighted kappa ≥ 0.6), but still estimates showed a large variation between articles. Agreement and reliability estimates for ALSSs were not reported in any article. Several ALSSs use MLSSs as a reference for model calibration and validation. However, varying agreement and reliability estimates of MLSSs make a clear definition of a lameness case difficult, and thus affect the validity of ALSSs. MLSSs and ALSSs showed limited validity for hoof lesion detection and pain assessment. The utilization of MLSSs and ALSSs should aim to the prevention and efficient management of conditions that induce impaired locomotion. Long-term studies comparing MLSSs and ALSSs while applying various strategies to detect and control unfavourable conditions leading to impaired locomotion are required to determine the usefulness of MLSSs and ALSSs for securing optimal production and animal welfare in practice. Copyright © 2014 Elsevier B.V. All rights reserved.
Muhamad, Zailani; Ramli, Ayiesah; Amat, Salleh
2015-05-01
The aim of this study was to determine the content validity, internal consistency, test-retest reliability and inter-rater reliability of the Clinical Competency Evaluation Instrument (CCEVI) in assessing the clinical performance of physiotherapy students. This study was carried out between June and September 2013 at University Kebangsaan Malaysia (UKM), Kuala Lumpur, Malaysia. A panel of 10 experts were identified to establish content validity by evaluating and rating each of the items used in the CCEVI with regards to their relevance in measuring students' clinical competency. A total of 50 UKM undergraduate physiotherapy students were assessed throughout their clinical placement to determine the construct validity of these items. The instrument's reliability was determined through a cross-sectional study involving a clinical performance assessment of 14 final-year undergraduate physiotherapy students. The content validity index of the entire CCEVI was 0.91, while the proportion of agreement on the content validity indices ranged from 0.83-1.00. The CCEVI construct validity was established with factor loading of ≥0.6, while internal consistency (Cronbach's alpha) overall was 0.97. Test-retest reliability of the CCEVI was confirmed with a Pearson's correlation range of 0.91-0.97 and an intraclass coefficient correlation range of 0.95-0.98. Inter-rater reliability of the CCEVI domains ranged from 0.59 to 0.97 on initial and subsequent assessments. This pilot study confirmed the content validity of the CCEVI. It showed high internal consistency, thereby providing evidence that the CCEVI has moderate to excellent inter-rater reliability. However, additional refinement in the wording of the CCEVI items, particularly in the domains of safety and documentation, is recommended to further improve the validity and reliability of the instrument.
A psychometric evaluation of the Rorschach comprehensive system's perceptual thinking index.
Dao, Tam K; Prevatt, Frances
2006-04-01
In this study, we investigated evidence for reliability and validity of the Perceptual Thinking Index (PTI; Exner, 2000a, 2000b) among an adult inpatient population. We conducted reliability and validity analyses on 107 patients who met the Diagnostic and Statistical Manual of Mental Disorders (4th ed., text revision; American Psychiatric Association, 2000) criteria for a schizophrenia-spectrum disorder (SSD) or mood disorder with no psychotic features (MD). Results provided support for interrater reliability as well as internal consistency of the PTI. Furthermore, the PTI was an effective index in differentiating SSD patients from patients diagnosed with an MD. Finally, the PTI demonstrated adequate diagnostic statistics that can be useful in the classification of patients diagnosed with SSD and MD. We discuss methodological issues, implications for assessment practice, and directions for future research.
Validity and Reliability of the Turkish Chronic Pain Acceptance Questionnaire
Akmaz, Hazel Ekin; Uyar, Meltem; Kuzeyli Yıldırım, Yasemin; Akın Korhan, Esra
2018-05-29
Pain acceptance is the process of giving up the struggle with pain and learning to live a worthwhile life despite it. In assessing patients with chronic pain in Turkey, making a diagnosis and tracking the effectiveness of treatment is done with scales that have been translated into Turkish. However, there is as yet no valid and reliable scale in Turkish to assess the acceptance of pain. To validate a Turkish version of the Chronic Pain Acceptance Questionnaire developed by McCracken and colleagues. Methodological and cross sectional study. A simple randomized sampling method was used in selecting the study sample. The sample was composed of 201 patients, more than 10 times the number of items examined for validity and reliability in the study, which totaled 20. A patient identification form, the Chronic Pain Acceptance Questionnaire, and the Brief Pain Inventory were used to collect data. Data were collected by face-to-face interviews. In the validity testing, the content validity index was used to evaluate linguistic equivalence, content validity, construct validity, and expert views. In reliability testing of the scale, Cronbach’s α coefficient was calculated, and item analysis and split-test reliability methods were used. Principal component analysis and varimax rotation were used in factor analysis and to examine factor structure for construct concept validity. The item analysis established that the scale, all items, and item-total correlations were satisfactory. The mean total score of the scale was 21.78. The internal consistency coefficient was 0.94, and the correlation between the two halves of the scale was 0.89. The Chronic Pain Acceptance Questionnaire, which is intended to be used in Turkey upon confirmation of its validity and reliability, is an evaluation instrument with sufficient validity and reliability, and it can be reliably used to examine patients’ acceptance of chronic pain.
Validity and Reliability of the Turkish Chronic Pain Acceptance Questionnaire
Akmaz, Hazel Ekin; Uyar, Meltem; Kuzeyli Yıldırım, Yasemin; Akın Korhan, Esra
2018-01-01
Background: Pain acceptance is the process of giving up the struggle with pain and learning to live a worthwhile life despite it. In assessing patients with chronic pain in Turkey, making a diagnosis and tracking the effectiveness of treatment is done with scales that have been translated into Turkish. However, there is as yet no valid and reliable scale in Turkish to assess the acceptance of pain. Aims: To validate a Turkish version of the Chronic Pain Acceptance Questionnaire developed by McCracken and colleagues. Study Design: Methodological and cross sectional study. Methods: A simple randomized sampling method was used in selecting the study sample. The sample was composed of 201 patients, more than 10 times the number of items examined for validity and reliability in the study, which totaled 20. A patient identification form, the Chronic Pain Acceptance Questionnaire, and the Brief Pain Inventory were used to collect data. Data were collected by face-to-face interviews. In the validity testing, the content validity index was used to evaluate linguistic equivalence, content validity, construct validity, and expert views. In reliability testing of the scale, Cronbach’s α coefficient was calculated, and item analysis and split-test reliability methods were used. Principal component analysis and varimax rotation were used in factor analysis and to examine factor structure for construct concept validity. Results: The item analysis established that the scale, all items, and item-total correlations were satisfactory. The mean total score of the scale was 21.78. The internal consistency coefficient was 0.94, and the correlation between the two halves of the scale was 0.89. Conclusion: The Chronic Pain Acceptance Questionnaire, which is intended to be used in Turkey upon confirmation of its validity and reliability, is an evaluation instrument with sufficient validity and reliability, and it can be reliably used to examine patients’ acceptance of chronic pain. PMID:29843496
Solar-Diesel Hybrid Power System Optimization and Experimental Validation
NASA Astrophysics Data System (ADS)
Jacobus, Headley Stewart
As of 2008 1.46 billion people, or 22 percent of the World's population, were without electricity. Many of these people live in remote areas where decentralized generation is the only method of electrification. Most mini-grids are powered by diesel generators, but new hybrid power systems are becoming a reliable method to incorporate renewable energy while also reducing total system cost. This thesis quantifies the measurable Operational Costs for an experimental hybrid power system in Sierra Leone. Two software programs, Hybrid2 and HOMER, are used during the system design and subsequent analysis. Experimental data from the installed system is used to validate the two programs and to quantify the savings created by each component within the hybrid system. This thesis bridges the gap between design optimization studies that frequently lack subsequent validation and experimental hybrid system performance studies.
Postert, Christian; Averbeck-Holocher, Marlies; Beyer, Thomas; Müller, Jörg; Furniss, Tilman
2009-03-01
DSM-IV and ICD-10 have limitations in the diagnostic classification of psychiatric disorders at preschool age (0-5 years). The publication of the Diagnostic Classification 0-3 (DC:0-3) in 1994, its basically revised second edition (DC:0-3R) in 2005 and the Research Diagnostic Criteria-Preschool Age (RDC-PA) in 2004 have provided several modifications of these manuals. Taking into account the growing empirical evidence highlighting the need for a diagnostic classification system for psychiatric disorders in preschool children, the main categorical classification systems in preschool psychiatry will be presented and discussed. The paper will focus on issues of validity, usefulness and reliability in DSM-IV, ICD-10, RDC-PA, DC:0-3, and DC:0-3R. The reasons for including or excluding postulated psychiatric disorder categories for preschool children with variable degrees of empirical evidence into the different diagnostic systems will be discussed.
Interrater reliability of the mind map assessment rubric in a cohort of medical students.
D'Antoni, Anthony V; Zipp, Genevieve Pinto; Olson, Valerie G
2009-04-28
Learning strategies are thinking tools that students can use to actively acquire information. Examples of learning strategies include mnemonics, charts, and maps. One strategy that may help students master the tsunami of information presented in medical school is the mind map learning strategy. Currently, there is no valid and reliable rubric to grade mind maps and this may contribute to their underutilization in medicine. Because concept maps and mind maps engage learners similarly at a metacognitive level, a valid and reliable concept map assessment scoring system was adapted to form the mind map assessment rubric (MMAR). The MMAR can assess mind map depth based upon concept-links, cross-links, hierarchies, examples, pictures, and colors. The purpose of this study was to examine interrater reliability of the MMAR. This exploratory study was conducted at a US medical school as part of a larger investigation on learning strategies. Sixty-six (N = 66) first-year medical students were given a 394-word text passage followed by a 30-minute presentation on mind mapping. After the presentation, subjects were again given the text passage and instructed to create mind maps based upon the passage. The mind maps were collected and independently scored using the MMAR by 3 examiners. Interrater reliability was measured using the intraclass correlation coefficient (ICC) statistic. Statistics were calculated using SPSS version 12.0 (Chicago, IL). Analysis of the mind maps revealed the following: concept-links ICC = .05 (95% CI, -.42 to .38), cross-links ICC = .58 (95% CI, .37 to .73), hierarchies ICC = .23 (95% CI, -.15 to .50), examples ICC = .53 (95% CI, .29 to .69), pictures ICC = .86 (95% CI, .79 to .91), colors ICC = .73 (95% CI, .59 to .82), and total score ICC = .86 (95% CI, .79 to .91). The high ICC value for total mind map score indicates strong MMAR interrater reliability. Pictures and colors demonstrated moderate to strong interrater reliability. We conclude that the MMAR may be a valid and reliable tool to assess mind maps in medicine. However, further research on the validity and reliability of the MMAR is necessary.
Interrater reliability of the mind map assessment rubric in a cohort of medical students
D'Antoni, Anthony V; Zipp, Genevieve Pinto; Olson, Valerie G
2009-01-01
Background Learning strategies are thinking tools that students can use to actively acquire information. Examples of learning strategies include mnemonics, charts, and maps. One strategy that may help students master the tsunami of information presented in medical school is the mind map learning strategy. Currently, there is no valid and reliable rubric to grade mind maps and this may contribute to their underutilization in medicine. Because concept maps and mind maps engage learners similarly at a metacognitive level, a valid and reliable concept map assessment scoring system was adapted to form the mind map assessment rubric (MMAR). The MMAR can assess mind map depth based upon concept-links, cross-links, hierarchies, examples, pictures, and colors. The purpose of this study was to examine interrater reliability of the MMAR. Methods This exploratory study was conducted at a US medical school as part of a larger investigation on learning strategies. Sixty-six (N = 66) first-year medical students were given a 394-word text passage followed by a 30-minute presentation on mind mapping. After the presentation, subjects were again given the text passage and instructed to create mind maps based upon the passage. The mind maps were collected and independently scored using the MMAR by 3 examiners. Interrater reliability was measured using the intraclass correlation coefficient (ICC) statistic. Statistics were calculated using SPSS version 12.0 (Chicago, IL). Results Analysis of the mind maps revealed the following: concept-links ICC = .05 (95% CI, -.42 to .38), cross-links ICC = .58 (95% CI, .37 to .73), hierarchies ICC = .23 (95% CI, -.15 to .50), examples ICC = .53 (95% CI, .29 to .69), pictures ICC = .86 (95% CI, .79 to .91), colors ICC = .73 (95% CI, .59 to .82), and total score ICC = .86 (95% CI, .79 to .91). Conclusion The high ICC value for total mind map score indicates strong MMAR interrater reliability. Pictures and colors demonstrated moderate to strong interrater reliability. We conclude that the MMAR may be a valid and reliable tool to assess mind maps in medicine. However, further research on the validity and reliability of the MMAR is necessary. PMID:19400964
Lempereur, Mathieu; Lelievre, Mathieu; Burdin, Valérie; Ben Salem, Douraied; Brochard, Sylvain
2017-01-01
Purpose To report evidence for the concurrent validity and reliability of dynamic MRI techniques to evaluate in vivo joint and muscle mechanics, and to propose recommendations for their use in the assessment of normal and impaired musculoskeletal function. Materials and methods The search was conducted on articles published in Web of science, PubMed, Scopus, Academic search Premier, and Cochrane Library between 1990 and August 2017. Studies that reported the concurrent validity and/or reliability of dynamic MRI techniques for in vivo evaluation of joint or muscle mechanics were included after assessment by two independent reviewers. Selected articles were assessed using an adapted quality assessment tool and a data extraction process. Results for concurrent validity and reliability were categorized as poor, moderate, or excellent. Results Twenty articles fulfilled the inclusion criteria with a mean quality assessment score of 66% (±10.4%). Concurrent validity and/or reliability of eight dynamic MRI techniques were reported, with the knee being the most evaluated joint (seven studies). Moderate to excellent concurrent validity and reliability were reported for seven out of eight dynamic MRI techniques. Cine phase contrast and real-time MRI appeared to be the most valid and reliable techniques to evaluate joint motion, and spin tag for muscle motion. Conclusion Dynamic MRI techniques are promising for the in vivo evaluation of musculoskeletal mechanics; however results should be evaluated with caution since validity and reliability have not been determined for all joints and muscles, nor for many pathological conditions. PMID:29232401
Clinical instruments: reliability and validity critical appraisal.
Brink, Yolandi; Louw, Quinette A
2012-12-01
RATIONALE, AIM AND OBJECTIVES: There is a lack of health care practitioners using objective clinical tools with sound psychometric properties. There is also a need for researchers to improve their reporting of the validity and reliability results of these clinical tools. Therefore, to promote the use of valid and reliable tools or tests for clinical evaluation, this paper reports on the development of a critical appraisal tool to assess the psychometric properties of objective clinical tools. A five-step process was followed to develop the new critical appraisal tool: (1) preliminary conceptual decisions; (2) defining key concepts; (3) item generation; (4) assessment of face validity; and (5) formulation of the final tool. The new critical appraisal tool consists of 13 items, of which five items relate to both validity and reliability studies, four items to validity studies only and four items to reliability studies. The 13 items could be scored as 'yes', 'no' or 'not applicable'. This critical appraisal tool will aid both the health care practitioner to critically appraise the relevant literature and researchers to improve the quality of reporting of the validity and reliability of objective clinical tools. © 2011 Blackwell Publishing Ltd.
Akram, A J; Ireland, A J; Postlethwaite, K C; Sandy, J R; Jerreat, A S
2013-11-01
This article describes the process of validity and reliability testing of a condition-specific quality-of-life measure for patients with hypodontia presenting for orthodontic treatment. The development of the instrument is described in a previous article. Royal Devon and Exeter NHS Foundation Trust & Musgrove Park Hospital, Taunton. The child perception questionnaire was used as a standard against which to test criterion validity. The Bland and Altman method was used to check agreement between the two questionnaires. Construct validity was tested using principal component analysis on the four sections of the questionnaire. Test-retest reliability was tested using intraclass correlation coefficient and Bland and Altman method. Cronbach's alpha was used to test internal consistency reliability. Overall the questionnaire showed good reliability, criterion and construct validity. This together with previous evidence of good face and content validity suggests that the instrument may prove useful in clinical practice and further research. This study has demonstrated that the newly developed condition-specific quality-of-life questionnaire is both valid and reliable for use in young patients with hypodontia. © 2013 John Wiley & Sons A/S. Published by Blackwell Publishing Ltd.
Validity and reliability of a scale to measure genital body image.
Zielinski, Ruth E; Kane-Low, Lisa; Miller, Janis M; Sampselle, Carolyn
2012-01-01
Women's body image dissatisfaction extends to body parts usually hidden from view--their genitals. Ability to measure genital body image is limited by lack of valid and reliable questionnaires. We subjected a previously developed questionnaire, the Genital Self Image Scale (GSIS) to psychometric testing using a variety of methods. Five experts determined the content validity of the scale. Then using four participant groups, factor analysis was performed to determine construct validity and to identify factors. Further construct validity was established using the contrasting groups approach. Internal consistency and test-retest reliability was determined. Twenty one of 29 items were considered content valid. Two items were added based on expert suggestions. Factor analysis was undertaken resulting in four factors, identified as Genital Confidence, Appeal, Function, and Comfort. The revised scale (GSIS-20) included 20 items explaining 59.4% of the variance. Women indicating an interest in genital cosmetic surgery exhibited significantly lower scores on the GSIS-20 than those who did not. The final 20 item scale exhibited internal reliability across all sample groups as well as test-retest reliability. The GSIS-20 provides a measure of genital body image demonstrating reliability and validity across several populations of women.
Xiao, Yuan-mei; Wang, Zhi-ming; Wang, Mian-zhen; Lan, Ya-jia
2005-06-01
To test the reliability and validity of two mental workload assessment scales, i.e. subjective workload assessment technique (SWAT) and NASA task load index (NASA-TLX). One thousand two hundred and sixty-eight mental workers were sampled from various kinds of occupations, such as scientific research, education, administration and medicine, etc, with randomized cluster sampling. The re-test reliability, split-half reliability, Cronbach's alpha coefficient and correlation coefficients between item score and total score were adopted to test the reliability. The test of validity included structure validity. The re-test reliability coefficients of these two scales and their items were ranged from 0.516 to 0.753 (P < 0.01), indicating the two scales had good re-test reliability; the split-half reliability of SWAT was 0.645, and its Cronbach's alpha coefficient was more than 0.80, all the correlation coefficients between its items score and total score were more than 0.70; as for NASA-TLX, both the split-half reliability and Cronbach's alpha coefficient were more than 0.80, the correlation coefficients between its items score and total score were all more than 0.60 (P < 0.01) except the item of performance. Both scales had good inner consistency. The Pearson correlation coefficient between the two scales was 0.492 (P < 0.01), implying the results of the two scales had good consistency. Factor analysis showed that the two scales had good structure validity. Both SWAT and NASA-TLX have good reliability and validity and may be used as a valid tool to assess mental workload in China after being revised properly.
Horn, W; Miksch, S; Egghart, G; Popow, C; Paky, F
1997-09-01
Real-time systems for monitoring and therapy planning, which receive their data from on-line monitoring equipment and computer-based patient records, require reliable data. Data validation has to utilize and combine a set of fast methods to detect, eliminate, and repair faulty data, which may lead to life-threatening conclusions. The strength of data validation results from the combination of numerical and knowledge-based methods applied to both continuously-assessed high-frequency data and discontinuously-assessed data. Dealing with high-frequency data, examining single measurements is not sufficient. It is essential to take into account the behavior of parameters over time. We present time-point-, time-interval-, and trend-based methods for validation and repair. These are complemented by time-independent methods for determining an overall reliability of measurements. The data validation benefits from the temporal data-abstraction process, which provides automatically derived qualitative values and patterns. The temporal abstraction is oriented on a context-sensitive and expectation-guided principle. Additional knowledge derived from domain experts forms an essential part for all of these methods. The methods are applied in the field of artificial ventilation of newborn infants. Examples from the real-time monitoring and therapy-planning system VIE-VENT illustrate the usefulness and effectiveness of the methods.
Wenborn, Jennifer; Challis, David; Pool, Jackie; Burgess, Jane; Elliott, Nicola; Orrell, Martin
2008-03-01
Activity is key to maintaining physical and mental health and well-being. However, as dementia affects the ability to engage in activity, care-givers can find it difficult to provide appropriate activities. The Pool Activity Level (PAL) Checklist guides the selection of appropriate, personally meaningful activities. The aim of this study was to assess the reliability and validity of the PAL Checklist when used with older people with dementia. A postal questionnaire sent to activity providers assessed content validity. Validity and reliability were measured in a sample of 60 older people with dementia. The questionnaire response rate was 83% (102/122). Most respondents felt no important items were missing. Seven of the nine activities were ranked as 'very important' or 'essential' by at least 77% of the sample, indicating very good content validity. Correlation with measures of cognition, severity of dementia and activity performance demonstrated strong concurrent validity. Inter-item correlation indicated strong construct validity. Cronbach's alpha coefficient measured internal consistency as excellent (0.95). All items achieved acceptable test-retest reliability, and the majority demonstrated acceptable inter-rater reliability. We conclude that the PAL Checklist demonstrates adequate validity and reliability when used with older people with dementia and appears a useful tool for a variety of care settings.
Lo, Wing-Sze; Ho, Sai-Yin; Wong, Bonny Yee-Man; Mak, Kwok-Kei; Lam, Tai-Hing
2011-06-01
The reliability and validity of Stunkard's Figure Rating Scale (FRS) as a measure of current body size (CBS) was established in Western adolescent girls but not in non-Western population. We examined the validity and test-retest reliability of Stunkard's FRS in assessing CBS among Chinese adolescents. Methods. In a school-based survey in Hong Kong, 5666 adolescents (boys: 45.1%; mean age 14.7 years) provided data on self-reported height and weight, CBS, perceived weight status, and health-related quality of life using the Medical Outcomes Study Short-Form version 2 (SF-12v2). Height and weight were also objectively measured. Spearman's correlation was used to assess construct validity, concurrent validity and test-retest reliability. Convergent and discriminant validity were good: CBS correlated strongly with weight and self-reported/measured BMI, but only weakly with SF-12v2. CBS correlated strongly with perceived weight status, showing concurrent validity. Spearman's correlation (r) for CBS was 0.78 for girls and 0.72 for boys indicating good test-retest reliability. Validity and reliability results did not differ significantly between senior and junior grade adolescents. Our findings support the use of Stunkard's FRS to measure body size among Chinese adolescents.
Rakotonarivo, O Sarobidy; Schaafsma, Marije; Hockley, Neal
2016-12-01
While discrete choice experiments (DCEs) are increasingly used in the field of environmental valuation, they remain controversial because of their hypothetical nature and the contested reliability and validity of their results. We systematically reviewed evidence on the validity and reliability of environmental DCEs from the past thirteen years (Jan 2003-February 2016). 107 articles met our inclusion criteria. These studies provide limited and mixed evidence of the reliability and validity of DCE. Valuation results were susceptible to small changes in survey design in 45% of outcomes reporting reliability measures. DCE results were generally consistent with those of other stated preference techniques (convergent validity), but hypothetical bias was common. Evidence supporting theoretical validity (consistency with assumptions of rational choice theory) was limited. In content validity tests, 2-90% of respondents protested against a feature of the survey, and a considerable proportion found DCEs to be incomprehensible or inconsequential (17-40% and 10-62% respectively). DCE remains useful for non-market valuation, but its results should be used with caution. Given the sparse and inconclusive evidence base, we recommend that tests of reliability and validity are more routinely integrated into DCE studies and suggest how this might be achieved. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
The Draw-A-Person Test: an indicator of children's cognitive and socioemotional adaptation?
ter Laak, J; de Goede, M; Aleva, A; van Rijswijk, P
2005-03-01
The authors examined aspects of reliability and validity of the Goodenough-Harris Draw-A-Person Test (DAP; D. B. Harris, 1963). The participants were 115 seven- to nine-year-old students attending regular or special education schools. Three judges, with a modest degree of training similar to that found among practicing clinicians, rated the students' human figure drawings on developmental and personality variables. The authors found that counting details and determining developmental level in the DAP test could be carried out reliably by judges with limited experience. However, the reliability of judgments of children's social and emotional development and personality was insufficient. Older students and students attending regular schools received significantly higher scores than did younger students or students attending special education schools. The authors found that the success of the DAP test as an indicator of cognitive level, socioemotional development, and personality is limited when global judgments are used. The authors concluded that more specific, reliable, valid, and useful scoring systems are needed for the DAP test.
2013-01-01
Background Yearly formative knowledge testing (also known as progress testing) was shown to have a limited construct-validity and reliability in postgraduate medical education. One way to improve construct-validity and reliability is to improve the authenticity of a test. As easily accessible internet has become inseparably linked to daily clinical practice, we hypothesized that allowing internet access for a limited amount of time during the progress test would improve the perception of authenticity (face-validity) of the test, which would in turn improve the construct-validity and reliability of postgraduate progress testing. Methods Postgraduate trainees taking the yearly knowledge progress test were asked to participate in a study where they could access the internet for 30 minutes at the end of a traditional pen and paper test. Before and after the test they were asked to complete a short questionnaire regarding the face-validity of the test. Results Mean test scores increased significantly for all training years. Trainees indicated that the face-validity of the test improved with internet access and that they would like to continue to have internet access during future testing. Internet access did not improve the construct-validity or reliability of the test. Conclusion Improving the face-validity of postgraduate progress testing, by adding the possibility to search the internet for a limited amount of time, positively influences test performance and face-validity. However, it did not change the reliability or the construct-validity of the test. PMID:24195696
Validity and reliability of Nintendo Wii Fit balance scores.
Wikstrom, Erik A
2012-01-01
Interactive gaming systems have the potential to help rehabilitate patients with musculoskeletal conditions. The Nintendo Wii Balance Board, which is part of the Wii Fit game, could be an effective tool to monitor progress during rehabilitation because the board and game can provide objective measures of balance. However, the validity and reliability of Wii Fit balance scores remain unknown. To determine the concurrent validity of balance scores produced by the Wii Fit game and the intrasession and intersession reliability of Wii Fit balance scores. Descriptive laboratory study. Sports medicine research laboratory. Forty-five recreationally active participants (age = 27.0 ± 9.8 years, height = 170.9 ± 9.2 cm, mass = 72.4 ± 11.8 kg) with a heterogeneous history of lower extremity injury. Participants completed a single-limb-stance task on a force plate and the Star Excursion Balance Test (SEBT) during the first test session. Twelve Wii Fit balance activities were completed during 2 test sessions separated by 1 week. Postural sway in the anteroposterior (AP) and mediolateral (ML) directions and the AP, ML, and resultant center-of-pressure (COP) excursions were calculated from the single-limb stance. The normalized reach distance was recorded for the anterior, posteromedial, and posterolateral directions of the SEBT. Wii Fit balance scores that the game software generated also were recorded. All 96 of the calculated correlation coefficients among Wii Fit activity outcomes and established balance outcomes were interpreted as poor (r < 0.50). Intrasession reliability for Wii Fit balance activity scores ranged from good (intraclass correlation coefficient [ICC] = 0.80) to poor (ICC = 0.39), with 8 activities having poor intrasession reliability. Similarly, 11 of the 12 Wii Fit balance activity scores demonstrated poor intersession reliability, with scores ranging from fair (ICC = 0.74) to poor (ICC = 0.29). Wii Fit balance activity scores had poor concurrent validity relative to COP outcomes and SEBT reach distances. In addition, the included Wii Fit balance activity scores generally had poor intrasession and intersession reliability.
von Dincklage, Falk; Lichtner, Gregor; Suchodolski, Klaudiusz; Ragaller, Maximilian; Friesdorf, Wolfgang; Podtschaske, Beatrice
2017-08-01
The implementation of computerized critical care information systems (CCIS) can improve the quality of clinical care and staff satisfaction, but also holds risks of disrupting the workflow with consecutive negative impacts. The usability of CCIS is one of the key factors determining their benefits and weaknesses. However, no tailored instrument exists to measure the usability of such systems. Therefore, the aim of this study was to design and validate a questionnaire that measures the usability of CCIS. Following a mixed-method design approach, we developed a questionnaire comprising two evaluation models to assess the usability of CCIS: (1) the task-specific model rates the usability individually for several tasks which CCIS could support and which we derived by analyzing work processes in the ICU; (2) the characteristic-specific model rates the different aspects of the usability, as defined by the international standard "ergonomics of human-system interaction". We tested validity and reliability of the digital version of the questionnaire in a sample population. In the sample population of 535 participants both usability evaluation models showed a strong correlation with the overall rating of the system (multiple correlation coefficients ≥0.80) as well as a very high internal consistency (Cronbach's alpha ≥0.93). The novel questionnaire is a valid and reliable instrument to measure the usability of CCIS and can be used to study the influence of the usability on their implementation benefits and weaknesses.
Ross, Amy M; Ilic, Kelley; Kiyoshi-Teo, Hiroko; Lee, Christopher S
2017-12-26
The purpose of this study was to establish the psychometric properties of the new 16-item leadership environment scale. The leadership environment scale was based on complexity science concepts relevant to complex adaptive health care systems. A workforce survey of direct-care nurses was conducted (n = 1,443) in Oregon. Confirmatory factor analysis, exploratory factor analysis, concordant validity test and reliability tests were conducted to establish the structure and internal consistency of the leadership environment scale. Confirmatory factor analysis indices approached acceptable thresholds of fit with a single factor solution. Exploratory factor analysis showed improved fit with a two-factor model solution; the factors were labelled 'influencing relationships' and 'interdependent system supports'. Moderate to strong convergent validity was observed between the leadership environment scale/subscales and both the nursing workforce index and the safety organising scale. Reliability of the leadership environment scale and subscales was strong, with all alphas ≥.85. The leadership environment scale is structurally sound and reliable. Nursing management can employ adaptive complexity leadership attributes, measure their influence on the leadership environment, subsequently modify system supports and relationships and improve the quality of health care systems. The leadership environment scale is an innovative fit to complex adaptive systems and how nurses act as leaders within these systems. © 2017 John Wiley & Sons Ltd.
ERIC Educational Resources Information Center
Worrell, Frank C.; Mello, Zena R.
2007-01-01
In this study, the authors examined the reliability, structural validity, and concurrent validity of Zimbardo Time Perspective Inventory (ZTPI) scores in a group of 815 academically talented adolescents. Reliability estimates of the purported factors' scores were in the low to moderate range. Exploratory factor analysis supported a five-factor…
ERIC Educational Resources Information Center
Cizek, Gregory J.
2009-01-01
Reliability and validity are two characteristics that must be considered whenever information about student achievement is collected. However, those characteristics--and the methods for evaluating them--differ in large-scale testing and classroom testing contexts. This article presents the distinctions between reliability and validity in the two…
ERIC Educational Resources Information Center
Arce-Ferrer, Alvaro J.; Castillo, Irene Borges
2007-01-01
The use of face-to-face interviews is controversial for college admissions decisions in light of the lack of availability of validity and reliability evidence for most college admission processes. This study investigated reliability and incremental predictive validity of a face-to-face postgraduate college admission interview with a sample of…
An Integrated Approach to Establish Validity and Reliability of Reading Tests
ERIC Educational Resources Information Center
Razi, Salim
2012-01-01
This study presents the processes of developing and establishing reliability and validity of a reading test by administering an integrative approach as conventional reliability and validity measures superficially reveals the difficulty of a reading test. In this respect, analysing vocabulary frequency of the test is regarded as a more eligible way…
ERIC Educational Resources Information Center
Klesius, Janell P.; Homan, Susan P.
1985-01-01
The article reviews validity and reliability studies on the informal reading inventory, a diagnostic instrument to identify reading grade-level placement and strengths and weaknesses in work recognition and comprehension. Gives suggestions to improve the validity and reliability of existing inventories and to evaluate them in newly published…
The type of mat (Contact vs. Photocell) affects vertical jump height estimated from flight time.
García-López, Juan; Morante, Juan C; Ogueta-Alday, Ana; Rodríguez-Marroyo, Jose A
2013-04-01
The purposes of this study were to analyze the validity and reliability of 2 photocell mats and to probe the possible influence of the type of mat (contact vs. photocell) on vertical jump height estimated from flight time. In 2 separate studies, 89 and 92 physical students performed 3 countermovement jumps that were simultaneously registered by a Force Plate (gold standard method), 2 photocell mats (SportJump System Pro and ErgoJump Plus), and a contact mat (SportJump-v1.0). The first study showed that the 2 photocell mats underestimated the vertical jump height (1.3 ± 0.2 cm and 5.9 ± 5.2 cm, respectively), but only SportJump System Pro showed a high correlation with the Force Plate (r = 0.999 and 0.676, respectively) and good intraday reliability (coefficient of variation = 2.98 and 15.94%, intraclass correlation coefficients = 0.95-0.97 and 0.45-0.57, respectively). The second study demonstrated a strong correlation (r = 0.994) between the 2 technologies (contact vs. photocell mats) with differences in vertical jump height of 2.0 ± 0.8 cm (95% confidence interval = 1.9-2.1 cm), which depended on both flight time and subjects' body mass. In conclusion, SportJump System Pro was a valid and reliable device. The new devices to measure vertical jump height from flight time should be validated. The type of mat (contact vs. photocell) affected approximately 6% the vertical jump height (approximately 2 cm in this study), which should be considered in further studies. The use of validated photocell mats instead of the contact mats was recommended.
Validity and reliability of the Diagnostic Adaptive Behaviour Scale.
Tassé, M J; Schalock, R L; Balboni, G; Spreat, S; Navas, P
2016-01-01
The Diagnostic Adaptive Behaviour Scale (DABS) is a new standardised adaptive behaviour measure that provides information for evaluating limitations in adaptive behaviour for the purpose of determining a diagnosis of intellectual disability. This article presents validity evidence and reliability data for the DABS. Validity evidence was based on comparing DABS scores with scores obtained on the Vineland Adaptive Behaviour Scale, second edition. The stability of the test scores was measured using a test and retest, and inter-rater reliability was assessed by computing the inter-respondent concordance. The DABS convergent validity coefficients ranged from 0.70 to 0.84, while the test-retest reliability coefficients ranged from 0.78 to 0.95, and the inter-rater concordance as measured by intraclass correlation coefficients ranged from 0.61 to 0.87. All obtained validity and reliability indicators were strong and comparable with the validity and reliability coefficients of the most commonly used adaptive behaviour instruments. These results and the advantages of the DABS for clinician and researcher use are discussed. © 2015 MENCAP and International Association of the Scientific Study of Intellectual and Developmental Disabilities and John Wiley & Sons Ltd.
[Reliability and validity of Driving Anger Scale in professional drivers in China].
Li, Z; Yang, Y M; Zhang, C; Li, Y; Hu, J; Gao, L W; Zhou, Y X; Zhang, X J
2017-11-10
Objective: To assess the reliability and validity of the Chinese version of Driving Anger Scale (DAS) in professional drivers in China and provide a scientific basis for the application of the scale in drivers in China. Methods: Professional drivers, including taxi drivers, bus drivers, truck drivers and school bus drivers, were selected to complete the questionnaire. Cronbach's α and split-half reliability were calculated to evaluate the reliability of DAS, and content, contract, discriminant and convergent validity were performed to measure the validity of the scale. Results: The overall Cronbach's α of DAS was 0.934 and the split-half reliability was 0.874. The correlation coefficient of each subscale with the total scale was 0.639-0.922. The simplified version of DAS supported a presupposed six-factor structure, explaining 56.371% of the total variance revealed by exploratory factor analysis. The DAS had good convergent and discriminant validity, with the success rate of calibration experiment of 100%. Conclusion: DAS has a good reliability and validity in professional drivers in China, and the use of DAS is worth promoting in divers.
On the next generation of reliability analysis tools
NASA Technical Reports Server (NTRS)
Babcock, Philip S., IV; Leong, Frank; Gai, Eli
1987-01-01
The current generation of reliability analysis tools concentrates on improving the efficiency of the description and solution of the fault-handling processes and providing a solution algorithm for the full system model. The tools have improved user efficiency in these areas to the extent that the problem of constructing the fault-occurrence model is now the major analysis bottleneck. For the next generation of reliability tools, it is proposed that techniques be developed to improve the efficiency of the fault-occurrence model generation and input. Further, the goal is to provide an environment permitting a user to provide a top-down design description of the system from which a Markov reliability model is automatically constructed. Thus, the user is relieved of the tedious and error-prone process of model construction, permitting an efficient exploration of the design space, and an independent validation of the system's operation is obtained. An additional benefit of automating the model construction process is the opportunity to reduce the specialized knowledge required. Hence, the user need only be an expert in the system he is analyzing; the expertise in reliability analysis techniques is supplied.
ERIC Educational Resources Information Center
Watson, J. Allen; And Others
1989-01-01
Describes study that was conducted to determine the feasibility of networking home microcomputers with a university mainframe system in order to investigate a new family process research paradigm, as well as the design and function of the microcomputer/mainframe system. Test instrumentation is described and systems' reliability and validity are…
Effect of individual shades on reliability and validity of observers in colour matching.
Lagouvardos, P E; Diamanti, H; Polyzois, G
2004-06-01
The effect of individual shades in shade guides, on the reliability and validity of measurements in a colour matching process is very important. Observer's agreement on shades and sensitivity/specificity of shades, can give us an estimate of shade's effect on observer's reliability and validity. In the present study, a group of 16 students, matched 15 shades of a Kulzer's guide and 10 human incisors to Kulzer's and/or Vita's shade tabs, in 4 different tests. The results showed shades I, B10, C40, A35 and A10 were those with the highest reliability and validity values. In conclusion, a) the matching process with shades of different materials was not accurate enough, b) some shades produce a more reliable and valid match than others and c) teeth are matched with relative difficulty.
The reliability and validity of a sexual functioning questionnaire.
Corty, E W; Althof, S E; Kurit, D M
1996-01-01
The present study assessed the reliability and validity of a measure of sexual functioning, the CMSH-SFQ, for male patients and their partners. The CMSH-SFQ measures erectile and orgasmic functioning, sexual drive, frequency of sexual behavior, and sexual satisfaction. Test-retest reliability was assessed with 19 males and 19 females for the baseline CMSH-SFQ. Criterion validity was measured by comparing the answers of 25 male patients to those of their partners at baseline and follow-up. The majority of items had acceptable levels of reliability and validity. The CMSH-SFQ provides a reliable and valid device that can be used to measure global sexual functioning in men and their partners and may be used to evaluate the efficacy of treatments for sexual dysfunctions. Limitations and suggestions for use of the CMSH-SFQ are addressed.
Reliability and validity of the McDonald Play Inventory.
McDonald, Ann E; Vigen, Cheryl
2012-01-01
This study examined the ability of a two-part self-report instrument, the McDonald Play Inventory, to reliably and validly measure the play activities and play styles of 7- to 11-yr-old children and to discriminate between the play of neurotypical children and children with known learning and developmental disabilities. A total of 124 children ages 7-11 recruited from a sample of convenience and a subsample of 17 parents participated in this study. Reliability estimates yielded moderate correlations for internal consistency, total test intercorrelations, and test-retest reliability. Validity estimates were established for content and construct validity. The results suggest that a self-report instrument yields reliable and valid measures of a child's perceived play performance and discriminates between the play of children with and without disabilities. Copyright © 2012 by the American Occupational Therapy Association, Inc.
ERIC Educational Resources Information Center
Viglione, Donald J.; Perry, William; Giromini, Luciano; Meyer, Gregory J.
2011-01-01
We used multiple regression to calculate a new Ego Impairment Index (EII-3). The aim was to incorporate changes in the component variables and distribution of the number of responses as found in the new Rorschach Performance Assessment System, while sustaining the validity and reliability of previous EIIs. The EII-3 formula was derived from a…
A Rasch-Based Validation of the Hooper Visual Organization Test in Chinese-Speaking Children
ERIC Educational Resources Information Center
Wuang, Yee-Pay; Wang, Li-Chen; Su, Chwen-Yng
2010-01-01
The aim of this study was to examine the validation of the Hooper Visual Organization Test (HVOT) for use in children by testing for item fit, unidimensionality, item hierarchy, reliability, and screening capacity. A modified scoring system was devised for the HVOT so that children received some credit for being able to describe the function of…
ERIC Educational Resources Information Center
Doménech-Betoret, Fernando; Fortea-Bagán, Miguel Angel
2015-01-01
Introduction: Education research has clearly verified that a student's perception of the system to evaluate the subject matter will play a fundamental role in his/her implication (deep approach vs. surface approach) in the teaching/learning process of the subject matter. The present work aims to examine the factorial validity and reliability of a…
Eslami, Ahmad Ali; Amidi Mazaheri, Maryam; Mostafavi, Firoozeh; Abbasi, Mohamad Hadi; Noroozi, Ensieh
2014-01-01
Assessment of social skills is a necessary requirement to develop and evaluate the effectiveness of cognitive and behavioral interventions. This paper reports the cultural adaptation and psychometric properties of the Farsi version of the social skills rating system-secondary students form (SSRS-SS) questionnaire (Gresham and Elliot, 1990), in a normative sample of secondary school students. A two-phase design was used that phase 1 consisted of the linguistic adaptation and in phase 2, using cross-sectional sample survey data, the construct validity and reliability of the Farsi version of the SSRS-SS were examined in a sample of 724 adolescents aged from 13 to 19 years. Content validity index was excellent, and the floor/ceiling effects were low. After deleting five of the original SSRS-SS items, the findings gave support for the item convergent and divergent validity. Factor analysis revealed four subscales. RESULTS showed good internal consistency (0.89) and temporal stability (0.91) for the total scale score. Findings demonstrated support for the use of the 27-item Farsi version in the school setting. Directions for future research regarding the applicability of the scale in other settings and populations of adolescents are discussed.
Validation of the breast evaluation questionnaire for breast hypertrophy and breast reduction.
Lewin, Richard; Elander, Anna; Lundberg, Jonas; Hansson, Emma; Thorarinsson, Andri; Claudelin, Malin; Bladh, Helena; Lidén, Mattias
2018-06-13
There is a lack of published, validated questionnaires for evaluating psychosocial morbidity in patients with breast hypertrophy undergoing breast reduction surgery. To validate the breast evaluation questionnaire (BEQ), originally developed for the assessment of breast augmentation patients, for the assessment of psychosocial morbidity in patients with breast hypertrophy undergoing breast reduction surgery. Validation study Subjects: Women with macromastia Methods: The validation of the BEQ, adapted to breast reduction, was performed in several steps. Content validity, reliability, construct validity and responsiveness were assessed. The original version was adjusted according to the results for content validity and resulted in item reduction and a modified BEQ (mBEQ) that was then assessed for reliability, construct validity and responsiveness. Internal and external validation was performed for the modified BEQ. Convergent validity was tested against Breast-Q (reduction) and discriminate validity was tested against the SF-36. Known-groups validation revealed significant differences between the normal population and patients undergoing breast reduction surgery. The BEQ showed good reliability by test-re-test analysis and high responsiveness. The modified BEQ may be reliable, valid and responsive instrument for assessing women who undergo breast reduction.
The case against one-shot testing for initial dental licensure.
Chambers, David W; Dugoni, Arthur A; Paisley, Ian
2004-03-01
High-stakes testing are expected to meet standards for cost-effectiveness, fairness, transparency, high reliability, and high validity. It is questionable whether initial licensure examinations in dentistry meet such standards. Decades of piecemeal adjustments in the system have resulted in limited improvement. The essential flaw in the system is reliance on a one-shot sample of a small segment of the skills, understanding, and supporting values needed for today's professional practice of dentistry. The "snapshot" approach to testing produces inherently substandard levels of reliability and validity. A three-step alternative is proposed: boards should (1) define the competencies required of beginning practitioners, (2) establish the psychometric standards needed to make defensible judgments about candidates, and (3) base licensure decisions only on portfolios of evidence that test for defined competencies at established levels of quality.
Formiga, Magno F; Roach, Kathryn E; Vital, Isabel; Urdaneta, Gisel; Balestrini, Kira; Calderon-Candelario, Rafael A; Campos, Michael A; Cahalin, Lawrence P
2018-01-01
The Test of Incremental Respiratory Endurance (TIRE) provides a comprehensive assessment of inspiratory muscle performance by measuring maximal inspiratory pressure (MIP) over time. The integration of MIP over inspiratory duration (ID) provides the sustained maximal inspiratory pressure (SMIP). Evidence on the reliability and validity of these measurements in COPD is not currently available. Therefore, we assessed the reliability, responsiveness and construct validity of the TIRE measures of inspiratory muscle performance in subjects with COPD. Test-retest reliability, known-groups and convergent validity assessments were implemented simultaneously in 81 male subjects with mild to very severe COPD. TIRE measures were obtained using the portable PrO2 device, following standard guidelines. All TIRE measures were found to be highly reliable, with SMIP demonstrating the strongest test-retest reliability with a nearly perfect intraclass correlation coefficient (ICC) of 0.99, while MIP and ID clustered closely together behind SMIP with ICC values of about 0.97. Our findings also demonstrated known-groups validity of all TIRE measures, with SMIP and ID yielding larger effect sizes when compared to MIP in distinguishing between subjects of different COPD status. Finally, our analyses confirmed convergent validity for both SMIP and ID, but not MIP. The TIRE measures of MIP, SMIP and ID have excellent test-retest reliability and demonstrated known-groups validity in subjects with COPD. SMIP and ID also demonstrated evidence of moderate convergent validity and appear to be more stable measures in this patient population than the traditional MIP.
The reliability and validity of ultrasound to quantify muscles in older adults: a systematic review
Scafoglieri, Aldo; Jager‐Wittenaar, Harriët; Hobbelen, Johannes S.M.; van der Schans, Cees P.
2017-01-01
Abstract This review evaluates the reliability and validity of ultrasound to quantify muscles in older adults. The databases PubMed, Cochrane, and Cumulative Index to Nursing and Allied Health Literature were systematically searched for studies. In 17 studies, the reliability (n = 13) and validity (n = 8) of ultrasound to quantify muscles in community‐dwelling older adults (≥60 years) or a clinical population were evaluated. Four out of 13 reliability studies investigated both intra‐rater and inter‐rater reliability. Intraclass correlation coefficient (ICC) scores for reliability ranged from −0.26 to 1.00. The highest ICC scores were found for the vastus lateralis, rectus femoris, upper arm anterior, and the trunk (ICC = 0.72 to 1.000). All included validity studies found ICC scores ranging from 0.92 to 0.999. Two studies describing the validity of ultrasound to predict lean body mass showed good validity as compared with dual‐energy X‐ray absorptiometry (r 2 = 0.92 to 0.96). This systematic review shows that ultrasound is a reliable and valid tool for the assessment of muscle size in older adults. More high‐quality research is required to confirm these findings in both clinical and healthy populations. Furthermore, ultrasound assessment of small muscles needs further evaluation. Ultrasound to predict lean body mass is feasible; however, future research is required to validate prediction equations in older adults with varying function and health. PMID:28703496
Kolodziejczyk, Julia K; Norman, Gregory J; Rock, Cheryl L; Arredondo, Elva M; Roesch, Scott C; Madanat, Hala; Patrick, Kevin
2016-01-01
This study evaluates the reliability and validity of the strategies for weight management (SWM) measure, a questionnaire that assesses weight management strategies for adults. The SWM includes 20 items that are categorized within the following subscales: (1) energy intake, (2) energy expenditure, (3) self-monitoring, and (4) self-regulation. Baseline and 6-month data were collected from 404 overweight/obese adults (mean age=22±3.8 years, 68% ethnic minority) enrolled in a randomized controlled trial aiming to reduce weight by improving diet and physical activity behaviours. Reliability and validity were assessed for each subscale separately. Cronbach alpha was conducted to assess reliability. Concurrent, construct I (sensitivity to the study treatment condition), and construct II (relationship to the outcomes) validity were assessed using linear regressions with the following outcome measures: weight, self-reported diet, and weekly energy expenditure. All subscales showed strong internal consistency. The strength of the validity evidence depended on subscale and validity type. The strongest validity evidence was concurrent validity of the energy intake and energy expenditure subscales; construct I validity of the energy intake and self-monitoring subscales; and construct II validity of the energy intake, energy expenditure, and self-regulation subscales. Results indicate that the SWM can be used to assess weight management strategies among an ethnically diverse sample of adults as each subscale showed evidence of reliability and select types of validity. As validity is an accumulation of evidence over multiple studies, this study provides initial reliability and validity evidence in one population segment. Copyright © 2015 Asia Oceania Association for the Study of Obesity. Published by Elsevier Ltd. All rights reserved.
[Reliability and validity of the Chinese version on Alcohol Use Disorders Identification Test].
Zhang, C; Yang, G P; Li, Z; Li, X N; Li, Y; Hu, J; Zhang, F Y; Zhang, X J
2017-08-10
Objective: To assess the reliability and validity of the Chinese version on Alcohol Use Disorders Identification Test (AUDIT) among medical students in China and to provide correct way of application on the recommended scales. Methods: An E-questionnaire was developed and sent to medical students in five different colleges. Students were all active volunteers to accept the testings. Cronbach's α and split-half reliability were calculated to evaluate the reliability of AUDIT while content, contract, discriminant and convergent validity were performed to measure the validity of the scales. Results: The overall Cronbach's α of AUDIT was 0.782 and the split-half reliability was 0.711. Data showed that the domain Cronbach's α and split-half reliability were 0.796 and 0.794 for hazardous alcohol use, 0.561 and 0.623 for dependence symptoms, and 0.647 and 0.640 for harmful alcohol use. Results also showed that the content validity index on the levels of items I-CVI) were from 0.83 to 1.00, the content validity index of scale level (S-CVI/UA) was 0.90, content validity index of average scale level (S-CVI/Ave) was 0.99 and the content validity ratios (CVR) were from 0.80 to 1.00. The simplified version of AUDIT supported a presupposed three-factor structure which could explain 61.175% of the total variance revealed through exploratory factor analysis. AUDIT semed to have good convergent and discriminant validity, with the success rate of calibration experiment as 100%. Conclusion: AUDIT showed good reliability and validity among medical students in China thus worth for promotion on its use.
2014-01-01
Background Foot disease complications, such as foot ulcers and infection, contribute to considerable morbidity and mortality. These complications are typically precipitated by “high-risk factors”, such as peripheral neuropathy and peripheral arterial disease. High-risk factors are more prevalent in specific “at risk” populations such as diabetes, kidney disease and cardiovascular disease. To the best of the authors’ knowledge a tool capturing multiple high-risk factors and foot disease complications in multiple at risk populations has yet to be tested. This study aimed to develop and test the validity and reliability of a Queensland High Risk Foot Form (QHRFF) tool. Methods The study was conducted in two phases. Phase one developed a QHRFF using an existing diabetes foot disease tool, literature searches, stakeholder groups and expert panel. Phase two tested the QHRFF for validity and reliability. Four clinicians, representing different levels of expertise, were recruited to test validity and reliability. Three cohorts of patients were recruited; one tested criterion measure reliability (n = 32), another tested criterion validity and inter-rater reliability (n = 43), and another tested intra-rater reliability (n = 19). Validity was determined using sensitivity, specificity and positive predictive values (PPV). Reliability was determined using Kappa, weighted Kappa and intra-class correlation (ICC) statistics. Results A QHRFF tool containing 46 items across seven domains was developed. Criterion measure reliability of at least moderate categories of agreement (Kappa > 0.4; ICC > 0.75) was seen in 91% (29 of 32) tested items. Criterion validity of at least moderate categories (PPV > 0.7) was seen in 83% (60 of 72) tested items. Inter- and intra-rater reliability of at least moderate categories (Kappa > 0.4; ICC > 0.75) was seen in 88% (84 of 96) and 87% (20 of 23) tested items respectively. Conclusions The QHRFF had acceptable validity and reliability across the majority of items; particularly items identifying relevant co-morbidities, high-risk factors and foot disease complications. Recommendations have been made to improve or remove identified weaker items for future QHRFF versions. Overall, the QHRFF possesses suitable practicality, validity and reliability to assess and capture relevant foot disease items across multiple at risk populations. PMID:24468080
Sandia National Laboratories: Fabrication, Testing and Validation
; Technology Defense Systems & Assessments About Defense Systems & Assessments Program Areas safe, secure, reliable, and can fully support the Nation's deterrence policy. Employing only the most support of this mission, Sandia National Laboratories has a significant role in advancing the "state
Evaluation of Two Observational Assessment Systems for Children's Development and Learning
ERIC Educational Resources Information Center
Kim, Do-Hong; Smith, JaneDiane
2010-01-01
This study provided preliminary evidence for the reliability and validity of "Teaching Strategies GOLD", a recently developed observational system for assessing young children's development and learning. The measurement properties of "Teaching Strategies GOLD" were compared with those of an older instrument, "The Creative…
Environmental Validation of Legionella Control in a VHA Facility Water System.
Jinadatha, Chetan; Stock, Eileen M; Miller, Steve E; McCoy, William F
2018-03-01
OBJECTIVES We conducted this study to determine what sample volume, concentration, and limit of detection (LOD) are adequate for environmental validation of Legionella control. We also sought to determine whether time required to obtain culture results can be reduced compared to spread-plate culture method. We also assessed whether polymerase chain reaction (PCR) and in-field total heterotrophic aerobic bacteria (THAB) counts are reliable indicators of Legionella in water samples from buildings. DESIGN Comparative Legionella screening and diagnostics study for environmental validation of a healthcare building water system. SETTING Veterans Health Administration (VHA) facility water system in central Texas. METHODS We analyzed 50 water samples (26 hot, 24 cold) from 40 sinks and 10 showers using spread-plate cultures (International Standards Organization [ISO] 11731) on samples shipped overnight to the analytical lab. In-field, on-site cultures were obtained using the PVT (Phigenics Validation Test) culture dipslide-format sampler. A PCR assay for genus-level Legionella was performed on every sample. RESULTS No practical differences regardless of sample volume filtered were observed. Larger sample volumes yielded more detections of Legionella. No statistically significant differences at the 1 colony-forming unit (CFU)/mL or 10 CFU/mL LOD were observed. Approximately 75% less time was required when cultures were started in the field. The PCR results provided an early warning, which was confirmed by spread-plate cultures. The THAB results did not correlate with Legionella status. CONCLUSIONS For environmental validation at this facility, we confirmed that (1) 100 mL sample volumes were adequate, (2) 10× concentrations were adequate, (3) 10 CFU/mL LOD was adequate, (4) in-field cultures reliably reduced time to get results by 75%, (5) PCR provided a reliable early warning, and (6) THAB was not predictive of Legionella results. Infect Control Hosp Epidemiol 2018;39:259-266.
Construction of Valid and Reliable Test for Assessment of Students
ERIC Educational Resources Information Center
Osadebe, P. U.
2015-01-01
The study was carried out to construct a valid and reliable test in Economics for secondary school students. Two research questions were drawn to guide the establishment of validity and reliability for the Economics Achievement Test (EAT). It is a multiple choice objective test of five options with 100 items. A sample of 1000 students was randomly…
Validity and Reliability of a Medicine Ball Explosive Power Test.
ERIC Educational Resources Information Center
Stockbrugger, Barry A.; Haennel, Robert G.
2001-01-01
Evaluated the validity and reliability of a medicine ball throw test to evaluate explosive power. Data on competitive sand volleyball players who performed a medicine ball throw and a standard countermovement jump indicated that the medicine ball throw test was a valid and reliable way to assess explosive power for an analogous total-body movement…
The Validity and Reliability of the Mobbing Scale (MS)
ERIC Educational Resources Information Center
Yaman, Erkan
2009-01-01
The aim of this research is to develop the Mobbing Scale and examine its validity and reliability. The sample of the study consisted of 515 persons from Sakarya and Bursa. In this study, construct validity, internal consistency, test-retest reliability, and item analysis of the scale were examined. As a result of factor analysis for construct…
ERIC Educational Resources Information Center
Barbu, Otilia C.; Levine-Donnerstein, Deborah; Marx, Ronald W.; Yaden, David B., Jr.
2013-01-01
This study examined reliability and validity of the Devereux Early Childhood Assessment (DECA), based on samples of parents and teachers' ratings of 1,145 entering kindergartners in the Southwest. Confirmatory factor analysis showed that DECA presented good reliability and validity for manifest variables, corroborating previous findings. Three…
ERIC Educational Resources Information Center
Smith, Jack E.; Hakel, Milton D.
1979-01-01
Examined are questions pertinent to the use of the Position Analysis Questionnaire: Who can use the PAQ reliably and validly? Must one rely on trained job analysts? Can people having no direct contact with the job use the PAQ reliably and validly? Do response biases influence PAQ responses? (Author/KC)
Validity and Reliability of the Arabic Token Test for Children
ERIC Educational Resources Information Center
Alkhamra, Rana A.; Al-Jazi, Aya B.
2016-01-01
Background: The Token Test for Children (2nd edition) (TTFC) is a measure for assessing receptive language. In this study we describe the translation process, validity and reliability of the Arabic Token Test for Children (A-TTFC). Aims: The aim of this study is to translate, validate and establish the reliability of the Arabic Token Test for…
Construction and Evaluation of Reliability and Validity of Reasoning Ability Test
ERIC Educational Resources Information Center
Bhat, Mehraj A.
2014-01-01
This paper is based on the construction and evaluation of reliability and validity of reasoning ability test at secondary school students. In this paper an attempt was made to evaluate validity, reliability and to determine the appropriate standards to interpret the results of reasoning ability test. The test includes 45 items to measure six types…
Conceptualizing Essay Tests' Reliability and Validity: From Research to Theory
ERIC Educational Resources Information Center
Badjadi, Nour El Imane
2013-01-01
The current paper on writing assessment surveys the literature on the reliability and validity of essay tests. The paper aims to examine the two concepts in relationship with essay testing as well as to provide a snapshot of the current understandings of the reliability and validity of essay tests as drawn in recent research studies. Bearing in…
ERIC Educational Resources Information Center
Markon, Kristian E.; Chmielewski, Michael; Miller, Christopher J.
2011-01-01
In 2 meta-analyses involving 58 studies and 59,575 participants, we quantitatively summarized the relative reliability and validity of continuous (i.e., dimensional) and discrete (i.e., categorical) measures of psychopathology. Overall, results suggest an expected 15% increase in reliability and 37% increase in validity through adoption of a…
Paediatric Automatic Phonological Analysis Tools (APAT).
Saraiva, Daniela; Lousada, Marisa; Hall, Andreia; Jesus, Luis M T
2017-12-01
To develop the pediatric Automatic Phonological Analysis Tools (APAT) and to estimate inter and intrajudge reliability, content validity, and concurrent validity. The APAT were constructed using Excel spreadsheets with formulas. The tools were presented to an expert panel for content validation. The corpus used in the Portuguese standardized test Teste Fonético-Fonológico - ALPE produced by 24 children with phonological delay or phonological disorder was recorded, transcribed, and then inserted into the APAT. Reliability and validity of APAT were analyzed. The APAT present strong inter- and intrajudge reliability (>97%). The content validity was also analyzed (ICC = 0.71), and concurrent validity revealed strong correlations between computerized and manual (traditional) methods. The development of these tools contributes to fill existing gaps in clinical practice and research, since previously there were no valid and reliable tools/instruments for automatic phonological analysis, which allowed the analysis of different corpora.
Measurement Properties of a Park Use Questionnaire
Evenson, Kelly R.; Wen, Fang; Golinelli, Daniela; Rodríguez, Daniel A.; Cohen, Deborah A.
2012-01-01
We determined the criterion validity and test-retest reliability of a brief park use questionnaire. From five US locations, 232 adults completed a brief survey four times and wore a global positioning system (GPS) monitor for three weeks. We assessed validity for park visits during the past week and during a usual week by examining agreement between frequency and duration of park visits reported in the questionnaire to the GPS monitor results. Spearman correlation coefficients (SCC) were used to measure agreement. For past week park visit frequency and duration, the SCC were 0.62–0.65 and 0.62–0.67, respectively. For usual week park visit frequency and duration, the SCC were 0.40–0.50 and 0.50–0.53, respectively. Usual park visit frequency reliability was 0.78–0.88 (percent agreement 69%–82%) and usual park visit duration was 0.75–0.84 (percent agreement 64%–73%). These results suggest that the questionnaire to assess usual and past week park use had acceptable validity and reliability. PMID:23853386
Assessing the competences associated with a nursing Bachelor thesis by means of rubrics.
Llaurado-Serra, M; Rodríguez, E; Gallart, A; Fuster, P; Monforte-Royo, C; De Juan, M Á
2018-07-01
Writing a Bachelor thesis is the last step in obtaining a university degree. The thesis may be job- or research-orientated, but it must demonstrate certain degree-level competences. Rubrics are a useful way of unifying the assessment criteria. To design a system of rubrics for assessing the competences associated with the Bachelor thesis of a nursing degree, to examine the system's reliability and validity and to analyse results in relation to the final thesis mark. Cross-sectional and psychometric study conducted between 2012 and 2014. Nursing degree at a Spanish university. Twelve tutors who designed the system of rubrics. Students (n = 76) who wrote their Bachelor thesis during the 2013-2014 academic year. After deciding which aspects would be assessed, who would assess them and when, the tutors developed seven rubrics (drafting process, assessment of the written thesis by the supervisor and by a panel, student self-assessment, peer assessment, tutor evaluation of the peer assessment and panel assessment of the viva). We analysed the reliability (inter-rater and internal consistency) and validity (convergent and discriminant) of the rubrics, and also the relationship between the competences assessed and the final thesis mark. All the rubrics had internal consistency coefficients >0.80. The rubric for oral communication skills (viva) yielded inter-rater reliability of 0.95. Factor analysis indicated a unidimensional structure for all but one of the rubrics, the exception being the rubric for peer assessment, which had a two-factor structure. The main competences associated with a good quality Bachelor thesis were written communication skills and the ability to work independently. The assessment system based on seven rubrics is shown to be valid and reliable. Writing a Bachelor thesis requires a range of degree-level competences and it offers nursing students the opportunity to develop their evidence-based practice skills. Copyright © 2018 Elsevier Ltd. All rights reserved.
Reliability and validity: Part II.
Davis, Debora Winders
2004-01-01
Determining measurement reliability and validity involves complex processes. There is usually room for argument about most instruments. It is important that the researcher clearly describes the processes upon which she made the decision to use a particular instrument, and presents the evidence available showing that the instrument is reliable and valid for the current purposes. In some cases, the researcher may need to conduct pilot studies to obtain evidence upon which to decide whether the instrument is valid for a new population or a different setting. In all cases, the researcher must present a clear and complete explanation for the choices, she has made regarding reliability and validity. The consumer must then judge the degree to which the researcher has provided adequate and theoretically sound rationale. Although I have tried to touch on most of the important concepts related to measurement reliability and validity, it is beyond the scope of this column to be exhaustive. There are textbooks devoted entirely to specific measurement issues if readers require more in-depth knowledge.
Toward Reliable and Energy Efficient Wireless Sensing for Space and Extreme Environments
NASA Technical Reports Server (NTRS)
Choi, Baek-Young; Boyd, Darren; Wilkerson, DeLisa
2017-01-01
Reliability is the critical challenge of wireless sensing in space systems operating in extreme environments. Energy efficiency is another concern for battery powered wireless sensors. Considering the physics of wireless communications, we propose an approach called Software-Defined Wireless Communications (SDC) that dynamically decide a reliable channel(s) avoiding unnecessary redundancy of channels, out of multiple distinct electromagnetic frequency bands such as radio and infrared frequencies.We validate the concept with Android and Raspberry Pi sensors and pseudo extreme experiments. SDC can be utilized in many areas beyond space applications.
NASA Technical Reports Server (NTRS)
Dunham, J. R. (Editor); Knight, J. C. (Editor)
1982-01-01
The state of the art in the production of crucial software for flight control applications was addressed. The association between reliability metrics and software is considered. Thirteen software development projects are discussed. A short term need for research in the areas of tool development and software fault tolerance was indicated. For the long term, research in format verification or proof methods was recommended. Formal specification and software reliability modeling, were recommended as topics for both short and long term research.
Reliability and validity of the Safe Routes to school parent and student surveys
2011-01-01
Background The purpose of this study is to assess the reliability and validity of the U.S. National Center for Safe Routes to School's in-class student travel tallies and written parent surveys. Over 65,000 tallies and 374,000 parent surveys have been completed, but no published studies have examined their measurement properties. Methods Students and parents from two Charlotte, NC (USA) elementary schools participated. Tallies were conducted on two consecutive days using a hand-raising protocol; on day two students were also asked to recall the previous days' travel. The recall from day two was compared with day one to assess 24-hour test-retest reliability. Convergent validity was assessed by comparing parent-reports of students' travel mode with student-reports of travel mode. Two-week test-retest reliability of the parent survey was assessed by comparing within-parent responses. Reliability and validity were assessed using kappa statistics. Results A total of 542 students participated in the in-class student travel tally reliability assessment and 262 parent-student dyads participated in the validity assessment. Reliability was high for travel to and from school (kappa > 0.8); convergent validity was lower but still high (kappa > 0.75). There were no differences by student grade level. Two-week test-retest reliability of the parent survey (n = 112) ranged from moderate to very high for objective questions on travel mode and travel times (kappa range: 0.62 - 0.97) but was substantially lower for subjective assessments of barriers to walking to school (kappa range: 0.31 - 0.76). Conclusions The student in-class student travel tally exhibited high reliability and validity at all elementary grades. The parent survey had high reliability on questions related to student travel mode, but lower reliability for attitudinal questions identifying barriers to walking to school. Parent survey design should be improved so that responses clearly indicate issues that influence parental decision making in regards to their children's mode of travel to school. PMID:21651794
Taffarel, Marilda Onghero; Luna, Stelio Pacca Loureiro; de Oliveira, Flavia Augusta; Cardoso, Guilherme Schiess; Alonso, Juliana de Moura; Pantoja, Jose Carlos; Brondani, Juliana Tabarelli; Love, Emma; Taylor, Polly; White, Kate; Murrell, Joanna C
2015-04-01
Quantification of pain plays a vital role in the diagnosis and management of pain in animals. In order to refine and validate an acute pain scale for horses a prospective, randomized, blinded study was conducted. Twenty-four client owned adult horses were recruited and allocated to one of four following groups: anaesthesia only (GA); pre-emptive analgesia and anaesthesia (GAA,); anaesthesia, castration and postoperative analgesia (GC); or pre-emptive analgesia, anaesthesia and castration (GCA). One investigator, unaware of the treatment group, assessed all horses at time-points before and after intervention and completed the pain scale. Videos were also obtained at these time-points and were evaluated by a further four blinded evaluators who also completed the scale. The data were used to investigate the relevance, specificity, criterion validity and inter- and intra-observer reliability of each item on the pain scale, and to evaluate construct validity and responsiveness of the scale. Construct validity was demonstrated by the observed differences in scores between the groups, four hours after anaesthetic recovery and before administration of systemic analgesia in the GC group. Inter- and intra-observer reliability for the items was only satisfactory. Subsequently the pain scale was refined, based on results for relevance, specificity and total item correlation. Scale refinement and exclusion of items that did not meet predefined requirements generated a selection of relevant pain behaviours in horses. After further validation for reliability, these may be used to evaluate pain under clinical and experimental conditions.
Singh, Amika S; Vik, Froydis N; Chinapaw, Mai J M; Uijtdewilligen, Léonie; Verloigne, Maïté; Fernández-Alvira, Juan M; Stomfai, Sarolta; Manios, Yannis; Martens, Marloes; Brug, Johannes
2011-12-09
Insight in children's energy balance-related behaviours (EBRBs) and their determinants is important to inform obesity prevention research. Therefore, reliable and valid tools to measure these variables in large-scale population research are needed. To examine the test-retest reliability and construct validity of the child questionnaire used in the ENERGY-project, measuring EBRBs and their potential determinants among 10-12 year old children. We collected data among 10-12 year old children (n = 730 in the test-retest reliability study; n = 96 in the construct validity study) in six European countries, i.e. Belgium, Greece, Hungary, the Netherlands, Norway, and Spain. Test-retest reliability was assessed using the intra-class correlation coefficient (ICC) and percentage agreement comparing scores from two measurements, administered one week apart. To assess construct validity, the agreement between questionnaire responses and a subsequent face-to-face interview was assessed using ICC and percentage agreement. Of the 150 questionnaire items, 115 (77%) showed good to excellent test-retest reliability as indicated by ICCs > .60 or percentage agreement ≥ 75%. Test-retest reliability was moderate for 34 items (23%) and poor for one item. Construct validity appeared to be good to excellent for 70 (47%) of the 150 items, as indicated by ICCs > .60 or percentage agreement ≥ 75%. From the other 80 items, construct validity was moderate for 39 (26%) and poor for 41 items (27%). Our results demonstrate that the ENERGY-child questionnaire, assessing EBRBs of the child as well as personal, family, and school-environmental determinants related to these EBRBs, has good test-retest reliability and moderate to good construct validity for the large majority of items.
2011-01-01
Background Insight in children's energy balance-related behaviours (EBRBs) and their determinants is important to inform obesity prevention research. Therefore, reliable and valid tools to measure these variables in large-scale population research are needed. Objective To examine the test-retest reliability and construct validity of the child questionnaire used in the ENERGY-project, measuring EBRBs and their potential determinants among 10-12 year old children. Methods We collected data among 10-12 year old children (n = 730 in the test-retest reliability study; n = 96 in the construct validity study) in six European countries, i.e. Belgium, Greece, Hungary, the Netherlands, Norway, and Spain. Test-retest reliability was assessed using the intra-class correlation coefficient (ICC) and percentage agreement comparing scores from two measurements, administered one week apart. To assess construct validity, the agreement between questionnaire responses and a subsequent face-to-face interview was assessed using ICC and percentage agreement. Results Of the 150 questionnaire items, 115 (77%) showed good to excellent test-retest reliability as indicated by ICCs > .60 or percentage agreement ≥ 75%. Test-retest reliability was moderate for 34 items (23%) and poor for one item. Construct validity appeared to be good to excellent for 70 (47%) of the 150 items, as indicated by ICCs > .60 or percentage agreement ≥ 75%. From the other 80 items, construct validity was moderate for 39 (26%) and poor for 41 items (27%). Conclusions Our results demonstrate that the ENERGY-child questionnaire, assessing EBRBs of the child as well as personal, family, and school-environmental determinants related to these EBRBs, has good test-retest reliability and moderate to good construct validity for the large majority of items. PMID:22152048
Validation and reliability of the Turkish Utian Quality-of-Life Scale in postmenopausal women.
Abay, Halime; Kaplan, Sena
2016-04-01
There are a limited number of menopause-specific quality-of-life scales for the Turkish population. This study was conducted to evaluate the validity and reliability of the Turkish Utian Quality-of-Life Scale in postmenopausal women. The study group was comprised of 250 postmenopausal women who applied to a training and research hospital's menopause clinic in Turkey. A survey form and the Turkish Utian quality-of-Life Scale were used to collect data, and the Turkish version of Short Form-36 was used to evaluate reliability with an equivalent form. Language-validity, content-validity, and construct-validity methods were used to assess the validity of the scale, and Cronbach's α coefficient calculation and the equivalent-form reliability methods were used to assess the reliability of the scale. The Turkish Utian Quality-of-Life Scale was determined to be a valid and reliable instrument for measuring the quality of life of postmenopausal women. Confirmatory factor analysis demonstrates that the instrument fits well with 23 items and a four-factor model. The Cronbach's α coefficient for the quality-of-life domains were as follows: 0.88 overall, 0.79 health, 0.78 emotional, 0.76 sexual, and 0.75 occupational. Reliability of the instrument was confirmed through significant correlations between scores on the Turkish version of the Utian Quality-of-Life Scale and the Turkish version of the Short Form-36 (r = 0.745, P < 0.001). This research emphasizes that the Turkish Utian Quality-of-Life Scale is reliable and valid in postmenopausal women-it is a useful instrument for measuring quality of life during menopause.
Alberta infant motor scale: reliability and validity when used on preterm infants in Taiwan.
Jeng, S F; Yau, K I; Chen, L C; Hsiao, S F
2000-02-01
The goal of this study was to examine the reliability and validity of measurements obtained with the Alberta Infant Motor Scale (AIMS) for evaluation of preterm infants in Taiwan. Two independent groups of preterm infants were used to investigate the reliability (n=45) and validity (n=41) for the AIMS. In the reliability study, the AIMS was administered to the infants by a physical therapist, and infant performance was videotaped. The performance was then rescored by the same therapist and by 2 other therapists to examine the intrarater and interrater reliability. In the validity study, the AIMS and the Bayley Motor Scale were administered to the infants at 6 and 12 months of age to examine criterion-related validity. Intraclass correlation coefficients (ICCs) for intrarater and interrater reliability of measurements obtained with the AIMS were high (ICC=.97-.99). The AIMS scores correlated with the Bayley Motor Scale scores at 6 and 12 months (r=.78 and.90), although the AIMS scores at 6 months were only moderately predictive of the motor function at 12 months (r=.56). The results suggest that measurements obtained with the AIMS have acceptable reliability and concurrent validity but limited predictive value for evaluating preterm Taiwanese infants.
Poon, Vickie Wan-kei; Lam, Linda Chiu-wa; Wong, Samuel Yeung-shan
2008-09-01
With the rapid growth of the older population, early detection of cognitive deficits is crucial in slowing down functional deterioration of the elderly persons. To examine the validity and reliability of the Chinese (Cantonese) version of the Hierarchic Dementia Scale (CV-HDS) for Chinese older persons in Hong Kong. The HDS was translated into Cantonese Chinese. The content and cultural validity were evaluated by six expert panel members. Sixty-two participants with diagnosis of dementia were recruited for evaluation. Inter-rater reliability, test-retest reliability, internal consistency and concurrent validity were examined. The CV-HDS demonstrated satisfactory psychometric properties. inter-rater reliability and test-retest reliability were high (alpha=0.89 and alpha=0.94 respectively). High value of Cronbach's alpha (alpha=0.94) demonstrated good internal consistency. The concurrent validity of CV-HDS, through correlation with its scores with that of the Chinese version of Mini Mental Status Examination, was established (ranged from r=0.58 to r=0.78, p<0.01). The CV-HDS is a reliable and valid instrument for assessing severity of cognitive impairment in Cantonese speaking Chinese people with dementia. It facilitates treatment planning to optimize the effects of functional training and rehabilitation.
Reliability and validity of electrothermometers and associated thermocouples.
Jutte, Lisa S; Knight, Kenneth L; Long, Blaine C
2008-02-01
Examine thermocouple model uncertainty (reliability+validity). First, a 3x3 repeated measures design with independent variables electrothermometers and thermocouple model. Second, a 1x3 repeated measures design with independent variable subprobe. Three electrothermometers, 3 thermocouple models, a multi-sensor probe and a mercury thermometer measured a stable water bath. Temperature and absolute temperature differences between thermocouples and a mercury thermometer. Thermocouple uncertainty was greater than manufactures'claims. For all thermocouple models, validity and reliability were better in the Iso-Themex than the Datalogger, but there were no practical differences between models within an electrothermometers. Validity of multi-sensor probes and thermocouples within a probe were not different but were greater than manufacturers'claims. Reliability of multiprobes and thermocouples within a probe were within manufacturers claims. Thermocouple models vary in reliability and validity. Scientists should test and report the uncertainty of their equipment rather than depending on manufactures' claims.
Doğramac, Sera N; Watsford, Mark L; Murphy, Aron J
2011-03-01
Subjective notational analysis can be used to track players and analyse movement patterns during match-play of team sports such as futsal. The purpose of this study was to establish the validity and reliability of the Event Recorder for subjective notational analysis. A course was designed, replicating ten minutes of futsal match-play movement patterns, where ten participants undertook the course. The course allowed a comparison of data derived from subjective notational analysis, to the known distances of the course, and to GPS data. The study analysed six locomotor activity categories, focusing on total distance covered, total duration of activities and total frequency of activities. The values between the known measurements and the Event Recorder were similar, whereas the majority of significant differences were found between the Event Recorder and GPS values. The reliability of subjective notational analysis was established with all ten participants being analysed on two occasions, as well as analysing five random futsal players twice during match-play. Subjective notational analysis is a valid and reliable method of tracking player movements, and may be a preferred and more effective method than GPS, particularly for indoor sports such as futsal, and field sports where short distances and changes in direction are observed.
Reliability and Validity of Nonsymbolic and Symbolic Comparison Tasks in School-Aged Children.
Castro, Danilka; Estévez, Nancy; Gómez, David; Dartnell, Pablo Ricardo
2017-12-04
Basic numerical processing has been regularly assessed using numerical nonsymbolic and symbolic comparison tasks. It has been assumed that these tasks index similar underlying processes. However, the evidence concerning the reliability and convergent validity across different versions of these tasks is inconclusive. We explored the reliability and convergent validity between two numerical comparison tasks (nonsymbolic vs. symbolic) in school-aged children. The relations between performance in both tasks and mental arithmetic were described and a developmental trajectories' analysis was also conducted. The influence of verbal and visuospatial working memory processes and age was controlled for in the analyses. Results show significant reliability (p < .001) between Block 1 and 2 for nonsymbolic task (global adjusted RT (adjRT): r = .78, global efficiency measures (EMs): r = .74) and, for symbolic task (adjRT: r = .86, EMs: r = .86). Also, significant convergent validity between tasks (p < .001) for both adjRT (r = .71) and EMs (r = .70) were found after controlling for working memory and age. Finally, it was found the relationship between nonsymbolic and symbolic efficiencies varies across the sample's age range. Overall, these findings suggest both tasks index the same underlying cognitive architecture and are appropriate to explore the Approximate Number System (ANS) characteristics. The evidence supports the central role of ANS in arithmetic efficiency and suggests there are differences across the age range assessed, concerning the extent to which efficiency in nonsymbolic and symbolic tasks reflects ANS acuity.
Chevat, Catherine; Viala-Danten, Muriel; Dias-Barbosa, Carla; Nguyen, Van Hung
2009-01-01
Background Influenza is among the most common infectious diseases. The main protection against influenza is vaccination. A self-administered questionnaire was developed and validated for use in clinical trials to assess subjects' perception and acceptance of influenza vaccination and its subsequent injection site reactions (ISR). Methods The VAPI questionnaire was developed based on interviews with vaccinees. The initial version was administered to subjects in international clinical trials comparing intradermal with intramuscular influenza vaccination. Item reduction and scale construction were carried out using principal component and multitrait analyses (n = 549). Psychometric validation of the final version was conducted per country (n = 5,543) and included construct and clinical validity and internal consistency reliability. All subjects gave their written informed consent before being interviewed or included in the clinical studies. Results The final questionnaire comprised 4 dimensions ("bother from ISR"; "arm movement"; "sleep"; "acceptability") grouping 16 items, and 5 individual items (anxiety before vaccination; bother from pain during vaccination; satisfaction with injection system; willingness to be vaccinated next year; anxiety about vaccination next year). Construct validity was confirmed for all scales in most of the countries. Internal consistency reliability was good for all versions (Cronbach's alpha ranging from 0.68 to 0.94), as was clinical validity: scores were positively correlated with the severity of ISR and pain. Conclusion The VAPI questionnaire is a valid and reliable tool, assessing the acceptance of vaccine injection and reactions following vaccination. Trial registration NCT00258934, NCT00383526, NCT00383539. PMID:19261173
First year progress report on the development of the Texas flexible pavement database.
DOT National Transportation Integrated Search
2008-01-01
Comprehensive and reliable databases are essential for the development, validation, and calibration of any pavement : design and rehabilitation system. These databases should include material properties, pavement structural : characteristics, highway...
Cardiovascular fitness strengthening using portable device.
Alqudah, Hamzah; Kai Cao; Tao Zhang; Haddad, Azzam; Su, Steven; Celler, Branko; Nguyen, Hung T
2016-08-01
The paper describes a reliable and valid Portable Exercise Monitoring system developed using TI eZ430-Chronos watch, which can control the exercise intensity through audio stimulation in order to increase the Cardiovascular fitness strengthening.
Validity and reliability of the Utrecht Work Engagement Scale-Student Version in Sri Lanka.
Wickramasinghe, Nuwan Darshana; Dissanayake, Devani Sakunthala; Abeywardena, Gihan Sajiwa
2018-05-04
The present study was aimed at assessing the validity and the reliability of the Sinhala version of the Utrecht Work Engagement Scale-Student Version (UWES-S) among collegiate cycle students in Sri Lanka. The 17-item UWES-S was translated to Sinhala and the judgmental validity was assessed by a multi-disciplinary panel of experts. Construct validity of the UWES-S was appraised by using multi-trait scaling analysis and exploratory factor analysis (EFA) on data obtained from a sample of 194 grade thirteen students in the Kurunegala district, Sri Lanka. Reliability of the UWES-S was assessed by using internal consistency and test-retest reliability. Except for item 13, all other items showed good psychometric properties in judgemental validity, item-convergent validity and item-discriminant validity. EFA using principal component analysis with Oblimin rotation, suggested a three-factor solution (including vigor, dedication and absorption subscales) explaining 65.4% of the total variance for the 16-item UWES-S (with item 13 deleted). All three subscales show high internal consistency with Cronbach's α coefficient values of 0.867, 0.819, and 0.903 and test-retest reliability was high (p < 0.001). Hence, the Sinhala version of the 16-item UWES-S is a valid and a reliable instrument to assess work engagement among collegiate cycle students in Sri Lanka.
Kadar, Masne; Ibrahim, Suhaili; Razaob, Nor Afifi; Chai, Siaw Chui; Harun, Dzalani
2018-02-01
The Lawton Instrumental Activities of Daily Living Scale is a tool often used to assess independence among elderly at home. Its suitability to be used with the elderly population in Malaysia has not been validated. This current study aimed to assess the validity and reliability of the Lawton Instrumental Activities of Daily Living Scale - Malay Version to Malay speaking elderly in Malaysia. This study was divided into three phases: (1) translation and linguistic validity involving both forward and backward translations; (2) establishment of face validity and content validity; and (3) establishment of reliability involving inter-rater, test-retest and internal consistency analyses. Data used for these analyses were obtained by interviewing 65 elderly respondents. Percentages of Content Validity Index for 4 criteria were from 88.89 to 100.0. The Cronbach α coefficient for internal consistency was 0.838. Intra-class Correlation Coefficient of inter-rater reliability and test-retest reliability was 0.957 and 0.950 respectively. The result shows that the Lawton Instrumental Activities of Daily Living Scale - Malay Version has excellent reliability and validity for use with the Malay speaking elderly people in Malaysia. This scale could be used by professionals to assess functional ability of elderly who live independently in community. © 2018 Occupational Therapy Australia.
Bengtsson, Ulrika; Kjellgren, Karin; Höfer, Stefan; Taft, Charles; Ring, Lena
2014-10-01
Self-management support tools using technology may improve adherence to hypertension treatment. There is a need for user-friendly tools facilitating patients' understanding of the interconnections between blood pressure, wellbeing and lifestyle. This study aimed to examine comprehension, comprehensiveness and relevance of items, and further to evaluate the usability and reliability of an interactive hypertension-specific mobile phone self-report system. Areas important in supporting self-management and candidate items were derived from five focus group interviews with patients and healthcare professionals (n = 27), supplemented by a literature review. Items and response formats were drafted to meet specifications for mobile phone administration and were integrated into a mobile phone data-capture system. Content validity and usability were assessed iteratively in four rounds of cognitive interviews with patients (n = 21) and healthcare professionals (n = 4). Reliability was examined using a test-retest. Focus group analyses yielded six areas covered by 16 items. The cognitive interviews showed satisfactory item comprehension, relevance and coverage; however, one item was added. The mobile phone self-report system was reliable and perceived easy to use. The mobile phone self-report system appears efficiently to capture information relevant in patients' self-management of hypertension. Future studies need to evaluate the effectiveness of this tool in improving self-management of hypertension in clinical practice.
Siahaan, Laura A; Syam, Ari F; Simadibrata, Marcellus; Setiati, Siti
2017-01-01
to obtain a valid and reliable GERD-QOL questionnaire for Indonesian application. at the initial stage, the GERD-QOL questionnaire was first translated into Indonesian language and the translated questionnaire was subsequently translated back into the original language (back-to-back translation). The results were evaluated by the researcher team and therefore, an Indonesian version of GERD-QOL questionnaire was developed. Ninety-one patients who had been clinically diagnosed with GERD based on the Montreal criteria were interviewed using the Indonesian version of GERD-QOL questionnaire and the SF 36 questionnaire. The validity was evaluated using a method of construct validity and external validity, and reliability can be tested by the method of internal consistency and test retest. the Indonesian version of GERD-QOL questionnaire had a good internal consistency reliability with a Cronbach Alpha of 0.687-0.842 and a good test retest reliability with an intra-class correlation coefficient of 0.756-0.936; p<0.05). The questionnaire had also been demonstrated to have a good validity with a proven high correlation to each question of SF-36 (p<0.05). the Indonesian version of GERD-QOL questionnaire has been proven valid and reliable to evaluate the quality of life of GERD patients.
Beyhun, Nazim Ercument; Can, Gamze; Tiryaki, Ahmet; Karakullukcu, Serdar; Bulut, Bekir; Yesilbas, Sehbal; Kavgaci, Halil; Topbas, Murat
2016-01-01
Background Needs based biopsychosocial distress instrument for cancer patients (CANDI) is a scale based on needs arising due to the effects of cancer. Objectives The aim of this research was to determine the reliability and validity of the CANDI scale in the Turkish language. Patients and Methods The study was performed with the participation of 172 cancer patients aged 18 and over. Factor analysis (principal components analysis) was used to assess construct validity. Criterion validities were tested by computing Spearman correlation between CANDI and hospital anxiety depression scale (HADS), and brief symptom inventory (BSI) (convergent validity) and quality of life scales (FACT-G) (divergent validity). Test-retest reliabilities and internal consistencies were measured with intraclass correlation (ICC) and Cronbach-α. Results A three-factor solution (emotional, physical and social) was found with factor analysis. Internal reliability (α = 0.94) and test-retest reliability (ICC = 0.87) were significantly high. Correlations between CANDI and HADS (rs = 0.67), and BSI (rs = 0.69) and FACT-G (rs = -0.76) were moderate and significant in the expected direction. Conclusions CANDI is a valid and reliable scale in cancer patients with a three-factor structure (emotional, physical and social) in the Turkish language. PMID:27621931
Liu, Ren; Srivastava, Anurag K.; Bakken, David E.; ...
2017-08-17
Intermittency of wind energy poses a great challenge for power system operation and control. Wind curtailment might be necessary at the certain operating condition to keep the line flow within the limit. Remedial Action Scheme (RAS) offers quick control action mechanism to keep reliability and security of the power system operation with high wind energy integration. In this paper, a new RAS is developed to maximize the wind energy integration without compromising the security and reliability of the power system based on specific utility requirements. A new Distributed Linear State Estimation (DLSE) is also developed to provide the fast andmore » accurate input data for the proposed RAS. A distributed computational architecture is designed to guarantee the robustness of the cyber system to support RAS and DLSE implementation. The proposed RAS and DLSE is validated using the modified IEEE-118 Bus system. Simulation results demonstrate the satisfactory performance of the DLSE and the effectiveness of RAS. Real-time cyber-physical testbed has been utilized to validate the cyber-resiliency of the developed RAS against computational node failure.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liu, Ren; Srivastava, Anurag K.; Bakken, David E.
Intermittency of wind energy poses a great challenge for power system operation and control. Wind curtailment might be necessary at the certain operating condition to keep the line flow within the limit. Remedial Action Scheme (RAS) offers quick control action mechanism to keep reliability and security of the power system operation with high wind energy integration. In this paper, a new RAS is developed to maximize the wind energy integration without compromising the security and reliability of the power system based on specific utility requirements. A new Distributed Linear State Estimation (DLSE) is also developed to provide the fast andmore » accurate input data for the proposed RAS. A distributed computational architecture is designed to guarantee the robustness of the cyber system to support RAS and DLSE implementation. The proposed RAS and DLSE is validated using the modified IEEE-118 Bus system. Simulation results demonstrate the satisfactory performance of the DLSE and the effectiveness of RAS. Real-time cyber-physical testbed has been utilized to validate the cyber-resiliency of the developed RAS against computational node failure.« less
Cutting planes for the multistage stochastic unit commitment problem
Jiang, Ruiwei; Guan, Yongpei; Watson, Jean -Paul
2016-04-20
As renewable energy penetration rates continue to increase in power systems worldwide, new challenges arise for system operators in both regulated and deregulated electricity markets to solve the security-constrained coal-fired unit commitment problem with intermittent generation (due to renewables) and uncertain load, in order to ensure system reliability and maintain cost effectiveness. In this paper, we study a security-constrained coal-fired stochastic unit commitment model, which we use to enhance the reliability unit commitment process for day-ahead power system operations. In our approach, we first develop a deterministic equivalent formulation for the problem, which leads to a large-scale mixed-integer linear program.more » Then, we verify that the turn on/off inequalities provide a convex hull representation of the minimum-up/down time polytope under the stochastic setting. Next, we develop several families of strong valid inequalities mainly through lifting schemes. In particular, by exploring sequence independent lifting and subadditive approximation lifting properties for the lifting schemes, we obtain strong valid inequalities for the ramping and general load balance polytopes. Lastly, branch-and-cut algorithms are developed to employ these valid inequalities as cutting planes to solve the problem. Our computational results verify the effectiveness of the proposed approach.« less
Cutting planes for the multistage stochastic unit commitment problem
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jiang, Ruiwei; Guan, Yongpei; Watson, Jean -Paul
As renewable energy penetration rates continue to increase in power systems worldwide, new challenges arise for system operators in both regulated and deregulated electricity markets to solve the security-constrained coal-fired unit commitment problem with intermittent generation (due to renewables) and uncertain load, in order to ensure system reliability and maintain cost effectiveness. In this paper, we study a security-constrained coal-fired stochastic unit commitment model, which we use to enhance the reliability unit commitment process for day-ahead power system operations. In our approach, we first develop a deterministic equivalent formulation for the problem, which leads to a large-scale mixed-integer linear program.more » Then, we verify that the turn on/off inequalities provide a convex hull representation of the minimum-up/down time polytope under the stochastic setting. Next, we develop several families of strong valid inequalities mainly through lifting schemes. In particular, by exploring sequence independent lifting and subadditive approximation lifting properties for the lifting schemes, we obtain strong valid inequalities for the ramping and general load balance polytopes. Lastly, branch-and-cut algorithms are developed to employ these valid inequalities as cutting planes to solve the problem. Our computational results verify the effectiveness of the proposed approach.« less
Symptom-based categorization of in-flight passenger medical incidents.
Mahony, Paul H; Myers, Julia A; Larsen, Peter D; Powell, David M C; Griffiths, Robin F
2011-12-01
The majority of in-flight passenger medical events are managed by cabin crew. Our study aimed to evaluate the reliability of cabin crew reports of in-flight medical events and to develop a symptom-based categorization system. All cabin crew in-flight passenger medical incident reports for an airline over a 9-yr period were examined retrospectively. Validation of incident descriptions were undertaken on a sample of 162 cabin crew reports where medically trained persons' reports were available for comparison using a three Round Delphi technique and testing concordance using Cohen's Kappa. A hierarchical symptom-based categorization system was designed and validated. The rate was 159 incidents per 106 passengers carried, or 70.4/113.3 incidents per 106 revenue passenger kilometres/miles, respectively. Concordance between cabin crew and medical reports was 96%, with a high validity rating (mean 4.6 on a 1-5 scale) and high Cohen's Kappa (0.94). The most common in-flight medical events were transient loss of consciousness (41%), nausea/vomiting/diarrhea (19.5%), and breathing difficulty (16%). Cabin crew records provide reliable data regarding in-flight passenger medical incidents, complementary to diagnosis-based systems, and allow the use of currently underutilized data. The categorization system provides a means for tracking passenger medical incidents internationally and an evidence base for cabin crew first aid training.
Jácome, Cristina; Cruz, Joana; Oliveira, Ana; Marques, Alda
2016-11-01
The Berg Balance Scale (BBS), Balance Evaluation Systems Test (BESTest), Mini-BESTest, and Brief-BESTest are useful in the assessment of balance. Their psychometric properties, however, have not been tested in patients with chronic obstructive pulmonary disease (COPD). This study aimed to compare the validity, reliability, and ability to identify fall status of the BBS, BESTest, Mini-BESTest, and the Brief-BESTest in patients with COPD. A cross-sectional study was conducted. Forty-six patients (24 men, 22 women; mean age=75.9 years, SD=7.1) were included. Participants were asked to report their falls during the previous 12 months and to fill in the Activity-specific Balance Confidence (ABC) Scale. The BBS and the BESTest were administered. Mini-BESTest and Brief-BESTest scores were computed based on the participants' BESTest performance. Validity was assessed by correlating balance tests with each other and with the ABC Scale. Interrater reliability (2 raters), intrarater reliability (48-72 hours), and minimal detectable changes (MDCs) were established. Receiver operating characteristics assessed the ability of each balance test to differentiate between participants with and without a history of falls. Balance test scores were significantly correlated with each other (Spearman correlation rho=.73-.90) and with the ABC Scale (rho=.53-.75). Balance tests presented high interrater reliability (intraclass correlation coefficient [ICC]=.85-.97) and intrarater reliability (ICC=.52-.88) and acceptable MDCs (MDC=3.3-6.3 points). Although all balance tests were able to identify fall status (area under the curve=0.74-0.84), the BBS (sensitivity=73%, specificity=77%) and the Brief-BESTest (sensitivity=81%, specificity=73%) had the higher ability to identify fall status. Findings are generalizable mainly to older patients with moderate COPD. The 4 balance tests are valid, reliable, and valuable in identifying fall status in patients with COPD. The Brief-BESTest presented slightly higher interrater reliability and ability to differentiate participants' fall status. © 2016 American Physical Therapy Association.
Determining the Appropriateness of the "What If" Situations Test (WIST) with Turkish Pre-Schoolers.
Citak Tunc, Gulseren; Gorak, Gulay; Ozyazicioglu, Nurcan; Ak, Bedriye; Isil, Ozlem; Vural, Pinar
2018-04-01
Measurement instruments are needed to assess the child's sexual abuse prevention program. The purpose of the study was to determine the reliability and validity of the WIST (What If Situations Test) for Turkish culture. Participants were children of the 3-6 age group attending pre-school education institutions and the sample size was identified by means of a power analysis. Seventy children were identified as the sample with 0.85 power and 0.05 type I error according to the power analysis. Language validity, content validity, internal validity coefficient (Cronbach alpha coefficient), and test-retest analyses were conducted in terms of validity and reliability in the scope of efforts for adaptation to Turkish culture. Firstly, Kendall W = 0.83 was the score for the expert opinions concerning the content validity of the language validity scale. It was found that the Cronbach alpha coefficients were between 0.68 and 0.90 for the scale sub-dimensions of appropriate and inappropriate recognition, saying, doing, telling, and reporting. The test-retest reliability of the scale was found to be r = 0.89 and the test-retest reliabilities for the sub-dimensions (appropriate recognition, inappropriate recognition, say skills, do skills, tell skills, and reporting skills) were between r = 0.48 and r = 0.92. The test-retest reliability for the Personal Safety Questionnaire (PSQ), as having complimentary items to the WIST, was found to be r = 0.82. The reliability and validity analysis of the 'What If' Situations Test (WIST), used to evaluate pre-schoolers' skills regarding self-protection against sexual abuse, showed that the Test's adaptation to Turkish culture was reliable and valid.
Singh, Amika S; Chinapaw, Mai J M; Uijtdewilligen, Léonie; Vik, Froydis N; van Lippevelde, Wendy; Fernández-Alvira, Juan M; Stomfai, Sarolta; Manios, Yannis; van der Sluijs, Maria; Terwee, Caroline; Brug, Johannes
2012-08-13
Insight in parental energy balance-related behaviours, their determinants and parenting practices are important to inform childhood obesity prevention. Therefore, reliable and valid tools to measure these variables in large-scale population research are needed. The objective of the current study was to examine the test-retest reliability and construct validity of the parent questionnaire used in the ENERGY-project, assessing parental energy balance-related behaviours, their determinants, and parenting practices among parents of 10-12 year old children. We collected data among parents (n = 316 in the test-retest reliability study; n = 109 in the construct validity study) of 10-12 year-old children in six European countries, i.e. Belgium, Greece, Hungary, the Netherlands, Norway, and Spain. Test-retest reliability was assessed using the intra-class correlation coefficient (ICC) and percentage agreement comparing scores from two measurements, administered one week apart. To assess construct validity, the agreement between questionnaire responses and a subsequent interview was assessed using ICC and percentage agreement.All but one item showed good to excellent test-retest reliability as indicated by ICCs > .60 or percentage agreement ≥ 75%. Construct validity appeared to be good to excellent for 92 out of 121 items, as indicated by ICCs > .60 or percentage agreement ≥ 75%. From the other 29 items, construct validity was moderate for 24 and poor for 5 items. The reliability and construct validity of the items of the ENERGY-parent questionnaire on multiple energy balance-related behaviours, their potential determinants, and parenting practices appears to be good. Based on the results of the validity study, we strongly recommend adapting parts of the ENERGY-parent questionnaire if used in future research.
Reliability and Validity of the Turkish Version of the Gastrointestinal Symptom Rating Scale.
Turan, Nuray; Aşt, Türkinaz Atabek; Kaya, Nurten
The purpose of this methodological study is to investigate the validity and reliability of the Turkish version of the Gastrointestinal Symptom Rating Scale (GSRS). The scale was adapted to the Turkish language via backward translation. Content validity was examined by referring to experts. Reliability was examined via test-retest reliability and internal consistency, and validity was examined with divergent and convergent validity. The Epworth Sleepiness Scale (ESS) and the Marlowe-Crowne Social Desirability Scale (MCSDS) were used for divergent validity. As for convergent validity, the Constipation Severity Instrument (CSI) and the Patient Assessment of Constipation Quality of Life Scale (PAC-QOLQ) were utilized. The relationship between the GSRS and the health-related quality of life (36-item short-form health survey [SF-36]) was also analyzed. The study population consisted of patients in orthopedic clinic who volunteered to participate. Test-retest reliability was examined with the participation of 30 patients; internal consistency and validity were examined with 150 patients. Test-retest reliability correlation coefficients of the GSRS varied from 0.39 to 0.87 for all items. For internal consistency, the GSRS's item total correlation was found to be 0.17-0.67, and Cronbach α was 0.82 for all items. There was a positive linear significant correlation between the GSRS, CSI, and PAC-QOLQ. There was no significant correlation between the GSRS, MCSDS, and ESS. Higher GSRS scores inversely correlated with general quality of life (SF-36). The Turkish version of the GSRS has been found to be a reliable and valid instrument for assessing patients' gastrointestinal symptoms. Therefore, this instrument can be confidently used with Turkish individuals.
Reliability and Validity of the Greek Migraine Disability Assessment (MIDAS) Questionnaire.
Oikonomidi, Theodora; Vikelis, Michail; Artemiadis, Artemios; Chrousos, George P; Darviri, Christina
2018-03-01
The Migraine Disability Assessment (MIDAS) Questionnaire is a reliable and valid instrument for migraine-related disability. Such a tool is needed to quantify migraine-related disability in the Greek population. This validation study aims to assess the test-retest reliability, internal consistency, item discriminant and convergent validity of the Greek translation of the MIDAS. Adults diagnosed with migraine completed the MIDAS Questionnaire on two occasions 3 weeks apart to assess reliability, and completed the RAND-36 to assess validity. Participants (n = 152) had a median MIDAS score of 24 and mostly severe disability (58% were grade IV). The test-retest reliability analysis (N = 59) revealed excellent reliability for the total score. Internal consistency was α = 0.71 for initial and α = 0.82 for retest completion. For item discriminant validity, the correlations between each question and the total score were significant, with high correlations for questions 2-5 (range 0.67 ≤ r ≤ 0.79; p < 0.01). For convergent validity, there was significant negative correlation between the total score and all RAND-36 subscales except for 'emotional wellbeing'. The negative correlation indicates that patients with a lower degree of disability according to their MIDAS score tended to have better wellbeing. Psychometric properties are comparable with those of other published validation studies of the MIDAS and the original. Findings on question 1 show that missing work/school days may be closely related with increased affect issues. The Greek version of the MIDAS Questionnaire has good reliability and validity. This study allowed for cross-cultural comparability of research findings.
The Reliability and Validity of Big Five Inventory Scores with African American College Students
ERIC Educational Resources Information Center
Worrell, Frank C.; Cross, William E., Jr.
2004-01-01
This article describes a study that examined the reliability and validity of scores on the Big Five Inventory (BFI; O. P. John, E. M. Donahue, & R. L. Kentle, 1991) in a sample of 336 African American college students. Results from the study indicated moderate reliability and structural validity for BFI scores. Additionally, BFI subscales had few…
ERIC Educational Resources Information Center
Song, Ji Hoon; Kim, Jin Yong; Chermack, Thomas J.; Yang; Baiyin
2008-01-01
The primary purpose of this research was to adapt the Dimensions of Learning Organization Questionnaire (DLOQ) from Watkins and Marsick (1993, 1996) and examine its validity and reliability in a Korean context. Results indicate that the DLOQ produces valid and reliable scores of learning organization characteristics in a Korean cultural context.…
ERIC Educational Resources Information Center
Harlen, Wynne
2005-01-01
This paper summarizes the findings of a systematic review of research on the reliability and validity of teachers' assessment used for summative purposes. In addition to the main question, the review also addressed the question "What conditions affect the reliability and validity of teachers' summative assessment?" The initial search for studies…
ERIC Educational Resources Information Center
Boonstra, Anne M.; Reneman, Michiel F.; Stewart, Roy E.; Balk, Gerlof A.
2012-01-01
The aim of this study was to determine the reliability and discriminant validity of the Dutch version of the life satisfaction questionnaire (Lisat-9 DV) to assess patients with an acquired brain injury. The reliability study used a test-retest design, and the validity study used a cross-sectional design. The setting was the general rehabilitation…
Validity and Reliability of the Academic Resilience Scale in Turkish High School
ERIC Educational Resources Information Center
Kapikiran, Sahin
2012-01-01
The present study aims to determine the validity and reliability of the academic resilience scale in Turkish high school. The participances of the study includes 378 high school students in total (192 female and 186 male). A set of analyses were conducted in order to determine the validity and reliability of the study. Firstly, both exploratory…
ERIC Educational Resources Information Center
Pfeiffer, Karin A.; Pivarnik, James M.; Womack, Christopher J.; Reeves, Mathew J.; Malina, Robert M.
2002-01-01
Investigated the reliability and validity of the Borg and OMNI rating of perceived exertion (RPE) scales in adolescent girls during treadmill exercise. Girls were randomly assigned to one of the RPE scales during various treadmill exercise conditions. Results indicated that the OMNI cycle pictorial scale was reliable and valid for use with…
Reliability and validity of the Outcome Expectations for Exercise Scale-2.
Resnick, Barbara
2005-10-01
Development of a reliable and valid measure of outcome expectations for exercise for older adults will help establish the relationship between outcome expectations and exercise and facilitate the development of interventions to increase physical activity in older adults. The purpose of this study was to test the reliability and validity of the Outcome Expectations for Exercise-2 Scale (OEE-2), a 13-item measure with two subscales: positive OEE (POEE) and negative OEE (NOEE). The OEE-2 scale was given to 161 residents in a continuing-care retirement community. There was some evidence of validity based on confirmatory factor analysis, Rasch-analysis INFIT and OUTFIT statistics, and convergent validity and test criterion relationships. There was some evidence for reliability of the OEE-2 based on alpha coefficients, person- and item-separation reliability indexes, and R(2)values. Based on analyses, suggested revisions are provided for future use of the OEE-2. Although ongoing reliability and validity testing are needed, the OEE-2 scale can be used to identify older adults with low outcome expectations for exercise, and interventions can then be implemented to strengthen these expectations and improve exercise behavior.
Romero-Franco, Natalia; Jiménez-Reyes, Pedro; Montaño-Munuera, Juan A
2017-11-01
Lower limb isometric strength is a key parameter to monitor the training process or recognise muscle weakness and injury risk. However, valid and reliable methods to evaluate it often require high-cost tools. The aim of this study was to analyse the concurrent validity and reliability of a low-cost digital dynamometer for measuring isometric strength in lower limb. Eleven physically active and healthy participants performed maximal isometric strength for: flexion and extension of ankle, flexion and extension of knee, flexion, extension, adduction, abduction, internal and external rotation of hip. Data obtained by the digital dynamometer were compared with the isokinetic dynamometer to examine its concurrent validity. Data obtained by the digital dynamometer from 2 different evaluators and 2 different sessions were compared to examine its inter-rater and intra-rater reliability. Intra-class correlation (ICC) for validity was excellent in every movement (ICC > 0.9). Intra and inter-tester reliability was excellent for all the movements assessed (ICC > 0.75). The low-cost digital dynamometer demonstrated strong concurrent validity and excellent intra and inter-tester reliability for assessing isometric strength in the main lower limb movements.
A comprehensive review of the psychometric properties of the Drug Abuse Screening Test.
Yudko, Errol; Lozhkina, Olga; Fouts, Adriana
2007-03-01
This article reviews the reliability and the validity of the (10-, 20-, and 28-item) Drug Abuse Screening Test (DAST). The reliability and the validity of the adolescent version of the DAST are also reviewed. An extensive literature review was conducted using the Medline and Psychinfo databases from the years 1982 to 2005. All articles that addressed the reliability and the validity of the DAST were examined. Publications in which the DAST was used as a screening tool but had no data on its psychometric properties were not included. Descriptive information about each version of the test, as well as discussion of the empirical literature that has explored measures of the reliability and the validity of the DAST, has been included. The DAST tended to have moderate to high levels of test-retest, interitem, and item-total reliabilities. The DAST also tended to have moderate to high levels of validity, sensitivity, and specificity. In general, all versions of the DAST yield satisfactory measures of reliability and validity for use as clinical or research tools. Furthermore, these tests are easy to administer and have been used in a variety of populations.
Ertuğ, Nurcan
2018-06-01
The aim of this study was to determine the validity and reliability of the Turkish version of the V-scale, which measures nurses' attitudes towards vital signs monitoring in the detection of clinical deterioration. This validity and reliability study was conducted at a tertiary hospital in Ankara, Turkey, in 2016. A total of 169 ward nurses participated in the study. Exploratory factor analysis, Cronbach's alpha coefficient, and the intraclass correlation coefficient were used to determine the validity and reliability of the scale. A 5-factor, 16-item scale explained 60.823% of the total variance according to the validity analysis. Our version matched the original scale in terms of the number of items and factor structure. Cronbach's alpha coefficient of the Turkish version of the V-scale was 0.764. The test-retest reliability results were 0.855 for the overall intraclass correlation coefficient, and the t-test result was P > 0.05. The V-scale is a reliable and valid instrument to measure Turkish nurses' attitudes towards vital signs monitoring in the detection of clinical deterioration. © 2018 John Wiley & Sons Australia, Ltd.
Wright, F Virginia; Rosenbaum, Peter; Fehlings, Darcy; Mesterman, Ronit; Breuer, Ute; Kim, Marie
2014-08-01
Optimizing movement quality is a common rehabilitation goal for children with cerebral palsy (CP). The new Quality Function Measure (QFM)--a revision of the Gross Motor Performance Measure (GMPM)--evaluates five attributes: Alignment, Co-ordination, Dissociated movement, Stability, and Weight-shift, for the Gross Motor Function Measure (GMFM) Stand and Walk/Run/Jump items. This study evaluated the reliability and discriminant validity of the QFM. Thirty-three children with CP (17 females, 16 males; mean age 8y 11mo, SD 3y 1mo; Gross Motor Function Classification System [GMFCS] levels I [n=17], II [n=7], III [n=9]) participated in reliability testing. Each did a GMFM Stand/Walk assessment, repeated 2 weeks later. Both GMFM assessments were videotaped. A physiotherapist assessor pair independently scored the QFM from an assigned child's GMFM video. GMFM data from 112 children. That is, (GMFCS I [n=38], II [n=27], III [n=47]) were used for discriminant validity evaluation. QFM mean scores varied from 45.0% (SD 27.2; Stability) to 56.2% (SD 27.5; Alignment). Reliability was excellent across all attributes: intraclass correlation coefficients (ICCs) ≥0.97 (95% confidence intervals [CI] 0.95-0.99), interrater ICCs ≥0.89 (95% CI 0.80-0.98), and test-retest ICCs ≥0.90 (95% CI 0.79-0.99). QFM discriminated qualitative attributes of motor function among GMFCS levels (maximum p<0.05). The QFM is reliable and valid, making it possible to assess how well young people with CP move and what areas of function to target to enhance quality of motor control. © 2014 Mac Keith Press.
Cheong, Sau Kuan; Lang, Cathryne P; Hemphill, Sheryl A; Johnston, Leanne M
2017-06-01
To evaluate the preliminary validity and reliability of the myTREEHOUSE Self-Concept Assessment for children with cerebral palsy (CP) aged 8 to 12 years. The myTREEHOUSE Self-Concept Assessment includes 26 items divided into eight domains, assessed across three Performance Perspectives (Personal, Social, and Perceived) and an additional Importance Rating. Face and content validity was assessed by semi-structured interviews with seven expert professionals regarding the assessment construct, content, and clinical utility. Reliability was assessed with 50 children aged 8 to 12 years with CP (29 males, 21 females; mean age 10y 2mo; Gross Motor Function Classification System [GMFCS] level I=35, II=8, III=5, IV=1; mean Wechsler Intelligence Scale for Children - Fourth Edition [WISC-IV]=104), whose data was used to calculate internal consistency of the scale, and a subset of 35 children (20 males, 15 females; mean age 10y 5mo; GMFCS level I=26, II=4, III=4, IV=1; mean WISC-IV=103) who participated in test-retest reliability within 14 to 28 days. Face and content validity was supported by positive expert feedback, with only minor adjustments suggested to clarify the wording of some items. After these amendments, strong internal consistency (Cronbach's α 0.84-0.91) and moderate to good test-retest reliability (intraclass correlation coefficient 0.64-0.75) was found for each component. The myTREEHOUSE Self-Concept Assessment is a valid and reliable assessment of self-concept for children with CP aged 8 to 12 years. © 2017 Mac Keith Press.
2013-01-01
Background The Parent-Infant Relationship Global Assessment Scale (PIR-GAS) signifies a conceptually relevant development in the multi-axial, developmentally sensitive classification system DC:0-3R for preschool children. However, information about the reliability and validity of the PIR-GAS is rare. A review of the available empirical studies suggests that in research, PIR-GAS ratings can be based on a ten-minute videotaped interaction sequence. The qualification of raters may be very heterogeneous across studies. Methods To test whether the use of the PIR-GAS still allows for a reliable assessment of the parent-infant relationship, our study compared a PIR-GAS ratings based on a full-information procedure across multiple settings with ratings based on a ten-minute video by two doctoral candidates of medicine. For each mother-child dyad at a family day hospital (N = 48), we obtained two video ratings and one full-information rating at admission to therapy and at discharge. This pre-post design allowed for a replication of our findings across the two measurement points. We focused on the inter-rater reliability between the video coders, as well as between the video and full-information procedure, including mean differences and correlations between the raters. Additionally, we examined aspects of the validity of video and full-information ratings based on their correlation with measures of child and maternal psychopathology. Results Our results showed that a ten-minute video and full-information PIR-GAS ratings were not interchangeable. Most results at admission could be replicated by the data obtained at discharge. We concluded that a higher degree of standardization of the assessment procedure should increase the reliability of the PIR-GAS, and a more thorough theoretical foundation of the manual should increase its validity. PMID:23705962
Storch, Eric A; Wood, Jeffrey J; Ehrenreich-May, Jill; Jones, Anna M; Park, Jennifer M; Lewin, Adam B; Murphy, Tanya K
2012-11-01
The psychometric properties of the Pediatric Anxiety Rating Scale (PARS), a clinician-administered measure for assessing severity of anxiety symptoms, were examined in 72 children and adolescents diagnosed with an autism spectrum disorder (ASD). The internal consistency of the PARS was 0.59, suggesting that the items were related but not repetitive. The PARS showed high 26-day test-retest (ICC = 0.83) and inter-rater reliability (ICC = 0.86). The PARS was strongly correlated with clinician-ratings of overall anxiety severity and parent-report anxiety measures, supporting convergent validity. Results for divergent validity were mixed. Although the PARS was not associated with the sum of the Social and Communication items on the Autism Diagnostic Observation System, it was moderately correlated with parent-reported inattention, aggression and externalizing behavior. Overall, these results suggest that the psychometric properties of the PARS are adequate for assessing anxiety symptoms in youth with ASD, although additional clarification of divergent validity is needed.
Hirdes, John P; Poss, Jeff W; Caldarelli, Hilary; Fries, Brant E; Morris, John N; Teare, Gary F; Reidel, Kristen; Jutan, Norma
2013-02-26
Evidence informed decision making in health policy development and clinical practice depends on the availability of valid and reliable data. The introduction of interRAI assessment systems in many countries has provided valuable new information that can be used to support case mix based payment systems, quality monitoring, outcome measurement and care planning. The Continuing Care Reporting System (CCRS) managed by the Canadian Institute for Health Information has served as a data repository supporting national implementation of the Resident Assessment Instrument (RAI 2.0) in Canada for more than 15 years. The present paper aims to evaluate data quality for the CCRS using an approach that may be generalizable to comparable data holdings internationally. Data from the RAI 2.0 implementation in Complex Continuing Care (CCC) hospitals/units and Long Term Care (LTC) homes in Ontario were analyzed using various statistical techniques that provide evidence for trends in validity, reliability, and population attributes. Time series comparisons included evaluations of scale reliability, patterns of associations between items and scales that provide evidence about convergent validity, and measures of changes in population characteristics over time. Data quality with respect to reliability, validity, completeness and freedom from logical coding errors was consistently high for the CCRS in both CCC and LTC settings. The addition of logic checks further improved data quality in both settings. The only notable change of concern was a substantial inflation in the percentage of long term care home residents qualifying for the Special Rehabilitation level of the Resource Utilization Groups (RUG-III) case mix system after the adoption of that system as part of the payment system for LTC. The CCRS provides a robust, high quality data source that may be used to inform policy, clinical practice and service delivery in Ontario. Only one area of concern was noted, and the statistical techniques employed here may be readily used to target organizations with data quality problems in that (or any other) area. There was also evidence that data quality was good in both CCC and LTC settings from the outset of implementation, meaning data may be used from the entire time series. The methods employed here may continue to be used to monitor data quality in this province over time and they provide a benchmark for comparisons with other jurisdictions implementing the RAI 2.0 in similar populations.
Reliability Validation and Improvement Framework
2012-11-01
systems . Steps in that direction include the use of the Architec- ture Tradeoff Analysis Method ® (ATAM®) developed at the Carnegie Mellon...embedded software • cyber - physical systems (CPSs) to indicate that the embedded software interacts with, manag - es, and controls a physical system [Lee...the use of formal static analysis methods to increase our confidence in system operation beyond testing. However, analysis results
Abma, Femke I; van der Klink, Jac J L; Bültmann, Ute
2013-03-01
The promotion of a sustainable, healthy and productive working life attracts more and more attention. Recently the Work Role Functioning Questionnaire (WRFQ) has been cross-culturally translated and adapted to Dutch. This questionnaire aims to measure the health-related work functioning of workers with health problems. The aim of this study is to evaluate the reliability, validity (including five new items) and responsiveness of the WRFQ 2.0 in the working population. A longitudinal study was conducted among workers. The reliability (internal consistency, test-retest reliability, measurement error), validity (structural validity-factor analysis, construct validity by means of hypotheses testing) and responsiveness of the WRFQ 2.0 were evaluated. A total of N = 553 workers completed the survey. The final WRFQ 2.0 has four subscales and showed very good internal consistency, moderate test-retest reliability, good construct validity and moderate responsiveness in the working population. The WRFQ was able to distinguish between groups with different levels of mental health, physical health, fatigue and need for recovery. A moderate correlation was found between WRFQ and related constructs respectively work ability and work productivity. A weak relationship was found with general self-rated health, work engagement and work involvement. The WRFQ 2.0 is a reliable and valid instrument to measure health-related work functioning in the working population. Further validation in larger samples is recommended, especially for test-retest reliability, responsiveness and the questionnaire's ability to predict the future course of health-related work functioning.
The Arthroscopic Surgical Skill Evaluation Tool (ASSET).
Koehler, Ryan J; Amsdell, Simon; Arendt, Elizabeth A; Bisson, Leslie J; Braman, Jonathan P; Bramen, Jonathan P; Butler, Aaron; Cosgarea, Andrew J; Harner, Christopher D; Garrett, William E; Olson, Tyson; Warme, Winston J; Nicandri, Gregg T
2013-06-01
Surgeries employing arthroscopic techniques are among the most commonly performed in orthopaedic clinical practice; however, valid and reliable methods of assessing the arthroscopic skill of orthopaedic surgeons are lacking. The Arthroscopic Surgery Skill Evaluation Tool (ASSET) will demonstrate content validity, concurrent criterion-oriented validity, and reliability when used to assess the technical ability of surgeons performing diagnostic knee arthroscopic surgery on cadaveric specimens. Cross-sectional study; Level of evidence, 3. Content validity was determined by a group of 7 experts using the Delphi method. Intra-articular performance of a right and left diagnostic knee arthroscopic procedure was recorded for 28 residents and 2 sports medicine fellowship-trained attending surgeons. Surgeon performance was assessed by 2 blinded raters using the ASSET. Concurrent criterion-oriented validity, interrater reliability, and test-retest reliability were evaluated. Content validity: The content development group identified 8 arthroscopic skill domains to evaluate using the ASSET. Concurrent criterion-oriented validity: Significant differences in the total ASSET score (P < .05) between novice, intermediate, and advanced experience groups were identified. Interrater reliability: The ASSET scores assigned by each rater were strongly correlated (r = 0.91, P < .01), and the intraclass correlation coefficient between raters for the total ASSET score was 0.90. Test-retest reliability: There was a significant correlation between ASSET scores for both procedures attempted by each surgeon (r = 0.79, P < .01). The ASSET appears to be a useful, valid, and reliable method for assessing surgeon performance of diagnostic knee arthroscopic surgery in cadaveric specimens. Studies are ongoing to determine its generalizability to other procedures as well as to the live operating room and other simulated environments.
Helmerhorst, Hendrik J F; Brage, Søren; Warren, Janet; Besson, Herve; Ekelund, Ulf
2012-08-31
Physical inactivity is one of the four leading risk factors for global mortality. Accurate measurement of physical activity (PA) and in particular by physical activity questionnaires (PAQs) remains a challenge. The aim of this paper is to provide an updated systematic review of the reliability and validity characteristics of existing and more recently developed PAQs and to quantitatively compare the performance between existing and newly developed PAQs.A literature search of electronic databases was performed for studies assessing reliability and validity data of PAQs using an objective criterion measurement of PA between January 1997 and December 2011. Articles meeting the inclusion criteria were screened and data were extracted to provide a systematic overview of measurement properties. Due to differences in reported outcomes and criterion methods a quantitative meta-analysis was not possible.In total, 31 studies testing 34 newly developed PAQs, and 65 studies examining 96 existing PAQs were included. Very few PAQs showed good results on both reliability and validity. Median reliability correlation coefficients were 0.62-0.71 for existing, and 0.74-0.76 for new PAQs. Median validity coefficients ranged from 0.30-0.39 for existing, and from 0.25-0.41 for new PAQs.Although the majority of PAQs appear to have acceptable reliability, the validity is moderate at best. Newly developed PAQs do not appear to perform substantially better than existing PAQs in terms of reliability and validity. Future PAQ studies should include measures of absolute validity and the error structure of the instrument.
Formiga, Magno F; Roach, Kathryn E; Vital, Isabel; Urdaneta, Gisel; Balestrini, Kira; Calderon-Candelario, Rafael A
2018-01-01
Purpose The Test of Incremental Respiratory Endurance (TIRE) provides a comprehensive assessment of inspiratory muscle performance by measuring maximal inspiratory pressure (MIP) over time. The integration of MIP over inspiratory duration (ID) provides the sustained maximal inspiratory pressure (SMIP). Evidence on the reliability and validity of these measurements in COPD is not currently available. Therefore, we assessed the reliability, responsiveness and construct validity of the TIRE measures of inspiratory muscle performance in subjects with COPD. Patients and methods Test–retest reliability, known-groups and convergent validity assessments were implemented simultaneously in 81 male subjects with mild to very severe COPD. TIRE measures were obtained using the portable PrO2 device, following standard guidelines. Results All TIRE measures were found to be highly reliable, with SMIP demonstrating the strongest test–retest reliability with a nearly perfect intraclass correlation coefficient (ICC) of 0.99, while MIP and ID clustered closely together behind SMIP with ICC values of about 0.97. Our findings also demonstrated known-groups validity of all TIRE measures, with SMIP and ID yielding larger effect sizes when compared to MIP in distinguishing between subjects of different COPD status. Finally, our analyses confirmed convergent validity for both SMIP and ID, but not MIP. Conclusion The TIRE measures of MIP, SMIP and ID have excellent test–retest reliability and demonstrated known-groups validity in subjects with COPD. SMIP and ID also demonstrated evidence of moderate convergent validity and appear to be more stable measures in this patient population than the traditional MIP. PMID:29805255
Validity and Reliability of the Upper Extremity Work Demands Scale.
Jacobs, Nora W; Berduszek, Redmar J; Dijkstra, Pieter U; van der Sluis, Corry K
2017-12-01
Purpose To evaluate validity and reliability of the upper extremity work demands (UEWD) scale. Methods Participants from different levels of physical work demands, based on the Dictionary of Occupational Titles categories, were included. A historical database of 74 workers was added for factor analysis. Criterion validity was evaluated by comparing observed and self-reported UEWD scores. To assess structural validity, a factor analysis was executed. For reliability, the difference between two self-reported UEWD scores, the smallest detectable change (SDC), test-retest reliability and internal consistency were determined. Results Fifty-four participants were observed at work and 51 of them filled in the UEWD twice with a mean interval of 16.6 days (SD 3.3, range = 10-25 days). Criterion validity of the UEWD scale was moderate (r = .44, p = .001). Factor analysis revealed that 'force and posture' and 'repetition' subscales could be distinguished with Cronbach's alpha of .79 and .84, respectively. Reliability was good; there was no significant difference between repeated measurements. An SDC of 5.0 was found. Test-retest reliability was good (intraclass correlation coefficient for agreement = .84) and all item-total correlations were >.30. There were two pairs of highly related items. Conclusion Reliability of the UEWD scale was good, but criterion validity was moderate. Based on current results, a modified UEWD scale (2 items removed, 1 item reworded, divided into 2 subscales) was proposed. Since observation appeared to be an inappropriate gold standard, we advise to investigate other types of validity, such as construct validity, in further research.
Chen, Hong-Lin; Cao, Ying-Juan; Zhang, Wei; Wang, Jing; Huai, Bao-Sha
2017-02-01
The inter-rater reliability of Braden Scale is not so good. We modified the Braden(ALB) scale by defining nutrition subscale based on serum albumin, then assessed it's the validity and reliability in hospital patients. We designed a retrospective study for validity analysis, and a prospective study for reliability analysis. Receiver operating curve (ROC) and area under the curve (AUC) were used to evaluate the predictive validity. Intra-class correlation coefficient (ICC) was used to investigate the inter-rater reliability. Two thousand five hundred twenty-five patients were included for validity analysis, 76 patients (3.0%) developed pressure ulcer. Positive correlation was found between serum albumin and nutrition score in Braden scale (Spearman's coefficient 0.2203, P<0.0001). The AUCs for Braden scale and Braden(ALB) scale predicting pressure ulcer risk were 0.813 (95% CI 0.797-0.828; P<0.0001), and 0.859 (95% CI 0.845-0.872; P<0.0001), respectively. The Braden(ALB) scale was even more valid than the Braden scale (z=1.860, P=0.0628). In different age subgroups, the Braden(ALB) scale seems also more valid than the original Braden scale, but no statistically significant differences were found (P>0.05). The inter-rater reliability study showed the ICC-value for nutrition increased 45.9%, and increased 4.3% for total score. The Braden(ALB) scale has similar validity compared with the original Braden scale for in hospital patients. However, the inter-rater reliability was significantly increased. Copyright © 2016 Elsevier Inc. All rights reserved.
2012-01-01
Physical inactivity is one of the four leading risk factors for global mortality. Accurate measurement of physical activity (PA) and in particular by physical activity questionnaires (PAQs) remains a challenge. The aim of this paper is to provide an updated systematic review of the reliability and validity characteristics of existing and more recently developed PAQs and to quantitatively compare the performance between existing and newly developed PAQs. A literature search of electronic databases was performed for studies assessing reliability and validity data of PAQs using an objective criterion measurement of PA between January 1997 and December 2011. Articles meeting the inclusion criteria were screened and data were extracted to provide a systematic overview of measurement properties. Due to differences in reported outcomes and criterion methods a quantitative meta-analysis was not possible. In total, 31 studies testing 34 newly developed PAQs, and 65 studies examining 96 existing PAQs were included. Very few PAQs showed good results on both reliability and validity. Median reliability correlation coefficients were 0.62–0.71 for existing, and 0.74–0.76 for new PAQs. Median validity coefficients ranged from 0.30–0.39 for existing, and from 0.25–0.41 for new PAQs. Although the majority of PAQs appear to have acceptable reliability, the validity is moderate at best. Newly developed PAQs do not appear to perform substantially better than existing PAQs in terms of reliability and validity. Future PAQ studies should include measures of absolute validity and the error structure of the instrument. PMID:22938557
Duracinsky, Martin; Lalanne, Christophe; Le Coeur, Sophie; Herrmann, Susan; Berzins, Baiba; Armstrong, Andrew Richard; Lau, Joseph Tak Fai; Fournier, Isabelle; Chassany, Olivier
2012-04-15
This study reports the psychometric validation of a new HIV/AIDS-specific health-related quality of life (HRQL) questionnaire, the Patient Reported Outcomes Quality of Life-HIV. The instrument was developed simultaneously across Europe, North and South America, Africa, Asia, and Australia to assess multidimensional quality of life impairments in the era of highly active antiretroviral therapy. A cross-sectional study was performed in 8 countries. The pilot 70-item questionnaire was co-administered with the HIV symptoms index, the EQ-5D and Medical Outcomes Study-HIV questionnaires. Demographic and biomedical data were collected. After item analysis and reduction, convergent discriminant concurrent validity and known-group validity were examined. Internal consistency and reliability scores were assessed using Cronbach alpha and intraclass correlation. The final sample of 791 patients was composed of 64% males (median age: 41 years, HIV diagnosis = 5 years), 13.8% were treatment naive. Item reduction yielded a 43-item form surveying 8 dimensions and 1 global health item that showed good convergent and discriminant validity and reliability (98% scaling success; Cronbach alphas 0.77-0.89). Correlations with EQ-5D and Medical Outcomes Study-HIV complied with concurrent validity expectations; likewise, correlations against the number of self-reported symptoms and depression showed good support for criterion validity. A test-retest study on French patients (n = 34) showed temporal stability (intraclass correlation coefficient = 0.86). Significant and meaningful differences of HRQL scores between countries were found. The Patient Reported Outcomes Quality of Life-HIV questionnaire is a valid and reliable instrument for assessing HRQL specific to HIV disease in different cultures and healthcare systems.
Predictors of validity and reliability of a physical activity record in adolescents
2013-01-01
Background Poor to moderate validity of self-reported physical activity instruments is commonly observed in young people in low- and middle-income countries. However, the reasons for such low validity have not been examined in detail. We tested the validity of a self-administered daily physical activity record in adolescents and assessed if personal characteristics or the convenience level of reporting physical activity modified the validity estimates. Methods The study comprised a total of 302 adolescents from an urban and rural area in Ecuador. Validity was evaluated by comparing the record with accelerometer recordings for seven consecutive days. Test-retest reliability was examined by comparing registrations from two records administered three weeks apart. Time spent on sedentary (SED), low (LPA), moderate (MPA) and vigorous (VPA) intensity physical activity was estimated. Bland Altman plots were used to evaluate measurement agreement. We assessed if age, sex, urban or rural setting, anthropometry and convenience of completing the record explained differences in validity estimates using a linear mixed model. Results Although the record provided higher estimates for SED and VPA and lower estimates for LPA and MPA compared to the accelerometer, it showed an overall fair measurement agreement for validity. There was modest reliability for assessing physical activity in each intensity level. Validity was associated with adolescents’ personal characteristics: sex (SED: P = 0.007; LPA: P = 0.001; VPA: P = 0.009) and setting (LPA: P = 0.000; MPA: P = 0.047). Reliability was associated with the convenience of completing the physical activity record for LPA (low convenience: P = 0.014; high convenience: P = 0.045). Conclusions The physical activity record provided acceptable estimates for reliability and validity on a group level. Sex and setting were associated with validity estimates, whereas convenience to fill out the record was associated with better reliability estimates for LPA. This tendency of improved reliability estimates for adolescents reporting higher convenience merits further consideration. PMID:24289296
NASA Technical Reports Server (NTRS)
Castell, Karen; Day, John H. (Technical Monitor)
2001-01-01
ST5 mission requirements include validation of Lithium-ion battery in orbit. Accommodation in the power system for Li-ion battery can be reduced with smaller amp-hour size, highly matched cells when compared to the larger amp-hour size approach. Result can be lower system mass and increased reliability.
NASA Technical Reports Server (NTRS)
Wilson, Larry W.
1989-01-01
The longterm goal of this research is to identify or create a model for use in analyzing the reliability of flight control software. The immediate tasks addressed are the creation of data useful to the study of software reliability and production of results pertinent to software reliability through the analysis of existing reliability models and data. The completed data creation portion of this research consists of a Generic Checkout System (GCS) design document created in cooperation with NASA and Research Triangle Institute (RTI) experimenters. This will lead to design and code reviews with the resulting product being one of the versions used in the Terminal Descent Experiment being conducted by the Systems Validations Methods Branch (SVMB) of NASA/Langley. An appended paper details an investigation of the Jelinski-Moranda and Geometric models for software reliability. The models were given data from a process that they have correctly simulated and asked to make predictions about the reliability of that process. It was found that either model will usually fail to make good predictions. These problems were attributed to randomness in the data and replication of data was recommended.
RELIABILITY AND VALIDITY OF SUBJECTIVE ASSESSMENT OF LUMBAR LORDOSIS IN CONVENTIONAL RADIOGRAPHY.
Ruhinda, E; Byanyima, R K; Mugerwa, H
2014-10-01
Reliability and validity studies of different lumbar curvature analysis and measurement techniques have been documented however there is limited literature on the reliability and validity of subjective visual analysis. Radiological assessment of lumbar lordotic curve aids in early diagnosis of conditions even before neurologic changes set in. To ascertain the level of reliability and validity of subjective assessment of lumbar lordosis in conventional radiography. A blinded, repeated-measures diagnostic test was carried out on lumbar spine x-ray radiographs. Radiology Department at Joint Clinical Research Centre (JCRC), Mengo-Kampala-Uganda. Seventy (70) lateral lumbar x-ray films were used for this study and were obtained from the archive of JCRC radiology department at Butikiro house, Mengo-Kampala. Poor observer agreement, both inter- and intra-observer, with kappa values of 0.16 was found. Inter-observer agreement was poorer than intra-observer agreement. Kappa values significantly rose when the lumbar lordosis was clustered into four categories without grading each abnormality. The results confirm that subjective assessment of lumbar lordosis has low reliability and validity. Film quality has limited influence on the observer reliability. This study further shows that fewer scale categories of lordosis abnormalities produce better observer reliability.
Ward, Dianne S; Mazzucca, Stephanie; McWilliams, Christina; Hales, Derek
2015-09-26
Early care and education (ECE) centers are important settings influencing young children's diet and physical activity (PA) behaviors. To better understand their impact on diet and PA behaviors as well as to evaluate public health programs aimed at ECE settings, we developed and tested the Environment and Policy Assessment and Observation - Self-Report (EPAO-SR), a self-administered version of the previously validated, researcher-administered EPAO. Development of the EPAO-SR instrument included modification of items from the EPAO, community advisory group and expert review, and cognitive interviews with center directors and classroom teachers. Reliability and validity data were collected across 4 days in 3-5 year old classrooms in 50 ECE centers in North Carolina. Center teachers and directors completed relevant portions of the EPAO-SR on multiple days according to a standardized protocol, and trained data collectors completed the EPAO for 4 days in the centers. Reliability and validity statistics calculated included percent agreement, kappa, correlation coefficients, coefficients of variation, deviations, mean differences, and intraclass correlation coefficients (ICC), depending on the response option of the item. Data demonstrated a range of reliability and validity evidence for the EPAO-SR instrument. Reporting from directors and classroom teachers was consistent and similar to the observational data. Items that produced strongest reliability and validity estimates included beverages served, outside time, and physical activity equipment, while items such as whole grains served and amount of teacher-led PA had lower reliability (observation and self-report) and validity estimates. To overcome lower reliability and validity estimates, some items need administration on multiple days. This study demonstrated appropriate reliability and validity evidence for use of the EPAO-SR in the field. The self-administered EPAO-SR is an advancement of the measurement of ECE settings and can be used by researchers and practitioners to assess the nutrition and physical activity environments of ECE settings.
Chen, Y-W; HajGhanbari, B; Road, J D; Coxson, H O; Camp, P G; Reid, W D
2018-06-08
Pain is prevalent in chronic obstructive pulmonary disease (COPD) and the Brief Pain Inventory (BPI) appears to be a feasible questionnaire to assess this symptom. However, the reliability and validity of the BPI have not been determined in individuals with COPD. This study aimed to determine the internal consistency, test-retest reliability and validity (construct, convergent, divergent and discriminant) of the BPI in individuals with COPD. In order to examine the test-retest reliability, individuals with COPD were recruited from pulmonary rehabilitation programmes to complete the BPI twice 1 week apart. In order to investigate validity, de-identified data was retrieved from two previous studies, including forced expiratory volume in 1-s, age, sex and data from four questionnaires: the BPI, short-form McGill Pain Questionnaire (SF-MPQ), 36-Item Short Form Survey (SF-36) and Community Health Activities Model Program for Seniors (CHAMPS) questionnaire. In total, 123 participants were included in the analyses (eligible data were retrieved from 86 participants and additional 37 participants were recruited). The BPI demonstrated excellent internal consistency and test-retest reliability. It also showed convergent validity with the SF-MPQ and divergent validity with the SF-36. The factor analysis yielded two factors of the BPI, which demonstrated that the two domains of the BPI measure the intended constructs. The BPI can also discriminate pain levels among COPD patients with varied levels of quality of life (SF-36) and physical activity (CHAMPS). The BPI is a reliable and valid pain questionnaire that can be used to evaluate pain in COPD. This study formally established the reliability and validity of the BPI in individuals with COPD, which have not been determined in this patient group. The results of this study provide strong evidence that assessment results from this pain questionnaire are reliable and valid. © 2018 European Pain Federation - EFIC®.
Auvinet, E; Multon, F; Manning, V; Meunier, J; Cobb, J P
2017-01-01
Gait asymmetry information is a key point in disease screening and follow-up. Constant Relative Phase (CRP) has been used to quantify within-stride asymmetry index, which requires noise-free and accurate motion capture, which is difficult to obtain in clinical settings. This study explores a new index, the Longitudinal Asymmetry Index (ILong) which is derived using data from a low-cost depth camera (Kinect). ILong is based on depth images averaged over several gait cycles, rather than derived joint positions or angles. This study aims to evaluate (1) the validity of CRP computed with Kinect, (2) the validity and sensitivity of ILong for measuring gait asymmetry based solely on data provided by a depth camera, (3) the clinical applicability of a posteriorly mounted camera system to avoid occlusion caused by the standard front-fitted treadmill consoles and (4) the number of strides needed to reliably calculate ILong. The gait of 15 subjects was recorded concurrently with a marker-based system (MBS) and Kinect, and asymmetry was artificially reproduced by introducing a 5cm sole attached to one foot. CRP computed with Kinect was not reliable. ILong detected this disturbed gait reliably and could be computed from a posteriorly placed Kinect without loss of validity. A minimum of five strides was needed to achieve a correlation coefficient of 0.9 between standard MBS and low-cost depth camera based ILong. ILong provides a clinically pragmatic method for measuring gait asymmetry, with application for improved patient care through enhanced disease, screening, diagnosis and monitoring. Copyright © 2016. Published by Elsevier B.V.
Li, Yingshuang; Ding, Chunge
2017-01-01
The Adult Carer Quality of Life questionnaire (AC-QoL) is a reliable and valid instrument used to assess the quality of life (QoL) of adult family caregivers. We explored the psychometric properties and tested the reliability and validity of a Chinese version of the AC-QoL with reliability and validity testing in 409 Chinese stroke caregivers. We used item-total correlation and extreme group comparison to do item analysis. To evaluate its reliability, we used a test-retest reliability approach, intraclass correlation coefficient (ICC), together with Cronbach’s alpha and model-based internal consistency index; to evaluate its validity, we used scale content validity, confirmatory factor analysis (CFA) and exploratory factor analysis (EFA) via principal component analysis with varimax rotation. We found that the CFA did not in fact confirm the original factor model and our EFA yielded a 31-item measure with a five-factor model. In conclusions, although some items performed differently in our analysis of the original English language version and our Chinese language version, our translated AC-QoL is a reliable and valid tool which can be used to assess the quality of life of stroke caregivers in mainland China. Chinese version AC-QoL is a comprehensive and good measurement to understand caregivers and has the potential to be a screening tool to assess QoL of caregiver. PMID:29131845
Nascimento-Ferreira, Marcus Vinícius; De Moraes, Augusto César Ferreira; Toazza-Oliveira, Paulo Vinícius; Forjaz, Claudia L M; Aristizabal, Juan Carlos; Santaliesra-Pasías, Alba M; Lepera, Candela; Nascimento-Junior, Walter Viana; Skapino, Estela; Delgado, Carlos Alberto; Moreno, Luis Alberto; Carvalho, Heráclito Barbosa
2018-03-01
The objective of this article is to test the reliability and validity of the new and innovative physical activity (PA) questionnaire. Subsamples from the South American Youth/Child Cardiovascular and Environment Study (SAYCARE) study were included to examine its reliability (children: n = 161; adolescents: n = 177) and validity (children: n = 82; adolescents: n = 60). The questionnaire consists of three dimensions of PA (leisure, active commuting, and school) performed during the last week. To assess its validity, the subjects wore accelerometers for at least 3 days and 8 h/d (at least one weekend day). The reliability was analyzed by correlation coefficients. In addition, Bland-Altman analysis and a multilevel regression were applied to estimate the measurement bias, limits of agreement, and influence of contextual variables. In children, the questionnaire showed consistent reliability (ρ = 0.56) and moderate validity (ρ = 0.46), and the contextual variable variance explained 43.0% with -22.9 min/d bias. In adolescents, the reliability was higher (ρ = 0.76) and the validity was almost excellent (ρ = 0.88), with 66.7% of the variance explained by city level with 16.0 min/d PA bias. The SAYCARE PA questionnaire shows acceptable (in children) to strong (in adolescents) reliability and strong validity in the measurement of PA in the pediatric population from low- to middle-income countries. © 2018 The Obesity Society.
NASA Astrophysics Data System (ADS)
Zheng, W.; Gao, J. M.; Wang, R. X.; Chen, K.; Jiang, Y.
2017-12-01
This paper put forward a new method of technical characteristics deployment based on Reliability Function Deployment (RFD) by analysing the advantages and shortages of related research works on mechanical reliability design. The matrix decomposition structure of RFD was used to describe the correlative relation between failure mechanisms, soft failures and hard failures. By considering the correlation of multiple failure modes, the reliability loss of one failure mode to the whole part was defined, and a calculation and analysis model for reliability loss was presented. According to the reliability loss, the reliability index value of the whole part was allocated to each failure mode. On the basis of the deployment of reliability index value, the inverse reliability method was employed to acquire the values of technology characteristics. The feasibility and validity of proposed method were illustrated by a development case of machining centre’s transmission system.
Verification and Validation in a Rapid Software Development Process
NASA Technical Reports Server (NTRS)
Callahan, John R.; Easterbrook, Steve M.
1997-01-01
The high cost of software production is driving development organizations to adopt more automated design and analysis methods such as rapid prototyping, computer-aided software engineering (CASE) tools, and high-level code generators. Even developers of safety-critical software system have adopted many of these new methods while striving to achieve high levels Of quality and reliability. While these new methods may enhance productivity and quality in many cases, we examine some of the risks involved in the use of new methods in safety-critical contexts. We examine a case study involving the use of a CASE tool that automatically generates code from high-level system designs. We show that while high-level testing on the system structure is highly desirable, significant risks exist in the automatically generated code and in re-validating releases of the generated code after subsequent design changes. We identify these risks and suggest process improvements that retain the advantages of rapid, automated development methods within the quality and reliability contexts of safety-critical projects.
Ford, Sarah; Hall, Angela
2004-09-01
The Medical Interaction Process System (MIPS) was originally developed in order to create a reliable observation tool for analysing doctor-patient encounters in the oncology setting. This paper reports a series of analyses carried out to establish whether the behaviour categories of the MIPS can discriminate between skilled and less skilled communicators. This involved the use of MIPS coded cancer consultations to compare the MIPS indices of 10 clinicians evaluated by an independent professional as skilled communicators with 10 who were considered less skilled. Eleven out of the 15 MIPS variables tested were able to distinguish the skilled from the less skilled group. Although limitations to the study are discussed, the results indicate that the MIPS has satisfactory discriminatory power and the results provide validity data that meet key objectives for developing the system. There is an ever-increasing need for reliable methods of assessing doctors' communication skills and evaluating medical interview teaching programmes. Copyright 2004 Elsevier Ireland Ltd.
Ruff, Jessica; Wang, Tiffany L; Quatman-Yates, Catherine C; Phieffer, Laura S; Quatman, Carmen E
2015-02-01
Commercially available gaming systems (CAGS) such as the Wii Balance Board (WBB) and Microsoft Xbox with Kinect (Xbox Kinect) are increasingly used as balance training and rehabilitation tools. The purpose of this review was to answer the question, "Are commercially available gaming systems valid and reliable instruments for use as clinical diagnostic and functional assessment tools in orthopaedic settings?" and provide a summary of relevant studies, identify their strengths and weaknesses, and generate conclusions regarding general validity/reliability of WBB and Xbox Kinect in orthopaedics. A systematic search was performed using MEDLINE (1996-2013) and Scopus (1996-2013). Inclusion criteria were minimum of 5 subjects, full manuscript provided in English or translated, and studies incorporating investigation of CAG measurement properties. Exclusion criteria included reviews, systematic reviews, summary/clinical commentaries, or case studies; conference proceedings/presentations; cadaveric studies; studies of non-reversible, non-orthopaedic-related musculoskeletal disease; non-human trials; and therapeutic studies not reporting comparative evaluation to already established functional assessment criteria. All studies meeting inclusion and exclusion criteria were appraised for quality by two independent reviewers. Evidence levels (I-V) were assigned to each study based on established methodological criteria. 3 Level II, 7 level III, and 1 Level IV studies met inclusion criteria and provided information related to the use of the WBB and Xbox Kinect as clinical assessment tools in the field of orthopaedics. Studies have used the WBB in a variety of clinical applications, including the measurement of center of pressure (COP), measurement of medial-to-lateral (M/L) or anterior-to-posterior (A/P) symmetry, assessment anatomic landmark positioning, and assessment of fall risk. However, no uniform protocols or outcomes were used to evaluate the quality of the WBB as a clinical assessment tool; therefore a wide range of sensitivities, specificities, accuracies, and validities were reported. Currently it is not possible to make a universal generalization about the clinical utility of CAGS in the field of orthopaedics. However, there is evidence to support using the WBB and the Xbox Kinect as tools to obtain reliable and valid COP measurements. The Wii Fit Game may specifically provide reliable and valid measurements for predicting fall risk. Copyright © 2014 Elsevier Ltd. All rights reserved.
Reliability and validity of the de Morton Mobility Index in individuals with sub-acute stroke.
Braun, Tobias; Marks, Detlef; Thiel, Christian; Grüneberg, Christian
2018-02-04
To establish the validity and reliability of the de Morton Mobility Index (DEMMI) in patients with sub-acute stroke. This cross-sectional study was performed in a neurological rehabilitation hospital. We assessed unidimensionality, construct validity, internal consistency reliability, inter-rater reliability, minimal detectable change and possible floor and ceiling effects of the DEMMI in adult patients with sub-acute stroke. The study included a total sample of 121 patients with sub-acute stroke. We analysed validity (n = 109) and reliability (n = 51) in two sub-samples. Rasch analysis indicated unidimensionality with an overall fit to the model (chi-square = 12.37, p = 0.577). All hypotheses on construct validity were confirmed. Internal consistency reliability (Cronbach's alpha = 0.94) and inter-rater reliability (intraclass correlation coefficient = 0.95; 95% confidence interval: 0.92-0.97) were excellent. The minimal detectable change with 90% confidence was 13 points. No floor or ceiling effects were evident. These results indicate unidimensionality, sufficient internal consistency reliability, inter-rater reliability, and construct validity of the DEMMI in patients with a sub-acute stroke. Advantages of the DEMMI in clinical application are the short administration time, no need for special equipment and interval level data. The de Morton Mobility Index, therefore, may be a useful performance-based bedside test to measure mobility in individuals with a sub-acute stroke across the whole mobility spectrum. Implications for Rehabilitation The de Morton Mobility Index (DEMMI) is an unidimensional measurement instrument of mobility in individuals with sub-acute stroke. The DEMMI has excellent internal consistency and inter-rater reliability, and sufficient construct validity. The minimal detectable change of the DEMMI with 90% confidence in stroke rehabilitation is 13 points. The lack of any floor or ceiling effects on hospital admission indicates applicability across the whole mobility spectrum of patients with sub-acute stroke.
Charalambous, A; Molassiotis, A
2017-01-01
The Short Form Chronic Respiratory Questionnaire (SF-CRQ) is frequently used in patients with obstructive pulmonary disease and it has demonstrated excellent psychometric properties. Since there is no psychometric information for its use with lung cancer patients, this study explored its validity and reliability in this population. Forty-six patients were assessed at two time points (with a 4-week interval) using the SF-CRQ, the modified Borg Scale, five numerical rating scales related to Perceived Severity of Breathlessness, and the Hospital Anxiety and Depression Scale. Internal consistency reliability was investigated by Cronbach's alpha reliability coefficient, test-retest reliability by Spearman-Brown reliability coefficient (P), content validity as well as convergent validity by Pearson's correlation coefficient between the SF-CRQ, and the conceptual similar scales mentioned above were explored. A principal component factor analysis was performed. The internal consistency was high [α = 0.88 (baseline) and 0.91 (after 1 month)]. The SF-CRQ had good stability with test-retest reliability ranging from r = 0.64 to 0.78, P < 0.001. Factor analysis suggests a single construct in this population. The preliminary data analyses supported the convergent, content, and construct validity of the SF-CRQ providing promising evidence that this can be a valid and reliable instrument for the assessment of quality of life related to breathlessness in lung cancer patients. © 2015 John Wiley & Sons Ltd.
Sun, Yi; Arning, Martin; Bochmann, Frank; Börger, Jutta; Heitmann, Thomas
2018-06-01
The Occupational Safety and Health Monitoring and Assessment Tool (OSH-MAT) is a practical instrument that is currently used in the German woodworking and metalworking industries to monitor safety conditions at workplaces. The 12-item scoring system has three subscales rating technical, organizational, and personnel-related conditions in a company. Each item has a rating value ranging from 1 to 9, with higher values indicating higher standard of safety conditions. The reliability of this instrument was evaluated in a cross-sectional survey among 128 companies and its validity among 30,514 companies. The inter-rater reliability of the instrument was examined independently and simultaneously by two well-trained safety engineers. Agreement between the double ratings was quantified by the intraclass correlation coefficient and absolute agreement of the rating values. The content validity of the OSH-MAT was evaluated by quantifying the association between OSH-MAT values and 5-year average injury rates by Poisson regression analysis adjusted for the size of the companies and industrial sectors. The construct validity of OSH-MAT was examined by principle component factor analysis. Our analysis indicated good to very good inter-rater reliability (intraclass correlation coefficient = 0.64-0.74) of OSH-MAT values with an absolute agreement of between 72% and 81%. Factor analysis identified three component subscales that met exactly the structure theory of this instrument. The Poisson regression analysis demonstrated a statistically significant exposure-response relationship between OSH-MAT values and the 5-year average injury rates. These analyses indicate that OSH-MAT is a valid and reliable instrument that can be used effectively to monitor safety conditions at workplaces.
Sim, Joong Hiong; Tong, Wen Ting; Hong, Wei-Han; Vadivelu, Jamuna; Hassan, Hamimah
2015-01-01
Assessment environment, synonymous with climate or atmosphere, is multifaceted. Although there are valid and reliable instruments for measuring the educational environment, there is no validated instrument for measuring the assessment environment in medical programs. This study aimed to develop an instrument for measuring students' perceptions of the assessment environment in an undergraduate medical program and to examine the psychometric properties of the new instrument. The Assessment Environment Questionnaire (AEQ), a 40-item, four-point (1=Strongly Disagree to 4=Strongly Agree) Likert scale instrument designed by the authors, was administered to medical undergraduates from the authors' institution. The response rate was 626/794 (78.84%). To establish construct validity, exploratory factor analysis (EFA) with principal component analysis and varimax rotation was conducted. To examine the internal consistency reliability of the instrument, Cronbach's α was computed. Mean scores for the entire AEQ and for each factor/subscale were calculated. Mean AEQ scores of students from different academic years and sex were examined. Six hundred and eleven completed questionnaires were analysed. EFA extracted four factors: feedback mechanism (seven items), learning and performance (five items), information on assessment (five items), and assessment system/procedure (three items), which together explained 56.72% of the variance. Based on the four extracted factors/subscales, the AEQ was reduced to 20 items. Cronbach's α for the 20-item AEQ was 0.89, whereas Cronbach's α for the four factors/subscales ranged from 0.71 to 0.87. Mean score for the AEQ was 2.68/4.00. The factor/subscale of 'feedback mechanism' recorded the lowest mean (2.39/4.00), whereas the factor/subscale of 'assessment system/procedure' scored the highest mean (2.92/4.00). Significant differences were found among the AEQ scores of students from different academic years. The AEQ is a valid and reliable instrument. Initial validation supports its use to measure students' perceptions of the assessment environment in an undergraduate medical program.
Improving Water Level and Soil Moisture Over Peatlands in a Global Land Modeling System
NASA Technical Reports Server (NTRS)
Bechtold, M.; De Lannoy, G. J. M.; Roose, D.; Reichle, R. H.; Koster, R. D.; Mahanama, S. P.
2017-01-01
New model structure for peatlands results in improved skill metrics (without any parameter calibration) Simulated surface soil moisture strongly affected by new model, but reliable soil moisture data lacking for validation.
NASA Technical Reports Server (NTRS)
Harper, Richard E.; Elks, Carl
1995-01-01
An Army Fault Tolerant Architecture (AFTA) has been developed to meet real-time fault tolerant processing requirements of future Army applications. AFTA is the enabling technology that will allow the Army to configure existing processors and other hardware to provide high throughput and ultrahigh reliability necessary for TF/TA/NOE flight control and other advanced Army applications. A comprehensive conceptual study of AFTA has been completed that addresses a wide range of issues including requirements, architecture, hardware, software, testability, producibility, analytical models, validation and verification, common mode faults, VHDL, and a fault tolerant data bus. A Brassboard AFTA for demonstration and validation has been fabricated, and two operating systems and a flight-critical Army application have been ported to it. Detailed performance measurements have been made of fault tolerance and operating system overheads while AFTA was executing the flight application in the presence of faults.
Powers, John H; Bacci, Elizabeth D; Guerrero, M Lourdes; Leidy, Nancy Kline; Stringer, Sonja; Kim, Katherine; Memoli, Matthew J; Han, Alison; Fairchok, Mary P; Chen, Wei-Ju; Arnold, John C; Danaher, Patrick J; Lalani, Tahaniyat; Ridoré, Michelande; Burgess, Timothy H; Millar, Eugene V; Hernández, Andrés; Rodríguez-Zulueta, Patricia; Smolskis, Mary C; Ortega-Gallegos, Hilda; Pett, Sarah; Fischer, William; Gillor, Daniel; Macias, Laura Moreno; DuVal, Anna; Rothman, Richard; Dugas, Andrea; Ruiz-Palacios, Guillermo M
2018-02-01
To assess the reliability, validity, and responsiveness of InFLUenza Patient-Reported Outcome (FLU-PRO©) scores for quantifying the presence and severity of influenza symptoms. An observational prospective cohort study of adults (≥18 years) with influenza-like illness in the United States, the United Kingdom, Mexico, and South America was conducted. Participants completed the 37-item draft FLU-PRO daily for up to 14 days. Item-level and factor analyses were used to remove items and determine factor structure. Reliability of the final tool was estimated using Cronbach α and intraclass correlation coefficients (2-day reliability). Convergent and known-groups validity and responsiveness were assessed using global assessments of influenza severity and return to usual health. Of the 536 patients enrolled, 221 influenza-positive subjects comprised the analytical sample. The mean age of the patients was 40.7 years, 60.2% were women, and 59.7% were white. The final 32-item measure has six factors/domains (nose, throat, eyes, chest/respiratory, gastrointestinal, and body/systemic), with a higher order factor representing symptom severity overall (comparative fit index = 0.92; root mean square error of approximation = 0.06). Cronbach α was high (total = 0.92; domain range = 0.71-0.87); test-retest reliability (intraclass correlation coefficient, day 1-day 2) was 0.83 for total scores and 0.57 to 0.79 for domains. Day 1 FLU-PRO domain and total scores were moderately to highly correlated (≥0.30) with Patient Global Rating of Flu Severity (except nose and throat). Consistent with known-groups validity, scores differentiated severity groups on the basis of global rating (total: F = 57.2, P < 0.001; domains: F = 8.9-67.5, P < 0.001). Subjects reporting return to usual health showed significantly greater (P < 0.05) FLU-PRO score improvement by day 7 than did those who did not, suggesting score responsiveness. Results suggest that FLU-PRO scores are reliable, valid, and responsive to change in influenza-positive adults. Copyright © 2018 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.
Validity and reliability of head posture measurement using Microsoft Kinect.
Oh, Baek-Lok; Kim, Jongmin; Kim, Jongshin; Hwang, Jeong-Min; Lee, Jehee
2014-11-01
To investigate the validity and reliability of Microsoft Kinect-based head tracker (KHT) for measuring head posture. Considering the cervical range of motion (CROM) as a reference, one-dimensional and three-dimensional (1D and 3D) head postures of 12 normal subjects (28-58 years of age; 6 women and 6 men) were obtained using the KHT. The KHT was validated by Pearson's correlation coefficient and intraclass correlation (ICC) coefficient. Test-retest reliability of the KHT was determined by its 95% limit of agreement (LoA) with the Bland-Altman plot. Face recognition success rate was evaluated for each head posture. Measurements of 1D and 3D head posture performed using the KHT were very close to those of the CROM with correlation coefficients of 0.99 and 0.97 (p<0.05), respectively, as well as with an ICC of >0.99 and 0.98, respectively. The reliability tests of the KHT in terms of 1D and 3D head postures had 95% LoA angles of approximately ±2.5° and ±6.5°, respectively. The KHT showed good agreement with the CROM and relatively favourable test-retest reliability. Considering its high performance, convenience and low cost, KHT could be clinically used as a head posture-measuring system. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
The validation of Huffaz Intelligence Test (HIT)
NASA Astrophysics Data System (ADS)
Rahim, Mohd Azrin Mohammad; Ahmad, Tahir; Awang, Siti Rahmah; Safar, Ajmain
2017-08-01
In general, a hafiz who can memorize the Quran has many specialties especially in respect to their academic performances. In this study, the theory of multiple intelligences introduced by Howard Gardner is embedded in a developed psychometric instrument, namely Huffaz Intelligence Test (HIT). This paper presents the validation and the reliability of HIT of some tahfiz students in Malaysia Islamic schools. A pilot study was conducted involving 87 huffaz who were randomly selected to answer the items in HIT. The analysis method used includes Partial Least Square (PLS) on reliability, convergence and discriminant validation. The study has validated nine intelligences. The findings also indicated that the composite reliabilities for the nine types of intelligences are greater than 0.8. Thus, the HIT is a valid and reliable instrument to measure the multiple intelligences among huffaz.
Exploring the reliability and validity of the social-moral awareness test.
Livesey, Alexandra; Dodd, Karen; Pote, Helen; Marlow, Elizabeth
2012-11-01
The aim of the study was to explore the validity of the social-moral awareness test (SMAT) a measure designed for assessing socio-moral rule knowledge and reasoning in people with learning disabilities. Comparisons between Theory of Mind and socio-moral reasoning allowed the exploration of construct validity of the tool. Factor structure, reliability and discriminant validity were also assessed. Seventy-one participants with mild-moderate learning disabilities completed the two scales of the SMAT and two False Belief Tasks for Theory of Mind. Reliability of the SMAT was very good, and the scales were shown to be uni-dimensional in factor structure. There was a significant positive relationship between Theory of Mind and both SMAT scales. There is early evidence of the construct validity and reliability of the SMAT. Further assessment of the validity of the SMAT will be required. © 2012 Blackwell Publishing Ltd.
ERIC Educational Resources Information Center
Shogren, Karrie A.; Wehmeyer, Michael L.; Seo, Hyojeong; Thompson, James R.; Schalock, Robert L.; Hughes, Carolyn; Little, Todd D.; Palmer, Susan B.
2017-01-01
This study compared the reliability, validity, and measurement properties of the "Supports Intensity Scale-Children's Version" (SIS-C) in children with autism and intellectual disability (n = 2,124) and children with intellectual disability only (n = 1,861). The results suggest that SIS-C is a valid and reliable tool in both populations.…
ERIC Educational Resources Information Center
John, A. C.
2015-01-01
The aim of the study was to examine the importance of reliability and validity as necessary foundation for fair assessment. The concepts of reliability, validity, fair assessment and their relationships were analysed. Qualities of fair assessment were discussed. A number of recommendations were made to make assessors be more cautious in award of…
Jackson, T
2001-05-01
Casemix-funding systems for hospital inpatient care require a set of resource weights which will not inadvertently distort patterns of patient care. Few health systems have very good sources of cost information, and specific studies to derive empirical cost relativities are themselves costly. This paper reports a 5 year program of research into the use of data from hospital management information systems (clinical costing systems) to estimate resource relativities for inpatient hospital care used in Victoria's DRG-based payment system. The paper briefly describes international approaches to cost weight estimation. It describes the architecture of clinical costing systems, and contrasts process and job costing approaches to cost estimation. Techniques of data validation and reliability testing developed in the conduct of four of the first five of the Victorian Cost Weight Studies (1993-1998) are described. Improvement in sampling, data validity and reliability are documented over the course of the research program, the advantages of patient-level data are highlighted. The usefulness of these byproduct data for estimation of relative resource weights and other policy applications may be an important factor in hospital and health system decisions to invest in clinical costing technology.
Reliability and validity of the Incontinence Quiz-Turkish version.
Kara, Kerime C; Çıtak Karakaya, İlkim; Tunalı, Nur; Karakaya, Mehmet G
2018-01-01
The aim of this study was to investigate the reliability and validity of the Turkish version of the Incontinence Quiz, which was developed by Branch et al. (1994), to assess women's knowledge of and attitudes toward urinary incontinence. Comprehensibility of the Turkish version of the 14-item Incontinence Quiz, which was prepared following translation-back translation procedures, was tested on a pilot group of eight women, and its internal reliability, test-retest reliability and construct validity were assessed in 150 women who attended the gynecology clinics of three hospitals in İçel, Turkey. Physical and sociodemographic characteristics and presence of incontinence complaints were also recorded. Data were analyzed at the 0.05 alpha level, using SPSS version 22. The scale had good reliability and validity. The internal reliability coefficient (Cronbach α) was 0.80, test-retest correlation coefficients were 0.83-0.94; and with regard to construct validity, Kaiser-Meyer-Olkin coefficient was 0.76 and Barlett sphericity test was 562.777 (P = 0.000). Turkish version of the Incontinence Quiz had a four-factor structure, with Eigenvalues ranging from 1.17 to 4.08. The Incontinence Quiz-Turkish version is a highly comprehensible, reliable and valid scale, which may be used to assess Turkish-speaking women's knowledge of and attitudes toward urinary incontinence. © 2017 Japan Society of Obstetrics and Gynecology.
Cha, Young Joo; Lee, Jae Jin; Kim, Do Hyun; You, Joshua Sung H
2017-10-23
Core stabilization plays an important role in the regulation of postural stability. To overcome shortcomings associated with pain and severe core instability during conventional core stabilization tests, we recently developed the dynamic neuromuscular stabilization-based heel sliding (DNS-HS) test. The purpose of this study was to establish the criterion validity and test-retest reliability of the novel DNS-HS test. Twenty young adults with core instability completed both the bilateral straight leg lowering test (BSLLT) and DNS-HS test for the criterion validity study and repeated the DNS-HS test for the test-retest reliability study. Criterion validity was determined by comparing hip joint angle data that were obtained from BSLLT and DNS-HS measures. The test-retest reliability was determined by comparing hip joint angle data. Criterion validity was (ICC2,3) = 0.700 (p< 0.05), suggesting a good relationship between the two core stability measures. Test-retest reliability was (ICC3,3) = 0.953 (p< 0.05), indicating excellent consistency between the repeated DNS-HS measurements. Criterion validity data demonstrated a good relationship between the gold standard BSLLT and DNS-HS core stability measures. Test-retest reliability data suggests that DNS-HS core stability was a reliable test for core stability. Clinically, the DNS-HS test is useful to objectively quantify core instability and allow early detection and evaluation.
Boerboom, T B B; Dolmans, D H J M; Jaarsma, A D C; Muijtjens, A M M; Van Beukelen, P; Scherpbier, A J J A
2011-01-01
Feedback to aid teachers in improving their teaching requires validated evaluation instruments. When implementing an evaluation instrument in a different context, it is important to collect validity evidence from multiple sources. We examined the validity and reliability of the Maastricht Clinical Teaching Questionnaire (MCTQ) as an instrument to evaluate individual clinical teachers during short clinical rotations in veterinary education. We examined four sources of validity evidence: (1) Content was examined based on theory of effective learning. (2) Response process was explored in a pilot study. (3) Internal structure was assessed by confirmatory factor analysis using 1086 student evaluations and reliability was examined utilizing generalizability analysis. (4) Relations with other relevant variables were examined by comparing factor scores with other outcomes. Content validity was supported by theory underlying the cognitive apprenticeship model on which the instrument is based. The pilot study resulted in an additional question about supervision time. A five-factor model showed a good fit with the data. Acceptable reliability was achievable with 10-12 questionnaires per teacher. Correlations between the factors and overall teacher judgement were strong. The MCTQ appears to be a valid and reliable instrument to evaluate clinical teachers' performance during short rotations.
Wesnes, Keith A; Brooker, Helen; Ballard, Clive; McCambridge, Laura; Stenton, Robert; Corbett, Anne
2017-12-01
The advent of long-term remotely conducted clinical trials requires assessments which can be administered online. This paper considers the utility, reliability, sensitivity and validity of an internet-based system for measuring changes in cognitive function which is being used in one such trial. The Platform for Research Online to investigate Genetics and Cognition in Ageing is a 10-year longitudinal and entirely remote study launched in November 2015. The CogTrack TM System is being used to monitor changes in important aspects of cognitive function using tests of attention, information processing and episodic memory. On study entry, the participants performed CogTrack TM up to three times over seven days, and these data are evaluated in this paper. During the first six months of the study, 14 531 individuals aged 50 to 94 years enrolled and performed the CogTrack TM System, 8627 of whom completed three test sessions. On the first administration, 99.4% of the study tasks were successfully completed. Repeated testing showed training/familiarisation effects on four of the ten measures which had largely stabilised by the third test session. The factor structure of the various measures was found to be robust. Evaluation of the influence of age identified clinically relevant declines over the age range of the population on one or more measures from all tasks. The results of these analyses identify CogTrack TM to be a practical and valid method to reliably, sensitively, remotely and repeatedly collect cognitive data from large samples of individuals aged 50 and over. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Numerical aerodynamic simulation facility. Preliminary study extension
NASA Technical Reports Server (NTRS)
1978-01-01
The production of an optimized design of key elements of the candidate facility was the primary objective of this report. This was accomplished by effort in the following tasks: (1) to further develop, optimize and describe the function description of the custom hardware; (2) to delineate trade off areas between performance, reliability, availability, serviceability, and programmability; (3) to develop metrics and models for validation of the candidate systems performance; (4) to conduct a functional simulation of the system design; (5) to perform a reliability analysis of the system design; and (6) to develop the software specifications to include a user level high level programming language, a correspondence between the programming language and instruction set and outline the operation system requirements.
Ahmad, Sohail; Ismail, Ahmad Izuanuddin; Khan, Tahir Mehmood; Akram, Waqas; Mohd Zim, Mohd Arif; Ismail, Nahlah Elkudssiah
2017-04-01
The stigmatisation degree, self-esteem and knowledge either directly or indirectly influence the control and self-management of asthma. To date, there is no valid and reliable instrument that can assess these key issues collectively. The main aim of this study was to test the reliability and validity of the newly devised and translated "Stigmatisation Degree, Self-Esteem and Knowledge Questionnaire" among adult asthma patients using the Rasch measurement model. This cross-sectional study recruited thirty adult asthma patients from two respiratory specialist clinics in Selangor, Malaysia. The newly devised self-administered questionnaire was adapted from relevant publications and translated into the Malay language using international standard translation guidelines. Content and face validation was done. The data were extracted and analysed for real item reliability and construct validation using the Rasch model. The translated "Stigmatisation Degree, Self-Esteem and Knowledge Questionnaire" showed high real item reliability values of 0.90, 0.86 and 0.89 for stigmatisation degree, self-esteem, and knowledge of asthma, respectively. Furthermore, all values of point measure correlation (PTMEA Corr) analysis were within the acceptable specified range of the Rasch model. Infit/outfit mean square values and Z standard (ZSTD) values of each item verified the construct validity and suggested retaining all the items in the questionnaire. The reliability analyses and output tables of item measures for construct validation proved the translated Malaysian version of "Stigmatisation Degree, Self-Esteem and Knowledge Questionnaire" as a valid and highly reliable questionnaire.